473,320 Members | 1,926 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Size of file

I hear that this isn't always valid:

FILE *in;
long size;
in = fopen("foo.bar","rb");
fseek(in,0,SEEK_END);
size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.

Apparently fseek is not guaranteed to work because of 0xFF EOF or other
characters, is this true only in text mode or also in binary mode? Is there
anyway to get a filesize without having to read bytes on at a time. Is it
best to just fread until it fails?
Oct 2 '08 #1
18 2789

"MisterE" <Mi*****@nimga.comwrote in message
news:48**********************@news.optusnet.com.au ...
>I hear that this isn't always valid:

FILE *in;
long size;
in = fopen("foo.bar","rb");
fseek(in,0,SEEK_END);
size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.

Apparently fseek is not guaranteed to work because of 0xFF EOF or other
characters, is this true only in text mode or also in binary mode? Is
there anyway to get a filesize without having to read bytes on at a time.
Is it best to just fread until it fails?
Works for me, using binary mode files. But there are various pitfalls:

In text mode, the size you get might be wrong because it might include '\n'
'\r' sequences instead of just '\n'.

Some types of files may not have a beginning or end (like stdin, or some
serial device), so don't have a size.

Some OSs may not store the exact bytesize of a file (for example may only
store a block size), so the value might be approximate. (And there might be
other OS things to bear in mind such as use of compression.)

And whatever file size you get might change if the file is modified (by any
other process) by the time you use the file size information.

For more details, see threads on this subject in c.l.c.

But within those constraints, I've been using code like yours successfully
for a decade or two.

--
Bartc

Oct 2 '08 #2
>I hear that this isn't always valid:

There are many, many, many different definitions of "file size",
(probably more than there are file sizes on a 64-bit machine) and
you need to decide which definition you want to use if you intend
calling any result "correct" or "incorrect".
>FILE *in;
long size;
in = fopen("foo.bar","rb");
fseek(in,0,SEEK_END);
size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.
In binary mode, SEEK_END need not be meaningfully supported because
the system may pad the file with trailing 0 bytes. For example, CP/M
only counts sectors on binary files and rounds the size of the file
up to the next multiple of 128 bytes, and pads the last sector with
trailing 0 bytes.

In text mode, the size returned from ftell need not be meaningful as
a number. For example, it might be a bitfield of a number of values
like sector, head, cylinder, track, train, etc. so that subtracting
two of them does not give anything meaningful.

(Try, for example, subtracting 09302008 from 10022008, treating
them as decimal integers rather than dates, and try to make sense
out of the result that would indicate that they are 2 days apart.
The same kind of encoding can be done on text file offsets.)

Byte offsets into a text file are likely to be misleading because
of the \r\n -\n translation done by some systems (e.g. Windows).
>Apparently fseek is not guaranteed to work because of 0xFF EOF or other
characters,
There is no "EOF character". Even on one those systems which use
an end marker for text files (Windows), that marker isn't 0xFF.
Many systems (UNIX & variants) just store a file length (yet another
definition of "file size") and don't use an end marker.

EOF is a value that won't *fit* in a char (unless sizeof int ==
sizeof char) which is why getchar() returns int, not char.
>is this true only in text mode or also in binary mode?
You are screwed in both text mode and in binary mode for different
reasons.
>Is there
anyway to get a filesize
Do you want *A* filesize (in which case, I pick 0, it's easy, and
you didn't say it had to be correct, and some files actually do
have size 0) or do you want a *correct* filesize, in which case you
have to pick a definition of filesize?
>without having to read bytes on at a time. Is it
best to just fread until it fails?
If you want to read the file into memory, two definitions
of file size come to mind:

1. The number of bytes read from the file in binary mode.
2. The number of bytes read from the file in text mode.

Chances are high that these two definitions will give different
answers for the file size for any given file. Neither of these
necessarily says anything about how much space the file takes on
disk. But if you want to read the file into memory, these are
the right definitions to use (pick the one that uses the same
file mode as the file mode you're going to use).
Oct 3 '08 #3
On Fri, 3 Oct 2008 09:24:53 +1000, "MisterE" <Mi*****@nimga.com>
wrote:
>I hear that this isn't always valid:
You heard right.
>
FILE *in;
long size;
in = fopen("foo.bar","rb");
You open the file in binary.
>fseek(in,0,SEEK_END);
The standard specifically states "A binary stream need not
meaningfully support fseek calls with a whence value of SEEK_END."
>size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.

Apparently fseek is not guaranteed to work because of 0xFF EOF or other
I don't where you came up with this. 0xFF is not a special character
in a binary file. It could even be a normal printable character since
the standard does not mandate ASCII or EBCDIC. EOF is not a
character. It is a macro. It is entirely possible that the value
used in that macro is not representable as a char.
>characters, is this true only in text mode or also in binary mode? Is there
anyway to get a filesize without having to read bytes on at a time. Is it
best to just fread until it fails?
Depends on how important portability is to you.

--
Remove del for email
Oct 3 '08 #4
On Fri, 03 Oct 2008 09:24:53 +1000, MisterE wrote:
I hear that this isn't always valid:

FILE *in;
long size;
in = fopen("foo.bar","rb");
fseek(in,0,SEEK_END);
size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.
Most people use fopen and fseek. In my programs I used stat. One thing
that always made me wonder is that stat reports filesize == 0 if the file
is opened . Only on closed file it reports the correct size.


--
www.lispmachine.wordpress.com
my email is @ the above blog.
Gooogle Groups is Blocked. Reason: Excessive Spamming

Oct 3 '08 #5
arnuld <su*****@invalid.addresswrites:
>On Fri, 03 Oct 2008 09:24:53 +1000, MisterE wrote:
>I hear that this isn't always valid:

FILE *in;
long size;
in = fopen("foo.bar","rb");
fseek(in,0,SEEK_END);
size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.

Most people use fopen and fseek. In my programs I used stat. One thing
that always made me wonder is that stat reports filesize == 0 if the file
is opened . Only on closed file it reports the correct size.
Sorry this is becoming off-topic, but where did you find this
behavior? Under Unix this would be very strange.

The only thing I can think of is that you opened the file for writing,
which ordinarily would truncate it, so that its size would indeed be
0. But opening for reading should not do this.
Oct 3 '08 #6
On Thu, 02 Oct 2008 23:04:30 -0700, Nate Eldredge wrote:

Sorry this is becoming off-topic, but where did you find this
behavior? Under Unix this would be very strange.
well, it happens on my machine all the time.

The only thing I can think of is that you opened the file for writing,
which ordinarily would truncate it, so that its size would indeed be
0. But opening for reading should not do this.
fopen(file, "a")

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Gooogle Groups is Blocked. Reason: Excessive Spamming

Oct 3 '08 #7
arnuld <su*****@invalid.addresswrites:
>On Thu, 02 Oct 2008 23:04:30 -0700, Nate Eldredge wrote:

>Sorry this is becoming off-topic, but where did you find this
behavior? Under Unix this would be very strange.

well, it happens on my machine all the time.
What operating system / compiler / standard library?
>
>The only thing I can think of is that you opened the file for writing,
which ordinarily would truncate it, so that its size would indeed be
0. But opening for reading should not do this.

fopen(file, "a")
Peculiar. Can you post a complete example of a program that shows
this behavior?
Oct 3 '08 #8
arnuld wrote:
>MisterE wrote:
>I hear that this isn't always valid:

FILE *in;
long size;
in = fopen("foo.bar","rb");
fseek(in,0,SEEK_END);
size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.

Most people use fopen and fseek. In my programs I used stat. One
thing that always made me wonder is that stat reports filesize
== 0 if the file is opened . Only on closed file it reports the
correct size.
stat is not present in standard C. Thus it can do anything, and is
off topic here unless you present its actual coding (in standard
C).

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Oct 3 '08 #9
On 3 Oct 2008 at 7:35, CBFalconer wrote:
arnuld wrote:
>Most people use fopen and fseek. In my programs I used stat. One
thing that always made me wonder is that stat reports filesize
== 0 if the file is opened . Only on closed file it reports the
correct size.
I don't know exactly what you mean. Perhaps you're talking about writes
that might have been buffered and not yet actually made, which stat()
won't detect? For example:
#include <stdio.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>

int main(void)
{
FILE *out;
struct stat buf;
out=fopen("foo", "w");
if(out) {
fputs("12345", out);
if(stat("foo", &buf)==0)
printf("size: %lu\n", (unsigned long) buf.st_size);
fflush(out);
if(stat("foo", &buf)==0)
printf("flushed size: %lu\n", (unsigned long) buf.st_size);
fclose(out);
}
return 0;
}

$ ./a
size: 0
flushed size: 5
stat is not present in standard C. Thus it can do anything, and is
off topic here unless you present its actual coding (in standard
C).
Why don't you just crawl back in your hole and die if you don't have
anything useful to contribute?

Oct 3 '08 #10
After takin' a swig o' grog, CBFalconer belched out
this bit o' wisdom:
arnuld wrote:
>>
Most people use fopen and fseek. In my programs I used stat. One
thing that always made me wonder is that stat reports filesize
== 0 if the file is opened . Only on closed file it reports the
correct size.

stat is not present in standard C. Thus it can do anything, and is
off topic here unless you present its actual coding (in standard
C).
It's called _stat() by Microsoft.

--
If builders built buildings the way programmers wrote programs,
then the first woodpecker to come along would destroy civilization.
Oct 3 '08 #11
CBFalconer <cb********@yahoo.comwrites:
arnuld wrote:
[...]
>Most people use fopen and fseek. In my programs I used stat. One
thing that always made me wonder is that stat reports filesize
== 0 if the file is opened . Only on closed file it reports the
correct size.

stat is not present in standard C. Thus it can do anything, and is
off topic here unless you present its actual coding (in standard
C).
Let's assume that arnuld is referring to the "stat" function specified
by POSIX; it's theoretically possible that he's talking about
something else, but common sense points to that one particular
function. Presenting an actual implementation of the POSIX stat() in
standard C is not possible; it depends on characteristics of the file
system that C does not define. Even if it were possible, posting a
complete implementation would be a waste of bandwidth; you don't need
to post a fucntion's implementation to discuss what it does.

If you want to say it's off-topic, just say it's off-topic (and I
agree, it is off-topic, though I don't object to a brief mention).
Dragging in absurd, and presumably unserious, suggestions about how it
*could* be topical is not at all helpful.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Oct 3 '08 #12
Chris Ahlstrom wrote:
CBFalconer belched out this bit o' wisdom:
>arnuld wrote:
>>>
Most people use fopen and fseek. In my programs I used stat. One
thing that always made me wonder is that stat reports filesize
== 0 if the file is opened . Only on closed file it reports the
correct size.

stat is not present in standard C. Thus it can do anything, and is
off topic here unless you present its actual coding (in standard
C).

It's called _stat() by Microsoft.
So what? It is not present in standard C, the subject of this
newsgroup. If you want to bring it up on a newsgroup that deals
with Microsoft or Posix, that is another matter. Then there is a
definition available for it. It still isn't portable.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Oct 3 '08 #13
Keith Thompson wrote:
CBFalconer <cb********@yahoo.comwrites:
.... snip ...
>
>stat is not present in standard C. Thus it can do anything,
and is off topic here unless you present its actual coding (in
standard C).
.... snip ...
>
If you want to say it's off-topic, just say it's off-topic (and
I agree, it is off-topic, though I don't object to a brief
mention). Dragging in absurd, and presumably unserious,
suggestions about how it *could* be topical is not at all helpful.
I disagree. There is no reason a user can't write his own stat()
function, say as:

int stat(char *s) {
return !!*s;
}

I think my response (above) covered the possibilities.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Oct 3 '08 #14
CBFalconer <cb********@yahoo.comwrites:
Keith Thompson wrote:
>CBFalconer <cb********@yahoo.comwrites:
... snip ...
>>
>>stat is not present in standard C. Thus it can do anything,
and is off topic here unless you present its actual coding (in
standard C).
... snip ...
>>
If you want to say it's off-topic, just say it's off-topic (and
I agree, it is off-topic, though I don't object to a brief
mention). Dragging in absurd, and presumably unserious,
suggestions about how it *could* be topical is not at all helpful.

I disagree. There is no reason a user can't write his own stat()
function, say as:

int stat(char *s) {
return !!*s;
}

I think my response (above) covered the possibilities.
The previous poster talked about using stat to determine the size of a
file.

Topicality doesn't preclude using a little common sense. When you
talk about Pascal, I generally assume you mean the language, not the
philosopher.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Oct 3 '08 #15
Keith Thompson said:

<snip>
Topicality doesn't preclude using a little common sense. When you
talk about Pascal, I generally assume you mean the language, not the
philosopher.
So no pressure, huh?

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Oct 4 '08 #16
Richard Heathfield wrote:
Keith Thompson said:

<snip>
>Topicality doesn't preclude using a little common sense. When
you talk about Pascal, I generally assume you mean the language,
not the philosopher.

So no pressure, huh?
Hey, over here we don't live in a MKS world.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Oct 4 '08 #17
On Thu, 02 Oct 2008 23:43:22 -0700, Nate Eldredge wrote:
What operating system / compiler / standard library?
[arnuld@dune ~]$ uname -a
Linux dune 2.6.9-67.EL #1 Fri Nov 16 12:34:13 EST 2007 i686 athlon i386 GNU/Linux

[arnuld@dune ~]$ cat /etc/issue
CentOS release 4.6 (Final)
Kernel \r on an \m

[arnuld@dune ~]$ gcc --version
gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-9)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Peculiar. Can you post a complete example of a program that shows
this behavior?
yeah, quite peculiar because its working fine now:
#include <stdio.h>
#include <sys/stat.h>

int main(void)
{
FILE* fp = NULL;
char file_name[] = "test.c";
char* pf = file_name;
struct stat statbuf;
unsigned long file_size = 0;
if( (fp = fopen(pf, "a")) )
{
if( ! stat(pf, &statbuf) )
{
file_size = statbuf.st_size;
}
}
if( fclose(fp) )
{
printf("error closing file\n");
}

printf("size of %s = %ld bytes\n", file_name, file_size);
return 0;
}
====================== OUTPUT ===========================
[arnuld@dune ztest]$ gcc -ansi -pedantic -Wall -Wextra new.c
[arnuld@dune ztest]$ ./a.out
size of test.c = 256 bytes
[arnuld@dune ztest]$

I don't know why it was behaving like that last time. May be you are write
because sometimes the program was to create new file while sometimes that
was supposed to append.

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Gooogle Groups is Blocked. Reason: Excessive Spamming

Oct 4 '08 #18
Chris Ahlstrom <li*****@bollsouth.nutwrote:
After takin' a swig o' grog, CBFalconer belched out
arnuld wrote:
>
Most people use fopen and fseek. In my programs I used stat. One
thing that always made me wonder is that stat reports filesize
== 0 if the file is opened . Only on closed file it reports the
correct size.
stat is not present in standard C. Thus it can do anything, and is
off topic here unless you present its actual coding (in standard
C).

It's called _stat() by Microsoft.
Does it work _exactly_ the same as stat() under POSIX? _All_ POSIXes?
No? Well, it's not very dependable, then, is it?

Richard
Oct 6 '08 #19

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

17
by: Arnold | last post by:
Is using fseek and ftell a reliable method of getting the file size on a binary file? I thought I remember reading somewhere it wasn't... If not what would be the "right" and portable method to...
6
by: Andrew Clark | last post by:
*** post for FREE via your newsreader at post.newsfeed.com *** Hello all, I recall several threads over the years about how reading file size cannot be done consistantly or portably, but I...
8
by: Dave | last post by:
I am serialising an object to a memory mapped file (using the CreateFileMapping and MapViewOfFile p/invoke calls). These need to know the maximum size of the "file". I can put in a "good guess" ie...
8
by: Ron | last post by:
Hi all, How do I determine the size of the tables I'm using? I looked under properties and it's not there. The book I just browsed said table is limited to 1GB. How do I find out what size my...
5
by: Jefferis NoSpamme | last post by:
Hi all, I'm trying to limit the file size of an image submission and I keep running into various problems. I've got most of it working, but I'm stumped and I have a basic question as to WHY this...
12
by: Phil Z. | last post by:
After migrating an ASP.NET 1.1 application to 2.0 we were getting "Cannot access a closed file" errors when uploading. I found a number of post on the subject and have since moved from using an...
4
by: Doug | last post by:
Hi, It looks like the only way to get a size of a file within csharp is to use FileInfo and the Length property. However that only returns the number of bytes in the file which is translating...
1
by: chrisj | last post by:
I'm using freeASPupload and got some assistance integrating to a Member script. It works successfully. In this modified version there are two groups that use this upload script. Members of one...
20
by: Ashit Vora | last post by:
Hi, I 'm new to C programming and 'm stuck somewhere. I want to find the size of a file. I couldn't find a proper way of doing it. What I was planning to do is... Open the requested file,...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.