By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,758 Members | 1,247 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,758 IT Pros & Developers. It's quick & easy.

Size of file

P: n/a
I hear that this isn't always valid:

FILE *in;
long size;
in = fopen("foo.bar","rb");
fseek(in,0,SEEK_END);
size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.

Apparently fseek is not guaranteed to work because of 0xFF EOF or other
characters, is this true only in text mode or also in binary mode? Is there
anyway to get a filesize without having to read bytes on at a time. Is it
best to just fread until it fails?
Oct 2 '08 #1
Share this Question
Share on Google+
18 Replies


P: n/a

"MisterE" <Mi*****@nimga.comwrote in message
news:48**********************@news.optusnet.com.au ...
>I hear that this isn't always valid:

FILE *in;
long size;
in = fopen("foo.bar","rb");
fseek(in,0,SEEK_END);
size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.

Apparently fseek is not guaranteed to work because of 0xFF EOF or other
characters, is this true only in text mode or also in binary mode? Is
there anyway to get a filesize without having to read bytes on at a time.
Is it best to just fread until it fails?
Works for me, using binary mode files. But there are various pitfalls:

In text mode, the size you get might be wrong because it might include '\n'
'\r' sequences instead of just '\n'.

Some types of files may not have a beginning or end (like stdin, or some
serial device), so don't have a size.

Some OSs may not store the exact bytesize of a file (for example may only
store a block size), so the value might be approximate. (And there might be
other OS things to bear in mind such as use of compression.)

And whatever file size you get might change if the file is modified (by any
other process) by the time you use the file size information.

For more details, see threads on this subject in c.l.c.

But within those constraints, I've been using code like yours successfully
for a decade or two.

--
Bartc

Oct 2 '08 #2

P: n/a
>I hear that this isn't always valid:

There are many, many, many different definitions of "file size",
(probably more than there are file sizes on a 64-bit machine) and
you need to decide which definition you want to use if you intend
calling any result "correct" or "incorrect".
>FILE *in;
long size;
in = fopen("foo.bar","rb");
fseek(in,0,SEEK_END);
size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.
In binary mode, SEEK_END need not be meaningfully supported because
the system may pad the file with trailing 0 bytes. For example, CP/M
only counts sectors on binary files and rounds the size of the file
up to the next multiple of 128 bytes, and pads the last sector with
trailing 0 bytes.

In text mode, the size returned from ftell need not be meaningful as
a number. For example, it might be a bitfield of a number of values
like sector, head, cylinder, track, train, etc. so that subtracting
two of them does not give anything meaningful.

(Try, for example, subtracting 09302008 from 10022008, treating
them as decimal integers rather than dates, and try to make sense
out of the result that would indicate that they are 2 days apart.
The same kind of encoding can be done on text file offsets.)

Byte offsets into a text file are likely to be misleading because
of the \r\n -\n translation done by some systems (e.g. Windows).
>Apparently fseek is not guaranteed to work because of 0xFF EOF or other
characters,
There is no "EOF character". Even on one those systems which use
an end marker for text files (Windows), that marker isn't 0xFF.
Many systems (UNIX & variants) just store a file length (yet another
definition of "file size") and don't use an end marker.

EOF is a value that won't *fit* in a char (unless sizeof int ==
sizeof char) which is why getchar() returns int, not char.
>is this true only in text mode or also in binary mode?
You are screwed in both text mode and in binary mode for different
reasons.
>Is there
anyway to get a filesize
Do you want *A* filesize (in which case, I pick 0, it's easy, and
you didn't say it had to be correct, and some files actually do
have size 0) or do you want a *correct* filesize, in which case you
have to pick a definition of filesize?
>without having to read bytes on at a time. Is it
best to just fread until it fails?
If you want to read the file into memory, two definitions
of file size come to mind:

1. The number of bytes read from the file in binary mode.
2. The number of bytes read from the file in text mode.

Chances are high that these two definitions will give different
answers for the file size for any given file. Neither of these
necessarily says anything about how much space the file takes on
disk. But if you want to read the file into memory, these are
the right definitions to use (pick the one that uses the same
file mode as the file mode you're going to use).
Oct 3 '08 #3

P: n/a
On Fri, 3 Oct 2008 09:24:53 +1000, "MisterE" <Mi*****@nimga.com>
wrote:
>I hear that this isn't always valid:
You heard right.
>
FILE *in;
long size;
in = fopen("foo.bar","rb");
You open the file in binary.
>fseek(in,0,SEEK_END);
The standard specifically states "A binary stream need not
meaningfully support fseek calls with a whence value of SEEK_END."
>size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.

Apparently fseek is not guaranteed to work because of 0xFF EOF or other
I don't where you came up with this. 0xFF is not a special character
in a binary file. It could even be a normal printable character since
the standard does not mandate ASCII or EBCDIC. EOF is not a
character. It is a macro. It is entirely possible that the value
used in that macro is not representable as a char.
>characters, is this true only in text mode or also in binary mode? Is there
anyway to get a filesize without having to read bytes on at a time. Is it
best to just fread until it fails?
Depends on how important portability is to you.

--
Remove del for email
Oct 3 '08 #4

P: n/a
On Fri, 03 Oct 2008 09:24:53 +1000, MisterE wrote:
I hear that this isn't always valid:

FILE *in;
long size;
in = fopen("foo.bar","rb");
fseek(in,0,SEEK_END);
size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.
Most people use fopen and fseek. In my programs I used stat. One thing
that always made me wonder is that stat reports filesize == 0 if the file
is opened . Only on closed file it reports the correct size.


--
www.lispmachine.wordpress.com
my email is @ the above blog.
Gooogle Groups is Blocked. Reason: Excessive Spamming

Oct 3 '08 #5

P: n/a
arnuld <su*****@invalid.addresswrites:
>On Fri, 03 Oct 2008 09:24:53 +1000, MisterE wrote:
>I hear that this isn't always valid:

FILE *in;
long size;
in = fopen("foo.bar","rb");
fseek(in,0,SEEK_END);
size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.

Most people use fopen and fseek. In my programs I used stat. One thing
that always made me wonder is that stat reports filesize == 0 if the file
is opened . Only on closed file it reports the correct size.
Sorry this is becoming off-topic, but where did you find this
behavior? Under Unix this would be very strange.

The only thing I can think of is that you opened the file for writing,
which ordinarily would truncate it, so that its size would indeed be
0. But opening for reading should not do this.
Oct 3 '08 #6

P: n/a
On Thu, 02 Oct 2008 23:04:30 -0700, Nate Eldredge wrote:

Sorry this is becoming off-topic, but where did you find this
behavior? Under Unix this would be very strange.
well, it happens on my machine all the time.

The only thing I can think of is that you opened the file for writing,
which ordinarily would truncate it, so that its size would indeed be
0. But opening for reading should not do this.
fopen(file, "a")

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Gooogle Groups is Blocked. Reason: Excessive Spamming

Oct 3 '08 #7

P: n/a
arnuld <su*****@invalid.addresswrites:
>On Thu, 02 Oct 2008 23:04:30 -0700, Nate Eldredge wrote:

>Sorry this is becoming off-topic, but where did you find this
behavior? Under Unix this would be very strange.

well, it happens on my machine all the time.
What operating system / compiler / standard library?
>
>The only thing I can think of is that you opened the file for writing,
which ordinarily would truncate it, so that its size would indeed be
0. But opening for reading should not do this.

fopen(file, "a")
Peculiar. Can you post a complete example of a program that shows
this behavior?
Oct 3 '08 #8

P: n/a
arnuld wrote:
>MisterE wrote:
>I hear that this isn't always valid:

FILE *in;
long size;
in = fopen("foo.bar","rb");
fseek(in,0,SEEK_END);
size = ftell(in);
fseek(in,0,SEEK_SET);

then fread size many bytes into memory.

Most people use fopen and fseek. In my programs I used stat. One
thing that always made me wonder is that stat reports filesize
== 0 if the file is opened . Only on closed file it reports the
correct size.
stat is not present in standard C. Thus it can do anything, and is
off topic here unless you present its actual coding (in standard
C).

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Oct 3 '08 #9

P: n/a
On 3 Oct 2008 at 7:35, CBFalconer wrote:
arnuld wrote:
>Most people use fopen and fseek. In my programs I used stat. One
thing that always made me wonder is that stat reports filesize
== 0 if the file is opened . Only on closed file it reports the
correct size.
I don't know exactly what you mean. Perhaps you're talking about writes
that might have been buffered and not yet actually made, which stat()
won't detect? For example:
#include <stdio.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>

int main(void)
{
FILE *out;
struct stat buf;
out=fopen("foo", "w");
if(out) {
fputs("12345", out);
if(stat("foo", &buf)==0)
printf("size: %lu\n", (unsigned long) buf.st_size);
fflush(out);
if(stat("foo", &buf)==0)
printf("flushed size: %lu\n", (unsigned long) buf.st_size);
fclose(out);
}
return 0;
}

$ ./a
size: 0
flushed size: 5
stat is not present in standard C. Thus it can do anything, and is
off topic here unless you present its actual coding (in standard
C).
Why don't you just crawl back in your hole and die if you don't have
anything useful to contribute?

Oct 3 '08 #10

P: n/a
After takin' a swig o' grog, CBFalconer belched out
this bit o' wisdom:
arnuld wrote:
>>
Most people use fopen and fseek. In my programs I used stat. One
thing that always made me wonder is that stat reports filesize
== 0 if the file is opened . Only on closed file it reports the
correct size.

stat is not present in standard C. Thus it can do anything, and is
off topic here unless you present its actual coding (in standard
C).
It's called _stat() by Microsoft.

--
If builders built buildings the way programmers wrote programs,
then the first woodpecker to come along would destroy civilization.
Oct 3 '08 #11

P: n/a
CBFalconer <cb********@yahoo.comwrites:
arnuld wrote:
[...]
>Most people use fopen and fseek. In my programs I used stat. One
thing that always made me wonder is that stat reports filesize
== 0 if the file is opened . Only on closed file it reports the
correct size.

stat is not present in standard C. Thus it can do anything, and is
off topic here unless you present its actual coding (in standard
C).
Let's assume that arnuld is referring to the "stat" function specified
by POSIX; it's theoretically possible that he's talking about
something else, but common sense points to that one particular
function. Presenting an actual implementation of the POSIX stat() in
standard C is not possible; it depends on characteristics of the file
system that C does not define. Even if it were possible, posting a
complete implementation would be a waste of bandwidth; you don't need
to post a fucntion's implementation to discuss what it does.

If you want to say it's off-topic, just say it's off-topic (and I
agree, it is off-topic, though I don't object to a brief mention).
Dragging in absurd, and presumably unserious, suggestions about how it
*could* be topical is not at all helpful.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Oct 3 '08 #12

P: n/a
Chris Ahlstrom wrote:
CBFalconer belched out this bit o' wisdom:
>arnuld wrote:
>>>
Most people use fopen and fseek. In my programs I used stat. One
thing that always made me wonder is that stat reports filesize
== 0 if the file is opened . Only on closed file it reports the
correct size.

stat is not present in standard C. Thus it can do anything, and is
off topic here unless you present its actual coding (in standard
C).

It's called _stat() by Microsoft.
So what? It is not present in standard C, the subject of this
newsgroup. If you want to bring it up on a newsgroup that deals
with Microsoft or Posix, that is another matter. Then there is a
definition available for it. It still isn't portable.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Oct 3 '08 #13

P: n/a
Keith Thompson wrote:
CBFalconer <cb********@yahoo.comwrites:
.... snip ...
>
>stat is not present in standard C. Thus it can do anything,
and is off topic here unless you present its actual coding (in
standard C).
.... snip ...
>
If you want to say it's off-topic, just say it's off-topic (and
I agree, it is off-topic, though I don't object to a brief
mention). Dragging in absurd, and presumably unserious,
suggestions about how it *could* be topical is not at all helpful.
I disagree. There is no reason a user can't write his own stat()
function, say as:

int stat(char *s) {
return !!*s;
}

I think my response (above) covered the possibilities.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Oct 3 '08 #14

P: n/a
CBFalconer <cb********@yahoo.comwrites:
Keith Thompson wrote:
>CBFalconer <cb********@yahoo.comwrites:
... snip ...
>>
>>stat is not present in standard C. Thus it can do anything,
and is off topic here unless you present its actual coding (in
standard C).
... snip ...
>>
If you want to say it's off-topic, just say it's off-topic (and
I agree, it is off-topic, though I don't object to a brief
mention). Dragging in absurd, and presumably unserious,
suggestions about how it *could* be topical is not at all helpful.

I disagree. There is no reason a user can't write his own stat()
function, say as:

int stat(char *s) {
return !!*s;
}

I think my response (above) covered the possibilities.
The previous poster talked about using stat to determine the size of a
file.

Topicality doesn't preclude using a little common sense. When you
talk about Pascal, I generally assume you mean the language, not the
philosopher.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Oct 3 '08 #15

P: n/a
Keith Thompson said:

<snip>
Topicality doesn't preclude using a little common sense. When you
talk about Pascal, I generally assume you mean the language, not the
philosopher.
So no pressure, huh?

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Oct 4 '08 #16

P: n/a
Richard Heathfield wrote:
Keith Thompson said:

<snip>
>Topicality doesn't preclude using a little common sense. When
you talk about Pascal, I generally assume you mean the language,
not the philosopher.

So no pressure, huh?
Hey, over here we don't live in a MKS world.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Oct 4 '08 #17

P: n/a
On Thu, 02 Oct 2008 23:43:22 -0700, Nate Eldredge wrote:
What operating system / compiler / standard library?
[arnuld@dune ~]$ uname -a
Linux dune 2.6.9-67.EL #1 Fri Nov 16 12:34:13 EST 2007 i686 athlon i386 GNU/Linux

[arnuld@dune ~]$ cat /etc/issue
CentOS release 4.6 (Final)
Kernel \r on an \m

[arnuld@dune ~]$ gcc --version
gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-9)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Peculiar. Can you post a complete example of a program that shows
this behavior?
yeah, quite peculiar because its working fine now:
#include <stdio.h>
#include <sys/stat.h>

int main(void)
{
FILE* fp = NULL;
char file_name[] = "test.c";
char* pf = file_name;
struct stat statbuf;
unsigned long file_size = 0;
if( (fp = fopen(pf, "a")) )
{
if( ! stat(pf, &statbuf) )
{
file_size = statbuf.st_size;
}
}
if( fclose(fp) )
{
printf("error closing file\n");
}

printf("size of %s = %ld bytes\n", file_name, file_size);
return 0;
}
====================== OUTPUT ===========================
[arnuld@dune ztest]$ gcc -ansi -pedantic -Wall -Wextra new.c
[arnuld@dune ztest]$ ./a.out
size of test.c = 256 bytes
[arnuld@dune ztest]$

I don't know why it was behaving like that last time. May be you are write
because sometimes the program was to create new file while sometimes that
was supposed to append.

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Gooogle Groups is Blocked. Reason: Excessive Spamming

Oct 4 '08 #18

P: n/a
Chris Ahlstrom <li*****@bollsouth.nutwrote:
After takin' a swig o' grog, CBFalconer belched out
arnuld wrote:
>
Most people use fopen and fseek. In my programs I used stat. One
thing that always made me wonder is that stat reports filesize
== 0 if the file is opened . Only on closed file it reports the
correct size.
stat is not present in standard C. Thus it can do anything, and is
off topic here unless you present its actual coding (in standard
C).

It's called _stat() by Microsoft.
Does it work _exactly_ the same as stat() under POSIX? _All_ POSIXes?
No? Well, it's not very dependable, then, is it?

Richard
Oct 6 '08 #19

This discussion thread is closed

Replies have been disabled for this discussion.