By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,315 Members | 1,672 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,315 IT Pros & Developers. It's quick & easy.

Size of a File

P: n/a
Hi,
I 'm new to C programming and 'm stuck somewhere.
I want to find the size of a file.
I couldn't find a proper way of doing it.
What I was planning to do is...
Open the requested file,
default position is 1st byte.
Now I use fseek() and move it till the send of the file.
Than I use ftell() to get the current position in the file.

this should return the num of bytes (length / size) of the file.

Is this a proper way to do?
Is there any efficient way to do so?

Thanks

Sep 7 '08 #1
Share this Question
Share on Google+
20 Replies


P: n/a
Ashit Vora wrote:
Hi,
I 'm new to C programming and 'm stuck somewhere.
I want to find the size of a file.
I couldn't find a proper way of doing it.
What I was planning to do is...
Open the requested file,
default position is 1st byte.
Now I use fseek() and move it till the send of the file.
Than I use ftell() to get the current position in the file.
That's what I would do, too (and what I was suggested to do when I had
the same problem some time ago).

I don't know of any other way (to do it portably). There may be
"better" tricks using platform specific APIs (like POSIX) though.

Daniel

--
Done: Arc-Bar-Cav-Sam-Val-Wiz, Dwa-Elf-Gno-Hum-Orc, Law-Neu-Cha, Fem-Mal
To go: Hea-Kni-Mon-Pri-Ran-Rog-Tou
Sep 7 '08 #2

P: n/a
>I 'm new to C programming and 'm stuck somewhere.
>I want to find the size of a file.
First of all, you need to define what *IS* the size of a file.
It's not a trivial question.

Is it the amount of space it takes on disk, including partially unused
blocks? Does that include the size of a directory slot? Inode?
Is it the number of bytes you can read from the file in binary mode?
Is it the number of bytes you can read from the file in text mode? (This
is likely to differ from the above on Windows systems with \r\n line
endings).
Is it the number of bytes that the UNIX "ls -l" or Windows "DIR"
command returns for that file?

What bad things will happen if the size of the file changes between
the time you compute the size of the file and the time you use that
number (for what?)?
>I couldn't find a proper way of doing it.
Depending on your definition of the size of a file, opening the
file (in binary or text mode; which mode you use may affect the
answer), and reading and counting characters will work.
>What I was planning to do is...
Open the requested file,
default position is 1st byte.
Now I use fseek() and move it till the send of the file.
Problem: a binary file need not meaningfully support seeking
to the end of the file. In CP/M, for example, the size of a
binary file is a multiple of the block size, and it doesn't
keep track of how far into the last block you've written.
>Than I use ftell() to get the current position in the file.
Problem: For a text file, the position need not be a number
of anything. It could be a bitfield of sector, head, cylinder,
track, train, disk number, etc. that has no correlation to a number
of bytes.
>this should return the num of bytes (length / size) of the file.

Is this a proper way to do?
It's not portable under ANSI C.
>Is there any efficient way to do so?
There isn't a portable, efficient way to do so. stat(), if available,
might be more efficient than your method. Opening the file, reading
it, and counting bytes may be very inefficient.

Sep 7 '08 #3

P: n/a
On September 7, 2008 16:04, in comp.lang.c, Gordon Burditt
(go***********@burditt.org) wrote:
>>I 'm new to C programming and 'm stuck somewhere.
I want to find the size of a file.

First of all, you need to define what *IS* the size of a file.
It's not a trivial question.
Agreed
Is it the amount of space it takes on disk, including partially unused
blocks? Does that include the size of a directory slot? Inode?
Is it the number of bytes you can read from the file in binary mode?
Is it the number of bytes you can read from the file in text mode? (This
is likely to differ from the above on Windows systems with \r\n line
endings).
Is it the number of bytes that the UNIX "ls -l" or Windows "DIR"
command returns for that file?
Is it the count of the number of bytes stored in a sparse file (where there
are areas of the file with no data at all) or is it the "virtual" size of
the sparse file, where the intervening empty spots are presumed to have
data? (Note, at least in unix, the "ls" command shows the "virtual" size,
rather than a count of the real data written to a sparse file.)

[snip]
--
Lew Pitcher

Master Codewright & JOAT-in-training | Registered Linux User #112576
http://pitcher.digitalfreehold.ca/ | GPG public key available by request
---------- Slackware - Because I know what I'm doing. ------
Sep 7 '08 #4

P: n/a
Ashit Vora wrote:
Hi,
I 'm new to C programming and 'm stuck somewhere.
I want to find the size of a file.
I couldn't find a proper way of doing it.
What I was planning to do is...
Open the requested file,
default position is 1st byte.
Now I use fseek() and move it till the send of the file.
Than I use ftell() to get the current position in the file.

this should return the num of bytes (length / size) of the file.

Is this a proper way to do?
Is there any efficient way to do so?
Unfortunately, this is problematic. There are two kinds of streams,
binary and text.

7.19.9.2p3 says "A binary stream need not meaningfully support fseek
calls with a whence value of SEEK_END".

7.19.9.4p2 says, with regard to the ftell() function, that "For a
text stream, its file position indicator contains unspecified
information ...; the difference between two such return values is not
necessarily a meaningful measure of the number of characters written
or read."

Therefore, no matter what kind of stream you have, either the fseek()
or the ftell() function is, at least in principle, unreliable for this
purpose. In practice, for many implementations they will all work
exactly as you expect. However, the only reasonably portable way to do
it is to open the file in binary mode, read it a character at a time,
keep count of how many characters have been red and wait for the end
of file. Even this technique won't work if the stream that your
reading isn't really a file, but is a special device.

There are more efficient ways of finding out the file size, but the
appropriate method is different for different operating systems. On
the systems I use most frequently, the relevant function is called
stat(). It might be a very different function on your system.

Also, keep in mind that there are many different numbers that might be
described as the size of a file (the amount of size it takes up on
disk, the amount of data it contains, the compressed size, the
uncompressed size, etc.), and different operating systems may give you
access to numbers with different meanings. Also, if you're using a
file whose size might change for reasons outside of your control
during the time you're working on it, the value for the size that you
get at one time might be out-of-date by the time you actually use that
information.
Sep 7 '08 #5

P: n/a
Daniel Kraft wrote, On 07/09/08 20:52:
Ashit Vora wrote:
>Hi,
I 'm new to C programming and 'm stuck somewhere.
I want to find the size of a file.
I couldn't find a proper way of doing it.
What I was planning to do is...
Open the requested file,
default position is 1st byte.
Now I use fseek() and move it till the send of the file.
Than I use ftell() to get the current position in the file.

That's what I would do, too (and what I was suggested to do when I had
the same problem some time ago).
That is not a portable way to do it, and in any case it depends on what
you mean by the size of a file.
I don't know of any other way (to do it portably). There may be
"better" tricks using platform specific APIs (like POSIX) though.
I suggest both of you read question 19.12 of the comp.lang.c FAQ at
http://c-faq.com/ and the questions it links to. Also search this group
for all the long discussions about this topic. Then ask if you need
further information.
--
Flash Gordon
Sep 7 '08 #6

P: n/a
rio

"Gordon Burditt" <go***********@burditt.orgha scritto nel messaggio
news:_-******************************@posted.internetamer ica...
I 'm new to C programming and 'm stuck somewhere.
I want to find the size of a file.

What bad things will happen if the size of the file changes between
the time you compute the size of the file and the time you use that
number (for what?)?
easy i detect it and send an error message
something as
a=getsize(file)
if((int)a<0) goto error;
b=malloc(a+128)
if(b==0) goto error;
r=getfile(b, file, a+120)
if(r==0) /* not get all the file until EOF */
goto error;

Sep 8 '08 #7

P: n/a
On 2008-09-08, rio <a@b.cwrote:
>
"Gordon Burditt" <go***********@burditt.orgha scritto nel messaggio
news:_-******************************@posted.internetamer ica...
>I 'm new to C programming and 'm stuck somewhere.
I want to find the size of a file.

What bad things will happen if the size of the file changes between
the time you compute the size of the file and the time you use that
number (for what?)?

easy i detect it and send an error message
It wasn't a challenge, it was a legitimate question about your
implmentation - so not "easy" :-)
something as
Whitespace? Fixed:
a = getsize(file)
if(a < 0)
goto error;
b = malloc(a + 128)
if(b == 0)
goto error;
r = getfile(b, file, a + 120)

if(r == 0) /* not get all the file until EOF */
goto error;
Well, this code still doesn't quite make sense. Maybe if I knew
how getsize() and getfile() were defined it would.
--
Andrew Poelstra ap*******@wpsoftware.com
To email me, use the above email addresss with .com set to .net
Sep 8 '08 #8

P: n/a
On Mon, 8 Sep 2008 19:10:54 +0200, "rio" <a@b.cwrote:
>
"Gordon Burditt" <go***********@burditt.orgha scritto nel messaggio
news:_-******************************@posted.internetamer ica...
>I 'm new to C programming and 'm stuck somewhere.
I want to find the size of a file.

What bad things will happen if the size of the file changes between
the time you compute the size of the file and the time you use that
number (for what?)?

easy i detect it and send an error message
something as
a=getsize(file)
if((int)a<0) goto error;
What is a that you think casting it to an int will do any good?
>b=malloc(a+128)
if(b==0) goto error;
r=getfile(b, file, a+120)
if(r==0) /* not get all the file until EOF */
goto error;

--
Remove del for email
Sep 9 '08 #9

P: n/a
On Mon, 08 Sep 2008 23:04:41 GMT, Andrew Poelstra
<ap*******@supernova.homewrote:
>On 2008-09-08, rio <a@b.cwrote:
>>
"Gordon Burditt" <go***********@burditt.orgha scritto nel messaggio
news:_-******************************@posted.internetamer ica...
>>I 'm new to C programming and 'm stuck somewhere.
I want to find the size of a file.

What bad things will happen if the size of the file changes between
the time you compute the size of the file and the time you use that
number (for what?)?

easy i detect it and send an error message

It wasn't a challenge, it was a legitimate question about your
implmentation - so not "easy" :-)
>something as

Whitespace? Fixed:
>a = getsize(file)
if(a < 0)
It's one thing to reformat code to make it readable but you really
should include all the code from the original post. You left out a
cast (which admittedly makes no sense but it was in the original).
> goto error;
b = malloc(a + 128)
if(b == 0)
goto error;
r = getfile(b, file, a + 120)

if(r == 0) /* not get all the file until EOF */
goto error;

Well, this code still doesn't quite make sense. Maybe if I knew
how getsize() and getfile() were defined it would.
--
Remove del for email
Sep 9 '08 #10

P: n/a
On Mon, 8 Sep 2008 19:10:54 +0200, "rio" <a@b.cwrote:
>
"Gordon Burditt" <go***********@burditt.orgha scritto nel messaggio
news:_-******************************@posted.internetamer ica...
>I 'm new to C programming and 'm stuck somewhere.
I want to find the size of a file.

What bad things will happen if the size of the file changes between
the time you compute the size of the file and the time you use that
number (for what?)?

easy i detect it and send an error message
something as
a=getsize(file)
if((int)a<0) goto error;
What is a that you think casting it to an int will do any good?
>b=malloc(a+128)
if(b==0) goto error;
r=getfile(b, file, a+120)
if(r==0) /* not get all the file until EOF */
goto error;

--
Remove del for email
Sep 9 '08 #11

P: n/a
On Mon, 08 Sep 2008 23:04:41 GMT, Andrew Poelstra
<ap*******@supernova.homewrote:
>On 2008-09-08, rio <a@b.cwrote:
>>
"Gordon Burditt" <go***********@burditt.orgha scritto nel messaggio
news:_-******************************@posted.internetamer ica...
>>I 'm new to C programming and 'm stuck somewhere.
I want to find the size of a file.

What bad things will happen if the size of the file changes between
the time you compute the size of the file and the time you use that
number (for what?)?

easy i detect it and send an error message

It wasn't a challenge, it was a legitimate question about your
implmentation - so not "easy" :-)
>something as

Whitespace? Fixed:
>a = getsize(file)
if(a < 0)
It's one thing to reformat code to make it readable but you really
should include all the code from the original post. You left out a
cast (which admittedly makes no sense but it was in the original).
> goto error;
b = malloc(a + 128)
if(b == 0)
goto error;
r = getfile(b, file, a + 120)

if(r == 0) /* not get all the file until EOF */
goto error;

Well, this code still doesn't quite make sense. Maybe if I knew
how getsize() and getfile() were defined it would.
--
Remove del for email
Sep 9 '08 #12

P: n/a
On 2008-09-09, Barry Schwarz <sc******@dqel.comwrote:
On Mon, 08 Sep 2008 23:04:41 GMT, Andrew Poelstra
<ap*******@supernova.homewrote:
>>On 2008-09-08, rio <a@b.cwrote:
>>>
"Gordon Burditt" <go***********@burditt.orgha scritto nel messaggio
news:_-******************************@posted.internetamer ica...
I 'm new to C programming and 'm stuck somewhere.
>I want to find the size of a file.

What bad things will happen if the size of the file changes between
the time you compute the size of the file and the time you use that
number (for what?)?

easy i detect it and send an error message

It wasn't a challenge, it was a legitimate question about your
implmentation - so not "easy" :-)
>>something as

Whitespace? Fixed:
>>a = getsize(file)
if(a < 0)

It's one thing to reformat code to make it readable but you really
should include all the code from the original post. You left out a
cast (which admittedly makes no sense but it was in the original).
Oops! Sorry about that.

The original code was
if((int) a < 0)

not
if(a < 0)

..

--
Andrew Poelstra ap*******@wpsoftware.com
To email me, use the above email addresss with .com set to .net
Sep 9 '08 #13

P: n/a
rio

"Andrew Poelstra" <ap*******@supernova.homeha scritto nel messaggio
news:sl**********************@supernova.home...
On 2008-09-08, rio <a@b.cwrote:
Well, this code still doesn't quite make sense. Maybe if I knew
how getsize() and getfile() were defined it would.
use imagination ...
>
--
Andrew Poelstra ap*******@wpsoftware.com
To email me, use the above email addresss with .com set to .net

Sep 9 '08 #14

P: n/a
On Sep 9, 2:04 am, Andrew Poelstra <apoels...@supernova.homewrote:
On 2008-09-08, rio <a...@b.cwrote:
"Gordon Burditt" <gordonb.7m...@burditt.orgha scritto nel messaggio
<snip>
Well, this code still doesn't quite make sense. Maybe if I knew
how getsize() and getfile() were defined it would.
rio is a troll that posts with many names here, just ignore him.
Gordon Burditt is also a troll, but he usually offers good advice. (he
continues to remove attribute lines even when told many times not to
do so)
Sep 9 '08 #15

P: n/a
Ashit Vora wrote:
Hi,
I 'm new to C programming and 'm stuck somewhere.
I want to find the size of a file.
I couldn't find a proper way of doing it.
What I was planning to do is...
Open the requested file,
default position is 1st byte.
I call that position 0, just /before/ the first byte. If there are N bytes
in a file, there will be N+1 in-between positions from 0 to N (just beyond
the last byte).
Now I use fseek() and move it till the send of the file.
Than I use ftell() to get the current position in the file.

this should return the num of bytes (length / size) of the file.
This is exactly what I do. (I know about stat() but have never had much luck
with it.)

BUT, I only work with binary files that are not streams; I know my OS
stores the byte-size of my files; and I know the files I will work with are
stable in that their sizes won't change while working on them.
Is this a proper way to do?
Not according to this newsgroup, but if it works for you that might be
adequate;

--
Bartc

Sep 9 '08 #16

P: n/a
rio

"Barry Schwarz" <sc******@dqel.comha scritto nel messaggio
news:ps********************************@4ax.com...
On Mon, 8 Sep 2008 19:10:54 +0200, "rio" <a@b.cwrote:
>>
"Gordon Burditt" <go***********@burditt.orgha scritto nel messaggio
news:_-******************************@posted.internetamer ica...
>>I 'm new to C programming and 'm stuck somewhere.
I want to find the size of a file.

What bad things will happen if the size of the file changes between
the time you compute the size of the file and the time you use that
number (for what?)?

easy i detect it and send an error message
something as
a=getsize(file)
if((int)a<0) goto error;

What is a that you think casting it to an int will do any good?
it is error if sizeof(Type(a))!=sizeof(int);
for the rest i like to use one bit in the "a" value for error
value;
it means that i can not read files that use the all "unsigned" "a"
but i have some way to prevent overflow
>>b=malloc(a+128)
if(b==0) goto error;
r=getfile(b, file, a+120)
if(r==0) /* not get all the file until EOF */
goto error;


--
Remove del for email


Sep 10 '08 #17

P: n/a
"rio" <a@b.cwrites:
"Barry Schwarz" <sc******@dqel.comha scritto nel messaggio
news:ps********************************@4ax.com...
>On Mon, 8 Sep 2008 19:10:54 +0200, "rio" <a@b.cwrote:
>>>
"Gordon Burditt" <go***********@burditt.orgha scritto nel messaggio
news:_-******************************@posted.internetamer ica...
I 'm new to C programming and 'm stuck somewhere.
>I want to find the size of a file.

What bad things will happen if the size of the file changes between
the time you compute the size of the file and the time you use that
number (for what?)?

easy i detect it and send an error message
something as
a=getsize(file)
if((int)a<0) goto error;

What is a that you think casting it to an int will do any good?

it is error if sizeof(Type(a))!=sizeof(int);
No it isn't. The comparison will be done by converting both operands
to int or by converting 0 to the same type as a depending on the
details. The cast can introduce an error, missing from plain a < 0,
if the value in a can't be converted to int.

By the way, you can write sizeof a != sizeof (int) -- there is no need
to invent a Type(a) pseudo-macro.

--
Ben.
Sep 10 '08 #18

P: n/a
rio wrote:
"Barry Schwarz" <sc******@dqel.comha scritto nel messaggio
news:ps********************************@4ax.com...
>On Mon, 8 Sep 2008 19:10:54 +0200, "rio" <a@b.cwrote:
....
>>a=getsize(file)
if((int)a<0) goto error;
What is a that you think casting it to an int will do any good?

it is error if sizeof(Type(a))!=sizeof(int);
I don't know what Type(a) means, unless it's equivalent to the typeof()
feature that many implementations provide as an extension. If that's
what you mean, they I should hope you would realize that sizeof(a) will
give you exactly the same answer.

Calling it an error is ambiguous; do you mean that it is a syntax error,
a constraint violation, or has undefined behavior, or do you simply mean
that the result of the conversion will be incorrect?

This code is not a syntax error.

It can be a constraint violation if 'a' has a struct, union, or array
type; but it would still be a constraint violation in those cases
without the cast.

The conversion has undefined behavior if result of the conversion would
otherwise be a value which cannot be represented by an int. However, as
far as I can see that's a reason for NOT using the cast. You were asked
to give a reason FOR using the cast.

The expression "(int)a" has defined behavior and a defined value which
is different from the value of 'a' itself, only if a has either a
pointer type, or a floating point type and a value within the range of int.

If 'a' has a pointer type, when I say "defined behavior", I mean
"implementation-defined", not "standard-defined", and even then only in
some cases. The expression (int)a<0 is not portable. The expression
(intptr_t)a would be safer; because it will either be an error (by
reasons of intptr_t not being defined), or has an implementation-defined
meaning and value. However, the expression (intptr_t)a<0 still has no
portable meaning.

I wouldn't expect that the expression "getsize(file)" would return a
floating point value; though you did say "use your imagination"; this is
the only case I can imagine where the "(int)" cast makes a meaningful
difference. If 'a' has a value which is negative but greater than -1.0,
then "a<0" will be true, but "(int)a < 0" will be false. Otherwise the
cast doesn't change anything. Is that what you're trying to do? If so,
"a<=1.0" is a clearer way of expressing your intent.

Regardless of which meaning you're using for "error",
sizeof(a)!=sizeof(int) is NOT the correct condition. for identifying the
problem.
Sep 10 '08 #19

P: n/a
James Kuyper <ja*********@verizon.netwrites:
[...]
The conversion has undefined behavior if result of the conversion
would otherwise be a value which cannot be represented by an
int. However, as far as I can see that's a reason for NOT using the
cast. You were asked to give a reason FOR using the cast.
[...]

Quibble: A conversion to int yields an implementation-defined value
(or raises an implementation-defined signal) if the value is out of
range.

Which is still an excellent reason not to use the cast.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Sep 10 '08 #20

P: n/a
Keith Thompson wrote:
James Kuyper <ja*********@verizon.netwrites:
[...]
The conversion has undefined behavior if result of the conversion
would otherwise be a value which cannot be represented by an
int. However, as far as I can see that's a reason for NOT using the
cast. You were asked to give a reason FOR using the cast.
[...]

Quibble: A conversion to int yields an implementation-defined value
(or raises an implementation-defined signal) if the value is out of
range.
You're right - it's not actually undefined behavior. However, because
the signal being raised is implementation-defined, there's no way to
write strictly conforming code which handles it. So this is the next-
worst thing to undefined behavior.
Which is still an excellent reason not to use the cast.
Sep 10 '08 #21

This discussion thread is closed

Replies have been disabled for this discussion.