By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,918 Members | 2,258 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,918 IT Pros & Developers. It's quick & easy.

File size

P: n/a
*** post for FREE via your newsreader at post.newsfeed.com ***

Hello all,

I recall several threads over the years about how reading file size
cannot be done consistantly or portably, but I don't remember any good
reasons (not that I haven't read them, I'm sure, but it's more of a
failure to hold them in my brain). Here is an attempt that I was
commissioned to write, and I'd appreciate any comments and/or criticism
(specific or general) before I release it to my customer. Thanks!

Andrew

/* begin filesize.c */
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>

int main ( int argc, char *argv [] )
{
int status;

if ( 2 != argc )
{
fprintf ( stderr, "Usage: %s <filename>\n", argv [ 0 ] );
status = EXIT_FAILURE;
}
else
{
FILE *fp;

fp = fopen ( argv [ 1 ], "rb" );
if ( !fp )
{
fprintf ( stderr, "Cannot open file \"%s\" for reading.
[Error %d]\n", argv [ 1 ], errno );
status = EXIT_FAILURE;
}
else
{
char c;
long unsigned size = 0;

while ( fread ( &c, sizeof c, 1, fp ) )
{
size++;
}

fclose ( fp );
printf ( "Size of file (in bytes): %lu\n", size );
#if 0
if ( size > 1 << 10 )
{
/*** 1 KB ***/
printf ( "[%lu KB]\n", size / ( 1 << 10 ) );
}
#endif

status = EXIT_SUCCESS;
}
}

#if 0
printf ( "Returning status code: %d\n", status );
#endif
return status;
}

/* end filesize.c */
-----= Posted via Newsfeed.Com, Uncensored Usenet News =-----
http://www.newsfeed.com - The #1 Newsgroup Service in the World!
-----== 100,000 Groups! - 19 Servers! - Unlimited Download! =-----

Nov 14 '05 #1
Share this Question
Share on Google+
6 Replies


P: n/a
"Andrew Clark" <an*****@syr.edu> wrote in message
news:Xn********************@208.33.61.211...
*** post for FREE via your newsreader at post.newsfeed.com ***

Hello all,

I recall several threads over the years about how reading file size
cannot be done consistantly or portably, but I don't remember any good
reasons (not that I haven't read them, I'm sure, but it's more of a
failure to hold them in my brain).
You don't have to know facts, you merely have to know where to find them. ;)
Here is an attempt that I was commissioned to write, and I'd appreciate
any comments and/or criticism (specific or general) before I release it to
my customer. Thanks!
If it's commercial code, what's preventing you from using some POSIX or other non-standard
but _much_ more practical method?
/* begin filesize.c */
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>

int main ( int argc, char *argv [] )
{
int status;

if ( 2 != argc )
{
fprintf ( stderr, "Usage: %s <filename>\n", argv [ 0 ] );
argv[0] may be "" or even NULL.
status = EXIT_FAILURE;
}
else
{
FILE *fp;

fp = fopen ( argv [ 1 ], "rb" );
if ( !fp )
{
fprintf ( stderr, "Cannot open file \"%s\" for reading.
[Error %d]\n", argv [ 1 ], errno );
status = EXIT_FAILURE;
}
else
{
char c;
long unsigned size = 0;
Files can be larger than unsigned long can represent. Hence the presence of fpos_t. You
should check for overflow of size. [Using two unsigned longs (64+ bits) should be enough
for the next couple of years though.]

while ( fread ( &c, sizeof c, 1, fp ) )
getchar() is likely to be much more efficient. Faster still is reading in larger chunks,
say 1024 bytes at a time.

But why bother reading the file at all? Why not just keep fseek-ing by large amounts?
{
size++;
}

You don't check for read errors. You may be misreporting the size.

Actually, even if you read the file without error, you have no guarantee that the file
will be the same size on subsequent reading.

Note that neither reading nor fseeking will work for streams which can't be rewound.
fclose ( fp );
printf ( "Size of file (in bytes): %lu\n", size );
#if 0
if ( size > 1 << 10 )
1024 is clearer to me than 1 << 10.
{
/*** 1 KB ***/
printf ( "[%lu KB]\n", size / ( 1 << 10 ) );
}
#endif

status = EXIT_SUCCESS;
}
}

#if 0
printf ( "Returning status code: %d\n", status );
#endif
return status;
}

/* end filesize.c */


--
Peter
Nov 14 '05 #2

P: n/a
In 'comp.lang.c', "Peter Nilsson" <ai***@acay.com.au> wrote:
while ( fread ( &c, sizeof c, 1, fp ) )


getchar() is likely to be much more efficient. Faster still is reading
in larger chunks, say 1024 bytes at a time.


You meant fgetc(), of course...

--
-ed- get my email here: http://marreduspam.com/ad672570
The C-language FAQ: http://www.eskimo.com/~scs/C-faq/top.html
C-reference: http://www.dinkumware.com/manuals/reader.aspx?lib=c99
FAQ de f.c.l.c : http://www.isty-info.uvsq.fr/~rumeau/fclc/
Nov 14 '05 #3

P: n/a
"Emmanuel Delahaye" <em**********@noos.fr> wrote in message
news:Xn***************************@212.27.42.73...
In 'comp.lang.c', "Peter Nilsson" <ai***@acay.com.au> wrote:
while ( fread ( &c, sizeof c, 1, fp ) )


getchar() is likely to be much more efficient. Faster still is reading
in larger chunks, say 1024 bytes at a time.


You meant fgetc(), of course...


Actually, I meant getc(), but the gist is the same. :-)

--
Peter
Nov 14 '05 #4

P: n/a

"Peter Nilsson" <ai***@acay.com.au> wrote in message
If it's commercial code, what's preventing you from using some
POSIX or other non-standard
but _much_ more practical method?
Maybe the customer wants portable ANSI C, or maybe this has been used as a
selling point.
Files can be larger than unsigned long can represent. Hence the
presence of fpos_t. You should check for overflow of size. [Using
two unsigned longs (64+ bits) should be enough
for the next couple of years though.]
And then we've all the fun of writing an int_to_ascii function, because two
longs can't be passed to printf().
But why bother reading the file at all? Why not just keep fseek-ing > by large amounts?

Unfortunately, if the file is text then fseek() / ftell() may not represent
the size. Here it is binary, so we don't have that problem, but there is
another issue with fseek() and ftell() not necessarily reporting the end of
the file.
That said, fseek()ing the end of the file and calling ftell() is a good
enough method for most practical purposes.
Nov 14 '05 #5

P: n/a
"Malcolm" <ma*****@55bank.freeserve.co.uk> wrote in message
news:c9**********@newsg3.svr.pol.co.uk...

"Peter Nilsson" <ai***@acay.com.au> wrote in message
If it's commercial code, what's preventing you from using some
POSIX or other non-standard
but _much_ more practical method?


Maybe the customer wants portable ANSI C, ...


Why would a client want ANSI C for _this_ task? What's stopping a programmer from
informing their client of better options?
Files can be larger than unsigned long can represent. Hence the
presence of fpos_t. You should check for overflow of size. [Using
two unsigned longs (64+ bits) should be enough
for the next couple of years though.]


And then we've all the fun of writing an int_to_ascii function, because two
longs can't be passed to printf().


Here's one to get you started...

char *ul2toa(char *s, unsigned long hi, unsigned long lo)
{
char *u, *v = s;
unsigned long q, r, d;

while (hi)
{
r = hi % 10;
hi = hi / 10;

d = (lo >> 16) + r * ((-1ul >> 16) + 1);
r = d % 10;
q = d / 10;

lo = (r << 16) + (lo & 0xFFFF);
r = lo % 10;
lo = lo / 10 + (q << 16);

*v++ = '0' + r;
}

do
{
*v++ = '0' + (lo % 10);
}
while (lo /= 10);

*v = 0;
for (u = s; u < --v; u++)
{ char c = *u; *u = *v; *v = c; }

return s;
}

--
Peter
Nov 14 '05 #6

P: n/a
"Peter Nilsson" <ai***@acay.com.au> wrote in message

Why would a client want ANSI C for _this_ task? [ file
size ] What's stopping a programmer from
informing their client of better options?
Let's say that the customer wants a suite of programs that use the stdin /
stdout model. Let's say that some are very processor intensive and they are
constantly investing in the latest hardware. If they have any sense they
will specify that all programs must be written in portable ANSI C, and
recompile the whole lot if new kit arrives.

Nov 14 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.