Programming in standard c

jacob navia

In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.

It is interesting to see that the answers to that message prove that
programming exclusively in standard C is completely impossible even
for a small and ridiculously simple program like the one I proposed.

1 I read the file contents in binary mode, what should allow me
to use ftell/fseek to determine the file size.

No objections to this were raised, except of course the obvious
one, if the "file" was some file associated with stdin, for
instance under some unix machine /dev/tty01 or similar...

I did not test for this since it is impossible in standard C:
isatty() is not in the standard.

2) There is NO portable way to determine which characters should be
ignored when transforming a binary file into a text file. One
reader (CB Falconer) proposed to open the file in binary mode
and then in text mode and compare the two buffers to see which
characters were missing... Well, that would be too expensive.

3) I used different values for errno defined by POSIX, but not by
the C standard, that defines only a few. Again, error handling
is not something important to be standardized, according to
the committee. errno is there but its usage is absolutely
not portable at all and goes immediately beyond what standard C
offers.

We hear again and again that this group is about standard C *"ONLY"*.
Could someone here then, tell me how this simple program could be
written in standard C?

This confirms my arguments about the need to improve the quality
of the standard library!

You can't do *anything* in just standard C.
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 26 '07 #1

Subscribe Post Reply

270

9196

1
2
3
>
Last »

Walter Roberson

In article <fk**********@aioe.org>, jacob navia <ja***@nospam.orgwrote:

>In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.

>1 I read the file contents in binary mode, what should allow me
to use ftell/fseek to determine the file size.

No objections to this were raised, except of course the obvious
one, if the "file" was some file associated with stdin, for
instance under some unix machine /dev/tty01 or similar...

Binary files may read with an indeterminate number of extra binary
0's at the end of them.
--
"History is a pile of debris" -- Laurie Anderson

Dec 26 '07 #2

Walter Roberson

In article <fk**********@aioe.org>, jacob navia <ja***@nospam.orgwrote:

>In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.

>2) There is NO portable way to determine which characters should be
ignored when transforming a binary file into a text file. One
reader (CB Falconer) proposed to open the file in binary mode
and then in text mode and compare the two buffers to see which
characters were missing... Well, that would be too expensive.

tmpnam() to get a temporary file name; write something short to the file
in one mode, freopen() the file in the other mode (making sure
you don't clobber the contents), then see what is read in.
--
"I was very young in those days, but I was also rather dim."
-- Christopher Priest

Dec 26 '07 #3

jameskuyper

jacob navia wrote:

In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.

It is interesting to see that the answers to that message prove that
programming exclusively in standard C is completely impossible even
for a small and ridiculously simple program like the one I proposed.

The most portable way to determine the size of the buffer needed to
store a stream that might have been opened in text mode is to use
repeated calls to fread() until the end of file has been reached,
keeping track of the total number of bytes read, and reallocating as
you go along. Sure, it's inefficient; but for streams which
correspond to devices, rather than files, there's really no
alternative. Even for streams which correspond to actual files, there
are real OSs where there's no more efficient method of finding the
length of the file. If you don't care about portability to those
platforms, there's no reason why you can't use OS-specific techniques
for determining the file length.

If you use that approach, there's no need to worry about how lines are
terminated for any particular implementation, whether with CR, LF,
CRLF, LFCR, or by padding the lines with null characters to the end of
a fixed maximum line size, or any other method that an implementation
finds ocnvev

3) I used different values for errno defined by POSIX, but not by

I saw no reason why, at any point, your program should set errno to
any value other than whichever one the underlying C standard library
functions would have already set it to. If fopen() doesn't already set
ENOENT, why should strfromfile()? If fseek() doesn't already set EIO,
why should strfromfile()? If malloc() doesn't already set ENOMEM, why
should strfromfile()? Don't make things harder than they have to be;
even if strfromfile() were standardized, I'd still recommend that the
implementor piggy-back off the lower level functions it calls, rather
than setting it's own error codes.

Dec 26 '07 #4

Eric Sosman

jacob navia wrote:

[...]
You can't do *anything* in just standard C.

Then why do you bother with this newsgroup? Why do
you waste your time on a powerless language? Why don't
you go away and become a regular on comp.lang.mumps or
comp.lang.apl or any newsgroup devoted to a language you
consider more useful than C? Since C has zero utility
(in your stated estimation), even comp.lang.cobol would
be a forum of more value. Go! Spend your talent on
something more useful than the torment of us poor old
dinosaurs! Go!

--
Eric Sosman
es*****@ieee-dot-org.invalid

Dec 26 '07 #5

Richard

Eric Sosman <es*****@ieee-dot-org.invalidwrites:

jacob navia wrote:
>[...]
You can't do *anything* in just standard C.

Then why do you bother with this newsgroup? Why do
you waste your time on a powerless language? Why don't
you go away and become a regular on comp.lang.mumps or
comp.lang.apl or any newsgroup devoted to a language you
consider more useful than C? Since C has zero utility
(in your stated estimation), even comp.lang.cobol would

No. real C has lots of uses. You know : the reason this group was
founded. For C programmers to discuss real world C situations.

be a forum of more value. Go! Spend your talent on
something more useful than the torment of us poor old
dinosaurs! Go!

Dec 26 '07 #6

Flash Gordon

jacob navia wrote, On 26/12/07 20:54:

In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.

It is interesting to see that the answers to that message prove that
programming exclusively in standard C is completely impossible even
for a small and ridiculously simple program like the one I proposed.

Incorrect. It was shown that YOU failed to do it, not that it cannot be
done. A simple modification of Chuck's ggets would probably do the job
(remove the termination on EOL and let it run to EOF instead).

1 I read the file contents in binary mode, what should allow me
to use ftell/fseek to determine the file size.

No objections to this were raised, except of course the obvious
one, if the "file" was some file associated with stdin, for
instance under some unix machine /dev/tty01 or similar...

Incorrect, I also pointed out that it would fail on WINDOWS with a ^Z
terminated text file and that the C standard does not guarantee that you
can seek to the end of a binary file at all.

I did not test for this since it is impossible in standard C:
isatty() is not in the standard.

There are ways to achieve the stated aim that do not involve finding the
length in advance.

2) There is NO portable way to determine which characters should be
ignored when transforming a binary file into a text file. One

True but irrelevant to solving the problem properly.

reader (CB Falconer) proposed to open the file in binary mode
and then in text mode and compare the two buffers to see which
characters were missing... Well, that would be too expensive.

So use the simple method, i.e. the one C provides, and open the file in
text mode.

3) I used different values for errno defined by POSIX, but not by
the C standard, that defines only a few. Again, error handling
is not something important to be standardized, according to
the committee. errno is there but its usage is absolutely
not portable at all and goes immediately beyond what standard C
offers.

No, *your* usage of it is not portable.

We hear again and again that this group is about standard C *"ONLY"*.
Could someone here then, tell me how this simple program could be
written in standard C?

See above.

This confirms my arguments about the need to improve the quality
of the standard library!

No, it shows that you do not know how to use the standard C library to
solve your problem and that is a different thing entirely.

You can't do *anything* in just standard C.

Oh, *I* can do things in standard C, I can also do things using
extensions where it is appropriate, and I can mix the two as I choose.
The same seems to apply to some of the others here.

Now, personally I cannot be bothered to modify Chuck's code to solve
your problem, however any competent C programmer should find the task
easy enough. I could also write a function to do it myself easily.
However, I'm fundamentally lazy and cannot be bothered to dig out
Chuck's code and modify it to meet a need I do not have.
--
Flash Gordon

Dec 26 '07 #7

jacob navia

Eric Sosman wrote:

jacob navia wrote:
>[...]
You can't do *anything* in just standard C.

Then why do you bother with this newsgroup? Why do
you waste your time on a powerless language? Why don't
you go away and become a regular on comp.lang.mumps or
comp.lang.apl or any newsgroup devoted to a language you
consider more useful than C? Since C has zero utility
(in your stated estimation), even comp.lang.cobol would
be a forum of more value. Go! Spend your talent on
something more useful than the torment of us poor old
dinosaurs! Go!

Stop whining and see the sentence in my message:
<quote>
This confirms my arguments about the need to improve the quality
of the standard library!
<end quote>

The solution for me is to improve what is there.

For you is just "go away" "go to another language" and similar
nonsense.

Every time I point out something that needs to be improved
the regulars are unable to put any coherent argumentation.
look at that:

Why don't
you go away and become a regular on comp.lang.mumps or
comp.lang.apl or any newsgroup devoted to a language you
consider more useful than C? Since C has zero utility
(in your stated estimation), even comp.lang.cobol would
be a forum of more value. Go! Spend your talent on
something more useful than the torment of us poor old
dinosaurs! Go!

Not even the shadow of an argumentation of WHY things
should STAY as they are.
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 26 '07 #8

Julienne Walker

On Dec 26, 3:54 pm, jacob navia <ja...@nospam.comwrote:

In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.

It is interesting to see that the answers to that message prove that
programming exclusively in standard C is completely impossible even
for a small and ridiculously simple program like the one I proposed.

1 I read the file contents in binary mode, what should allow me
to use ftell/fseek to determine the file size.

No objections to this were raised, except of course the obvious
one, if the "file" was some file associated with stdin, for
instance under some unix machine /dev/tty01 or similar...

I did not test for this since it is impossible in standard C:
isatty() is not in the standard.

2) There is NO portable way to determine which characters should be
ignored when transforming a binary file into a text file. One
reader (CB Falconer) proposed to open the file in binary mode
and then in text mode and compare the two buffers to see which
characters were missing... Well, that would be too expensive.

Would building the buffer as you go be too expensive too? Personally,
I was wondering why you went to the trouble of even *trying* to
transform a binary file into a text file. Why not simply take a FILE
pointer and let the caller worry about the file mode? Then instead of
the clearly non-portable method of getting a file size, use a
reasonably efficient buffering technique and read the file
sequentially. That solves issues 1 and 2 quite nicely.

3) I used different values for errno defined by POSIX, but not by
the C standard, that defines only a few. Again, error handling
is not something important to be standardized, according to
the committee. errno is there but its usage is absolutely
not portable at all and goes immediately beyond what standard C
offers.

Then don't use it? It's not like you're barred from writing your own
error handling mechanism, and fortunately you're also not forced to
follow the conventions of the standard library. I see a lot of self-
imposed restrictions in that attempt (no offense), and if you remove
them, I don't think your current claim holds water.

We hear again and again that this group is about standard C *"ONLY"*.
Could someone here then, tell me how this simple program could be
written in standard C?

Here's a prototype:

#include <stdio.h>
#include <stdlib.h>

char *readall ( FILE *in, int *n )
{
char *result = NULL;
size_t size = 0;
int curr = 0;
int ch;

while ( ( ch = fgetc ( in ) ) != EOF ) {
if ( curr == size ) {
char *save = realloc ( result, size + BUFSIZ + 1 );

if ( save == NULL )
break;

result = save;
size += BUFSIZ;
}

result[curr++] = (char)ch;
}

*n = curr;

return result;
}

It's fairly naive for a start, but with more detailed requirements I
still don't see how it can't be done in standard C.

This confirms my arguments about the need to improve the quality
of the standard library!

I agree that the standard library needs work, but I don't agree that
this is one of the things that it needs. You'll need to convince me a
little more.

You can't do *anything* in just standard C.

Excepting any silly mistakes in my code, I'd say that your claim is
false.

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatiquehttp://www.cs.virginia.edu/~lcc-win32

Dec 26 '07 #9

Eric Sosman

jacob navia wrote:

Eric Sosman wrote:
>jacob navia wrote:
>>[...]
You can't do *anything* in just standard C.

Then why do you bother with this newsgroup? Why do
you waste your time on a powerless language? Why don't
you go away and become a regular on comp.lang.mumps or
comp.lang.apl or any newsgroup devoted to a language you
consider more useful than C? Since C has zero utility
(in your stated estimation), even comp.lang.cobol would
be a forum of more value. Go! Spend your talent on
something more useful than the torment of us poor old
dinosaurs! Go!

Stop whining and see the sentence in my message:
<quote>
This confirms my arguments about the need to improve the quality
of the standard library!
<end quote>

You wrote: "You can't do *anything* in just standard C."
Do you stand by that statement, or do you retreat from it?
If you stand by it, why are you here?

--
Eric Sosman
es*****@ieee-dot-org.invalid

Dec 26 '07 #10

jacob navia

ja*********@verizon.net wrote:

jacob navia wrote:
>In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.

It is interesting to see that the answers to that message prove that
programming exclusively in standard C is completely impossible even
for a small and ridiculously simple program like the one I proposed.

The most portable way to determine the size of the buffer needed to
store a stream that might have been opened in text mode is to use
repeated calls to fread() until the end of file has been reached,
keeping track of the total number of bytes read, and reallocating as
you go along. Sure, it's inefficient;

er... YES!

but for streams which
correspond to devices, rather than files, there's really no
alternative.

We could restrict this to normal files.

Even for streams which correspond to actual files, there
are real OSs where there's no more efficient method of finding the
length of the file.

I just can't imagine a file system that doesn't provide a way
of knowing the length of a file. Maybe there is SOMEWHERE in
the world a crazy file system like that but why should we
care about it?

If you don't care about portability to those
platforms, there's no reason why you can't use OS-specific techniques
for determining the file length.

Much simpler would be if we had

size_t filesize(FILE *);

isn't it?

I.e. the standard would abstract away from the programmer all the
details of how to do this quite ELEMENTARY operation!

If there are file systems where there is no way to know that besides
by reading the whole file, then THOSE SYSTEMS would be forced to do that
not everyone!

If you use that approach, there's no need to worry about how lines are
terminated for any particular implementation, whether with CR, LF,
CRLF, LFCR, or by padding the lines with null characters to the end of
a fixed maximum line size, or any other method that an implementation
finds ocnvev

THAT could be left to the implementation of "filesize" and we
could rely on a portable way of doing that. The objective of this
message is precisely to point to the lack of functionality in
the standard library.

>3) I used different values for errno defined by POSIX, but not by

I saw no reason why, at any point, your program should set errno to
any value other than whichever one the underlying C standard library
functions would have already set it to. If fopen() doesn't already set
ENOENT, why should strfromfile()? If fseek() doesn't already set EIO,
why should strfromfile()? If malloc() doesn't already set ENOMEM, why
should strfromfile()? Don't make things harder than they have to be;
even if strfromfile() were standardized, I'd still recommend that the
implementor piggy-back off the lower level functions it calls, rather
than setting it's own error codes.

I *had* to do this since there is NOW no guarantee that the lower level
functions do this. I was programming around a limitation of standard C.

I do not understand why the committee ignores all the work done by POSIX
and incorporates their error analysis into standard C for the low level
functions like fread/fopen, etc.

Why refusing to do a more sophisticated error analysis than just testing
for NULL?

I cite the standard for the fopen function:
Returns
The fopen function returns a pointer to the object controlling the
stream. If the open operation fails, fopen returns a null pointer.

Not a SINGLE WORD of error analysis more sophisticated than

"it failed".

This is really level ZERO of error analysis.

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 26 '07 #11

user923005

On Dec 26, 2:01*pm, jacob navia <ja...@nospam.comwrote:

Eric Sosman wrote:
jacob navia wrote:
[...]
You can't do *anything* in just standard C.

* * Then why do you bother with this newsgroup? *Why do
you waste your time on a powerless language? *Why don't
you go away and become a regular on comp.lang.mumps or
comp.lang.apl or any newsgroup devoted to a language you
consider more useful than C? *Since C has zero utility
(in your stated estimation), even comp.lang.cobol would
be a forum of more value. *Go! *Spend your talent on
something more useful than the torment of us poor old
dinosaurs! *Go!

Stop whining and see the sentence in my message:
<quote>
This confirms my arguments about the need to improve the quality
of the standard library!
<end quote>

The solution for me is to improve what is there.

For you is just "go away" "go to another language" and similar
nonsense.

Every time I point out something that needs to be improved
the regulars are unable to put any coherent argumentation.
look at that:

*Why don't
*you go away and become a regular on comp.lang.mumps or
*comp.lang.apl or any newsgroup devoted to a language you
*consider more useful than C? *Since C has zero utility
*(in your stated estimation), even comp.lang.cobol would
*be a forum of more value. *Go! *Spend your talent on
*something more useful than the torment of us poor old
*dinosaurs! *Go!
*>

Not even the shadow of an argumentation of WHY things
should STAY as they are.

When you act in an inflamatory way, surely you expect an inflamatory
response.
Of course, the game would not be nearly so fun if we all talked to
each other in a civil manner. But that would assume that we actually
wanted to *make* progress.

Dec 26 '07 #12

jacob navia

Julienne Walker wrote:

>
Would building the buffer as you go be too expensive too? Personally,
I was wondering why you went to the trouble of even *trying* to
transform a binary file into a text file. Why not simply take a FILE
pointer and let the caller worry about the file mode? Then instead of
the clearly non-portable method of getting a file size, use a
reasonably efficient buffering technique and read the file
sequentially. That solves issues 1 and 2 quite nicely.

If you reduce the requirements of course, it is easy...
But then the usage of your utility is greatly reduced.

>3) I used different values for errno defined by POSIX, but not by
the C standard, that defines only a few. Again, error handling
is not something important to be standardized, according to
the committee. errno is there but its usage is absolutely
not portable at all and goes immediately beyond what standard C
offers.

Then don't use it? It's not like you're barred from writing your own
error handling mechanism, and fortunately you're also not forced to
follow the conventions of the standard library. I see a lot of self-
imposed restrictions in that attempt (no offense), and if you remove
them, I don't think your current claim holds water.

But the point is that that error mechanism wouldn't be standard.

I do not want to argue trhat it is impossible to write this program
in C. I am arguing that it is not possible to write it in STANDARD C.

>We hear again and again that this group is about standard C *"ONLY"*.
Could someone here then, tell me how this simple program could be
written in standard C?

Here's a prototype:

#include <stdio.h>
#include <stdlib.h>

char *readall ( FILE *in, int *n )
{
char *result = NULL;
size_t size = 0;
int curr = 0;
int ch;

while ( ( ch = fgetc ( in ) ) != EOF ) {
if ( curr == size ) {
char *save = realloc ( result, size + BUFSIZ + 1 );

if ( save == NULL )
break;

result = save;
size += BUFSIZ;
}

result[curr++] = (char)ch;
}

*n = curr;

return result;
}

It's fairly naive for a start, but with more detailed requirements I
still don't see how it can't be done in standard C.

1) You suppose that the file pointer is at the beginning of the file
To be sure you should to an fseek before reading...
2) You allocate always in BUFSIZE chunks, and you have an almost 100%
probability of wasting memory.
3) If you run out of memory you return a truncated file, giving the user
NO WAY to know that the data is missing!
4) The string is not zero terminated... You write the EOF value at the
end in most cases.

Look, all those bugs can be easily corrected and your approach is maybe
sounder than mine. You will agree however, that

fpos_t filesize(FILE *);

would be useful isn't it?

>
>This confirms my arguments about the need to improve the quality
of the standard library!

I agree that the standard library needs work, but I don't agree that
this is one of the things that it needs. You'll need to convince me a
little more.

>You can't do *anything* in just standard C.

Excepting any silly mistakes in my code, I'd say that your claim is
false.

You took down the most important use of this utility:
Abstracting away the difference from binary/text files
from the user. If we take that away, it would be useful only
for binary files.

Thanks for your input.

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 26 '07 #13

user923005

On Dec 26, 2:12*pm, jacob navia <ja...@nospam.comwrote:

jameskuy...@verizon.net wrote:
jacob navia wrote:
In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.

It is interesting to see that the answers to that message prove that
programming exclusively in standard C is completely impossible even
for a small and ridiculously simple program like the one I proposed.

The most portable way to determine the size of the buffer needed to
store a stream that might have been opened in text mode is to use
repeated calls to fread() until the end of file has been reached,
keeping track of the total number of bytes read, and reallocating as
you go along. Sure, it's inefficient; *

er... YES!

but for streams which
correspond to devices, rather than files, there's really no
alternative.

We could restrict this to normal files.

Do you understand that 'normal' files behave in completely different
ways, depending on what kind of files they are?
On OpenVMS, for instance, there are sequential files, indexed files,
and relative files. The files can be compressed, and (if indexed) the
indexes can be compressed. This compression can be partial or total.
How will your "simple" functions deal with this one complexity on a
single OS?

Even for streams which correspond to actual files, there
are real OSs where there's no more efficient method of finding the
length of the file.

I just can't imagine a file system that doesn't provide a way
of knowing the length of a file. Maybe there is SOMEWHERE in
the world a crazy file system like that but why should we
care about it?

The best you can ever hope for is an estimate. On the RMS example
from above, if you have a compressed file, it might be very tiny but
contain a million records. So will knowing the size of the file give
you meaningful information about what it contains?

If you don't care about portability to those
platforms, there's no reason why you can't use OS-specific techniques
for determining the file length.

Much simpler would be if we had

size_t filesize(FILE *);

isn't it?

I.e. the standard would abstract away from the programmer all the
details of how to do this quite ELEMENTARY operation!

You do understand that this operation has no real meaning on a multi-
user system?

If there are file systems where there is no way to know that besides
by reading the whole file, then THOSE SYSTEMS would be forced to do that
not everyone!

For systems where we would have to special-case things, we will need
'one-off' functions to handle it. Do these functions belong in the
standard? BTW, the systems where we have to special case things is
"almost all of them."[*]
[*] my informal estimate, based on experience but not measured.

If you use that approach, there's no need to worry about how lines are
terminated for any particular implementation, whether with CR, LF,
CRLF, LFCR, or by padding the lines with null characters to the end of
a fixed maximum line size, or any other method that an implementation
finds ocnvev

THAT could be left to the implementation of "filesize" and we
could rely on a portable way of doing that. The objective of this
message is precisely to point to the lack of functionality in
the standard library.

Please describe how this portable 'filesize' function works just on a
system 3090. If it does not work there, will the compiler vendors
want to implement it? If the compiler vendors do not want to
implement it, shall we add it to the standard?

3) I used different values for errno defined by POSIX, but not by

I saw no reason why, at any point, your program should set errno to
any value other than whichever one the underlying C standard library
functions would have already set it to. If fopen() doesn't already set
ENOENT, why should strfromfile()? If fseek() doesn't already set EIO,
why should strfromfile()? If malloc() doesn't already set ENOMEM, why
should strfromfile()? Don't make things harder than they have to be;
even if strfromfile() were standardized, I'd still recommend that the
implementor piggy-back off the lower level functions it calls, rather
than setting it's own error codes.

I *had* to do this since there is NOW no guarantee that the lower level
functions do this. I was programming around a limitation of standard C.

I do not understand why the committee ignores all the work done by POSIX
and incorporates their error analysis into standard C for the low level
functions like fread/fopen, etc.

Why refusing to do a more sophisticated error analysis than just testing
for NULL?

I cite the standard for the fopen function:
Returns
The fopen function returns a pointer to the object controlling the
stream. If the open operation fails, fopen returns a null pointer.

Not a SINGLE WORD of error analysis more sophisticated than

"it failed".

This is really level ZERO of error analysis.

That is more than zero.

Do you understand that operating system X can return errors that do
not exist on operating system Y?

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatiquehttp://www.cs.virginia.edu/~lcc-win32

Dec 26 '07 #14

jacob navia

user923005 wrote:

On Dec 26, 2:12 pm, jacob navia <ja...@nospam.comwrote:
>jameskuy...@verizon.net wrote:
>>jacob navia wrote:
We could restrict this to normal files.

Do you understand that 'normal' files behave in completely different
ways, depending on what kind of files they are?

Who cares?

The C implementation ALREADY abstracts that away from us.

I know that I can open a file for writing under OpenVMS
or whatever and I can write two bytes in it and read them again.

SO FAR the abstraction works. What I am proposing is just a
bit more of FUNCTIONALITY.

On OpenVMS, for instance, there are sequential files, indexed files,
and relative files. The files can be compressed, and (if indexed) the
indexes can be compressed. This compression can be partial or total.
How will your "simple" functions deal with this one complexity on a
single OS?

In the same manner that fopen abstract all that away from
me.

>>Even for streams which correspond to actual files, there
are real OSs where there's no more efficient method of finding the
length of the file.
I just can't imagine a file system that doesn't provide a way
of knowing the length of a file. Maybe there is SOMEWHERE in
the world a crazy file system like that but why should we
care about it?

The best you can ever hope for is an estimate. On the RMS example
from above, if you have a compressed file, it might be very tiny but
contain a million records. So will knowing the size of the file give
you meaningful information about what it contains?

The value returned should be equivalent to the bytes that I would read
in binary mode.

>>If you don't care about portability to those
platforms, there's no reason why you can't use OS-specific techniques
for determining the file length.
Much simpler would be if we had

size_t filesize(FILE *);

isn't it?

I.e. the standard would abstract away from the programmer all the
details of how to do this quite ELEMENTARY operation!

You do understand that this operation has no real meaning on a multi-
user system?

Ah well. I am dreaming then all the time.

I write

dir

and the multi-user file system tells me the size of each file.

ANd in unix I do

ls

and (WONDER) I get a meaningless result with the file size of
each file.

>If there are file systems where there is no way to know that besides
by reading the whole file, then THOSE SYSTEMS would be forced to do that
not everyone!

For systems where we would have to special-case things, we will need
'one-off' functions to handle it. Do these functions belong in the
standard? BTW, the systems where we have to special case things is
"almost all of them."[*]

If we define filesize as the number of bytes that would
be returned when reading the file we do not have to special case
anything.

But granted, a binary/text mode would be nice.

>
Please describe how this portable 'filesize' function works just on a
system 3090. If it does not work there, will the compiler vendors
want to implement it? If the compiler vendors do not want to
implement it, shall we add it to the standard?

In system 3090 it returns the size of the file. Even in that system,
I can see the size of each file.

But you are surely a bit of outmoded I would say.

According to IBM, you can easily upgrade your system 3090 to a system
9000.

Only thing is that system 9000 was introduced in 1990 (so system 3090
must be a mainframe of 198x!). Even system 9000 doesn't exist anymore,
since IBM retired it in 1998...

You search for VERY current examples isn't it?

:-)

>>
Why refusing to do a more sophisticated error analysis than just testing
for NULL?

I cite the standard for the fopen function:
Returns
The fopen function returns a pointer to the object controlling the
stream. If the open operation fails, fopen returns a null pointer.

Not a SINGLE WORD of error analysis more sophisticated than

"it failed".

This is really level ZERO of error analysis.

That is more than zero.

Do you understand that operating system X can return errors that do
not exist on operating system Y?

So what?

If a mapping from the native error to the given error palette is not
possible, the implementation can return that error code!

But we could PORTABLY test for IO errors, "no memory" errors, etc!
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 26 '07 #15

CBFalconer

jacob navia wrote:

>

.... snip ...

>
I do not understand why the committee ignores all the work done
by POSIX and incorporates their error analysis into standard C
for the low level functions like fread/fopen, etc.

Yes. Read your own quote above and ponder it for a while. As you
come up with reasons write them down. After you have twenty
written, come back here.

--
Merry Christmas, Happy Hanukah, Happy New Year
Joyeux Noel, Bonne Annee, Frohe Weihnachten
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Dec 26 '07 #16

CBFalconer

jacob navia wrote:

>

.... snip ...

>
But the point is that that error mechanism wouldn't be standard.

I do not want to argue trhat it is impossible to write this program
in C. I am arguing that it is not possible to write it in STANDARD C.

I believe 'this program' has to read in a file of text lines, and
store it in a buffer. I can do that in standard C. Do you want it
broken into lines, or just one solid string?

It will also work if the input file is stdin. However, it will not
work if you exceed the limits of the machine on which it runs.

How much are you willing to pay me for the source?

--
Merry Christmas, Happy Hanukah, Happy New Year
Joyeux Noel, Bonne Annee, Frohe Weihnachten
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Dec 26 '07 #17

jacob navia

Eric Sosman wrote:

You wrote: "You can't do *anything* in just standard C."
Do you stand by that statement, or do you retreat from it?
If you stand by it, why are you here?

int main(void) { int n = printf("hello\n");}
How much is n?

No way to know since the error codes of printf
are NOT standardized. This means that I can only
know that n can be... *ANYTHING*. Maybe it wrote
some characters, then stopped, or whatever!

The problem with the lack of standardization of error codes
means that I can't do error checking in a portable way
and thus, no portable program of any importance can be
written that handles the different error situations that
could arise.

In normal software, you *are* interested into why this program/function
call failed. You can't portably do that in standard C;

You can't even know the size of a file without reading it all.

A bit of more functionality would be better for all of us. But
if I am in this group obviously, it is not because I
believe standard C is useless but because I want to fix some
problems with it.

Does this answer your question?
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 26 '07 #18

Bart C

"user923005" <dc*****@connx.comwrote in message
news:79**********************************@v4g2000h sf.googlegroups.com...

jacob navia wrote:

...

>I just can't imagine a file system that doesn't provide a way
of knowing the length of a file. Maybe there is SOMEWHERE in
the world a crazy file system like that but why should we
care about it?

....

>Much simpler would be if we had
size_t filesize(FILE *);

I've been using a function like the following:

unsigned int getfilesize(FILE* handle)
{
unsigned int p,size;
p=ftell(handle); /*p=current position*/
fseek(handle,0,2); /*get eof position*/
size=ftell(handle); /*size in bytes*/
fseek(handle,p,0); /*restore file position*/
return size;
}

What is wrong with this, is it non-standard? (Apart from the likely 4Gb
limit)

>You do understand that this operation has no real meaning on a multi-
user system?

Because anything could happen between getting the size and making use of it?
In that case pretty much everything is impossible.

>and relative files. The files can be compressed, and (if indexed) the
indexes can be compressed. This compression can be partial or total.
How will your "simple" functions deal with this one complexity on a
single OS?

So the file is compressed, so what? If the compression is made transparent
by the OS, I will get the full filesize. If not, will get the size of a
compressed file. The compression is likely irrelevant, and I can't do
anything with it anyway. And if I can, I will know how to decompress and how
to get the inflated size.

Someone mentioned streams in this thread, but on my computer as an example,
I have so many hundred thousand files 99.99...% of which are just a bunch of
so many bytes. This type of 'File' seems so dominant that surely it should
have been given special treatment apart from streams.

Bart

Dec 26 '07 #19

jacob navia

Bart C wrote:

"user923005" <dc*****@connx.comwrote in message
news:79**********************************@v4g2000h sf.googlegroups.com...

>>jacob navia wrote:

..

>>I just can't imagine a file system that doesn't provide a way
of knowing the length of a file. Maybe there is SOMEWHERE in
the world a crazy file system like that but why should we
care about it?

...

>>Much simpler would be if we had
size_t filesize(FILE *);

I've been using a function like the following:

unsigned int getfilesize(FILE* handle)
{
unsigned int p,size;
p=ftell(handle); /*p=current position*/
fseek(handle,0,2); /*get eof position*/
size=ftell(handle); /*size in bytes*/
fseek(handle,p,0); /*restore file position*/
return size;
}

What is wrong with this, is it non-standard? (Apart from the likely 4Gb
limit)

I proposed exactly that, and it works in most systems...

But I was flamed without end. I am interested to see what you will get

:-)

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 26 '07 #20

Richard Heathfield

[My first reply appears to have got lost in the ether. This reply is
somewhat shorter, alas.]

Bart C said:

unsigned int getfilesize(FILE* handle)
{
unsigned int p,size;
p=ftell(handle); /*p=current position*/
fseek(handle,0,2); /*get eof position*/
size=ftell(handle); /*size in bytes*/
fseek(handle,p,0); /*restore file position*/
return size;
}

What is wrong with this, is it non-standard?

Yes. See the Standard's definitions of fseek and ftell, which make it clear
why this is not a portable way for determining a file's size.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Dec 26 '07 #21

user923005

On Dec 26, 3:10*pm, jacob navia <ja...@nospam.comwrote:

user923005 wrote:
On Dec 26, 2:12 pm, jacob navia <ja...@nospam.comwrote:
jameskuy...@verizon.net wrote:
jacob navia wrote:
We could restrict this to normal files.

Do you understand that 'normal' files behave in completely different
ways, depending on what kind of files they are?

Who cares?

The C implementation ALREADY abstracts that away from us.

I know that I can open a file for writing under OpenVMS
or whatever and I can write two bytes in it and read them again.

SO FAR the abstraction works. What I am proposing is just a
bit more of FUNCTIONALITY.

This functionality cannot be achieved. It is literally impossible on
a multi-user system for obvious reasons.

On OpenVMS, for instance, there are sequential files, indexed files,
and relative files. *The files can be compressed, and (if indexed) the
indexes can be compressed. *This compression can be partial or total.
How will your "simple" functions deal with this one complexity on a
single OS?

In the same manner that fopen abstract all that away from
me.

But fopen(), by abstraction of those details, means that in reality it
won't work for most files on an OpenVMS system. That is why the DEC/
COMPAQ/HP C compilers for OpenVMS have tons of extensions for fopen()
that are specific to OpenVMS. Your technique could give you an
answer, but it would be wrong almost all of the time.

>Even for streams which correspond to actual files, there
are real OSs where there's no more efficient method of finding the
length of the file.
I just can't imagine a file system that doesn't provide a way
of knowing the length of a file. Maybe there is SOMEWHERE in
the world a crazy file system like that but why should we
care about it?

The best you can ever hope for is an estimate. *On the RMS example
from above, if you have a compressed file, it might be very tiny but
contain a million records. *So will knowing the size of the file give
you meaningful information about what it contains?

The value returned should be equivalent to the bytes that I would read
in binary mode.

You realize, of course, that with a compressed file, that value has
more than one meaning. In addition, with a multiuser system, the
microsecond after you collect the file, it can be truncated or even
deleted. So the number that it tells you is only a guess at best and
may be a total lie. Do you want an answer that you cannot rely on?
What are you going to do with that answer?

>If you don't care about portability to those
platforms, there's no reason why you can't use OS-specific techniques
for determining the file length.
Much simpler would be if we had

size_t filesize(FILE *);

isn't it?

I.e. the standard would abstract away from the programmer all the
details of how to do this quite ELEMENTARY operation!

You do understand that this operation has no real meaning on a multi-
user system?

Ah well. I am dreaming then all the time.

I write

dir

and the multi-user file system tells me the size of each file.

ANd in unix I do

ls

and (WONDER) I get a meaningless result with the file size of
each file.

And if you do ls again, one second later, all of those files might be
gone or different sized. If you relied on the answer that you got,
you would (at best) get the wrong answer on occasion. If you expected
that the size you got would hold all of the file, and if someone added
a record, then your memory allocation to hold it is too small and when
you read the data, the operation will overwrite memory. If you can
come up with a simple work-around for this obvious and fundamental
problem, I would like to hear of it.

If there are file systems where there is no way to know that besides
by reading the whole file, then THOSE SYSTEMS would be forced to do that
not everyone!

For systems where we would have to special-case things, we will need
'one-off' functions to handle it. *Do these functions belong in the
standard? *BTW, the systems where we have to special case things is
"almost all of them."[*]

If we define filesize as the number of bytes that would
be returned when reading the file we do not have to special case
anything.

I am surprised that you do not understand the ramifications of not
being the only one allowed to access a file.

But granted, a binary/text mode would be nice.

Please describe how this portable 'filesize' function works just on a
system 3090. *If it does not work there, will the compiler vendors
want to implement it? *If the compiler vendors do not want to
implement it, shall we add it to the standard?

In system 3090 it returns the size of the file. Even in that system,
I can see the size of each file.

But you are surely a bit of outmoded I would say.

According to IBM, you can easily upgrade your system 3090 to a system
9000.

Only thing is that system 9000 was introduced in 1990 (so system 3090
must be a mainframe of 198x!). Even system 9000 doesn't exist anymore,
since IBM retired it in 1998...

Most of the data in the world resides on IBM 3090 hardware. But I
guess that it is not very important.

You search for VERY current examples isn't it?

Exactly.

:-)

Why refusing to do a more sophisticated error analysis than just testing
for NULL?

I cite the standard for the fopen function:
Returns
The fopen function returns a pointer to the object controlling the
stream. If the open operation fails, fopen returns a null pointer.

Not a SINGLE WORD of error analysis more sophisticated than

"it failed".

This is really level ZERO of error analysis.

That is more than zero.

Do you understand that operating system X can return errors that do
not exist on operating system Y?

So what?

If a mapping from the native error to the given error palette is not
possible, the implementation can return that error code!

But we could PORTABLY test for IO errors, "no memory" errors, etc!

I think that you will find the errno.h file contains:
1. The error values mandated by the standard (hint: there is more
than 1)
2. Any other error values that are pertiant for the system on which
the code is compiled.

Dec 27 '07 #22

Julienne Walker

On Dec 26, 5:27 pm, jacob navia <ja...@nospam.comwrote:

Julienne Walker wrote:

Would building the buffer as you go be too expensive too? Personally,
I was wondering why you went to the trouble of even *trying* to
transform a binary file into a text file. Why not simply take a FILE
pointer and let the caller worry about the file mode? Then instead of
the clearly non-portable method of getting a file size, use a
reasonably efficient buffering technique and read the file
sequentially. That solves issues 1 and 2 quite nicely.

If you reduce the requirements of course, it is easy...
But then the usage of your utility is greatly reduced.

Perhaps I missed some of the requirements, but my version was based on
your description of the problem: "we all some day needed to read an
entire file into RAM to process it". Given that, it's trivial to write
a function in standard C (pick a standard).

3) I used different values for errno defined by POSIX, but not by
the C standard, that defines only a few. Again, error handling
is not something important to be standardized, according to
the committee. errno is there but its usage is absolutely
not portable at all and goes immediately beyond what standard C
offers.

Then don't use it? It's not like you're barred from writing your own
error handling mechanism, and fortunately you're also not forced to
follow the conventions of the standard library. I see a lot of self-
imposed restrictions in that attempt (no offense), and if you remove
them, I don't think your current claim holds water.

But the point is that that error mechanism wouldn't be standard.

I do not want to argue trhat it is impossible to write this program
in C. I am arguing that it is not possible to write it in STANDARD C.

If you throw in a bunch of non-portable requirements, it's pretty much
a given that the resulting program will also be non-portable.

We hear again and again that this group is about standard C *"ONLY"*.
Could someone here then, tell me how this simple program could be
written in standard C?

Here's a prototype:

#include <stdio.h>
#include <stdlib.h>

char *readall ( FILE *in, int *n )
{
char *result = NULL;
size_t size = 0;
int curr = 0;
int ch;

while ( ( ch = fgetc ( in ) ) != EOF ) {
if ( curr == size ) {
char *save = realloc ( result, size + BUFSIZ + 1 );

if ( save == NULL )
break;

result = save;
size += BUFSIZ;
}

result[curr++] = (char)ch;
}

*n = curr;

return result;
}

It's fairly naive for a start, but with more detailed requirements I
still don't see how it can't be done in standard C.

1) You suppose that the file pointer is at the beginning of the file
To be sure you should to an fseek before reading...

Actually, I made an explicit decision not to rewind the stream because
that limits the usefulness of this function. Just like I expect the
file to be open, I also expect the "file pointer" to be located
wherever the caller wants. This is not a bug.

2) You allocate always in BUFSIZE chunks, and you have an almost 100%
probability of wasting memory.

It's a naive example (I'm reasonably sure I said that already) for
illustrative purposes. I didn't intend it to be flawless production
code. I literally spent two minutes writing it. Anything more
sophisticated would be terribly buggy after two minutes. ;-)

3) If you run out of memory you return a truncated file, giving the user
NO WAY to know that the data is missing!

Once again, naive example for illustrative purposes. Didn't I say that
it's a start and not the finished product?

4) The string is not zero terminated... You write the EOF value at the
end in most cases.

Again, this was an explicit decision, not a bug. Your alternative is
equally valid.

Look, all those bugs can be easily corrected and your approach is maybe
sounder than mine. You will agree however, that

fpos_t filesize(FILE *);

would be useful isn't it?

Taken at face value, yes. But when you throw in all of the variables,
it's not quite as obvious. For example, how would we define the size
of a file? The number of bytes? The actual storage cost? The number of
characters after textual conversions are made? All of those are useful
metrics, yet if we include a function for each, the standard library
starts to become bloated. If we define a function that can handle all
of the options, we're likely to get yet another weird function that's
way more complicated than we want.

There's more to standardizing functions than saying "wouldn't it be
useful?".

This confirms my arguments about the need to improve the quality
of the standard library!

I agree that the standard library needs work, but I don't agree that
this is one of the things that it needs. You'll need to convince me a
little more.

You can't do *anything* in just standard C.

Excepting any silly mistakes in my code, I'd say that your claim is
false.

You took down the most important use of this utility:
Abstracting away the difference from binary/text files
from the user. If we take that away, it would be useful only
for binary files.

You didn't abstract it away either. The caller still has to consider
these things to pass in the proper mode. If you "take it away", as you
say I've done, the function is still useful for both binary and text
files because fgetc does the right thing for the stream without any
special work on my part.

I get the distinct impression that you're basing these complaints on
requirements that I'm not aware of. Can you give me a formal
description of this function so that I have a better idea of what I'm
dealing with?

Thanks for your input.

No no, thank you. I noticed that you've been quick to work on bug
reports with lcc-win32. ;-)

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatiquehttp://www.cs.virginia.edu/~lcc-win32

Dec 27 '07 #23

user923005

On Dec 26, 3:32*pm, "Bart C" <b...@freeuk.comwrote:

"user923005" <dcor...@connx.comwrote in message

news:79**********************************@v4g2000h sf.googlegroups.com...

jacob navia wrote:

..

I just can't imagine a file system that doesn't provide a way
of knowing the length of a file. Maybe there is SOMEWHERE in
the world a crazy file system like that but why should we
care about it?

...

Much simpler would be if we had
size_t filesize(FILE *);

I've been using a function like the following:

unsigned int getfilesize(FILE* handle)
{
unsigned int p,size;
p=ftell(handle); * * * * * * * */*p=current position*/
fseek(handle,0,2); * * * * * * */*get eof position*/
size=ftell(handle); * * * * * * /*size in bytes*/
fseek(handle,p,0); * * * * * * */*restore file position*/
return size;

}

What is wrong with this, is it non-standard? (Apart from the likely 4Gb
limit)

You do understand that this operation has no real meaning on a multi-
user system?

Because anything could happen between getting the size and making use of it?
In that case pretty much everything is impossible.

Exactly. Your solution is portable across every single user system
that has only one point of access (but Windows does not qualify, even
though single user because drivers, folders and files can be shared).
I guess it works well for a toaster IC.

and relative files. *The files can be compressed, and (if indexed) the
indexes can be compressed. *This compression can be partial or total.
How will your "simple" functions deal with this one complexity on a
single OS?

So the file is compressed, so what? If the compression is made transparent
by the OS, I will get the full filesize.

No. You only get an estimate. At least with some file systems that I
know of. The only way to get the real count is to do a table scan,
and that only works if you lock the file. Chances are very good that
even if you are allowed to lock the file, users will be very angry at
you for locking the file for the duration of a table scan. I also
guess that they won't be too happy when you map a 12 gig file into
memory on a machine that has 32 bit virtual memory but only 8 gigs
physical RAM.

If not, will get the size of a
compressed file. The compression is likely irrelevant, and I can't do
anything with it anyway. And if I can, I will know how to decompress and how
to get the inflated size.

The operating system does it. You have no control over it
whatsoever. You are not even told the algorithm that they are using.

Someone mentioned streams in this thread, but on my computer as an example,
I have so many hundred thousand files 99.99...% of which are just a bunch of
so many bytes. This type of 'File' seems so dominant that surely it should
have been given special treatment apart from streams.

I agree that it would be nice to have a stream classifier.

e.g.:
struct file_kinds fdescribe(FILE *F);

Dec 27 '07 #24

jacob navia

user923005 wrote:

On Dec 26, 3:10 pm, jacob navia <ja...@nospam.comwrote:
>user923005 wrote:
>>On Dec 26, 2:12 pm, jacob navia <ja...@nospam.comwrote:
jameskuy...@verizon.net wrote:
jacob navia wrote:
We could restrict this to normal files.
Do you understand that 'normal' files behave in completely different
ways, depending on what kind of files they are?
Who cares?

The C implementation ALREADY abstracts that away from us.

I know that I can open a file for writing under OpenVMS
or whatever and I can write two bytes in it and read them again.

SO FAR the abstraction works. What I am proposing is just a
bit more of FUNCTIONALITY.

This functionality cannot be achieved. It is literally impossible on
a multi-user system for obvious reasons.

Please stop that. With the same arguments I can tell that

fseek() is bogus since somebody else can erase the
file after you do your fseek.

ftell: same problem.

fread: the same
fwrite: the same. You wrote something but root took it away.

etc etc.

Please: LET'S BE REALISTIC.

filesize returns the size in bytes that reading character
by character would return if the file is unchanged.

It does NOT guarantee that your coffee has sugar, that your shoes
do not hurt, or that the file will still be there tomorrow.

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 27 '07 #25

Flash Gordon

jacob navia wrote, On 26/12/07 23:10:

user923005 wrote:
>On Dec 26, 2:12 pm, jacob navia <ja...@nospam.comwrote:
>>jameskuy...@verizon.net wrote:
jacob navia wrote:

<snip>

>>If there are file systems where there is no way to know that besides
by reading the whole file, then THOSE SYSTEMS would be forced to do that
not everyone!

For systems where we would have to special-case things, we will need
'one-off' functions to handle it. Do these functions belong in the
standard? BTW, the systems where we have to special case things is
"almost all of them."[*]

If we define filesize as the number of bytes that would
be returned when reading the file we do not have to special case
anything.

Ah, so you don't want to know if the file will fit on the floppy disk
you are about to write it to...

BTW, I can sometimes fit a file larger than the floppy disk (as far as
ls is concerned) on the floppy disk, I just have to do it the
appropriate way.

But granted, a binary/text mode would be nice.

<snip>

I actually agree that some of what you are saying would be nice, but I
can see the problems. I can also see that none of what you are asking
for is actually required for the problem you were trying to solve.
--
Flash Gordon

Dec 27 '07 #26

user923005

On Dec 26, 4:53*pm, jacob navia <ja...@nospam.comwrote:

user923005 wrote:
On Dec 26, 3:10 pm, jacob navia <ja...@nospam.comwrote:
user923005 wrote:
On Dec 26, 2:12 pm, jacob navia <ja...@nospam.comwrote:
jameskuy...@verizon.net wrote:
jacob navia wrote:
We could restrict this to normal files.
Do you understand that 'normal' files behave in completely different
ways, depending on what kind of files they are?
Who cares?

The C implementation ALREADY abstracts that away from us.

I know that I can open a file for writing under OpenVMS
or whatever and I can write two bytes in it and read them again.

SO FAR the abstraction works. What I am proposing is just a
bit more of FUNCTIONALITY.

This functionality cannot be achieved. *It is literally impossible on
a multi-user system for obvious reasons.

Please stop that. With the same arguments I can tell that

fseek() is bogus since somebody else can erase the
file after you do your fseek.

ftell: same problem.

fread: the same
fwrite: the same. You wrote something but root took it away.

Really? You have the file open and seek to a position and the OS lets
someone else delete it? Marvelous. Show me this system that I may
stand in astonishment. For instance, in Windows:

C:\tmp>cl /D_CRT_SECURE_NO_WARNINGS /W4 /Ox fst.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762
for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.

fst.c
Microsoft (R) Incremental Linker Version 8.00.50727.762
Copyright (C) Microsoft Corporation. All rights reserved.

/out:fst.exe
fst.obj

C:\tmp>type fst.c

#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>

char s[32767];

int main(void)
{
FILE *f;
f = fopen("fst.c", "r");
if (f == NULL) {
puts(strerror(errno));
exit(EXIT_FAILURE);
}
fseek(f, 9, SEEK_SET);
puts("Waiting for user input...");
fgets(s, sizeof s, stdin);
fclose(f);
return 0;
}

C:\tmp>fst
Waiting for user input...

In another window:
C:\tmp>del fst.c
C:\tmp\FST.C
The process cannot access the file because it is being used by another
process.

C:\tmp>

etc etc.

Please: LET'S BE REALISTIC.

filesize returns the size in bytes that reading character
by character would return if the file is unchanged.

It does NOT guarantee that your coffee has sugar, that your shoes
do not hurt, or that the file will still be there tomorrow.

I would like the number to be actually useful. Unfortunately you have
not proposed any system where it would have utility.

Dec 27 '07 #27

user923005

On Dec 26, 3:32*pm, "Bart C" <b...@freeuk.comwrote:

"user923005" <dcor...@connx.comwrote in message

news:79**********************************@v4g2000h sf.googlegroups.com...

jacob navia wrote:

..

I just can't imagine a file system that doesn't provide a way
of knowing the length of a file. Maybe there is SOMEWHERE in
the world a crazy file system like that but why should we
care about it?

...

Much simpler would be if we had
size_t filesize(FILE *);

I've been using a function like the following:

unsigned int getfilesize(FILE* handle)
{
unsigned int p,size;
p=ftell(handle); * * * * * * * */*p=current position*/
fseek(handle,0,2); * * * * * * */*get eof position*/
size=ftell(handle); * * * * * * /*size in bytes*/
fseek(handle,p,0); * * * * * * */*restore file position*/
return size;

}

What is wrong with this, is it non-standard? (Apart from the likely 4Gb
limit)

You do understand that this operation has no real meaning on a multi-
user system?

Because anything could happen between getting the size and making use of it?
In that case pretty much everything is impossible.

and relative files. *The files can be compressed, and (if indexed) the
indexes can be compressed. *This compression can be partial or total.
How will your "simple" functions deal with this one complexity on a
single OS?

So the file is compressed, so what? If the compression is made transparent
by the OS, I will get the full filesize. If not, will get the size of a
compressed file. The compression is likely irrelevant, and I can't do
anything with it anyway. And if I can, I will know how to decompress and how
to get the inflated size.

Someone mentioned streams in this thread, but on my computer as an example,
I have so many hundred thousand files 99.99...% of which are just a bunch of
so many bytes. This type of 'File' seems so dominant that surely it should
have been given special treatment apart from streams.

An aside:

12.25: What's the difference between fgetpos/fsetpos and ftell/fseek?
What are fgetpos() and fsetpos() good for?

A: ftell() and fseek() use type long int to represent offsets
(positions) in a file, and may therefore be limited to offsets
of about 2 billion (2**31-1). The newer fgetpos() and
fsetpos()
functions, on the other hand, use a special typedef, fpos_t,
to
represent the offsets. The type behind this typedef, if
chosen
appropriately, can represent arbitrarily large offsets, so
fgetpos() and fsetpos() can be used with arbitrarily huge
files.
fgetpos() and fsetpos() also record the state associated with
multibyte streams. See also question 1.4.

References: K&R2 Sec. B1.6 p. 248; ISO Sec. 7.9.1,
Secs. 7.9.9.1,7.9.9.3; H&S Sec. 15.5 p. 252.

Dec 27 '07 #28

Gordon Burditt

>You do understand that this operation has no real meaning on a multi-

>user system?

Ah well. I am dreaming then all the time.

I write

dir

and the multi-user file system tells me the size of each file.

ANd in unix I do

ls

and (WONDER) I get a meaningless result with the file size of
each file.

You mean it tells you what the size of the file USED TO BE when the
command ran. It might not be that size now. Files can grow. Or
be chopped to zero length. Or be deleted.

Try running "ls -l log.txt" and see what it lists for the file
size of log.txt.

If you want to use a filesize() function to provide an INITIAL ESTIMATE
of the file size, it might work, provided you can tolerate the actual
value being higher or lower. And in this case, issues of line endings
might not matter so much. The estimate for "reasonable" files might
be off by less than 10% and not hurt efficiency too much.

>>If there are file systems where there is no way to know that besides
by reading the whole file, then THOSE SYSTEMS would be forced to do that
not everyone!

Even some unitasking systems might have problems with file sizes
changing, if, for example, it was invoked on the file stdout is
redirected to.

>If we define filesize as the number of bytes that would
be returned when reading the file we do not have to special case
anything.

Only if you can make the answer not change between the time you compute
the answer and the time you use it.

Dec 27 '07 #29

user923005

On Dec 26, 5:30*pm, gordonb.x8...@burditt.org (Gordon Burditt) wrote:

You do understand that this operation has no real meaning on a multi-
user system?

Ah well. I am dreaming then all the time.

I write

dir

and the multi-user file system tells me the size of each file.

ANd in unix I do

ls

and (WONDER) I get a meaningless result with the file size of
each file.

You mean it tells you what the size of the file USED TO BE when the
command ran. *It might not be that size now. *Files can grow. *Or
be chopped to zero length. *Or be deleted.

Try running "ls -l log.txt" and see what it lists for the file
size of log.txt.

If you want to use a filesize() function to provide an INITIAL ESTIMATE
of the file size, it might work, provided you can tolerate the actual
value being higher or lower. *And in this case, issues of line endings
might not matter so much. *The estimate for "reasonable" files might
be off by less than 10% and not hurt efficiency too much.

>If there are file systems where there is no way to know that besides
by reading the whole file, then THOSE SYSTEMS would be forced to do that
not everyone!

Even some unitasking systems might have problems with file sizes
changing, if, for example, it was invoked on the file stdout is
redirected to.

If we define filesize as the number of bytes that would
be returned when reading the file we do not have to special case
anything.

Only if you can make the answer not change between the time you compute
the answer and the time you use it.

This also assumes that the answer is correct.
For instance, many operating systems report file sizes in blocks. That
tells you (even if you lock the file) only an approximate size.

You do know that the real file size is <= (reported blocks * bytes per
block) but this number will be totally bogus for compressed files.

It seems like such a handy thing to have this "file size" function.
Is it astounding that we don't have one? Not at all, when you
consider the problems associated with collection of a number like that
in a way that is reliable. In fact, the remarkable thing is that
thinking people would even debate it, because the problems with the
collection of such a number are so patently obvious. The system
library functions are designed to return something better than a
guess. If a given function would only be able to produce a guess, the
implementors decided not to write them. After all, someone might use
that guess and it will be wrong sometimes.

Dec 27 '07 #30

Gordon Burditt

>This functionality cannot be achieved. It is literally impossible on

>a multi-user system for obvious reasons.

Please stop that. With the same arguments I can tell that

fseek() is bogus since somebody else can erase the
file after you do your fseek.

On a POSIX system, fseek() still works EVEN IF someone remove()s
the file between the fseek() and a fread() or fwrite() after it.

>ftell: same problem.

On a POSIX system, ftell() still works EVEN IF someone remove()s
the file between a ftell() and a fread(), fwrite(), or fseek() after
it.

>fread: the same
fwrite: the same. You wrote something but root took it away.

On a POSIX system, fread() and fwrite() still work EVEN IF someone
remove()s the file. Ok, the file does go away after it gets
fclose()d.

>Please: LET'S BE REALISTIC.

Files that are continuously growing are common. This could include
the standard output of the running program itself, or files being
syslogged to, or all sorts of things. If a program creates a file,
chances are the file exists at some point with a size between zero
and its final size.

>filesize returns the size in bytes that reading character
by character would return if the file is unchanged.

And if someone changes the file instead, it's acceptable to segfault?

>It does NOT guarantee that your coffee has sugar, that your shoes
do not hurt, or that the file will still be there tomorrow.

Dec 27 '07 #31

Gordon Burditt

This functionality cannot be achieved. =A0It is literally impossible on

a multi-user system for obvious reasons.

Please stop that. With the same arguments I can tell that

fseek() is bogus since somebody else can erase the
file after you do your fseek.

ftell: same problem.

fread: the same
fwrite: the same. You wrote something but root took it away.

Really? You have the file open and seek to a position and the OS lets
someone else delete it? Marvelous.

Yes, and furthermore you can still read from the file, or write to it,
*AFTER* someone else deletes it (until you fclose() it).

>Show me this system that I may
stand in astonishment.

UNIX or POSIX.

[Windows example of deleting failing on open file deleted.]

Dec 27 '07 #32

Stephen Montgomery-Smith

Eric Sosman wrote:

jacob navia wrote:
>Eric Sosman wrote:
>>jacob navia wrote:
[...]
You can't do *anything* in just standard C.

Then why do you bother with this newsgroup? Why do
you waste your time on a powerless language? Why don't
you go away and become a regular on comp.lang.mumps or
comp.lang.apl or any newsgroup devoted to a language you
consider more useful than C? Since C has zero utility
(in your stated estimation), even comp.lang.cobol would
be a forum of more value. Go! Spend your talent on
something more useful than the torment of us poor old
dinosaurs! Go!

Stop whining and see the sentence in my message:
<quote>
This confirms my arguments about the need to improve the quality
of the standard library!
<end quote>

You wrote: "You can't do *anything* in just standard C."
Do you stand by that statement, or do you retreat from it?
If you stand by it, why are you here?

I think his "anything" was hyperbole, and clearly meant as such! IMHO
hyperbole is a proper form of communication, but does require that the
recipient isn't a pedant!

Dec 27 '07 #33

Stephen Montgomery-Smith

user923005 wrote:

When you act in an inflamatory way, surely you expect an inflamatory
response.
Of course, the game would not be nearly so fun if we all talked to
each other in a civil manner. But that would assume that we actually
wanted to *make* progress.

Yes, but inflammatory exchanges is a major part of what makes this group
so fun to read. So please don't discourage his participation!

Dec 27 '07 #34

Stephen Montgomery-Smith

jacob navia wrote:

In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.

It is interesting to see that the answers to that message prove that
programming exclusively in standard C is completely impossible even
for a small and ridiculously simple program like the one I proposed.

1 I read the file contents in binary mode, what should allow me
to use ftell/fseek to determine the file size.

No objections to this were raised, except of course the obvious
one, if the "file" was some file associated with stdin, for
instance under some unix machine /dev/tty01 or similar...

I did not test for this since it is impossible in standard C:
isatty() is not in the standard.

2) There is NO portable way to determine which characters should be
ignored when transforming a binary file into a text file. One
reader (CB Falconer) proposed to open the file in binary mode
and then in text mode and compare the two buffers to see which
characters were missing... Well, that would be too expensive.

3) I used different values for errno defined by POSIX, but not by
the C standard, that defines only a few. Again, error handling
is not something important to be standardized, according to
the committee. errno is there but its usage is absolutely
not portable at all and goes immediately beyond what standard C
offers.

We hear again and again that this group is about standard C *"ONLY"*.
Could someone here then, tell me how this simple program could be
written in standard C?

This confirms my arguments about the need to improve the quality
of the standard library!

You can't do *anything* in just standard C.

As a newcomer to this group who hasn't even read the FAQ, let me
nevertheless brazenly seek to answer your question.

I think you are correct in that standard C is of somewhat limited value.
But perhaps we should see standard C as perhaps a tool to be embedded
into real C, rather than as an object with value in of itself. By "real
C", I mean any implementation that is used in real life (Visual C, GCC
on Linux, etc).

Now there is a sense in which the kind of function you are asking about
- to put a file into memory - is really the kind of thing a systems
programmer would do. Any portable version of such a function would
typically be much slower than any special function designed around a
particular OS. Most importantly, it would be a pointless thing to add
to the standard, because rather than liberating OS creators it would
hamstring them. Instead of standard C being this tremendously powerful
springboard from which to create useful implementations, it would go the
way PASCAL went, great in theory, but too limited in practice. Even if
your function existed in standard C, I would still use mmap() for my
unix programs, because I know that mmap is designed to work well with my
operating system of choice.

Now this particular newsgroup has chosen to make standard C its only
legitimate discussion point. This is a bit awkward to newcomers to this
group, because for most groups the name is somewhat self-explanatory,
and one would normally expect a group with the name comp.lang.c to be a
general discussion ground of all things related to C. So people just
post their messages without reading the FAQ, and for most groups
newgroups this works just fine.

But I can also see why some folks would like a "standard C only"
discussion group. One problem is that "all things related to C" is a
huge subject, especially for the kinds of people likely to be using
newsgroups. Now I can see that discussion of standard C only will
necessarily be rather arcane discussions, but there should be a place to
do this, and why not this place?

They could rename their newsgroup to comp.lang.c-standard or such like,
but then the group would get far less postings. As such it would become
like alt.sci.math.galois_fields (a random example that came to mind),
which is mostly spam with only sporadic postings that are even slightly
on-topic. No. They are much better off with a name like comp.lang.c so
that the off-topic but non-spam postings at least outnumber the spam
postings. Now if only the regulars could learn to be more friendly and
patient in redirecting newcomers to the groups they really need, but I
am not the behavior police! And anyway, it is fun for lurkers like me
to read postings by those who have such an awkward combination of easily
giving offense and easily taking offense.

Stephen

Dec 27 '07 #35

backslash null

"Stephen Montgomery-Smith" <st*****@missouri.eduwrote in message
news:Qs******************************@centurytel.n et...

Eric Sosman wrote:
>jacob navia wrote:

>>Stop whining and see the sentence in my message:
<quote>
This confirms my arguments about the need to improve the quality
of the standard library!
<end quote>

You wrote: "You can't do *anything* in just standard C."
Do you stand by that statement, or do you retreat from it?
If you stand by it, why are you here?

I think his "anything" was hyperbole, and clearly meant as such! IMHO
hyperbole is a proper form of communication, but does require that the
recipient isn't a pedant!

So 'stop whining' was from Jacob to Dr. Sosman. Tja.

Without alternatives such as Jacob's lcc, standard C, the syntax to which
*all* of ISO's sexiest syntaxes has reference, would be six thousand words
about const. And a thousand points of perfect, republican light.

'Pedant" is a badge he wears proudly. I love using antecedents improperly
as does Jabba. Arizonans don't seem to have a problem with being
systematically wrong.
--
Connecticut Sucks

Liebermann endorses McCain

>>>>[...]
You can't do *anything* in just standard C.

Then why do you bother with this newsgroup? Why do
you waste your time on a powerless language? Why don't
you go away and become a regular on comp.lang.mumps or
comp.lang.apl or any newsgroup devoted to a language you
consider more useful than C? Since C has zero utility
(in your stated estimation), even comp.lang.cobol would
be a forum of more value. Go! Spend your talent on
something more useful than the torment of us poor old
dinosaurs! Go!

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Dec 27 '07 #36

Stephen Montgomery-Smith

backslash null wrote:

"Stephen Montgomery-Smith" <st*****@missouri.eduwrote in message
news:Qs******************************@centurytel.n et...
>Eric Sosman wrote:
>>jacob navia wrote:

>>>Stop whining and see the sentence in my message:
<quote>
This confirms my arguments about the need to improve the quality
of the standard library!
<end quote>
You wrote: "You can't do *anything* in just standard C."
Do you stand by that statement, or do you retreat from it?
If you stand by it, why are you here?
I think his "anything" was hyperbole, and clearly meant as such! IMHO
hyperbole is a proper form of communication, but does require that the
recipient isn't a pedant!

So 'stop whining' was from Jacob to Dr. Sosman. Tja.

Without alternatives such as Jacob's lcc, standard C, the syntax to which
*all* of ISO's sexiest syntaxes has reference, would be six thousand words
about const. And a thousand points of perfect, republican light.

'Pedant" is a badge he wears proudly. I love using antecedents improperly
as does Jabba. Arizonans don't seem to have a problem with being
systematically wrong.

I have this feeling that you said something very intelligent here,
probably rubbing my face in the mud. But I confess I don't get it!

Dec 27 '07 #37

Richard Heathfield

[Stephen's reply, whilst long, was well worth reading. I only have comments
to make on a tiny portion of it. Please imagine that, instead of snipping
the rest, I had quoted it all and written <aol>I agree!</aolunderneath.]

Stephen Montgomery-Smith said:

jacob navia wrote:

<snip>

>>
You can't do *anything* in just standard C.

As a newcomer to this group who hasn't even read the FAQ, let me
nevertheless brazenly seek to answer your question.

I think you are correct in that standard C is of somewhat limited value.

*All* tools are of somewhat limited value. I think many people would be
astounded at just how much can be done with standard C, and just how
widely that functionality can be implemented.

But perhaps we should see standard C as perhaps a tool to be embedded
into real C, rather than as an object with value in of itself.

How do you feel about s/rather than/as well/ - because I think that such a
change reflects reality rather more closely. Certainly for my own part, I
know that my use of what you call "real C" (by which you appear to mean "C
+ non-ISO9899 libraries") is dwarfed by my use of ISO C. Most of the C
programs I write are ISO C programs. Only a very small proportion use
non-ISO9899 libraries.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Dec 27 '07 #38

Malcolm McLean

"jacob navia" <ja***@nospam.comwrote in message

In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.

It is interesting to see that the answers to that message prove that
programming exclusively in standard C is completely impossible even
for a small and ridiculously simple program like the one I proposed.

/*
function to slurp in an ASCII file
Params: path - path to file
Returns: malloced string containing whole file
*/
char *loadfile(char *path)
{
FILE *fp;
int ch;
long i = 0;
long size = 0;
char *answer;

fp = fopen(path, "r");
if(!fp)
{
printf("Can't open %s\n", path);
return 0;
}

fseek(fp, 0, SEEK_END);
size = ftell(fp);
fseek(fp, 0, SEEK_SET);

answer = malloc(size + 100);
if(!answer)
{
printf("Out of memory\n");
fclose(fp);
return 0;
}

while( (ch = fgetc(fp)) != EOF)
answer[i++] = ch;

answer[i++] = 0;

fclose(fp);

return answer;
}

This will do it. Add 100 + size/10 for luck if paranoid.
You are right that a perverse implementation can break this, which is a bug
in the standard.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Dec 27 '07 #39

Malcolm McLean

"Julienne Walker" <ha*********@hotmail.comwrote in message #

I get the distinct impression that you're basing these complaints on
requirements that I'm not aware of. Can you give me a formal
description of this function so that I have a better idea of what I'm
dealing with?

It's got to load a text file into a contiguous block of RAM, on any platform
running ANSI standard C.
Implied is that it shouldn't waste memory, make too many passes over the
data, or repeatedly reallocate.
It can't be done, because implementations don't have to return an index from
ftell(). So you need to call fgetc() iteratively to get the size of the
file. However MiniBasic has to load scripts, and the function I used is in
practise good enough.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Dec 27 '07 #40

Eric Sosman

Bart C wrote:

"user923005" <dc*****@connx.comwrote in message
news:79**********************************@v4g2000h sf.googlegroups.com...

>>jacob navia wrote:

..

>>I just can't imagine a file system that doesn't provide a way
of knowing the length of a file. Maybe there is SOMEWHERE in
the world a crazy file system like that but why should we
care about it?

...

>>Much simpler would be if we had
size_t filesize(FILE *);

I've been using a function like the following:

unsigned int getfilesize(FILE* handle)
{
unsigned int p,size;
p=ftell(handle); /*p=current position*/
fseek(handle,0,2); /*get eof position*/
size=ftell(handle); /*size in bytes*/
fseek(handle,p,0); /*restore file position*/
return size;
}

What is wrong with this, is it non-standard? (Apart from the likely 4Gb
limit)

Several things are wrong with it, even apart from the
possible 64KB limit.

Zeroth, you should have #include'd <stdio.h>. I'll let
you get away with this one, though, on the grounds that since
you're using FILE you probably *have* #include'd it but just
failed to show the inclusion.

First, there's no error checking. None, nada, zero, zip.

Second, ftell() returns a long. When you store the long
value in an unsigned int, the conversion might not preserve
the value; you may end up seeking back to a different place
than you started. (Or, on a text stream, you may invoke
undefined behavior since the value of `p' in the second fseek()
may not be the value ftell() returned.)

Third, what are the magic numbers 2 and 0 that you use
as the third arguments in the fseek() calls? My guess is
that they are the expansions of the macros SEEK_END and
SEEK_CUR on some system you once used, and that you've
decided for some bizarre reason to avoid using the macros.
So the values will be right (one supposes) on that system,
but there's no telling what they might mean on another.

Fourth, for a text stream the value returned by ftell() is
not necessarily a byte count; it is a value with an unspecified
encoding. Calling it a "file size" makes unwarranted assumptions.

Fifth, there's 7.19.9.2p3: "A binary stream need not
meaningfully support fseek calls with a whence value of SEEK_END."
So if SEEK_END expands to the value 2 (see above), the first
ftell() call may be meaningless on a binary stream.

Sixth, for a binary stream there may be an unspecified
number of extraneous zero bytes after the last byte actually
written to the file. (This isn't as bad as the others, because
if you read the file you'll actually be able to read those
zeroes if they are present: They behave as if they're in the
file, even though they may never have been written to it.)

But other than that, it looks pretty good.

--
Eric Sosman
es*****@ieee-dot-org.invalid

Dec 27 '07 #41

Eric Sosman

jacob navia wrote:

Eric Sosman wrote:
> You wrote: "You can't do *anything* in just standard C."
Do you stand by that statement, or do you retreat from it?
If you stand by it, why are you here?

int main(void) { int n = printf("hello\n");}
How much is n?

Answer for the code as shown: Impossible to tell,
because the code needn't even compile under C99 rules,
and invokes undefined behavior in both C90 and C99.

Answer for the code as probably intended: Either
six or an unspecified negative number.

No way to know since the error codes of printf
are NOT standardized. This means that I can only
know that n can be... *ANYTHING*. Maybe it wrote
some characters, then stopped, or whatever!

No, n cannot be "*ANYTHING*". For example, it cannot
be forty-two.

The problem with the lack of standardization of error codes
means that I can't do error checking in a portable way
and thus, no portable program of any importance can be
written that handles the different error situations that
could arise.

No such program can be portable anyhow, since the list
of potential failure modes is system-specific. Do you want
to force an implementation to throw away information about
the cause of a failure, simply to cram its diagnosis into
one least-common-denominator framework of failure codes?
Perhaps you do: I see that your "Happy christmas" effort
diagnoses *every* fopen() failure as "file not found" --
no "file locked by another user," no "too many open files,"
no "insufficient memory," no "permission denied," just "file
not found." (Well, at least you're following an established
precedent: "Tapes? What tapes? There are no such tapes, and
besides, we burned 'em.")

In normal software, you *are* interested into why this program/function
call failed. You can't portably do that in standard C;

Right. When you have enumerated all the failure conditions
for all the file systems that C has run on, runs on today, or
will run on in the future, then you can talk about a comprehensive
and portable encoding scheme for them.

You can't even know the size of a file without reading it all.

This is true, and sometimes a problem. Not usually, but
sometimes.

A bit of more functionality would be better for all of us. But
if I am in this group obviously, it is not because I
believe standard C is useless but because I want to fix some
problems with it.

Either you don't comprehend the difficulty, or you have
seen a way to solve it that has eluded a lot of other people.
The latter would be better for everyone (if you're willing to
share the solution under not-too-expensive terms), but from
the content of your posts over the years I greatly fear that
the case is the former.

Does this answer your question?

No. You made a blanket, all-inclusive statement that
"You can't do *anything* in just standard C," and I asked
whether you stood by it or would retreat from it. You have
still neither affirmed nor recanted your claim.

--
Eric Sosman
es*****@ieee-dot-org.invalid

Dec 27 '07 #42

army1987

jacob navia wrote:

fpos_t filesize(FILE *);

would be useful isn't it?

On my system fpos_t isn't an integer. It isn't an arithmetic type, either.
It isn't a scalar, either.
How do I convert an object whose type looks like
typedef struct
{
__off_t __pos;
__mbstate_t __state;
} _G_fpos_t;
typedef _G_fpos_t fpos_t;
to a number?

--
Army1987 (Replace "NOSPAM" with "email")

Dec 27 '07 #43

jacob navia

army1987 wrote:

jacob navia wrote:

>fpos_t filesize(FILE *);

would be useful isn't it?
On my system fpos_t isn't an integer. It isn't an arithmetic type, either.
It isn't a scalar, either.
How do I convert an object whose type looks like
typedef struct
{
__off_t __pos;
__mbstate_t __state;
} _G_fpos_t;
typedef _G_fpos_t fpos_t;
to a number?

you convert the __pos member into a long long.
Read the docs, maybe you are interested in the
mbstate member, maybe not.

In any case I would say that a long long
result would be a better return type.

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 27 '07 #44

Richard Heathfield

jacob navia said:

Eric Sosman wrote:
>jacob navia wrote:
>>Eric Sosman wrote:
You wrote: "You can't do *anything* in just standard C."
Do you stand by that statement, or do you retreat from it?
If you stand by it, why are you here?
int main(void) { int n = printf("hello\n");}
How much is n?

Answer for the code as shown: Impossible to tell,
because the code needn't even compile under C99 rules,
and invokes undefined behavior in both C90 and C99.

WOW. How clever you are.

Sarcasm doesn't work very well when you're in the wrong. If you don't want
people to post blindingly obvious corrections to your code, don't make
blindingly obvious mistakes.

<snip>

You establish a false alternative. If somebody asks for
better standardization of error codes, you say that the
alternativew are
o NOTHING (no standardization at all)
o a comprehensive error list of all possible error codes.

The OBVIOUS alternative of standardizing the most common ones
(IO error, not enough memory, incorrect argument, etc)
and leaving to the implementation to return more explicit error codes
is not at all considered...

On the contrary, that's what ISO did. That's why we have EDOM and ERANGE.
The difference between what you suggest and what they actually
standardised is mere haggling over where to draw the line. If you want
more error codes added to the Standard, lobby ISO to that effect.
Complaining about it in comp.lang.c won't achieve anything, because
comp.lang.c doesn't write the Standard.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Dec 27 '07 #45

Flash Gordon

Malcolm McLean wrote, On 27/12/07 12:12:

>
"jacob navia" <ja***@nospam.comwrote in message
>In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.

It is interesting to see that the answers to that message prove that
programming exclusively in standard C is completely impossible even
for a small and ridiculously simple program like the one I proposed.

/*
function to slurp in an ASCII file
Params: path - path to file
Returns: malloced string containing whole file
*/
char *loadfile(char *path)
{
FILE *fp;
int ch;
long i = 0;
long size = 0;
char *answer;

fp = fopen(path, "r");

OK, you got the mode right for the file so you've done better than Jacob.

if(!fp)
{
printf("Can't open %s\n", path);
return 0;
}

fseek(fp, 0, SEEK_END);

You should check for success.

size = ftell(fp);

Using a method you know is not portable is hardly the best way to answer
Jacob's challenge.

fseek(fp, 0, SEEK_SET);

answer = malloc(size + 100);
if(!answer)
{
printf("Out of memory\n");
fclose(fp);
return 0;

You should try for consistent indenting.

}

while( (ch = fgetc(fp)) != EOF)
answer[i++] = ch;

This could overrun your buffer since you don't check.

answer[i++] = 0;

fclose(fp);

return answer;
}

This will do it. Add 100 + size/10 for luck if paranoid.
You are right that a perverse implementation can break this, which is a
bug in the standard.

Or a limitation due to the limitations of existing systems.

Of course, if you had bothered to add in a few simple checks you could
have produced a solution that would work for files up the the maximum
size of block that can be allocated. So get your best guess of the file
size and then expand the buffer if the file turns out to be larger (or
the fseek or ftell failed) and optionally shrink it down at the end.

Since the systems I work with can have larger files than the total of
physical+virtual memory such a function is of no real use to be.
--
Flash Gordon

Dec 27 '07 #46

Stephen Montgomery-Smith

Richard Heathfield wrote:

[Stephen's reply, whilst long, was well worth reading. I only have comments
to make on a tiny portion of it. Please imagine that, instead of snipping
the rest, I had quoted it all and written <aol>I agree!</aolunderneath.]

Stephen Montgomery-Smith said:

>jacob navia wrote:
<snip>

>>You can't do *anything* in just standard C.

As a newcomer to this group who hasn't even read the FAQ, let me
nevertheless brazenly seek to answer your question.

I think you are correct in that standard C is of somewhat limited value.

*All* tools are of somewhat limited value. I think many people would be
astounded at just how much can be done with standard C, and just how
widely that functionality can be implemented.

> But perhaps we should see standard C as perhaps a tool to be embedded
into real C, rather than as an object with value in of itself.

How do you feel about s/rather than/as well/ - because I think that such a
change reflects reality rather more closely. Certainly for my own part, I
know that my use of what you call "real C" (by which you appear to mean "C
+ non-ISO9899 libraries") is dwarfed by my use of ISO C. Most of the C
programs I write are ISO C programs. Only a very small proportion use
non-ISO9899 libraries.

Of course, you are correct.

But to reiterate my points - many years ago I used to program in PASCAL.
The problem was PASCAL had certain limitations, and so to overcome
them every implementation had to have certain non-standard extensions.

Then I switched to C. C also has limitations, because a programming
language simply cannot cover every eventuality that a user or OS might
need. But C was defined in a sufficiently ambiguous manner that all the
extensions were permitted by the standard, and one still had standard C.
Somehow the inventors of C (and their successor standards bodies)
attained that delicate balance, because of course to be too ambiguous
would be just as bad as being too strict.

Another thing about C - somehow it is easy to use. PASCAL, I remember,
was very klunky, and it took too many typestrokes to accomplish
something very simple. Next, the other day, a friend sent me a program
written in FORTRAN, and I simply couldn't read it! And this program was
was performing numerical analysis, something that while perhaps
mathematically difficult, is simple from a programming point of view.
On the other hand, I can read C code for OS internals, minimally
commented, and as long as I know broadly what the code is meant to do,
it reads very easily.

Stephen

Dec 27 '07 #47

jacob navia

What do I mean with error analysis?

Something like this
FOPEN
[snip]

ERRORS
The fopen() function shall fail if:
[EACCES]
Search permission is denied on a component of the path prefix, or the
file exists and the permissions specified by mode are denied, or the
file does not exist and write permission is denied for the parent
directory of the file to be created.
[EINTR]
A signal was caught during fopen().
[EISDIR]
The named file is a directory and mode requires write access.
[ELOOP]
A loop exists in symbolic links encountered during resolution of the
path argument.
[EMFILE]
{OPEN_MAX} file descriptors are currently open in the calling process.
[ENAMETOOLONG]
The length of the filename argument exceeds {PATH_MAX} or a pathname
component is longer than {NAME_MAX}.
[ENFILE]
The maximum allowable number of files is currently open in the system.
[ENOENT]
A component of filename does not name an existing file or filename is an
empty string.
[ENOSPC]
The directory or file system that would contain the new file cannot be
expanded, the file does not exist, and the file was to be created.
[ENOTDIR]
A component of the path prefix is not a directory.
[ENXIO]
The named file is a character special or block special file, and the
device associated with this special file does not exist.
[EOVERFLOW]
The named file is a regular file and the size of the file cannot be
represented correctly in an object of type off_t.
[EROFS]
The named file resides on a read only file system and write access was
specified.

You see?
An implementation would be allowed to extend this errors but we could
portably test for a certain kind of error.

To test if a file does not exist I could test for ENOENT when I try
to open it. I could test EISDIR to see if this file is a directory...
etc etc!
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 27 '07 #48

Malcolm McLean

"Flash Gordon" <sp**@flash-gordon.me.ukwrote in message

Malcolm McLean wrote, On 27/12/07 12:12:
>>
fseek(fp, 0, SEEK_END);

You should check for success.

> size = ftell(fp);

Using a method you know is not portable is hardly the best way to answer
Jacob's challenge.

The code is designed to be used in a production environment, and it is
adequate for that. It reads in a MiniBasic script file. If the file is huge
the function will fail, but the interpreter will choke on such an input
anyway.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Dec 27 '07 #49

Flash Gordon

Malcolm McLean wrote, On 27/12/07 12:21:

>
"Julienne Walker" <ha*********@hotmail.comwrote in message #
>I get the distinct impression that you're basing these complaints on
requirements that I'm not aware of. Can you give me a formal
description of this function so that I have a better idea of what I'm
dealing with?

It's got to load a text file into a contiguous block of RAM, on any
platform running ANSI standard C.

Easy for reasonably sized files where it is possible, not possible if
the file is larger than the memory available to the process.

Implied is that it shouldn't waste memory, make too many passes over the
data, or repeatedly reallocate.

Those are not implied by the initial statement of requirements. They
also make it impossible even if you leave behind the strictures of
standard C, since the only way to avoid waste memory is to find the file
size, and on Windows (to take one example) the only way to find the
space required is to do a complete scan of the file since Windows uses 2
bytes in a file to indicate the end of a line and can signal the end of
a text file with another byte at *any* point in the physical file. So
the impossibility is nothing to do with C but everything to do with the
way *common* systems work.

It can't be done, because implementations don't have to return an index
from ftell().

That is a limitation of C because it is a limitation of some of the
underlying systems C runs on, such as Windows.

So you need to call fgetc() iteratively to get the size of
the file.

Any "getfilesize()" function that worked "correctly" for text files on
Windows (i.e. reported the number of characters you can read if the file
is not modified) would have to read the file a byte at a time anyway.

However MiniBasic has to load scripts, and the function I used
is in practise good enough.

Well, I've claimed that writing a function that can (subject to system
limitations) read an entire text file is not hard, so I'm not surprised
by your claim.
--
Flash Gordon

Dec 27 '07 #50

Programming in standard c

Similar topics