473,657 Members | 2,592 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Programming in standard c

In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.

It is interesting to see that the answers to that message prove that
programming exclusively in standard C is completely impossible even
for a small and ridiculously simple program like the one I proposed.

1 I read the file contents in binary mode, what should allow me
to use ftell/fseek to determine the file size.

No objections to this were raised, except of course the obvious
one, if the "file" was some file associated with stdin, for
instance under some unix machine /dev/tty01 or similar...

I did not test for this since it is impossible in standard C:
isatty() is not in the standard.

2) There is NO portable way to determine which characters should be
ignored when transforming a binary file into a text file. One
reader (CB Falconer) proposed to open the file in binary mode
and then in text mode and compare the two buffers to see which
characters were missing... Well, that would be too expensive.

3) I used different values for errno defined by POSIX, but not by
the C standard, that defines only a few. Again, error handling
is not something important to be standardized, according to
the committee. errno is there but its usage is absolutely
not portable at all and goes immediately beyond what standard C
offers.

We hear again and again that this group is about standard C *"ONLY"*.
Could someone here then, tell me how this simple program could be
written in standard C?

This confirms my arguments about the need to improve the quality
of the standard library!

You can't do *anything* in just standard C.
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
Dec 26 '07
270 9434
Bart C wrote:
"user923005 " <dc*****@connx. comwrote in message
news:79******** *************** ***********@v4g 2000hsf.googleg roups.com...
>>jacob navia wrote:
..
>>I just can't imagine a file system that doesn't provide a way
of knowing the length of a file. Maybe there is SOMEWHERE in
the world a crazy file system like that but why should we
care about it?
...
>>Much simpler would be if we had
size_t filesize(FILE *);

I've been using a function like the following:

unsigned int getfilesize(FIL E* handle)
{
unsigned int p,size;
p=ftell(handle) ; /*p=current position*/
fseek(handle,0, 2); /*get eof position*/
size=ftell(hand le); /*size in bytes*/
fseek(handle,p, 0); /*restore file position*/
return size;
}

What is wrong with this, is it non-standard? (Apart from the likely 4Gb
limit)
Several things are wrong with it, even apart from the
possible 64KB limit.

Zeroth, you should have #include'd <stdio.h>. I'll let
you get away with this one, though, on the grounds that since
you're using FILE you probably *have* #include'd it but just
failed to show the inclusion.

First, there's no error checking. None, nada, zero, zip.

Second, ftell() returns a long. When you store the long
value in an unsigned int, the conversion might not preserve
the value; you may end up seeking back to a different place
than you started. (Or, on a text stream, you may invoke
undefined behavior since the value of `p' in the second fseek()
may not be the value ftell() returned.)

Third, what are the magic numbers 2 and 0 that you use
as the third arguments in the fseek() calls? My guess is
that they are the expansions of the macros SEEK_END and
SEEK_CUR on some system you once used, and that you've
decided for some bizarre reason to avoid using the macros.
So the values will be right (one supposes) on that system,
but there's no telling what they might mean on another.

Fourth, for a text stream the value returned by ftell() is
not necessarily a byte count; it is a value with an unspecified
encoding. Calling it a "file size" makes unwarranted assumptions.

Fifth, there's 7.19.9.2p3: "A binary stream need not
meaningfully support fseek calls with a whence value of SEEK_END."
So if SEEK_END expands to the value 2 (see above), the first
ftell() call may be meaningless on a binary stream.

Sixth, for a binary stream there may be an unspecified
number of extraneous zero bytes after the last byte actually
written to the file. (This isn't as bad as the others, because
if you read the file you'll actually be able to read those
zeroes if they are present: They behave as if they're in the
file, even though they may never have been written to it.)

But other than that, it looks pretty good.

--
Eric Sosman
es*****@ieee-dot-org.invalid
Dec 27 '07 #41
jacob navia wrote:
Eric Sosman wrote:
> You wrote: "You can't do *anything* in just standard C."
Do you stand by that statement, or do you retreat from it?
If you stand by it, why are you here?

int main(void) { int n = printf("hello\n ");}
How much is n?
Answer for the code as shown: Impossible to tell,
because the code needn't even compile under C99 rules,
and invokes undefined behavior in both C90 and C99.

Answer for the code as probably intended: Either
six or an unspecified negative number.
No way to know since the error codes of printf
are NOT standardized. This means that I can only
know that n can be... *ANYTHING*. Maybe it wrote
some characters, then stopped, or whatever!
No, n cannot be "*ANYTHING* ". For example, it cannot
be forty-two.
The problem with the lack of standardization of error codes
means that I can't do error checking in a portable way
and thus, no portable program of any importance can be
written that handles the different error situations that
could arise.
No such program can be portable anyhow, since the list
of potential failure modes is system-specific. Do you want
to force an implementation to throw away information about
the cause of a failure, simply to cram its diagnosis into
one least-common-denominator framework of failure codes?
Perhaps you do: I see that your "Happy christmas" effort
diagnoses *every* fopen() failure as "file not found" --
no "file locked by another user," no "too many open files,"
no "insufficie nt memory," no "permission denied," just "file
not found." (Well, at least you're following an established
precedent: "Tapes? What tapes? There are no such tapes, and
besides, we burned 'em.")
In normal software, you *are* interested into why this program/function
call failed. You can't portably do that in standard C;
Right. When you have enumerated all the failure conditions
for all the file systems that C has run on, runs on today, or
will run on in the future, then you can talk about a comprehensive
and portable encoding scheme for them.
You can't even know the size of a file without reading it all.
This is true, and sometimes a problem. Not usually, but
sometimes.
A bit of more functionality would be better for all of us. But
if I am in this group obviously, it is not because I
believe standard C is useless but because I want to fix some
problems with it.
Either you don't comprehend the difficulty, or you have
seen a way to solve it that has eluded a lot of other people.
The latter would be better for everyone (if you're willing to
share the solution under not-too-expensive terms), but from
the content of your posts over the years I greatly fear that
the case is the former.
Does this answer your question?
No. You made a blanket, all-inclusive statement that
"You can't do *anything* in just standard C," and I asked
whether you stood by it or would retreat from it. You have
still neither affirmed nor recanted your claim.

--
Eric Sosman
es*****@ieee-dot-org.invalid
Dec 27 '07 #42
jacob navia wrote:
fpos_t filesize(FILE *);

would be useful isn't it?
On my system fpos_t isn't an integer. It isn't an arithmetic type, either.
It isn't a scalar, either.
How do I convert an object whose type looks like
typedef struct
{
__off_t __pos;
__mbstate_t __state;
} _G_fpos_t;
typedef _G_fpos_t fpos_t;
to a number?

--
Army1987 (Replace "NOSPAM" with "email")
Dec 27 '07 #43
army1987 wrote:
jacob navia wrote:
>fpos_t filesize(FILE *);

would be useful isn't it?
On my system fpos_t isn't an integer. It isn't an arithmetic type, either.
It isn't a scalar, either.
How do I convert an object whose type looks like
typedef struct
{
__off_t __pos;
__mbstate_t __state;
} _G_fpos_t;
typedef _G_fpos_t fpos_t;
to a number?
you convert the __pos member into a long long.
Read the docs, maybe you are interested in the
mbstate member, maybe not.

In any case I would say that a long long
result would be a better return type.

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
Dec 27 '07 #44
jacob navia said:
Eric Sosman wrote:
>jacob navia wrote:
>>Eric Sosman wrote:
You wrote: "You can't do *anything* in just standard C."
Do you stand by that statement, or do you retreat from it?
If you stand by it, why are you here?
int main(void) { int n = printf("hello\n ");}
How much is n?

Answer for the code as shown: Impossible to tell,
because the code needn't even compile under C99 rules,
and invokes undefined behavior in both C90 and C99.

WOW. How clever you are.
Sarcasm doesn't work very well when you're in the wrong. If you don't want
people to post blindingly obvious corrections to your code, don't make
blindingly obvious mistakes.

<snip>
You establish a false alternative. If somebody asks for
better standardization of error codes, you say that the
alternativew are
o NOTHING (no standardization at all)
o a comprehensive error list of all possible error codes.

The OBVIOUS alternative of standardizing the most common ones
(IO error, not enough memory, incorrect argument, etc)
and leaving to the implementation to return more explicit error codes
is not at all considered...
On the contrary, that's what ISO did. That's why we have EDOM and ERANGE.
The difference between what you suggest and what they actually
standardised is mere haggling over where to draw the line. If you want
more error codes added to the Standard, lobby ISO to that effect.
Complaining about it in comp.lang.c won't achieve anything, because
comp.lang.c doesn't write the Standard.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk >
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Dec 27 '07 #45
Malcolm McLean wrote, On 27/12/07 12:12:
>
"jacob navia" <ja***@nospam.c omwrote in message
>In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.

It is interesting to see that the answers to that message prove that
programming exclusively in standard C is completely impossible even
for a small and ridiculously simple program like the one I proposed.
/*
function to slurp in an ASCII file
Params: path - path to file
Returns: malloced string containing whole file
*/
char *loadfile(char *path)
{
FILE *fp;
int ch;
long i = 0;
long size = 0;
char *answer;

fp = fopen(path, "r");
OK, you got the mode right for the file so you've done better than Jacob.
if(!fp)
{
printf("Can't open %s\n", path);
return 0;
}

fseek(fp, 0, SEEK_END);
You should check for success.
size = ftell(fp);
Using a method you know is not portable is hardly the best way to answer
Jacob's challenge.
fseek(fp, 0, SEEK_SET);

answer = malloc(size + 100);
if(!answer)
{
printf("Out of memory\n");
fclose(fp);
return 0;
You should try for consistent indenting.
}

while( (ch = fgetc(fp)) != EOF)
answer[i++] = ch;
This could overrun your buffer since you don't check.
answer[i++] = 0;

fclose(fp);

return answer;
}

This will do it. Add 100 + size/10 for luck if paranoid.
You are right that a perverse implementation can break this, which is a
bug in the standard.
Or a limitation due to the limitations of existing systems.

Of course, if you had bothered to add in a few simple checks you could
have produced a solution that would work for files up the the maximum
size of block that can be allocated. So get your best guess of the file
size and then expand the buffer if the file turns out to be larger (or
the fseek or ftell failed) and optionally shrink it down at the end.

Since the systems I work with can have larger files than the total of
physical+virtua l memory such a function is of no real use to be.
--
Flash Gordon
Dec 27 '07 #46
Richard Heathfield wrote:
[Stephen's reply, whilst long, was well worth reading. I only have comments
to make on a tiny portion of it. Please imagine that, instead of snipping
the rest, I had quoted it all and written <aol>I agree!</aolunderneath.]

Stephen Montgomery-Smith said:
>jacob navia wrote:
<snip>
>>You can't do *anything* in just standard C.

As a newcomer to this group who hasn't even read the FAQ, let me
nevertheless brazenly seek to answer your question.

I think you are correct in that standard C is of somewhat limited value.

*All* tools are of somewhat limited value. I think many people would be
astounded at just how much can be done with standard C, and just how
widely that functionality can be implemented.
> But perhaps we should see standard C as perhaps a tool to be embedded
into real C, rather than as an object with value in of itself.

How do you feel about s/rather than/as well/ - because I think that such a
change reflects reality rather more closely. Certainly for my own part, I
know that my use of what you call "real C" (by which you appear to mean "C
+ non-ISO9899 libraries") is dwarfed by my use of ISO C. Most of the C
programs I write are ISO C programs. Only a very small proportion use
non-ISO9899 libraries.
Of course, you are correct.

But to reiterate my points - many years ago I used to program in PASCAL.
The problem was PASCAL had certain limitations, and so to overcome
them every implementation had to have certain non-standard extensions.

Then I switched to C. C also has limitations, because a programming
language simply cannot cover every eventuality that a user or OS might
need. But C was defined in a sufficiently ambiguous manner that all the
extensions were permitted by the standard, and one still had standard C.
Somehow the inventors of C (and their successor standards bodies)
attained that delicate balance, because of course to be too ambiguous
would be just as bad as being too strict.

Another thing about C - somehow it is easy to use. PASCAL, I remember,
was very klunky, and it took too many typestrokes to accomplish
something very simple. Next, the other day, a friend sent me a program
written in FORTRAN, and I simply couldn't read it! And this program was
was performing numerical analysis, something that while perhaps
mathematically difficult, is simple from a programming point of view.
On the other hand, I can read C code for OS internals, minimally
commented, and as long as I know broadly what the code is meant to do,
it reads very easily.

Stephen
Dec 27 '07 #47
What do I mean with error analysis?

Something like this
FOPEN
[snip]

ERRORS
The fopen() function shall fail if:
[EACCES]
Search permission is denied on a component of the path prefix, or the
file exists and the permissions specified by mode are denied, or the
file does not exist and write permission is denied for the parent
directory of the file to be created.
[EINTR]
A signal was caught during fopen().
[EISDIR]
The named file is a directory and mode requires write access.
[ELOOP]
A loop exists in symbolic links encountered during resolution of the
path argument.
[EMFILE]
{OPEN_MAX} file descriptors are currently open in the calling process.
[ENAMETOOLONG]
The length of the filename argument exceeds {PATH_MAX} or a pathname
component is longer than {NAME_MAX}.
[ENFILE]
The maximum allowable number of files is currently open in the system.
[ENOENT]
A component of filename does not name an existing file or filename is an
empty string.
[ENOSPC]
The directory or file system that would contain the new file cannot be
expanded, the file does not exist, and the file was to be created.
[ENOTDIR]
A component of the path prefix is not a directory.
[ENXIO]
The named file is a character special or block special file, and the
device associated with this special file does not exist.
[EOVERFLOW]
The named file is a regular file and the size of the file cannot be
represented correctly in an object of type off_t.
[EROFS]
The named file resides on a read only file system and write access was
specified.

You see?
An implementation would be allowed to extend this errors but we could
portably test for a certain kind of error.

To test if a file does not exist I could test for ENOENT when I try
to open it. I could test EISDIR to see if this file is a directory...
etc etc!
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
Dec 27 '07 #48

"Flash Gordon" <sp**@flash-gordon.me.ukwro te in message
Malcolm McLean wrote, On 27/12/07 12:12:
>>
fseek(fp, 0, SEEK_END);

You should check for success.
> size = ftell(fp);

Using a method you know is not portable is hardly the best way to answer
Jacob's challenge.
The code is designed to be used in a production environment, and it is
adequate for that. It reads in a MiniBasic script file. If the file is huge
the function will fail, but the interpreter will choke on such an input
anyway.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm
Dec 27 '07 #49
Malcolm McLean wrote, On 27/12/07 12:21:
>
"Julienne Walker" <ha*********@ho tmail.comwrote in message #
>I get the distinct impression that you're basing these complaints on
requirements that I'm not aware of. Can you give me a formal
description of this function so that I have a better idea of what I'm
dealing with?
It's got to load a text file into a contiguous block of RAM, on any
platform running ANSI standard C.
Easy for reasonably sized files where it is possible, not possible if
the file is larger than the memory available to the process.
Implied is that it shouldn't waste memory, make too many passes over the
data, or repeatedly reallocate.
Those are not implied by the initial statement of requirements. They
also make it impossible even if you leave behind the strictures of
standard C, since the only way to avoid waste memory is to find the file
size, and on Windows (to take one example) the only way to find the
space required is to do a complete scan of the file since Windows uses 2
bytes in a file to indicate the end of a line and can signal the end of
a text file with another byte at *any* point in the physical file. So
the impossibility is nothing to do with C but everything to do with the
way *common* systems work.
It can't be done, because implementations don't have to return an index
from ftell().
That is a limitation of C because it is a limitation of some of the
underlying systems C runs on, such as Windows.
So you need to call fgetc() iteratively to get the size of
the file.
Any "getfilesiz e()" function that worked "correctly" for text files on
Windows (i.e. reported the number of characters you can read if the file
is not modified) would have to read the file a byte at a time anyway.
However MiniBasic has to load scripts, and the function I used
is in practise good enough.
Well, I've claimed that writing a function that can (subject to system
limitations) read an entire text file is not hard, so I'm not surprised
by your claim.
--
Flash Gordon
Dec 27 '07 #50

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
2126
by: Matt | last post by:
I always heard dialet of programming language. Or implementation of a programming language. What does it really mean? C++ is standardized already, does it mean it doesn't have any dialets? But I have heard Borland C++ is the implementation of C++ programming language. What does it mean? please advise. thanks!!
3
2472
by: user | last post by:
Hi all, At the outset, I regret having to post this slightly OT post here. However, I strongly feel that people in this group would be the best to advise me on my predicament. I am working as a QA in an MNC in India, since I graduated in September 2003. I am working in a QA role which requires me to do some Winrunner automation, and modifying/creating a framework in Perl for Automated Regression Testing of some command Line utilities...
7
1702
by: Jesse B. | last post by:
I've been learning how to program with C, and I can't find any info about GUI programming with C. I'm almost done with O'reilly's Practical programming with C, and would like to mess around with GUI programming with C. I understand that it's easier to move to C++ or a few other languages (namely GTK+ or QT), but I'd like to stick with just C for awhile so I don't get too confused. I've been told that I should stick with a language for...
134
7964
by: evolnet.regular | last post by:
I've been utilising C for lots of small and a few medium-sized personal projects over the course of the past decade, and I've realised lately just how little progress it's made since then. I've increasingly been using scripting languages (especially Python and Bourne shell) which offer the same speed and yet are far more simple and safe to use. I can no longer understand why anyone would willingly use C to program anything but the lowest...
4
2135
by: Sreekanth | last post by:
Hi all, I have implemented a timing out version of fgets function call. I am pasting the entire code below. I have following doubts: 1. The code which I have written does it follow standard C programming conventions. ( I am pretty familar with java styles but not with c :-( ). 2. In order to read character by character from stdin , I have made use
7
4945
by: Robert Seacord | last post by:
The CERT/CC has just deployed a new web site dedicated to developing secure coding standards for the C programming language, C++, and eventually other programming language. We have already developed significant content for the C programming language that is available at: https://www.securecoding.cert.org/ by clicking on the "CERT C Programming Language Secure Coding Standard"
139
5912
by: Joe Mayo | last post by:
I think I become more and more alone... Everybody tells me that C++ is better, because once a project becomes very large, I should be happy that it has been written in C++ and not C. I'm the only guy thinking that C is a great programming language and that there is no need to program things object oriented. Many people says also that they save more time by programming projects object oriented, but I think its faster to program them in a...
151
8023
by: istillshine | last post by:
There are many languages around: C++, JAVA, PASCAL, and so on. I tried to learn C++ and JAVA, but ended up criticizing them. Is it because C was my first programming language? I like C because, comparatively, it is small, efficient, and able to handle large and complex tasks. I could not understand why people are using and talking about other programming languages.
14
3394
by: =?ISO-8859-1?Q?Tom=E1s_=D3_h=C9ilidhe?= | last post by:
As far as I know, the C Standard has no mention of multi-threaded programming; it has no mention of how to achieve multi-threaded programming, nor does it mention whether the language or its libraries are suitable for multi-threaded programming. For people who are fond of portable C programming, what's the best way to go about multi-threaded programming? I've been reading up on POSIX threads a little, they seem pretty ubiquitous....
0
8425
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8845
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8743
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8522
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7355
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6177
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5647
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4173
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
1973
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.