473,836 Members | 1,549 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Why GCC does warn me when I using gets() function for accessing file

After compiling the source code with gcc v.4.1.1, I got a warning
message:
"/tmp/ccixzSIL.o: In function 'main';ex.c: (.text+0x9a): warning: the
'gets' function is dangerous and should not be used."

Could anybody tell me why gets() function is dangerous??
Thank you very much.

Cuthbert

Here is the source code I was testing:
---------------------------------------------------
/* count.c -- using standard I/O */

#include "stdafx.h"
#include <stdio.h>
#include <stdlib.h// ANSI C exit() prototype

int main(int argc, char *argv[])
{
int ch; // place to store each character as read
FILE *fp; // "file pointer"
long count = 0;

if (argc != 2)
{
printf("Usage: %s filename\n", argv[0]);
exit(1);
}
if ((fp = fopen(argv[1], "r")) == NULL)
{
printf("Can't open %s\n", argv[1]);
exit(1);
}
while ((ch = getc(fp)) != EOF)
{
putc(ch,stdout) ; // same as putchar(ch);
count++;
}
fclose(fp);
printf("File %s has %ld characters\n", argv[1], count);

return 0;
}
------------------------------------------------------------------

Sep 3 '06
89 6093
we******@gmail. com wrote:
Philip Potter wrote:
<we******@gmail .comwrote in message
This is a completely different situation from gets(). The ANSI C
committee has openly declared hostile intent towards the software
industry by putting their stamp of approval on this function. They
even go so far as to put deceptive language in the standard in an
attempt to demonstrate they've addressed the problem of potential bad
uses of gets().
Is this true? Please tell me more - I'd be interested to hear.

I found this in the C9X Rationale (sorry, got this mixed up with the
standard itself):

"Because gets does not check for buffer overrun,
it is generally unsafe to use when its input is not
under the programmer's control. [...]"

Ok, so they have a rudimentary understanding of the problem.

"[...] This has cause some to question whether it
should appear in the Standard at all. [...]"
My dear fellow, if you can't even be bothered to quote them correctly,
you shouldn't be the one to whine.

Richard
Sep 6 '06 #71
Philip Potter wrote:
<we******@gmail .comwrote in message
Philip Potter wrote:
[...] However, if I know that on my
system int is 17 bits or more, I can guarantee I haven't. The size of an int
is outside the programmer/language/specification's control, so according to
your argument this is still UB, and my implementation is free to reformat my
hard drive instead. I don't think many people would agree with you on this.
Uhhh ... that ANSI C committee itself agrees with this point of view.
I would actually prefer to take your point of view, and decribe a
limited scope of "bad behavior" for certain failures like numerical
overflows. But the ANSI C people decided not to bother with that. So
yes, overflowing an integer apparently can email the KGB the US nuclear
launch codes.

You've completely missed the point here. Re-read the sentence "If I know
that on my system int is 17 bits or more, I can guarantee I haven't [invoked
UB when adding 32767 to 1]".
What relevance is that? The C standard says nothing about such
guarantees. You are comflating a particular implementation with the
standard. Of course an implementation can do whatever it wants for
platform-specific and undefined behavior. So it does -- this does not
prove any point.
Similarly, if I know that on my system stdin gets a \n character at
least every 20 characters, I can use gets() and guarantee no UB.
All your doing is going ahead and translating the universe of UB that
comes with gets()

http://en.wikipedia.org/wiki/Begging_the_question
That does not apply here. That gets() comes with UB is not in dispute.
That you can remove its undefinedness on a particular platform, is
nothing more than an abuse of the meaning of the word UB.
into one narrow manifestation or predictable
behavior. That's exactly what I did in my sample implementation, BTW.

Except that yours isn't standard-compliant.
Of course it is -- under conditions of UB any behavior is
standard-compliant.
[...] gets() does not invoke UB unless
it actually overruns the buffer.
But nothing in the program can make this condition either happen or not
happen. I.e., to be well defined the spec has to specify something
outside of the C language. Besides not being its mandate -- it
actually does not do that. So the specification does not describe
conditions within the confines of what its describing (that C language,
not what user should do) under which the call can be made to be well
defined. The "unless its actually overruns the buffer" is nothing that
the C standard can explain with any specificity -- and it doesn't try.
[...] If you believe otherwise, quote C&V, rather
than just asserting it.
Quoting from the standard is useless since the standard does not make
any attempt to analyze things to their logical conclusion. Taking a
classic cue from Keith Thomspon, if you narrow your view of C
programming to just the spec you would conclude that the C language
does not contain variables (I kid you not, he posted this). Or as
other people keep claiming -- that C doesn't have a stack (if function
calls and returns are always bracketted like pushes and pops, do we not
have a stack? In fact a "call" stack?).

But telling me to justify my "beliefs" by citing chapter and verse you
are just being like the standards committee was with the gets()
rationale by attempting to subvert the rules for the argument.

Buffer overruns invoke UB. gets() may invoke buffer overruns
independent of how the programmer uses it. In some implementations its
possible to go outside the ANSI specification to force gets() to not
buffer overflow and have the behavior that's described in the spec,
however clearly the spec does not delineate these conditions.

So regardless of what the spec says under the heading of gets(), the UB
is inescapable from within the spec, which means the behavior described
in the spec is irrelevant since UB can invoke any behavior.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Sep 6 '06 #72
<we******@gmail .comwrote in message
news:11******** **************@ m79g2000cwm.goo glegroups.com.. .
Philip Potter wrote:
You've completely missed the point here. Re-read the sentence "If I know
that on my system int is 17 bits or more, I can guarantee I haven't
[invoked
UB when adding 32767 to 1]".

What relevance is that? The C standard says nothing about such
guarantees.
It says that the size of an int is implementation-defined. It describes the
meaning of "implementa tion-defined" carefully. It talks about minimum values
for INT_MAX, and says that integer overflow is UB.

If integer overflow happens, the behaviour is undefined. If an addition does
not overflow, the behaviour is well-defined, and must conform to the
standard's definition of addition.

On an implementation with INT_MAX>32767, 32767+1 is not an overflow, and
therefore not UB. It must result in 32768 - no other behaviour is
conforming.

Please tell me which step in this argument you disagree with.
You are comflating a particular implementation with the
standard. Of course an implementation can do whatever it wants for
platform-specific and undefined behavior. So it does -- this does not
prove any point.
No it can't. Please see FAQ 11.33. Implementation-defined behaviour must be
consistent, and must fit within the restrictions imposed by the standard.

If you are going to continue to place your hands over your ears, singing to
yourself, you are welcome to. I am tired of trying to talk sense over the
endless noise you put out.
[...] If you believe otherwise, quote C&V, rather
than just asserting it.

Quoting from the standard is useless since the standard does not make
any attempt to analyze things to their logical conclusion.
It sure beats your preferred method of "Yes it is!" "No it isn't!"

Philip

Sep 6 '06 #73
we******@gmail. com writes:
Philip Potter wrote:
[...]
>Except that yours isn't standard-compliant.

Of course it is -- under conditions of UB any behavior is
standard-compliant.
No. I'll expand on that below.

[...]
Quoting from the standard is useless since the standard does not make
any attempt to analyze things to their logical conclusion. Taking a
classic cue from Keith Thomspon, if you narrow your view of C
programming to just the spec you would conclude that the C language
does not contain variables (I kid you not, he posted this).
I think you're referring to the "curious about array initialization. "
thread from last April and May.

I did not say that "the C language does not contain variables". (If I
did, please cite the article in which I said that.) I said that the C
standard does not define the term "variable". The discussion is
archived on groups.google.c om; anyone who's interested can read it
there.
Or as
other people keep claiming -- that C doesn't have a stack (if function
calls and returns are always bracketted like pushes and pops, do we not
have a stack? In fact a "call" stack?).
The semantics of function calling does require some sort of structure
that behaves in a stack-like manner (last-in first-out). On the other
hand, the term "stack" is also commonly used to refer to a particular
data structure implemented in hardware, where a CPU register is
dedicated as a "stack pointer", and the stack grows and shrinks
through contiguous memory addresses. This kind of "stack" is not
required or implied by the C standard, and there are implementations
that don't have such a "stack"; the data storage required for the
local objects created by a function call is allocated by something
similar to malloc(), and released by something similar to free().
Referring to "the stack" on such a system would be misleading.

[...]
So regardless of what the spec says under the heading of gets(), the UB
is inescapable from within the spec, which means the behavior described
in the spec is irrelevant since UB can invoke any behavior.
Ok, getting back to gets().

You've encouraged me to do something I don't believe I've ever done
here. I'm going to defend gets().

Suppose I've written a function that takes a char* argument (that
points to a string), and I want to write a quick and dirty test
program for it. (I might write a more rigorous test framework later
on; for now, I just want to try it with a few arguments to see if the
results seem plausible.) So, I write a small program like this:

#include <stdio.h>
#include <string.h>
void show(char *s)
{
printf("%d: \"%s\"\n", (int)strlen(s), s);
}

int main(void)
{
char buf[256];
while (gets(buf) != NULL) {
show(buf);
}
return 0;
}

This lets me manually test my function with a few values. Since I
wrote the program in the last 5 minutes, I *know* that it could fail
if I enter too long a line. Once I've satisfied myself that the
function works more or less as I want it to, I delete the program.
I've never made it available to anyone else. I have exactly as much
control over the program's input as I do over the program itself.

If I enter a 300-character line while running this program, I get
undefined behavior. The consequences would be entirely my own fault.

But suppose I enter a 10-character line. The C standard guarantees
that it will work properly. If I'm using your proposed
implementation, on which any call to gets() attempts to reformat my
hard drive, then the damage to my system is entirely *your* fault. I
used a standard function in a safe manner, in a way that *cannot*
invoke undefined behavior (because I control the input, and I will not
enter an overly long input line ).

Having said that, if I were writing such a quick-and-dirty test
program in real life I *still* wouldn't use gets(). I'd use fgets()
and remove the trailing '\n'. (There would always be one because,
again, I wouldn't feed very long lines to the program; if I
accidentally did so, the program would misbehave, but in a benign and
predictable manner.) gets() should not be used, and it should be
removed from the standard, or at least formally deprecated.
Implementations should warn about any calls to gets().

But *if* I use it in a manner whose behavior is guaranteed by the
standard, I have every right to expect it to behave as the standard
specifies.

I don't expect you to be willing to understand this, but I'm prepared
to be pleasantly surprised.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Sep 6 '06 #74
Keith Thompson wrote:
we******@gmail. com writes:
Philip Potter wrote:
[...]
Except that yours isn't standard-compliant.
Of course it is -- under conditions of UB any behavior is
standard-compliant.

No. I'll expand on that below.
Actually you didn't. You simply tried to defend gets() by describing a
scenario outside the specification (hence under UB) that was
predictable in a way you've constructed which happens to coincide with
what the optimistic things that specification tries to describe in its
explanation of gets(). But you never removed the "UB cloud" which
covers the whole thing.
I did not say that "the C language does not contain variables". (If I
did, please cite the article in which I said that.) I said that the C
standard does not define the term "variable".
The relevant quote:

" [...] It is not obvious what the word "variable" should mean in the
context of C. [...]"

And if you think that quote is out of context, you can look up it for
for yourself and see the follow up with a half dozen examples of things
in C where it supposedly can't be decided whether or not something is a
variable.
Or as
other people keep claiming -- that C doesn't have a stack (if function
calls and returns are always bracketted like pushes and pops, do we not
have a stack? In fact a "call" stack?).

The semantics of function calling does require some sort of structure
that behaves in a stack-like manner (last-in first-out). On the other
hand, the term "stack" is also commonly used to refer to a particular
data structure implemented in hardware, where a CPU register is
dedicated as a "stack pointer", and the stack grows and shrinks
through contiguous memory addresses. This kind of "stack" is not
required or implied by the C standard, and there are implementations
that don't have such a "stack"; the data storage required for the
local objects created by a function call is allocated by something
similar to malloc(), and released by something similar to free().
Referring to "the stack" on such a system would be misleading.
What the hell are you talking about? If you think "the stack" means a
hardware stack, its because of something in your mind. We all
understand that the C specification is implemented on an abstract
machine, and that's where its "stack" is. If someone is conflating
this stack with a particular stack implementation (such as Sparc's
register window mechanism, or Itanium's register block stack thingy),
its no different from the people who post with gcc-specific extensions
(like an extra envp parameter in main) which happens here all the time.

And of course in this case the conflation is usually harmless since its
a very rare thing for someone to use an *extension* or
platform-specific feature of a hardware stack in real world code. You
usually use it exactly in the same way you use it in its abtract form
-- you push and pop to it. Compliers may play games with hardware
stacks, general programmers (even hard code low level programmers like
myself) usually do not.

So insisting that C has no stack because the specification doesn't say
that it does is just silly. This is why confining discussion of C only
to the language in the specification is idiotic.
[...]
So regardless of what the spec says under the heading of gets(), the UB
is inescapable from within the spec, which means the behavior described
in the spec is irrelevant since UB can invoke any behavior.

Ok, getting back to gets().

You've encouraged me to do something I don't believe I've ever done
here. I'm going to defend gets().
I know your mind doesn't work very "flexibilit y" at all but I'll give
it a shot -- replace your bad gets() program with another program,
which say, performs a simple buffer overflow:

char digs[5];
sprintf (digs, "%d", (int) val);

Ok, then continue to apply the reasoning and statements you just made
with your gets() program, but in the obvious analogous way. Ok, so
here are the statements you made which apply equally to a program
whicih contains the above:

1) " ... But suppose I enter a 10-character line. The C standard
guarantees that it will work properly."

-- Similarly, if we make val small enough here, it will work
properly.

2) "If I enter a 300-character line while running this program, I get
undefined behavior. The consequences would be entirely my own fault.
[...] But suppose I enter a 10-character line. The C standard
guarantees that it will work properly."

-- Similarly if I make val a 5+ digit integer, the program that
includes the above will have UB. But if I make val a 4 digit
positive number, or 3 digit negative number, it will work
just fine.

The UB we get from overrunning digs[] here obviously can lead to
arbitrary action since it will smash and adjacent declarations
including possibly volatiles, sig_atomic_t or whatever. Same with your
gets() program. So both programs occupy the same space of what's the
worst that can go wrong. Either program could easily format your hard
drive with the right set of circumstances.

So we see the analogy is a pretty close fit, and because of that we
usually look at code such as the above very skeptically. In other
words your argument about gets() hasn't specifcally bolstered gets() in
any way that doesn't also bolster the code above. Let me repeat --
your *argument* doesn't significantly distinguish gets() from the code
snippet above in the context we are in.

Where the analogy falls down, however, is that that above code can be
made to work solely through mechanisms inside the program itself. If I
have some way of guaranteeing that val is between -999 and 9999 solely
through mechanisms inside the program itself, then everything is fine.
I would be using things *IN THE C STANDARD* to make sure that the
semantics of that code remained compliant. The key point is that I do
not need to venture outside the system/program or invoke platform
specific behavior to guarantee that code and brinng it within spec.
I.e., the semantic correctness is guaranteed, essentially by other
contents from the standard itself. I.e., the code above is actually
correct within certain assumptions, and those assumptions can be
enforced by nothing more than the standard itself. The potential for
UB is *eliminated* from within usage of the specification itself.

Your gets() program cannot be similarly fixed, or similarly rely on
analogous guarantees. In order to make gets() work according to what
the standard is optimistically hoping for you *MUST* step outside of
the standard. Thus the standard is trying to specify something that
specifically needs something that isn't (and can't realistically be
put) in the standard. BTW, what does UB mean? Doesn't it mean
arbitrary behavior outside the specification? So trying to format your
hard drive (perhaps successfully, perhaps not) because you stepped
outside the spec

Your argument fails to make this distinction (can you see this?) and by
implication misses the whole point.
Having said that, if I were writing such a quick-and-dirty test
program in real life I *still* wouldn't use gets().
And in this case, its not because of any typically wrong reasoning on
your part. You are actually behaving correctly. As would any
programmer that behaved this way. So why is this being specified? The
rationale is not convincing, and in fact is clearly meant as
subterfuge.
[...] I'd use fgets()
and remove the trailing '\n'. (There would always be one because,
again, I wouldn't feed very long lines to the program; if I
accidentally did so, the program would misbehave, but in a benign and
predictable manner.)
So you've traded one bad behavior for another? ... Whatever, that's
another discussion entirely. You won't UB with this strategy (just get
wrong results, but predictably so.) The \n can also be omitted if EOF
is encountered without a \n just before it, btw. A \n can also
*appear* to be omitted if a \0 is consumed before a \n is, and you are
just using C's char * string semantics on the results.
[...] gets() should not be used, and it should be
removed from the standard, or at least formally deprecated.
Implementations should warn about any calls to gets().
So what are you defending?
But *if* I use it in a manner whose behavior is guaranteed by the
standard, I have every right to expect it to behave as the standard
specifies.
Ok, but the standard *CANNOT* specify that guarantee. It makes a
"chicken before the egg" kind of specification about how gets() works.
It basically says *IF* the call to gets() doesn't invoke UB, then it
reflects some kind of stdin input. But that *IF* cannot be satisfied
by any content in the standard at all. Are you following? Therefore
the standard is not *specifying* a way for gets() to behave in the
optimistic way they are hoping it does.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Sep 6 '06 #75
we******@gmail. com writes:
Keith Thompson wrote:
>we******@gmail. com writes:
Philip Potter wrote:
[...]
>Except that yours isn't standard-compliant.

Of course it is -- under conditions of UB any behavior is
standard-compliant.

No. I'll expand on that below.

Actually you didn't. You simply tried to defend gets() by describing a
scenario outside the specification (hence under UB) that was
predictable in a way you've constructed which happens to coincide with
what the optimistic things that specification tries to describe in its
explanation of gets(). But you never removed the "UB cloud" which
covers the whole thing.
>I did not say that "the C language does not contain variables". (If I
did, please cite the article in which I said that.) I said that the C
standard does not define the term "variable".

The relevant quote:

" [...] It is not obvious what the word "variable" should mean in the
context of C. [...]"

And if you think that quote is out of context, you can look up it for
for yourself and see the follow up with a half dozen examples of things
in C where it supposedly can't be decided whether or not something is a
variable.
Thank you for confirming that I did *not* say that "the C language
does not contain variables".
Or as
other people keep claiming -- that C doesn't have a stack (if function
calls and returns are always bracketted like pushes and pops, do we not
have a stack? In fact a "call" stack?).

The semantics of function calling does require some sort of structure
that behaves in a stack-like manner (last-in first-out). On the other
hand, the term "stack" is also commonly used to refer to a particular
data structure implemented in hardware, where a CPU register is
dedicated as a "stack pointer", and the stack grows and shrinks
through contiguous memory addresses. This kind of "stack" is not
required or implied by the C standard, and there are implementations
that don't have such a "stack"; the data storage required for the
local objects created by a function call is allocated by something
similar to malloc(), and released by something similar to free().
Referring to "the stack" on such a system would be misleading.

What the hell are you talking about? If you think "the stack" means a
hardware stack, its because of something in your mind. We all
understand that the C specification is implemented on an abstract
machine, and that's where its "stack" is.
[snip]
>
So insisting that C has no stack because the specification doesn't say
that it does is just silly. This is why confining discussion of C only
to the language in the specification is idiotic.
In most implementations , local variables and other storage associated
with a called function are allocated on "the stack". My understanding
of the phrase "the stack" in this context is exactly the kind of
hardware-based stack I discussed above, something that is not
guaranteed by the standard. The word "the" implies something
specific.

If the phrase "the stack" doesn't carry that implication for you,
that's terrific, but I strongly suspect that it does for most people.
>[...]
So regardless of what the spec says under the heading of gets(), the UB
is inescapable from within the spec, which means the behavior described
in the spec is irrelevant since UB can invoke any behavior.

Ok, getting back to gets().

You've encouraged me to do something I don't believe I've ever done
here. I'm going to defend gets().
[snip]
Your gets() program cannot be similarly fixed, or similarly rely on
analogous guarantees. In order to make gets() work according to what
the standard is optimistically hoping for you *MUST* step outside of
the standard. Thus the standard is trying to specify something that
specifically needs something that isn't (and can't realistically be
put) in the standard. BTW, what does UB mean? Doesn't it mean
arbitrary behavior outside the specification? So trying to format your
hard drive (perhaps successfully, perhaps not) because you stepped
outside the spec
Ok, there's no way within the standard to use gets() safely. Beyond
the question of whether it should be used in any circumstances, it
certainly shouldn't be used in code that's intended to be portable.
(The sample program I posted was not intended to be portable; it was
specifically designed to be used in tightly controlled conditions and
then discarded.)

Not all C code has to be portable. Most C code should be portable,
but most C *programs* are not; they depend on system-specific
features. fopen() can't be successfully called without a valid file
name, and there's no portable way (other than tmpnam()) to generate a
valid file name. (And yes, fopen() behaves in a well-defined manner
if you give it an invalid file name, which makes it more robust than
gets().)
Your argument fails to make this distinction (can you see this?) and by
implication misses the whole point.
I didn't miss the point. I made a different point.
>Having said that, if I were writing such a quick-and-dirty test
program in real life I *still* wouldn't use gets().

And in this case, its not because of any typically wrong reasoning on
your part. You are actually behaving correctly. As would any
programmer that behaved this way. So why is this being specified? The
rationale is not convincing, and in fact is clearly meant as
subterfuge.
A subterfuge? Do you think that the ISO C committee keeps gets() in
the standard for malicious purposes? What is their motivation?

[...]
>[...] gets() should not be used, and it should be
removed from the standard, or at least formally deprecated.
Implementation s should warn about any calls to gets().

So what are you defending?
Just this: Given that gets() is defined by the standard, a conforming
implementation must implement it properly. gets() does not always
invoke undefined behavior. In those cases where it doesn't, it must
behave as specified.
>But *if* I use it in a manner whose behavior is guaranteed by the
standard, I have every right to expect it to behave as the standard
specifies.

Ok, but the standard *CANNOT* specify that guarantee. It makes a
"chicken before the egg" kind of specification about how gets() works.
It basically says *IF* the call to gets() doesn't invoke UB, then it
reflects some kind of stdin input.
Correct.
But that *IF* cannot be satisfied
by any content in the standard at all. Are you following? Therefore
the standard is not *specifying* a way for gets() to behave in the
optimistic way they are hoping it does.
The standard provides no portable way to use gets() safely.

There are *non-portable* ways to use gets() safely.

C is specifically designed to support both portable and non-portable
programming.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Sep 7 '06 #76
On Thu, 6 Sep 2006 we******@gmail. com wrote:
In order to make gets() work according to what
the standard is optimistically hoping for you *MUST* step outside of
the standard. Thus the standard is trying to specify something that
specifically needs something that isn't (and can't realistically be
put) in the standard. BTW, what does UB mean? Doesn't it mean
arbitrary behavior outside the specification? So trying to format your
hard drive (perhaps successfully, perhaps not) because you stepped
outside the spec
I believe that gets() works as follows. I simply cannot see
how you could apply the as-if rule to ``optimize'' this code into
unconditional arbitrary behavior.

/*
* 7.19.7.7 The gets function
*
* Implemented by Tak-Shing Chan
*/

#include <stdio.h>

char *
gets(char *s)
{
int c;
char *itaptbs = s;

/*
* 7.19.7.7 paragraph 2
*
* The gets function reads characters from the input
* stream pointed to by stdin, into the array pointed to
* by s, until end-of-file is encountered or a new-line
* character is read.
*/
while (!((c = getchar()) == EOF || c == '\n'))
*itaptbs++ = c;

/*
* 7.19.7.7 paragraph 3
*
* If end-of-file is encountered and no characters have
* been read into the array, the contents of the array
* remain unchanged and a null pointer is returned. If
* a read error occurs during the operation, the array
* contents are indeterminate and a null pointer is
* returned.
*/
if (c == EOF && (itaptbs == s || ferror(stdin)))
return NULL;

/*
* 7.19.7.7 paragraph 2
*
* Any new-line character is discarded, and a null
* character is written immediately after the last
* character read into the array.
*/
*itaptbs = 0;
/*
* 7.19.7.7 paragraph 3
*
* The gets function returns s if successful.
*/
return s;
}

Tak-Shing
Sep 7 '06 #77
Tak-Shing Chan wrote:
On Thu, 6 Sep 2006 we******@gmail. com wrote:
In order to make gets() work according to what
the standard is optimistically hoping for you *MUST* step outside of
the standard. Thus the standard is trying to specify something that
specifically needs something that isn't (and can't realistically be
put) in the standard. BTW, what does UB mean? Doesn't it mean
arbitrary behavior outside the specification? So trying to format your
hard drive (perhaps successfully, perhaps not) because you stepped
outside the spec

I believe that gets() works as follows. I simply cannot see
how you could apply the as-if rule to ``optimize'' this code into
unconditional arbitrary behavior.
This is because you are doing a literal translation of what they are
saying without taking anything to a logical conclusion. This has
nothing to do with optimization. "As-if" also has little meaning once
UB is encountered -- every behaviour is "as-if" once you enact UB. All
that needs to be established is that there is a UB here.
/*
* 7.19.7.7 The gets function
*
* Implemented by Tak-Shing Chan
*/

#include <stdio.h>

char *
gets(char *s)
{
int c;
char *itaptbs = s;

/*
* 7.19.7.7 paragraph 2
*
* The gets function reads characters from the input
* stream pointed to by stdin, into the array pointed to
* by s, until end-of-file is encountered or a new-line
* character is read.
*/
while (!((c = getchar()) == EOF || c == '\n'))
*itaptbs++ = c;
This last line causes an unfixable and unaddressable UB. The fact that
this is not stated in the specification does not change it from being
so. Because of that, the code can in fact, undo the stream state, send
the characters back, send the state of s into anything it likes, then
proceed to format your hard drive. In fact it can do anything, and a
programmer cannot have any expectation that anything less happens,
except in platform-specific scenarios that are not covered by the
specification.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Sep 7 '06 #78
On Thu, 6 Sep 2006 we******@gmail. com wrote:
Tak-Shing Chan wrote:
>On Thu, 6 Sep 2006 we******@gmail. com wrote:
>>In order to make gets() work according to what
the standard is optimistically hoping for you *MUST* step outside of
the standard. Thus the standard is trying to specify something that
specificall y needs something that isn't (and can't realistically be
put) in the standard. BTW, what does UB mean? Doesn't it mean
arbitrary behavior outside the specification? So trying to format your
hard drive (perhaps successfully, perhaps not) because you stepped
outside the spec

I believe that gets() works as follows. I simply cannot see
how you could apply the as-if rule to ``optimize'' this code into
unconditiona l arbitrary behavior.

This is because you are doing a literal translation of what they are
saying without taking anything to a logical conclusion. This has
nothing to do with optimization. "As-if" also has little meaning once
UB is encountered -- every behaviour is "as-if" once you enact UB. All
that needs to be established is that there is a UB here.
UB would only occur if the input really exceeds the size of
the array pointed to by s.
> /*
* 7.19.7.7 The gets function
*
* Implemented by Tak-Shing Chan
*/

#include <stdio.h>

char *
gets(char *s)
{
int c;
char *itaptbs = s;

/*
* 7.19.7.7 paragraph 2
*
* The gets function reads characters from the input
* stream pointed to by stdin, into the array pointed to
* by s, until end-of-file is encountered or a new-line
* character is read.
*/
while (!((c = getchar()) == EOF || c == '\n'))
*itaptbs++ = c;

This last line causes an unfixable and unaddressable UB. The fact that
this is not stated in the specification does not change it from being
so. Because of that, the code can in fact, undo the stream state, send
the characters back, send the state of s into anything it likes, then
proceed to format your hard drive. In fact it can do anything, and a
programmer cannot have any expectation that anything less happens,
except in platform-specific scenarios that are not covered by the
specification.
It is not UB if the array pointed to by s is large enough for
the input.

Tak-Shing
Sep 7 '06 #79
Keith Thompson wrote:
we******@gmail. com writes:
Keith Thompson wrote:
we******@gmail. com writes:
Philip Potter wrote:
Or as
other people keep claiming -- that C doesn't have a stack (if function
calls and returns are always bracketted like pushes and pops, do we not
have a stack? In fact a "call" stack?).

The semantics of function calling does require some sort of structure
that behaves in a stack-like manner (last-in first-out). On the other
hand, the term "stack" is also commonly used to refer to a particular
data structure implemented in hardware, where a CPU register is
dedicated as a "stack pointer", and the stack grows and shrinks
through contiguous memory addresses. This kind of "stack" is not
required or implied by the C standard, and there are implementations
that don't have such a "stack"; the data storage required for the
local objects created by a function call is allocated by something
similar to malloc(), and released by something similar to free().
Referring to "the stack" on such a system would be misleading.
What the hell are you talking about? If you think "the stack" means a
hardware stack, its because of something in your mind. We all
understand that the C specification is implemented on an abstract
machine, and that's where its "stack" is.
[snip]

So insisting that C has no stack because the specification doesn't say
that it does is just silly. This is why confining discussion of C only
to the language in the specification is idiotic.

In most implementations , local variables and other storage associated
with a called function are allocated on "the stack".
Really? Most implementations I know of actually throw these things
into registers first. Many even throw return addresses into "link
registers". Even on the x86 (a very popular platform), there are at
least *two* stacks (one for floating point, and one for the rest). We
must inhabit different planes of existance.
[...] My understanding
of the phrase "the stack" in this context is exactly the kind of
hardware-based stack I discussed above, something that is not
guaranteed by the standard. The word "the" implies something
specific.

If the phrase "the stack" doesn't carry that implication for you,
that's terrific, but I strongly suspect that it does for most people.
You think most people know assembly language? You really do live in a
bizarre fantasy world.
[...]
So regardless of what the spec says under the heading of gets(), the UB
is inescapable from within the spec, which means the behavior described
in the spec is irrelevant since UB can invoke any behavior.

Ok, getting back to gets().

You've encouraged me to do something I don't believe I've ever done
here. I'm going to defend gets().

[snip]
Your gets() program cannot be similarly fixed, or similarly rely on
analogous guarantees. In order to make gets() work according to what
the standard is optimistically hoping for you *MUST* step outside of
the standard. Thus the standard is trying to specify something that
specifically needs something that isn't (and can't realistically be
put) in the standard. BTW, what does UB mean? Doesn't it mean
arbitrary behavior outside the specification? So trying to format your
hard drive (perhaps successfully, perhaps not) because you stepped
outside the spec

Ok, there's no way within the standard to use gets() safely. Beyond
the question of whether it should be used in any circumstances, it
certainly shouldn't be used in code that's intended to be portable.
portable?!?! What? What has that got to do with anything? It fails
in *every* system. *Every* time its put into a program its wrong.
Only in systems where the input is redirected *and* the system does not
support multitasking can you even build a credible case for a well
defined scenario where it can satisfy the committees fantasies about
how gets() is supposed to behave. Even there, you are relying on
specific platform behavior.
(The sample program I posted was not intended to be portable;
It is if you ignore the UB -- which of course you are. Its only not
portable because UB is not portable. Portability just isn't the issue.
Every platform must fail except by extraordinary intervention (that
can't realistically be called programming).
[...] it was
specifically designed to be used in tightly controlled conditions and
then discarded.)
I thought it was designed for you to post and make a point. If you
actually used it for any reason, besides contradicting earlier
statements you made, it would just be irresponsible.
Not all C code has to be portable. Most C code should be portable,
but most C *programs* are not; they depend on system-specific
features. fopen() can't be successfully called without a valid file
name, and there's no portable way (other than tmpnam()) to generate a
valid file name.
You are confusing platform specific with undefined behavior. Calls to
fopen(), and system() can't be made portable. This is well understood.
This has nothing to do with the situation with gets().
[...] (And yes, fopen() behaves in a well-defined manner
if you give it an invalid file name, which makes it more robust than
gets().)
It makes it well defined. As opposed to gets().
Your argument fails to make this distinction (can you see this?) and by
implication misses the whole point.

I didn't miss the point. I made a different point.
There's no point in there. You can't use non-portability as a
protection for gets(), and that clearly was not the point you were
making.
Having said that, if I were writing such a quick-and-dirty test
program in real life I *still* wouldn't use gets().
And in this case, its not because of any typically wrong reasoning on
your part. You are actually behaving correctly. As would any
programmer that behaved this way. So why is this being specified? The
rationale is not convincing, and in fact is clearly meant as
subterfuge.

A subterfuge? Do you think that the ISO C committee keeps gets() in
the standard for malicious purposes? What is their motivation?
I have no idea *WHY* they do things like that. I just know that they
did it. I mean we *KNOW* that the committee is aware of what the issue
is. But they have gone on record to say that that doesn't matter them
and they are leaving it in, and they've created a "doublespea k" kind of
rationale for their behavior.
[...]
[...] gets() should not be used, and it should be
removed from the standard, or at least formally deprecated.
Implementations should warn about any calls to gets().
So what are you defending?

Just this: Given that gets() is defined by the standard, a conforming
implementation must implement it properly. gets() does not always
invoke undefined behavior. In those cases where it doesn't, it must
behave as specified.
You're a broken record. I have asked and you have not explained the
difference between undefined behavior and sometimes undefined behavior.
Literally you gave an example of a platform and environment specific
way of making the undefined behavior emit some sort of predictable
results. But that's generally exactly the case for every other kind of
UB that you can create as well. So you have not made a distinction,
and thus have not made the case. There is a built-in contradiction of
language in the specification -- they just omit the blatant expression
of that contradiction, even though they cannot excise it from real
manifestations.
But *if* I use it in a manner whose behavior is guaranteed by the
standard, I have every right to expect it to behave as the standard
specifies.
Ok, but the standard *CANNOT* specify that guarantee. It makes a
"chicken before the egg" kind of specification about how gets() works.
It basically says *IF* the call to gets() doesn't invoke UB, then it
reflects some kind of stdin input.

Correct.
But that *IF* cannot be satisfied
by any content in the standard at all. Are you following? Therefore
the standard is not *specifying* a way for gets() to behave in the
optimistic way they are hoping it does.

The standard provides no portable way to use gets() safely.
It provides *NO* way to use gets() safely. Portable or not.
There are *non-portable* ways to use gets() safely.
There are non-portable ways of making every UB safe. *EVERY*. That's
an irrelevant tautology.
C is specifically designed to support both portable and non-portable
programming.
It was *supposed* to be designed to be well defined, regardless of
portability. They specified gets() obviously -- so you have
reinterpret the spec to realize the gets() always invokes UB, to retain
this well definedness property. Your portability argument is just a
red herring.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Sep 7 '06 #80

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
5245
by: E | last post by:
I am having trouble with setTimeout working on a second call to the setTimeout function from a second page which is an html page. Here is the scenario. I have a web page and onload it calls a javascript function which calls setTimeout and will process a second javascript function "Warn" just before the session expires. The Warn function displays an html page with a button. A second timer is started to cause the html page to close...
0
9816
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, weíll explore What is ONU, What Is Router, ONU & Routerís main usage, and What is the difference between ONU and Router. Letís take a closer look ! Part I. Meaning of...
0
10840
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10546
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
10254
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6978
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5647
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5823
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4448
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4013
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.