By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,381 Members | 1,573 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,381 IT Pros & Developers. It's quick & easy.

Why "gets" has not been deprecated yet?

P: n/a
We all know that the "gets" function from the Standard C Library (which
is part of the Standard C++ Library) is dangerous. It provides no
bounds check, so it's easy to overwrite memory when using it, and
impossible to guarantee that it won't happen.

Therefore, i think it's surprising that this function has not been
deprecated.
The C++98 Standard keeps it from the C89 standard.
The C99 Standard has kept it :-o.

Now, the C standard committee is working on safe functions (the ones
that end with "_s") for the C Standard Library. I don't know if they
are going to deprecate the dreaded "gets". Even if not, i think it
would be a good idea to deprecate it in the next C++ standard, since
C++ has better ways to accomplish the same task (getline). It's too
early to expect the safe functions (*_s) in the C++ Standard, but
getting rid of "gets" is not that hard, isn't it? Programs that use it
are broken anyway. Also, C++ has deprecated other features from C just
because C++ has better alternatives (static meaning "internal linkage"
and headers ending in ".h").

Opinions? Should this message be posted on comp.std.c++?

Nov 3 '05 #1
Share this Question
Share on Google+
32 Replies


P: n/a
Marcus wrote:
Opinions? Should this message be posted on comp.std.c++?


You probably want to post it over there, as people on here generally
focus more on application, and less on changing / debating the
standard.

Be careful with the assumption that all things using gets are
inherantly flawed however. =P

Nov 3 '05 #2

P: n/a
Josh Mcfarlane wrote:
Be careful with the assumption that all things using gets are
inherantly flawed however. =P


Well, gets invokes undefined behavior.

Nov 3 '05 #3

P: n/a
Rolf Magnus wrote:
Well, gets invokes undefined behavior. From a buffer overrun? If not, what else causes the undefined behavior?


Nov 3 '05 #4

P: n/a
Josh Mcfarlane wrote:
Rolf Magnus wrote:
Well, gets invokes undefined behavior.

From a buffer overrun? If not, what else causes the undefined behavior?

Many functions in the C library have undefined behavior when given
arguments outside their range. They are by and large a piece of
inherited crap that should have never received standard status (the
STDIO part of the library is the most misdesigned malodious thing
ever foisted on the community, it was derived from a misnamed
piece of crap from an ancient UNIX project called the "portable
IO library").
Nov 3 '05 #5

P: n/a
Ron Natalie wrote:
Many functions in the C library have undefined behavior when given
arguments outside their range. They are by and large a piece of
inherited crap that should have never received standard status (the
STDIO part of the library is the most misdesigned malodious thing
ever foisted on the community, it was derived from a misnamed
piece of crap from an ancient UNIX project called the "portable
IO library").


Well, ya, my point was, if you can confine to arguments within their
range, they do function (at least to my knowledge). Good? No, but still
functionable.

Anywho, let's go throw this at the std people and see if it can get any
support.

Nov 3 '05 #6

P: n/a
> Many functions in the C library have undefined behavior when given
arguments outside their range. They are by and large a piece of
inherited crap that should have never received standard status (the
STDIO part of the library is the most misdesigned malodious thing
ever foisted on the community, it was derived from a misnamed
piece of crap from an ancient UNIX project called the "portable
IO library").


Wow! I had never hear about that, can you explain a little more what
are the problems of <stdio.h>?

Nov 3 '05 #7

P: n/a
Gaijinco wrote:
Many functions in the C library have undefined behavior when given
arguments outside their range. They are by and large a piece of
inherited crap that should have never received standard status (the
STDIO part of the library is the most misdesigned malodious thing
ever foisted on the community, it was derived from a misnamed
piece of crap from an ancient UNIX project called the "portable
IO library").


Wow! I had never hear about that, can you explain a little more what
are the problems of <stdio.h>?

Functions like gets that have no provisions for safety.
All the functions have arguments in different order. Some
of them have the file stream arg first, some last.
fwrite/fread have a number of records and record size number
that nobody knows what to do with other than multiply together.
It just goes on from their, the library is crap.
Nov 3 '05 #8

P: n/a
This propensity for undefined behaviour is an example of Design by
Contract (DbC): you meet the preconditions, and you get the contracted
behaviour. The philosophy says: if you stuff up, and fail to pick it
up in your testing, it's your fault and you're a pathetic excuse for a
programmer, (and probably a human being). Anyway, the point is that
DbC can work, but you have to guarantee the preconditions. For gets,
they're extreme: if you know that standard input necessarily sends
lines below a certain length, then you can use it. This is probably
only the case when standard input is coming from some other source that
you control. For example, you might write a filter that works on some
fixed-length records, and is designed to be used in a pipeline ala
(UNIX) "cat file | filter" or (DOS) "type file | filter". Who's to say
that you don't know what you're doing well enough to guarantee the line
length precondition? It's your own call whether you use it.

FWIW, I dislike DbC and agree that gets should hardly ever be used,
would happily consider that it should never be used in new code, but
wouldn't go to the extent of saying that it must never be used and it's
worth breaking existing code using it. More generally, the stdio
library has proven itself a well-designed bit of work, in that while
it's error-proneness been the cause of innumerable errors, it's
concision, usability and flexibility has supported innumerable systems
that do useful work. If you think you can write better in C, go ahead
and see if anyone wants to use your creations.... One of the
compromises of C++ is that it should overwhelmingly be a superset of C,
with benefits in porting, skills transfer etc..

Tony

Nov 3 '05 #9

P: n/a
Josh Mcfarlane wrote:
Ron Natalie wrote:
Many functions in the C library have undefined behavior when given
arguments outside their range. They are by and large a piece of
inherited crap that should have never received standard status (the
STDIO part of the library is the most misdesigned malodious thing
ever foisted on the community, it was derived from a misnamed
piece of crap from an ancient UNIX project called the "portable
IO library").


Well, ya, my point was, if you can confine to arguments within their
range, they do function (at least to my knowledge). Good? No, but still
functionable.


The problem about gets is that there is no way for the program to provide
arguments that are really 100% safe. gets will produce a buffer overflow if
the buffer you provided isn't large enough for the incoming data. There is
no (portable) way to make the buffer big enough in every case, since the
program can't control the amount of data that is read. This lack of control
leads me to the conclusion that gets() can be seen as generally invoking
undefined behavior.

Nov 3 '05 #10

P: n/a
Rolf Magnus wrote:

The problem about gets is that there is no way for the program to provide
arguments that are really 100% safe. gets will produce a buffer overflow if
the buffer you provided isn't large enough for the incoming data. There is
no (portable) way to make the buffer big enough in every case, since the
program can't control the amount of data that is read. This lack of control
leads me to the conclusion that gets() can be seen as generally invoking
undefined behavior.


That's too broad. The behavior of gets is undefined if the input in fact
is too large for the buffer. If it isn't, the behavior is well defined.

That's not a comment on its utility, but on how to apply technical terms.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Nov 3 '05 #11

P: n/a
Pete Becker wrote:
Rolf Magnus wrote:

The problem about gets is that there is no way for the program to provide
arguments that are really 100% safe. gets will produce a buffer overflow
if the buffer you provided isn't large enough for the incoming data.
There is no (portable) way to make the buffer big enough in every case,
since the program can't control the amount of data that is read. This
lack of control leads me to the conclusion that gets() can be seen as
generally invoking undefined behavior.
That's too broad. The behavior of gets is undefined if the input in fact
is too large for the buffer. If it isn't, the behavior is well defined.


However, the C++ standard does not specify how large the input is or may be,
and there is no way for the program to know it, so the "if it isn't, the
behavior is well defined" part is of no relevance for my program. I must
assume that the input may be too large, no matter what my program does.
That's not a comment on its utility, but on how to apply technical terms.


Ok, let's apply technical terms then:
According to the C++ standard, UB is "behavior, such as might arise upon use
of an erronous program construct or erroneous data, for which this
International Standard imposes no requirements". Applying that to your
sentence above, that means that my program has "an erronous program
construct or erroneous data", if the input is too large and is correct if
the input fits in the provided space. But my program can't control whenther
the input fits or not. It can control the size of the buffer, but not the
amount of data coming in, so it doesn't have any way of ensuring the
well-defined behavior that you are writing about.
It's as if you say "the behavior is well-defined only on full moon".

Nov 3 '05 #12

P: n/a
On 2005-11-03, to***********@yahoo.co.uk
<to***********@yahoo.co.uk> wrote:
This propensity for undefined behaviour is an example of Design
by Contract (DbC): you meet the preconditions, and you get the
contracted behaviour. The philosophy says: if you stuff up,
and fail to pick it up in your testing, it's your fault and
you're a pathetic excuse for a programmer, (and probably a
human being). Anyway, the point is that DbC can work, but you
have to guarantee the preconditions. For gets, they're
extreme: if you know that standard input necessarily sends
lines below a certain length, then you can use it. This is
probably only the case when standard input is coming from some
other source that you control. For example, you might write a
filter that works on some fixed-length records, and is designed
to be used in a pipeline ala (UNIX) "cat file | filter" or
(DOS) "type file | filter". Who's to say that you don't know
what you're doing well enough to guarantee the line length
precondition?


Crackers.

--
Neil Cerutti
Nov 3 '05 #13

P: n/a
Ron Natalie wrote:
fwrite/fread have a number of records and record size number
that nobody knows what to do with other than multiply together.
It just goes on from their, the library is crap.


fwrite and fread return the number of objects written or read,
not the number of chars. But in general you may as well
use <iostream> and friends.

--
imalone
Nov 3 '05 #14

P: n/a
Consider: someone writes two programs that share a header file
containing a buffer-size constant. In one program, lines are generated
and checked against this maximum length. The other program defines a
buffer based on this length, but uses gets(). The two programs may be
reasonably well synchronised, in that a change to the header triggers
rebuilds of both. Just hope they're distributed together too! This is
arguably in line with a workable (but deeply unappealing to me) DbC
philosophy. I can't say it's crackers, even though I'd like to be able
to! - Tony

Nov 3 '05 #15

P: n/a
Ian Malone wrote:
Ron Natalie wrote:
fwrite/fread have a number of records and record size number
that nobody knows what to do with other than multiply together.
It just goes on from their, the library is crap.


fwrite and fread return the number of objects written or read,
not the number of chars. But in general you may as well
use <iostream> and friends.

Yeah, so? But there is no concept of reading anything other
than char's from the stream. All the function does is multiply
those two args togehter and divides by the size on return.
It's a stupid design.
Nov 3 '05 #16

P: n/a
Rolf Magnus wrote:

Ok, let's apply technical terms then:
According to the C++ standard, UB is "behavior, such as might arise upon use
of an erronous program construct or erroneous data, for which this
International Standard imposes no requirements". Applying that to your
sentence above, that means that my program has "an erronous program
construct or erroneous data", if the input is too large and is correct if
the input fits in the provided space. But my program can't control whenther
the input fits or not. It can control the size of the buffer, but not the
amount of data coming in, so it doesn't have any way of ensuring the
well-defined behavior that you are writing about.
That's correct.
It's as if you say "the behavior is well-defined only on full moon".


No, it's not. Not being able to control input is not the same as input
always being ill-formed. For a quick and dirty one-off command line
utility I'd have no qualms about using gets.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Nov 3 '05 #17

P: n/a
<to***********@yahoo.co.uk> wrote in message
news:11**********************@g47g2000cwa.googlegr oups.com...
This propensity for undefined behaviour is an example of Design by
Contract (DbC): you meet the preconditions, and you get the contracted
behaviour. The philosophy says: if you stuff up, and fail to pick it
up in your testing, it's your fault and you're a pathetic excuse for a
programmer, (and probably a human being). Anyway, the point is that
DbC can work, but you have to guarantee the preconditions. For gets,
they're extreme: if you know that standard input necessarily sends
lines below a certain length, then you can use it. This is probably
only the case when standard input is coming from some other source that
you control. For example, you might write a filter that works on some
fixed-length records, and is designed to be used in a pipeline ala
(UNIX) "cat file | filter" or (DOS) "type file | filter". Who's to say
that you don't know what you're doing well enough to guarantee the line
length precondition? It's your own call whether you use it.


Who's to say that an input stream with a 'guaranteed' limit of
'record size', did not get corrupted by some outside influence,
rendering the 'guarantee' spurious? I've actually had to deal
with this issue in the real world (receiving data over an RS232
line, subject to ocassional 'noise'). My program was not able
to make *any* assumptions about the expected data stream.

'Knowing what I was doing', I knew that such 'guarantee' was
impossible to implement. 'Knowing what I was doing' meant that
it was my program's responsibility to deal with 'dirty' data
in a safe manner (e.g. discarding it, or perhaps re-acquiring it).
-Mike
Nov 3 '05 #18

P: n/a
Pete Becker wrote:
Rolf Magnus wrote:

Ok, let's apply technical terms then:
According to the C++ standard, UB is "behavior, such as might arise upon
use of an erronous program construct or erroneous data, for which this
International Standard imposes no requirements". Applying that to your
sentence above, that means that my program has "an erronous program
construct or erroneous data", if the input is too large and is correct if
the input fits in the provided space. But my program can't control
whenther the input fits or not. It can control the size of the buffer,
but not the amount of data coming in, so it doesn't have any way of
ensuring the well-defined behavior that you are writing about.
That's correct.


So, you think the correctness of a C++ program can depend on what the user
enters at run-time?
It's as if you say "the behavior is well-defined only on full moon".


No, it's not. Not being able to control input is not the same as input
always being ill-formed.


It's always potentially being ill-formed.
For a quick and dirty one-off command line utility I'd have no qualms
about using gets.


That way of thinking is the reason for quite a lot of security holes.
Nov 4 '05 #19

P: n/a
Rolf Magnus wrote:
It's always potentially being ill-formed.


That's like saying that every value that you set is potentially
invalid. If I have a set output from another digital source on the
machine, it COULD be ill-formed if the machine doesn't work as it's
suppose to, just as if a pointer set to a certain object could randomly
change from the OS not operating as it is suppose to and overwriting
that segment of memory.

Nov 4 '05 #20

P: n/a
Rolf Magnus wrote:
Pete Becker wrote:

Rolf Magnus wrote:
Ok, let's apply technical terms then:
According to the C++ standard, UB is "behavior, such as might arise upon
use of an erronous program construct or erroneous data, for which this
International Standard imposes no requirements". Applying that to your
sentence above, that means that my program has "an erronous program
construct or erroneous data", if the input is too large and is correct if
the input fits in the provided space. But my program can't control
whenther the input fits or not. It can control the size of the buffer,
but not the amount of data coming in, so it doesn't have any way of
ensuring the well-defined behavior that you are writing about.
That's correct.

So, you think the correctness of a C++ program can depend on what the user
enters at run-time?


It's what the language defintition says.
It's as if you say "the behavior is well-defined only on full moon".

No, it's not. Not being able to control input is not the same as input
always being ill-formed.

It's always potentially being ill-formed.


Whatever.
For a quick and dirty one-off command line utility I'd have no qualms
about using gets.

That way of thinking is the reason for quite a lot of security holes.


A quick and dirty one-off command line utility by definition isn't
secure, so security holes aren't important.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Nov 4 '05 #21

P: n/a

"Pete Becker" <pe********@acm.org> wrote in message
news:au********************@rcn.net...
Rolf Magnus wrote:
Pete Becker wrote:

Rolf Magnus wrote:

Ok, let's apply technical terms then:
According to the C++ standard, UB is "behavior, such as might arise upon
use of an erronous program construct or erroneous data,


I think it's unfortunate that the phrase 'erroneous data'
is not elaborated upon. Does it mean only 'data embedded
in the program', or is it intedend to include 'external input'?

Or perhaps the concept of 'data' is defined elswhere
I'm unaware of?

-Mike

Nov 4 '05 #22

P: n/a
Mike, perhaps you're missing my point, or perhaps you're trying to
illustrate the complexities of making the determination. Anyway, your
post highlights that some issues can and should be anticipated and
usefully handled. In contrast, other things can't, or needn't be given
the robustness requirements of a system.

Some examples may help (but I'm starting to wonder). For example,
there's rarely any point worrying about whether the file will be
corrupted during intra-host comms or hard disk I/O, as if that does
happen all bets about the integrity of your process and its operational
environment are off. Obviously, inter-host comms that doesn't perform
it's own stream validation benefits from the care you prescribe. In
contrast, it's generally considered unnecessary to validate the
integrity of a TCP/IP comms stream, as the protocol detects errors and
coordinates resends as necessary, and anything reaching your app may be
deemed to be what was sent for all but the most extremely demanding of
purposes.

Consider that mistakes in array/vector indexing cause millions of bugs,
so why not have std::vector::operator[] checked? It is an issue that
has been considered and debated, and most people are happy with the
prioritisation of performance over robustness for this function, are
aware that at() is available for checked access, and can wrap vector<>
redirecting operator[]() to at() if desired. People can make an
informed choice based on their needs.

Similarly, you could argue that a text viewer should be written such
that it can view files larger than the available virtual RAM, but that
doesn't mean that it's not useful and sufficient in most cases to
implement one that can't.

In summary, I'm saying that there is an argument as follows: when you
know an approach is sufficiently robust for your requirements, why
shouldn't you be allowed to use it? You can agree or disagree, but I
can assure you that there will be many people out there who believe
passionately in such a position who you'll never convince otherwise.

Tony

Nov 4 '05 #23

P: n/a
Josh Mcfarlane wrote:
That's like saying that every value that you set is potentially
invalid. If I have a set output from another digital source on the
machine, it COULD be ill-formed if the machine doesn't work as it's
suppose to,
There is a difference between the machine not working as it's supposed to
and the program doing assumptions that it's not supposed to do.
just as if a pointer set to a certain object could randomly
change from the OS not operating as it is suppose to and overwriting
that segment of memory.


Then that machine is not standard C++ compliant. However, if gets() attempts
to put 10 Terabytes into the buffer, that's perfectly fine with the C++
standard.

Nov 4 '05 #24

P: n/a
Rolf Magnus wrote:
Josh Mcfarlane wrote:
That's like saying that every value that you set is potentially
invalid. If I have a set output from another digital source on the
machine, it COULD be ill-formed if the machine doesn't work as it's
suppose to,


There is a difference between the machine not working as it's supposed to
and the program doing assumptions that it's not supposed to do.


My point is, if you have program A, that outputs to a buffer that
Program B reads, there are certain assumptions you can make about the
stream assuming Program X & Y are packaged together. A very small case,
yes, but it is still a case in which you could be sure the input data
would be valid.
just as if a pointer set to a certain object could randomly
change from the OS not operating as it is suppose to and overwriting
that segment of memory.


Then that machine is not standard C++ compliant. However, if gets() attempts
to put 10 Terabytes into the buffer, that's perfectly fine with the C++
standard.


Exactly my point. When you're dealing with knowns from another section
of the program or a helper program, you know what you're dealing with.

Nov 4 '05 #25

P: n/a
Rolf Magnus wrote:

Then that machine is not standard C++ compliant. However, if gets() attempts
to put 10 Terabytes into the buffer, that's perfectly fine with the C++
standard.


If the buffer is smaller than 10 Terabytes the behavior is undefined.
That's not "perfectly fine with the C++ standard."

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Nov 4 '05 #26

P: n/a
to***********@yahoo.co.uk escreveu:
Mike, perhaps you're missing my point, or perhaps you're trying to
Consider that mistakes in array/vector indexing cause millions of bugs,
so why not have std::vector::operator[] checked? It is an issue that
has been considered and debated, and most people are happy with the
prioritisation of performance over robustness for this function, are
aware that at() is available for checked access, and can wrap vector<>
redirecting operator[]() to at() if desired. People can make an
informed choice based on their needs.


If you are comparing std::vector::operator[] and gets() (i'm not sure
this is your point), i think the comparison is not valid. The program
creates the vector and knows its size (or uses vector::size()). On the
other hand, it's very hard to control standard input. Maybe if you are
doing interprocess communication (as it was pointed out by Josh
Mcfarlane), but i don't think it's a compelling argument to keep
gets(). If gets() is removed in C++2020 (after deprecation in C++0x),
people who miss it may reimplement it. But i expect that in 2020
everybody will have changed their gets() to getline()...

Nov 4 '05 #27

P: n/a
Marcus wrote:
If you are comparing std::vector::operator[] and gets() (i'm not sure
this is your point), i think the comparison is not valid. The program
creates the vector and knows its size (or uses vector::size()). On the
other hand, it's very hard to control standard input. Maybe if you are
doing interprocess communication (as it was pointed out by Josh
Mcfarlane), but i don't think it's a compelling argument to keep
gets(). If gets() is removed in C++2020 (after deprecation in C++0x),
people who miss it may reimplement it. But i expect that in 2020
everybody will have changed their gets() to getline()...


Don't get me wrong, I'm not advocating using gets(), I'm just trying to
show those people that have it in their head that gets() always invokes
undefined behavior that their notion is ill-formed.

Nov 4 '05 #28

P: n/a
Josh Mcfarlane escreveu:
Don't get me wrong, I'm not advocating using gets(), I'm just trying to
show those people that have it in their head that gets() always invokes
undefined behavior that their notion is ill-formed.


I see :-)

I don't know if this is really going to add to the discussion, but
let's restate the problem as:
"gets() doesn't allow any form o graceful failure"
Is this a better argument?

Or what about:
"It's embarassing to explain to newbies that they should avoid gets(),
even tho it's part of the standard library and seems very useful at
first". :-P

Nov 4 '05 #29

P: n/a
Marcus wrote:
We all know that the "gets" function from the Standard C Library (which
is part of the Standard C++ Library) is dangerous. It provides no
bounds check, so it's easy to overwrite memory when using it, and
impossible to guarantee that it won't happen.
There is an infinite variety of things you can write in a C++ program
which will render it undefined and potentially dangerous in some
situation. Calling the gets() function is just one of these things.

Among harmful things, gets() is one of the easiest to diagnose. It's
easy for an implementation to detect that a program calls gets(), and
issue a diagnostic. Quite simply, the program's translation units
contain an unresolved reference to that function.
Therefore, i think it's surprising that this function has not been
deprecated.
Deprecated is only a status change that exists in the minds of the
members of the community. It has no practical impact on what's
happening in the actual software.

A C or C++ implementation is free to emit whatever diagnostics it
wants, and to support stricter modes of operation in which it reject
some programs which are correct according to the standard.

For instance, if you run the GNU C compiler with '-Werror', it will
reject all programs for which it emits any kind of diagnostic. Even
something harmless like the suggestion of extra parentheses, or the
definition of a variable that is not used.

I know of one environment provides a warning when a reference to gets
occurs among the translation units of a program being linked to produce
an image.

There are also environments that support bounds checking on objects,
such as compilers that use a '"fat" representation for pointers, and C
interpreters. The gets() function is harmless, to the extent that if it
overruns the array, the condition will be detected and turned into a
diagnostic. I.e. there are conceivable situations in which gets() isn't
disastrous.

So it's basically up to implementors and their community: what they
care about.
Now, the C standard committee is working on safe functions (the ones
that end with "_s") for the C Standard Library. I don't know if they
are going to deprecate the dreaded "gets".
gets is not "dreaded". Only dumb programmers are "dreaded". Dumb
programmers will foil any attempt to provide a safe environment. The
only way to make the world nearly 100% safe from dumb programmers is to
put them on an island with no Internet connection.

If you "dread" gets, you have some emotional problem. Normal people
don't think about it, let alone regard it as some Bogey Man.
getting rid of "gets" is not that hard, isn't it?
Yes, don't use it!
Programs that use it are broken anyway.


Not necessarily. Suppose I have a compiler application which is
separated into two programs, the compiler proper and an assembler.

The compiler generates assembly code, which the assembler reads from
its standard input.

My compiler never generates lines longer than 1023 characters, by
design. So the assembler can safely use gets() on a 1024 character
buffer to read the compiler's output.

The assembler is part of my compiler application; it's not meant to be
used alone. So the interface between the two is a private interface.

Years ago I did some Motorola 68000 programming using the GNU assembler
that served as the back-end for gcc. With that assembler, if I
mis-spelled the mnemonic name of an instruction opcode, there wasn't
any nice error message with a line number. Guess what, the assembler
crashed with a segfault! That wasn't a problem, because the assembler
didn't have to be designed to handle incorrect input. It was an
internal interface to be used by the compiler, which put out correct
opcodes. I got my assembly routines working anyway and life went on.

If you think that's a bad idea to have such an interface, well consider
that modules in C and C++ programs often have such "unsafe" internal
interfaces between them. It's not unusual for pointers to arrays to be
passed around without any size being mentioned anywhere, because all of
the modules just "know" the size. It is some manifest constant that
comes from a header file.

If you linked that compiler and assembler into one program, the gets()
would disappear. The compiler would just pass char * pointers directly
into the assembler, which would be understood to point to arrays of
1024 characters.

How are you going to mark /that/ type of practice as deprecated?

Nov 4 '05 #30

P: n/a
Mike Wahler wrote:
"Pete Becker" <pe********@acm.org> wrote in message
news:au********************@rcn.net...
Rolf Magnus wrote:
Pete Becker wrote:
Rolf Magnus wrote:

>Ok, let's apply technical terms then:
>According to the C++ standard, UB is "behavior, such as might arise upon
>use of an erronous program construct or erroneous data,


I think it's unfortunate that the phrase 'erroneous data'
is not elaborated upon. Does it mean only 'data embedded
in the program', or is it intedend to include 'external input'?


An external input becomes data in the program once it is read. The
erroneous data in the case of gets() is not the text that is coming
from standard input, but rather the array index, or buffer pointer,
that is driven out of bounds by that text. We don't exactly know what
that is because it's an impelmentation detail in the library.

At some point, the gets() function will internally form an lvalue that
is one element past the end of the array and assign to it. The pointer
behind that lvalue is erroneous data, with respect to that assignment,
just like zero is erroneous data with respect to its use as a
denominator in division.

If you write a program that inputs two numbers from the user and
divides them, its behavior is undefined if the user inputs a zero
denominator. At some point, the zero value exists as perfectly good
data. It is not inherently "erroneous". It's scanned, assigned to a
variable of type double or int or whatever, and sits there, being a
perfectly good zero. Then, suddenly, it looks up and sees that it's
walking under a slash! Bad luck ...

Nov 4 '05 #31

P: n/a
Kaz Kylheku escreveu:
There is an infinite variety of things you can write in a C++ program
which will render it undefined and potentially dangerous in some
situation. Calling the gets() function is just one of these things.

Among harmful things, gets() is one of the easiest to diagnose. It's
easy for an implementation to detect that a program calls gets(), and
issue a diagnostic. Quite simply, the program's translation units
contain an unresolved reference to that function.
Yes. gets() is one of the easiest to diagnose. That's why i posted the
first message. Sure, complex things like rvalue references (random
example) are important, but these simple things are important too. They
change the "feel" of the language.
Therefore, i think it's surprising that this function has not been
deprecated.


Deprecated is only a status change that exists in the minds of the
members of the community. It has no practical impact on what's
happening in the actual software.


This is "excuse number 3" according to a post by John Nagle on
comp.std.c++
(do links like this:
http://groups.google.com.br/group/co...358f4b0fbe5842
work?)
BTW: Thanks, John!
A C or C++ implementation is free to emit whatever diagnostics it
wants, and to support stricter modes of operation in which it reject
some programs which are correct according to the standard.
And the implementation is also free to provide no diagnostic. So, i
think it would be a good thing to encourage them to provide
diagnostics...
(...)
So it's basically up to implementors and their community: what they
care about.
Now, the C standard committee is working on safe functions (the ones
that end with "_s") for the C Standard Library. I don't know if they
are going to deprecate the dreaded "gets".
gets is not "dreaded". Only dumb programmers are "dreaded". Dumb
programmers will foil any attempt to provide a safe environment. The
only way to make the world nearly 100% safe from dumb programmers is to
put them on an island with no Internet connection.


"Excuse number 5"... I mean, this is irrelevant. If we look at those
extremes, we'll come to no conclusion, because "any attempt to do
anything will fail, since dumb programmers..."
If you "dread" gets, you have some emotional problem. Normal people
don't think about it, let alone regard it as some Bogey Man.
Yes, i scream and cry every time i see a call to gets() or a forum
reply that says to a newbie: "just use gets() to read input". :-PPPPPPP
:-D
getting rid of "gets" is not that hard, isn't it?


Yes, don't use it!


I don't. Maybe that's the reason the K&R book starts showing how to
write a getline function. Luckily, C++ improved this aspect of C.
Programs that use it are broken anyway.


Not necessarily. Suppose I have a compiler application which is
separated into two programs, the compiler proper and an assembler.

The compiler generates assembly code, which the assembler reads from
its standard input.

My compiler never generates lines longer than 1023 characters, by
design. So the assembler can safely use gets() on a 1024 character
buffer to read the compiler's output.

The assembler is part of my compiler application; it's not meant to be
used alone. So the interface between the two is a private interface.

Years ago I did some Motorola 68000 programming using the GNU assembler
that served as the back-end for gcc. With that assembler, if I
mis-spelled the mnemonic name of an instruction opcode, there wasn't
any nice error message with a line number. Guess what, the assembler
crashed with a segfault! That wasn't a problem, because the assembler
didn't have to be designed to handle incorrect input. It was an
internal interface to be used by the compiler, which put out correct
opcodes. I got my assembly routines working anyway and life went on.


Wow. Would you like to see your grandchildren facing the same problems?
:-P :-D

Let's do things right this time. :-)
If you think that's a bad idea to have such an interface, well consider
that modules in C and C++ programs often have such "unsafe" internal
interfaces between them. It's not unusual for pointers to arrays to be
passed around without any size being mentioned anywhere, because all of
the modules just "know" the size. It is some manifest constant that
comes from a header file.


I don't think about interfaces between functions the same way i think
about interfaces between programs. The security implications are
different.

Nov 5 '05 #32

P: n/a
Marcus escreveu:
This is "excuse number 3" according to a post by John Nagle on
comp.std.c++
(do links like this:
http://groups.google.com.br/group/co...358f4b0fbe5842
work?)


"work?" is not part of the link :-o

Nov 5 '05 #33

This discussion thread is closed

Replies have been disabled for this discussion.