Why "gets" has not been deprecated yet?

Marcus wrote:

Opinions? Should this message be posted on comp.std.c++?

You probably want to post it over there, as people on here generally
focus more on application, and less on changing / debating the
standard.

Be careful with the assumption that all things using gets are
inherantly flawed however. =P

Nov 3 '05 #2

Josh Mcfarlane wrote:

Be careful with the assumption that all things using gets are
inherantly flawed however. =P

Well, gets invokes undefined behavior.

Nov 3 '05 #3

Rolf Magnus wrote:

Well, gets invokes undefined behavior. From a buffer overrun? If not, what else causes the undefined behavior?

Nov 3 '05 #4

Ron Natalie

Josh Mcfarlane wrote:

Rolf Magnus wrote:
Well, gets invokes undefined behavior.

From a buffer overrun? If not, what else causes the undefined behavior?

Many functions in the C library have undefined behavior when given
arguments outside their range. They are by and large a piece of
inherited crap that should have never received standard status (the
STDIO part of the library is the most misdesigned malodious thing
ever foisted on the community, it was derived from a misnamed
piece of crap from an ancient UNIX project called the "portable
IO library").

Nov 3 '05 #5

Ron Natalie wrote:

Many functions in the C library have undefined behavior when given
arguments outside their range. They are by and large a piece of
inherited crap that should have never received standard status (the
STDIO part of the library is the most misdesigned malodious thing
ever foisted on the community, it was derived from a misnamed
piece of crap from an ancient UNIX project called the "portable
IO library").

Well, ya, my point was, if you can confine to arguments within their
range, they do function (at least to my knowledge). Good? No, but still
functionable.

Anywho, let's go throw this at the std people and see if it can get any
support.

Nov 3 '05 #6

Gaijinco

> Many functions in the C library have undefined behavior when given

arguments outside their range. They are by and large a piece of
inherited crap that should have never received standard status (the
STDIO part of the library is the most misdesigned malodious thing
ever foisted on the community, it was derived from a misnamed
piece of crap from an ancient UNIX project called the "portable
IO library").

Wow! I had never hear about that, can you explain a little more what
are the problems of <stdio.h>?

Nov 3 '05 #7

Ron Natalie

Gaijinco wrote:

Many functions in the C library have undefined behavior when given
arguments outside their range. They are by and large a piece of
inherited crap that should have never received standard status (the
STDIO part of the library is the most misdesigned malodious thing
ever foisted on the community, it was derived from a misnamed
piece of crap from an ancient UNIX project called the "portable
IO library").

Wow! I had never hear about that, can you explain a little more what
are the problems of <stdio.h>?

Functions like gets that have no provisions for safety.
All the functions have arguments in different order. Some
of them have the file stream arg first, some last.
fwrite/fread have a number of records and record size number
that nobody knows what to do with other than multiply together.
It just goes on from their, the library is crap.

Nov 3 '05 #8

tony_in_da_uk

This propensity for undefined behaviour is an example of Design by
Contract (DbC): you meet the preconditions, and you get the contracted
behaviour. The philosophy says: if you stuff up, and fail to pick it
up in your testing, it's your fault and you're a pathetic excuse for a
programmer, (and probably a human being). Anyway, the point is that
DbC can work, but you have to guarantee the preconditions. For gets,
they're extreme: if you know that standard input necessarily sends
lines below a certain length, then you can use it. This is probably
only the case when standard input is coming from some other source that
you control. For example, you might write a filter that works on some
fixed-length records, and is designed to be used in a pipeline ala
(UNIX) "cat file | filter" or (DOS) "type file | filter". Who's to say
that you don't know what you're doing well enough to guarantee the line
length precondition? It's your own call whether you use it.

FWIW, I dislike DbC and agree that gets should hardly ever be used,
would happily consider that it should never be used in new code, but
wouldn't go to the extent of saying that it must never be used and it's
worth breaking existing code using it. More generally, the stdio
library has proven itself a well-designed bit of work, in that while
it's error-proneness been the cause of innumerable errors, it's
concision, usability and flexibility has supported innumerable systems
that do useful work. If you think you can write better in C, go ahead
and see if anyone wants to use your creations.... One of the
compromises of C++ is that it should overwhelmingly be a superset of C,
with benefits in porting, skills transfer etc..

Tony

Nov 3 '05 #9

Josh Mcfarlane wrote:

Ron Natalie wrote:
Many functions in the C library have undefined behavior when given
arguments outside their range. They are by and large a piece of
inherited crap that should have never received standard status (the
STDIO part of the library is the most misdesigned malodious thing
ever foisted on the community, it was derived from a misnamed
piece of crap from an ancient UNIX project called the "portable
IO library").

Well, ya, my point was, if you can confine to arguments within their
range, they do function (at least to my knowledge). Good? No, but still
functionable.

The problem about gets is that there is no way for the program to provide
arguments that are really 100% safe. gets will produce a buffer overflow if
the buffer you provided isn't large enough for the incoming data. There is
no (portable) way to make the buffer big enough in every case, since the
program can't control the amount of data that is read. This lack of control
leads me to the conclusion that gets() can be seen as generally invoking
undefined behavior.

Nov 3 '05 #10

Rolf Magnus wrote:

The problem about gets is that there is no way for the program to provide
arguments that are really 100% safe. gets will produce a buffer overflow if
the buffer you provided isn't large enough for the incoming data. There is
no (portable) way to make the buffer big enough in every case, since the
program can't control the amount of data that is read. This lack of control
leads me to the conclusion that gets() can be seen as generally invoking
undefined behavior.

That's too broad. The behavior of gets is undefined if the input in fact
is too large for the buffer. If it isn't, the behavior is well defined.

That's not a comment on its utility, but on how to apply technical terms.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

Nov 3 '05 #11

Pete Becker wrote:

Rolf Magnus wrote:

The problem about gets is that there is no way for the program to provide
arguments that are really 100% safe. gets will produce a buffer overflow
if the buffer you provided isn't large enough for the incoming data.
There is no (portable) way to make the buffer big enough in every case,
since the program can't control the amount of data that is read. This
lack of control leads me to the conclusion that gets() can be seen as
generally invoking undefined behavior.
That's too broad. The behavior of gets is undefined if the input in fact
is too large for the buffer. If it isn't, the behavior is well defined.

However, the C++ standard does not specify how large the input is or may be,
and there is no way for the program to know it, so the "if it isn't, the
behavior is well defined" part is of no relevance for my program. I must
assume that the input may be too large, no matter what my program does.
That's not a comment on its utility, but on how to apply technical terms.

Ok, let's apply technical terms then:
According to the C++ standard, UB is "behavior, such as might arise upon use
of an erronous program construct or erroneous data, for which this
International Standard imposes no requirements". Applying that to your
sentence above, that means that my program has "an erronous program
construct or erroneous data", if the input is too large and is correct if
the input fits in the provided space. But my program can't control whenther
the input fits or not. It can control the size of the buffer, but not the
amount of data coming in, so it doesn't have any way of ensuring the
well-defined behavior that you are writing about.
It's as if you say "the behavior is well-defined only on full moon".

Nov 3 '05 #12

Neil Cerutti

On 2005-11-03, to***********@yahoo.co.uk
<to***********@yahoo.co.uk> wrote:

This propensity for undefined behaviour is an example of Design
by Contract (DbC): you meet the preconditions, and you get the
contracted behaviour. The philosophy says: if you stuff up,
and fail to pick it up in your testing, it's your fault and
you're a pathetic excuse for a programmer, (and probably a
human being). Anyway, the point is that DbC can work, but you
have to guarantee the preconditions. For gets, they're
extreme: if you know that standard input necessarily sends
lines below a certain length, then you can use it. This is
probably only the case when standard input is coming from some
other source that you control. For example, you might write a
filter that works on some fixed-length records, and is designed
to be used in a pipeline ala (UNIX) "cat file | filter" or
(DOS) "type file | filter". Who's to say that you don't know
what you're doing well enough to guarantee the line length
precondition?

Crackers.

--
Neil Cerutti

Nov 3 '05 #13

Ian Malone

Ron Natalie wrote:

fwrite/fread have a number of records and record size number
that nobody knows what to do with other than multiply together.
It just goes on from their, the library is crap.

fwrite and fread return the number of objects written or read,
not the number of chars. But in general you may as well
use <iostream> and friends.

--
imalone

Nov 3 '05 #14

tony_in_da_uk

Consider: someone writes two programs that share a header file
containing a buffer-size constant. In one program, lines are generated
and checked against this maximum length. The other program defines a
buffer based on this length, but uses gets(). The two programs may be
reasonably well synchronised, in that a change to the header triggers
rebuilds of both. Just hope they're distributed together too! This is
arguably in line with a workable (but deeply unappealing to me) DbC
philosophy. I can't say it's crackers, even though I'd like to be able
to! - Tony

Nov 3 '05 #15

Ron Natalie

Ian Malone wrote:

Ron Natalie wrote:
fwrite/fread have a number of records and record size number
that nobody knows what to do with other than multiply together.
It just goes on from their, the library is crap.

fwrite and fread return the number of objects written or read,
not the number of chars. But in general you may as well
use <iostream> and friends.

Yeah, so? But there is no concept of reading anything other
than char's from the stream. All the function does is multiply
those two args togehter and divides by the size on return.
It's a stupid design.

Nov 3 '05 #16

Rolf Magnus wrote:

Ok, let's apply technical terms then:
According to the C++ standard, UB is "behavior, such as might arise upon use
of an erronous program construct or erroneous data, for which this
International Standard imposes no requirements". Applying that to your
sentence above, that means that my program has "an erronous program
construct or erroneous data", if the input is too large and is correct if
the input fits in the provided space. But my program can't control whenther
the input fits or not. It can control the size of the buffer, but not the
amount of data coming in, so it doesn't have any way of ensuring the
well-defined behavior that you are writing about.
That's correct.
It's as if you say "the behavior is well-defined only on full moon".

No, it's not. Not being able to control input is not the same as input
always being ill-formed. For a quick and dirty one-off command line
utility I'd have no qualms about using gets.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

Nov 3 '05 #17

Mike Wahler

<to***********@yahoo.co.uk> wrote in message
news:11**********************@g47g2000cwa.googlegr oups.com...

This propensity for undefined behaviour is an example of Design by
Contract (DbC): you meet the preconditions, and you get the contracted
behaviour. The philosophy says: if you stuff up, and fail to pick it
up in your testing, it's your fault and you're a pathetic excuse for a
programmer, (and probably a human being). Anyway, the point is that
DbC can work, but you have to guarantee the preconditions. For gets,
they're extreme: if you know that standard input necessarily sends
lines below a certain length, then you can use it. This is probably
only the case when standard input is coming from some other source that
you control. For example, you might write a filter that works on some
fixed-length records, and is designed to be used in a pipeline ala
(UNIX) "cat file | filter" or (DOS) "type file | filter". Who's to say
that you don't know what you're doing well enough to guarantee the line
length precondition? It's your own call whether you use it.

Who's to say that an input stream with a 'guaranteed' limit of
'record size', did not get corrupted by some outside influence,
rendering the 'guarantee' spurious? I've actually had to deal
with this issue in the real world (receiving data over an RS232
line, subject to ocassional 'noise'). My program was not able
to make *any* assumptions about the expected data stream.

'Knowing what I was doing', I knew that such 'guarantee' was
impossible to implement. 'Knowing what I was doing' meant that
it was my program's responsibility to deal with 'dirty' data
in a safe manner (e.g. discarding it, or perhaps re-acquiring it).
-Mike

Nov 3 '05 #18

Pete Becker wrote:

Rolf Magnus wrote:

Ok, let's apply technical terms then:
According to the C++ standard, UB is "behavior, such as might arise upon
use of an erronous program construct or erroneous data, for which this
International Standard imposes no requirements". Applying that to your
sentence above, that means that my program has "an erronous program
construct or erroneous data", if the input is too large and is correct if
the input fits in the provided space. But my program can't control
whenther the input fits or not. It can control the size of the buffer,
but not the amount of data coming in, so it doesn't have any way of
ensuring the well-defined behavior that you are writing about.
That's correct.

So, you think the correctness of a C++ program can depend on what the user
enters at run-time?

It's as if you say "the behavior is well-defined only on full moon".

No, it's not. Not being able to control input is not the same as input
always being ill-formed.

It's always potentially being ill-formed.
For a quick and dirty one-off command line utility I'd have no qualms
about using gets.

That way of thinking is the reason for quite a lot of security holes.

Nov 4 '05 #19

Rolf Magnus wrote:

It's always potentially being ill-formed.

That's like saying that every value that you set is potentially
invalid. If I have a set output from another digital source on the
machine, it COULD be ill-formed if the machine doesn't work as it's
suppose to, just as if a pointer set to a certain object could randomly
change from the OS not operating as it is suppose to and overwriting
that segment of memory.

Nov 4 '05 #20

Rolf Magnus wrote:

Pete Becker wrote:

Rolf Magnus wrote:
Ok, let's apply technical terms then:
According to the C++ standard, UB is "behavior, such as might arise upon
use of an erronous program construct or erroneous data, for which this
International Standard imposes no requirements". Applying that to your
sentence above, that means that my program has "an erronous program
construct or erroneous data", if the input is too large and is correct if
the input fits in the provided space. But my program can't control
whenther the input fits or not. It can control the size of the buffer,
but not the amount of data coming in, so it doesn't have any way of
ensuring the well-defined behavior that you are writing about.
That's correct.

So, you think the correctness of a C++ program can depend on what the user
enters at run-time?

It's what the language defintition says.

It's as if you say "the behavior is well-defined only on full moon".

No, it's not. Not being able to control input is not the same as input
always being ill-formed.

It's always potentially being ill-formed.

Whatever.

For a quick and dirty one-off command line utility I'd have no qualms
about using gets.

That way of thinking is the reason for quite a lot of security holes.

A quick and dirty one-off command line utility by definition isn't
secure, so security holes aren't important.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

Nov 4 '05 #21

Mike Wahler

"Pete Becker" <pe********@acm.org> wrote in message
news:au********************@rcn.net...

Rolf Magnus wrote:
Pete Becker wrote:

Rolf Magnus wrote:

Ok, let's apply technical terms then:
According to the C++ standard, UB is "behavior, such as might arise upon
use of an erronous program construct or erroneous data,

I think it's unfortunate that the phrase 'erroneous data'
is not elaborated upon. Does it mean only 'data embedded
in the program', or is it intedend to include 'external input'?

Or perhaps the concept of 'data' is defined elswhere
I'm unaware of?

-Mike

Nov 4 '05 #22

tony_in_da_uk

Mike, perhaps you're missing my point, or perhaps you're trying to
illustrate the complexities of making the determination. Anyway, your
post highlights that some issues can and should be anticipated and
usefully handled. In contrast, other things can't, or needn't be given
the robustness requirements of a system.

Some examples may help (but I'm starting to wonder). For example,
there's rarely any point worrying about whether the file will be
corrupted during intra-host comms or hard disk I/O, as if that does
happen all bets about the integrity of your process and its operational
environment are off. Obviously, inter-host comms that doesn't perform
it's own stream validation benefits from the care you prescribe. In
contrast, it's generally considered unnecessary to validate the
integrity of a TCP/IP comms stream, as the protocol detects errors and
coordinates resends as necessary, and anything reaching your app may be
deemed to be what was sent for all but the most extremely demanding of
purposes.

Consider that mistakes in array/vector indexing cause millions of bugs,
so why not have std::vector::operator[] checked? It is an issue that
has been considered and debated, and most people are happy with the
prioritisation of performance over robustness for this function, are
aware that at() is available for checked access, and can wrap vector<>
redirecting operator[]() to at() if desired. People can make an
informed choice based on their needs.

Similarly, you could argue that a text viewer should be written such
that it can view files larger than the available virtual RAM, but that
doesn't mean that it's not useful and sufficient in most cases to
implement one that can't.

In summary, I'm saying that there is an argument as follows: when you
know an approach is sufficiently robust for your requirements, why
shouldn't you be allowed to use it? You can agree or disagree, but I
can assure you that there will be many people out there who believe
passionately in such a position who you'll never convince otherwise.

Tony

Nov 4 '05 #23

Josh Mcfarlane wrote:

That's like saying that every value that you set is potentially
invalid. If I have a set output from another digital source on the
machine, it COULD be ill-formed if the machine doesn't work as it's
suppose to,
There is a difference between the machine not working as it's supposed to
and the program doing assumptions that it's not supposed to do.
just as if a pointer set to a certain object could randomly
change from the OS not operating as it is suppose to and overwriting
that segment of memory.

Then that machine is not standard C++ compliant. However, if gets() attempts
to put 10 Terabytes into the buffer, that's perfectly fine with the C++
standard.

Nov 4 '05 #24

Rolf Magnus wrote:

Josh Mcfarlane wrote:
That's like saying that every value that you set is potentially
invalid. If I have a set output from another digital source on the
machine, it COULD be ill-formed if the machine doesn't work as it's
suppose to,

There is a difference between the machine not working as it's supposed to
and the program doing assumptions that it's not supposed to do.

My point is, if you have program A, that outputs to a buffer that
Program B reads, there are certain assumptions you can make about the
stream assuming Program X & Y are packaged together. A very small case,
yes, but it is still a case in which you could be sure the input data
would be valid.

just as if a pointer set to a certain object could randomly
change from the OS not operating as it is suppose to and overwriting
that segment of memory.

Then that machine is not standard C++ compliant. However, if gets() attempts
to put 10 Terabytes into the buffer, that's perfectly fine with the C++
standard.

Exactly my point. When you're dealing with knowns from another section
of the program or a helper program, you know what you're dealing with.

Nov 4 '05 #25

Rolf Magnus wrote:

Then that machine is not standard C++ compliant. However, if gets() attempts
to put 10 Terabytes into the buffer, that's perfectly fine with the C++
standard.

If the buffer is smaller than 10 Terabytes the behavior is undefined.
That's not "perfectly fine with the C++ standard."

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

Nov 4 '05 #26

Marcus

to***********@yahoo.co.uk escreveu:

Mike, perhaps you're missing my point, or perhaps you're trying to
Consider that mistakes in array/vector indexing cause millions of bugs,
so why not have std::vector::operator[] checked? It is an issue that
has been considered and debated, and most people are happy with the
prioritisation of performance over robustness for this function, are
aware that at() is available for checked access, and can wrap vector<>
redirecting operator[]() to at() if desired. People can make an
informed choice based on their needs.

If you are comparing std::vector::operator[] and gets() (i'm not sure
this is your point), i think the comparison is not valid. The program
creates the vector and knows its size (or uses vector::size()). On the
other hand, it's very hard to control standard input. Maybe if you are
doing interprocess communication (as it was pointed out by Josh
Mcfarlane), but i don't think it's a compelling argument to keep
gets(). If gets() is removed in C++2020 (after deprecation in C++0x),
people who miss it may reimplement it. But i expect that in 2020
everybody will have changed their gets() to getline()...

Nov 4 '05 #27