473,405 Members | 2,160 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,405 software developers and data experts.

integer overflow in scanf functions

hi.

i wanted to know why doesn't the scanf functions check for overflow
when reading number. For example scanf("%d" on 32bit machine considers
"1" and "4294967297" to be the same.

I tracked to code to where the conversion itself happens. Code in
scanfs just ignores return value from conversion procedures.

More info in case of glibc posted here:
http://board.flatassembler.net/topic.php?t=6359

AFAIK, implementation doesn't define behavior in case of overflow, so
glibc could consider this error and return errno=ERANGE

Dec 15 '06 #1
26 9360
In article <11**********************@79g2000cws.googlegroups. com>,
vi****@gmail.com <vi****@gmail.comwrote:
>i wanted to know why doesn't the scanf functions check for overflow
when reading number. For example scanf("%d" on 32bit machine considers
"1" and "4294967297" to be the same.
Because that's how it is spec'd.

"An input item is defined as the longest matching sequence of
characters, unless that exceeds a specified field width, in
which case it is the initial subsequence of that length in
the sequence." [...]

"Except in the case of a % specifier, the input item (or, in the
case of a %n directive, the count of input characters) is
converted to a type appropriate for the conversion specifier. [...]
Unless assignment suppression was indicated by a *, the result
of the conversion is placed in the object pointed to by the first
argument following the format argument that has not
already received a conversion result. If this object does not
have an appropriate type, or if the result of the conversion cannot
be represented in the space provided, the behaviour is undefined."
So there you have it: if you didn't put in a field width, then
the %d is *required* to pull in all the decimal digits there, and
if that's too big for an int, then the result is officially undefined.
This is how fscanf (and hence scanf) are -required- to work according
to the standard.
--
I was very young in those days, but I was also rather dim.
-- Christopher Priest
Dec 15 '06 #2
2006-12-15 <el**********@canopus.cc.umanitoba.ca>,
Walter Roberson wrote:
In article <11**********************@79g2000cws.googlegroups. com>,
vi****@gmail.com <vi****@gmail.comwrote:
>>i wanted to know why doesn't the scanf functions check for overflow
when reading number. For example scanf("%d" on 32bit machine considers
"1" and "4294967297" to be the same.

Because that's how it is spec'd.

"An input item is defined as the longest matching sequence of
characters,
And in what way is "429496729" a matching sequence of characters, if
there is no such integer value?
unless that exceeds a specified field width, in
which case it is the initial subsequence of that length in
the sequence." [...]

"Except in the case of a % specifier, the input item (or, in the
case of a %n directive, the count of input characters) is
converted to a type appropriate for the conversion specifier. [...]
Unless assignment suppression was indicated by a *, the result
of the conversion is placed in the object pointed to by the first
argument following the format argument that has not
already received a conversion result. If this object does not
have an appropriate type, or if the result of the conversion cannot
be represented in the space provided, the behaviour is undefined."
It's undefined. Which means there _are_ no requirements. An
implementation is free to treat it as 1, or as 429496729 with 7 still on
the stream, or as such with 7 _not_ still on the stream, or as
4294967295 (saturation), etc, etc

Anyway, I found a possible situation in which my scanf is
non-conformant:

Numerical strings are truncated to 512 characters; for example, %f
and %d are implicitly %512f and %512d.

So, if I send %f

1.000000000000000000000000000000000000000000000000 0000000000000
00000000000000000000000000000000000000000000000000 0000000000000
00000000000000000000000000000000000000000000000000 0000000000000
00000000000000000000000000000000000000000000000000 0000000000000
00000000000000000000000000000000000000000000000000 0000000000000
00000000000000000000000000000000000000000000000000 0000000000000
00000000000000000000000000000000000000000000000000 0000000000000
00000000000000000000000000000000000000000000000000 0000000000000
e1

it converts to 1 instead of 10. Does the standard allow this?
Dec 15 '06 #3
Walter Roberson a écrit :
In article <11**********************@79g2000cws.googlegroups. com>,
vi****@gmail.com <vi****@gmail.comwrote:

>>i wanted to know why doesn't the scanf functions check for overflow
when reading number. For example scanf("%d" on 32bit machine considers
"1" and "4294967297" to be the same.


Because that's how it is spec'd.

"An input item is defined as the longest matching sequence of
characters, unless that exceeds a specified field width, in
which case it is the initial subsequence of that length in
the sequence." [...]

"Except in the case of a % specifier, the input item (or, in the
case of a %n directive, the count of input characters) is
converted to a type appropriate for the conversion specifier. [...]
Unless assignment suppression was indicated by a *, the result
of the conversion is placed in the object pointed to by the first
argument following the format argument that has not
already received a conversion result. If this object does not
have an appropriate type, or if the result of the conversion cannot
be represented in the space provided, the behaviour is undefined."
So there you have it: if you didn't put in a field width, then
the %d is *required* to pull in all the decimal digits there, and
if that's too big for an int, then the result is officially undefined.
This is how fscanf (and hence scanf) are -required- to work according
to the standard.
In general functions like scanf are unusable. They are so
problematic, that it is a wonder when they work at all.

Use strtol, or a similar function that will give reasonable
error returns...
Dec 15 '06 #4
In article <sl*******************@rlaptop.random.yi.org>,
Random832 <ra*******@gmail.comwrote:
>2006-12-15 <el**********@canopus.cc.umanitoba.ca>,
Walter Roberson wrote:
>In article <11**********************@79g2000cws.googlegroups. com>,
vi****@gmail.com <vi****@gmail.comwrote:
>>>i wanted to know why doesn't the scanf functions check for overflow
>"An input item is defined as the longest matching sequence of
characters,
>And in what way is "429496729" a matching sequence of characters, if
there is no such integer value?
The match is based upon the lexical grammar, and the lexical
grammar does not put limitations on the number or content of the
decimal digits.
--
Okay, buzzwords only. Two syllables, tops. -- Laurie Anderson
Dec 15 '06 #5
2006-12-15 <el**********@canopus.cc.umanitoba.ca>,
Walter Roberson wrote:
In article <sl*******************@rlaptop.random.yi.org>,
Random832 <ra*******@gmail.comwrote:
>>2006-12-15 <el**********@canopus.cc.umanitoba.ca>,
Walter Roberson wrote:
>>In article <11**********************@79g2000cws.googlegroups. com>,
vi****@gmail.com <vi****@gmail.comwrote:
>>>>i wanted to know why doesn't the scanf functions check for overflow
>>"An input item is defined as the longest matching sequence of
characters,
>>And in what way is "429496729" a matching sequence of characters, if
there is no such integer value?

The match is based upon the lexical grammar, and the lexical
grammar does not put limitations on the number or content of the
decimal digits.
OK. The rest of my post stands. undefined is undefined, it's not
"required" to do anything in such a case.
Dec 15 '06 #6
so, we agree, it's undefined.

wouldn't it be better to return this overflow as error? 10 digits would
be read off the file/stream/whatever, and function will return as if
number format was invalid, with errno=ERANGE.

i don't think that current behavior is what people await. and scanf
functions are doing lot of "smart" stuff already, just because people
await such behavior.

Dec 15 '06 #7
In article <sl*******************@rlaptop.random.yi.org>,
Random832 <ra*******@gmail.comwrote:
>"Except in the case of a % specifier, the input item (or, in the
case of a %n directive, the count of input characters) is
converted to a type appropriate for the conversion specifier. [...]
Unless assignment suppression was indicated by a *, the result
of the conversion is placed in the object pointed to by the first
argument following the format argument that has not
already received a conversion result. If this object does not
have an appropriate type, or if the result of the conversion cannot
be represented in the space provided, the behaviour is undefined."
>It's undefined. Which means there _are_ no requirements. An
implementation is free to treat it as 1, or as 429496729 with 7 still on
the stream, or as such with 7 _not_ still on the stream, or as
4294967295 (saturation), etc, etc
No, consumption of the maximum characters is -required-. It cannot
leave the other characters in the stream. The undefined part comes
in the valuation and storage of the overly-long result, not in
how many characters are consumed from input.
--
All is vanity. -- Ecclesiastes
Dec 15 '06 #8
Walter Roberson wrote:
In article <sl*******************@rlaptop.random.yi.org>,
Random832 <ra*******@gmail.comwrote:
>>
It's undefined. Which means there _are_ no requirements. An
implementation is free to treat it as 1, or as 429496729 with 7 still on
the stream, or as such with 7 _not_ still on the stream, or as
4294967295 (saturation), etc, etc

No, consumption of the maximum characters is -required-. It cannot
leave the other characters in the stream. The undefined part comes
in the valuation and storage of the overly-long result, not in
how many characters are consumed from input.
Once undefined behavior strikes, the program has no way
to tell how many characters were or were not consumed. All
requirements lose their force in the face of U.B.

--
Eric Sosman
es*****@acm-dot-org.invalid
Dec 15 '06 #9
2006-12-15 <el**********@canopus.cc.umanitoba.ca>,
Walter Roberson wrote:
In article <sl*******************@rlaptop.random.yi.org>,
Random832 <ra*******@gmail.comwrote:
>>"Except in the case of a % specifier, the input item (or, in the
case of a %n directive, the count of input characters) is
converted to a type appropriate for the conversion specifier. [...]
Unless assignment suppression was indicated by a *, the result
of the conversion is placed in the object pointed to by the first
argument following the format argument that has not
already received a conversion result. If this object does not
have an appropriate type, or if the result of the conversion cannot
be represented in the space provided, the behaviour is undefined."
>>It's undefined. Which means there _are_ no requirements. An
implementation is free to treat it as 1, or as 429496729 with 7 still on
the stream, or as such with 7 _not_ still on the stream, or as
4294967295 (saturation), etc, etc

No, consumption of the maximum characters is -required-. It cannot
leave the other characters in the stream. The undefined part comes
in the valuation and storage of the overly-long result, not in
how many characters are consumed from input.
No, I don't think you get it.

In an undefined situation, the standard forbids nothing.

Meaning the implementation gets to do whatever the f*** it wants to,
regarding anything, once anything has happened that has been undefined.
Dec 16 '06 #10
jacob navia wrote:
>
.... snip ...
>
In general functions like scanf are unusable. They are so
problematic, that it is a wonder when they work at all.

Use strtol, or a similar function that will give reasonable
error returns...
No, that requires assigning a buffer of sufficient size, which is
unknown a-priori. Instead take a look at:

<http://cbfalconer.home.att.net/download/txtio.zip>

(which has been revised, but not posted) for a method of reading
values from a text stream without any buffer assignment needed. In
particular see txtinput.c.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>
Dec 16 '06 #11
Eric Sosman wrote:
Walter Roberson wrote:
In article <sl*******************@rlaptop.random.yi.org>,
Random832 <ra*******@gmail.comwrote:
>
It's undefined. Which means there _are_ no requirements. An
implementation is free to treat it as 1, or as 429496729 with 7 still on
the stream, or as such with 7 _not_ still on the stream, or as
4294967295 (saturation), etc, etc
No, consumption of the maximum characters is -required-. It cannot
leave the other characters in the stream. The undefined part comes
in the valuation and storage of the overly-long result, not in
how many characters are consumed from input.

Once undefined behavior strikes, the program has no way
to tell how many characters were or were not consumed.
All requirements lose their force in the face of U.B.
True, but suppose an implementation defines the usual non-trapping 2c
overflow or strtoxxx style behaviour for the %d fscanf case, then the
behaviour is no longer undefined and the normal rules apply.

Of course, few implementations go so far as to actually define (i.e.
guarantee) such behaviour, let alone document it.

--
Peter

Dec 16 '06 #12
In article <sl*******************@rlaptop.random.yi.org>,
Random832 <ra*******@gmail.comwrote:
>2006-12-15 <el**********@canopus.cc.umanitoba.ca>,
Walter Roberson wrote:
>In article <sl*******************@rlaptop.random.yi.org>,
Random832 <ra*******@gmail.comwrote:
>>>Unless assignment suppression was indicated by a *, the result
of the conversion is placed in the object pointed to by the first
argument following the format argument that has not
already received a conversion result. If this object does not
have an appropriate type, or if the result of the conversion cannot
be represented in the space provided, the behaviour is undefined."
>No, I don't think you get it.
>In an undefined situation, the standard forbids nothing.
>Meaning the implementation gets to do whatever the f*** it wants to,
regarding anything, once anything has happened that has been undefined.
The C90 standard defines a three-part operation, first reading
the characters, then converting the type of the value, and then
attempting to store the received value. The first two parts
do not allow for undefined behaviour: only the storage aspect does.

Therefor, in a conforming C90 implementation, the complete sequence
of decimal digits is certain to be read. Stopping reading the stream
at the maximum usable int length (for %d) is not one of the options.
The "undefined behaviour" might then go through the trouble of
"putting back" the extra characters somehow, but read them first it
must.

Ah, there's a simple way to tell: use assignment supression. Then no
actual storage attempt takes place, so whether the receiving variable
is the right size or type is not at question, and undefined behaviour
cannot take place. If you then have another format element to read a
value, or use %n to find the number of characters read, you can
determine where the %d scan left off. C90 tells you where you
should be (i.e., after the sequence of decimal characters); if
your system does leave you in the middle then your system is wrong.
--
There are some ideas so wrong that only a very intelligent person
could believe in them. -- George Orwell
Dec 16 '06 #13
In article <sl*******************@rlaptop.random.yi.org>
Random832 <ra*******@gmail.comwrote:
>Anyway, I found a possible situation in which my scanf is
non-conformant:

Numerical strings are truncated to 512 characters; for example, %f
and %d are implicitly %512f and %512d.

So, if I send %f

1.00000000000000000000000000000000000000000000000 00000000000000
0000000000000000000000000000000000000000000000000 00000000000000
0000000000000000000000000000000000000000000000000 00000000000000
0000000000000000000000000000000000000000000000000 00000000000000
0000000000000000000000000000000000000000000000000 00000000000000
0000000000000000000000000000000000000000000000000 00000000000000
0000000000000000000000000000000000000000000000000 00000000000000
0000000000000000000000000000000000000000000000000 00000000000000
e1

it converts to 1 instead of 10. Does the standard allow this?
Yes:

Environmental limits

[#7] An implementation shall support text files with lines
containing at least 254 characters, including the
terminating new-line character. The value of the macro
BUFSIZ shall be at least 256.

(under "7.13.2 Streams" in the draft .txt file I keep handy).

Most stdio implementations will have *some* convenient limit, as
they will read numerical input into a buffer and then use strtol(),
strtoll(), strtod(), etc., to perform the actual conversions. That
limit must be at least 254, but need not be as high as BUFSIZ (that
is, just because BUFSIZ is, say, 8192, does not mean that scanf()
must be able to eat 8192-digit numbers).
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Dec 16 '06 #14
2006-12-16 <em**********@canopus.cc.umanitoba.ca>,
Walter Roberson wrote:
The C90 standard defines a three-part operation, first reading
the characters, then converting the type of the value, and then
attempting to store the received value. The first two parts
do not allow for undefined behaviour: only the storage aspect does.
And once the storage aspect _does_ have undefined behavior, it can
then go backwards in time and change how the other two aspects operated
in the first place.

In an undefined situation, the C standard forbids nothing.
Therefor, in a conforming C90 implementation, the complete sequence
of decimal digits is certain to be read. Stopping reading the stream
at the maximum usable int length (for %d) is not one of the options.
The "undefined behaviour" might then go through the trouble of
"putting back" the extra characters somehow, but read them first it
must.
It's undefined, there's no rule against time paradoxes.
Ah, there's a simple way to tell: use assignment supression. Then no
actual storage attempt takes place, so whether the receiving variable
is the right size or type is not at question, and undefined behaviour
cannot take place.
But since the behavior is undefined when assignment suppression is not
used, it's free to act differently than if it is used.
Dec 17 '06 #15
2006-12-16 <em*********@news1.newsguy.com>,
Chris Torek wrote:
In article <sl*******************@rlaptop.random.yi.org>
Random832 <ra*******@gmail.comwrote:
>>Anyway, I found a possible situation in which my scanf is
non-conformant:

Numerical strings are truncated to 512 characters; for example, %f
and %d are implicitly %512f and %512d.

So, if I send %f

1.0000000000000000000000000000000000000000000000 000000000000000
000000000000000000000000000000000000000000000000 000000000000000
000000000000000000000000000000000000000000000000 000000000000000
000000000000000000000000000000000000000000000000 000000000000000
000000000000000000000000000000000000000000000000 000000000000000
000000000000000000000000000000000000000000000000 000000000000000
000000000000000000000000000000000000000000000000 000000000000000
000000000000000000000000000000000000000000000000 000000000000000
e1

it converts to 1 instead of 10. Does the standard allow this?

Yes:

Environmental limits

[#7] An implementation shall support text files with lines
containing at least 254 characters, including the
terminating new-line character. The value of the macro
BUFSIZ shall be at least 256.
And what about sscanf?

int main() {
char *x[515];
double n;
memset(x+2,'0',510);
x[0] = '1'; x[1] = '.'; x[512] = 'e'; x[513] = '1'; x[514] = 0;
sscanf(x,"%lf",&n); printf("%f",x);
}

prints 1 or 10?
Dec 17 '06 #16
>>In article <sl*******************@rlaptop.random.yi.org>
>>Random832 <ra*******@gmail.comwrote:
>>>Does the standard allow [scanf to place limits on the size of
numbers converted with %d, %f, etc]

In article <em*********@news1.newsguy.comI wrote:
>Yes:
Environmental limits
[snippage]

In article <sl*******************@rlaptop.random.yi.org>
Random832 <ra*******@gmail.comwrote:
>And what about sscanf?
As far as I can tell, the same rules apply.

Since there is no documentation requirement and no fixed upper
bound (just that "254" I quoted as a lower bound), each scanf
(either each call, or each member of the family, or both) could
use a different limit, too, as long as it is at least 254 each
time.

Practically speaking, I would expect either all the functions
(scanf, fscanf, and sscanf) would have the same limit because they
use the same internal engine; or the engine might "see" that sscanf
is working off a string in memory, hence there is no need to make
a copy of digit-sequences for strto*(), hence sort of "accidentally"
avoid upper limits there. (However, the ruling that conversion
of, e.g., "1.23e-x" must fail, instead of converting "1.23" and
leaving the e-x for the next directive, would make this harder than
one might think at first. If the implementor just used the endptr
parameter from strtod(), sscanf against "1.23e-x" with "%f%s" would
convert two items successfully, instead of failing as required.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Dec 18 '06 #17
Chris Torek wrote:
>
.... snip ...
>
Practically speaking, I would expect either all the functions
(scanf, fscanf, and sscanf) would have the same limit because they
use the same internal engine; or the engine might "see" that sscanf
is working off a string in memory, hence there is no need to make
a copy of digit-sequences for strto*(), hence sort of "accidentally"
avoid upper limits there. (However, the ruling that conversion
of, e.g., "1.23e-x" must fail, instead of converting "1.23" and
leaving the e-x for the next directive, would make this harder than
one might think at first. If the implementor just used the endptr
parameter from strtod(), sscanf against "1.23e-x" with "%f%s" would
convert two items successfully, instead of failing as required.)
There is no necessity to have ANY string length limit affect these
textstream-to-number conversions. I have written code that avoids
the problem entirely. However the error condition for "1.2e-x"
sequences remains. This can obviously be handled easily when the
input is a string, and is otherwise limited by the guaranteed
lookback (ungetc) level.

I disagree that such an input must fail. The interpretation as a
number, followed by a string, seems perfectly reasonable to me.
The cure here is that the application must check the termination
char for the numeric field.

In addition, there should be no problem at the system level in
providing multi-level ungetc ability, provided that the system
never has to back up across line ends. Since a '\n' will always
terminate any numeric input field, this is no hardship. A short
time ago I wrote a small test program to detect this capability,
and found that DJGPP has it. I published the little test here at
the time. So this reduces to a quality of implementation issue.

In practice this all means that the scanf series of functions
should not be used to input numerics without limiting the call to a
single field.

Here is my test program for ungetc levels (tungetc.c):

#include <stdio.h>
#include <stdlib.h>
#define MAXLN 10

int main(void) {
char line[MAXLN + 1];
int ix, ch;

puts("Test ability to ungetc for multiple chars in one line");
fputs("Enter no more than 10 chars:", stdout); fflush(stdout);
ix = 0;
while ((EOF != (ch = getchar())) && ('\n' != ch)) {
if (MAXLN <= ix) break;
line[ix++] = ch;
}
line[ix] = '\0';
if ('\n' != ungetc('\n', stdin)) {
puts("Can't unget a '\\n'");
return(EXIT_FAILURE);
}
puts(line);
puts("Trying to push back the whole line");
while (ix 0) {
ch = ungetc(line[--ix], stdin);
if (ch == line[ix]) putchar(ch);
else {
putchar(line[ix]);
puts(" failed to push back");
return(EXIT_FAILURE);
}
}
puts("\nTrying to reread the whole line");
while ((EOF != (ch = getchar())) && ('\n' != ch)) {
if (ix++ == MAXLN) break;
putchar(ch);
}
return 0;
} /* main */

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>
Dec 18 '06 #18
2006-12-18 <em*********@news4.newsguy.com>,
Chris Torek wrote:
>>>In article <sl*******************@rlaptop.random.yi.org>
Random832 <ra*******@gmail.comwrote:
Does the standard allow [scanf to place limits on the size of
numbers converted with %d, %f, etc]

In article <em*********@news1.newsguy.comI wrote:
>>Yes:
Environmental limits
[snippage]

In article <sl*******************@rlaptop.random.yi.org>
Random832 <ra*******@gmail.comwrote:
>>And what about sscanf?

As far as I can tell, the same rules apply.
That rule does not allow a limit for any scanf function - it allows
limits for other things which allows an implementation to be written for
which no such case is possible for scanf or fscanf - that is not the
same thing.
Since there is no documentation requirement and no fixed upper
bound (just that "254" I quoted as a lower bound), each scanf
(either each call, or each member of the family, or both) could
use a different limit, too, as long as it is at least 254 each
time.
The section you quoted has absolutely nothing to do with any *scanf
function, and even less to do with sscanf.
Dec 18 '06 #19
2006-12-18 <45***************@yahoo.com>,
CBFalconer wrote:
Chris Torek wrote:
>>
... snip ...
>>
Practically speaking, I would expect either all the functions
(scanf, fscanf, and sscanf) would have the same limit because they
use the same internal engine; or the engine might "see" that sscanf
is working off a string in memory, hence there is no need to make
a copy of digit-sequences for strto*(), hence sort of "accidentally"
avoid upper limits there. (However, the ruling that conversion
of, e.g., "1.23e-x" must fail, instead of converting "1.23" and
leaving the e-x for the next directive, would make this harder than
one might think at first. If the implementor just used the endptr
parameter from strtod(), sscanf against "1.23e-x" with "%f%s" would
convert two items successfully, instead of failing as required.)

There is no necessity to have ANY string length limit affect these
textstream-to-number conversions.
He was apparently saying, though, that it is _permitted_ for an
implementation to limit it to 512 characters, and quoted an unrelated
section of the standard that makes it difficult [but clearly not
impossible, as shown by my post] to construct a test case.

If I pass
1.000000000000000000000000000000000000000000000000 00000000000000\
00000000000000000000000000000000000000000000000000 00000000000000\
00000000000000000000000000000000000000000000000000 00000000000000\
00000000000000000000000000000000000000000000000000 00000000000000\
00000000000000000000000000000000000000000000000000 00000000000000\
00000000000000000000000000000000000000000000000000 00000000000000\
00000000000000000000000000000000000000000000000000 00000000000000\
00000000000000000000000000000000000000000000000000 00000000000000e1 to
scanf, I expect it to come back with ten, not one, as the result value.

No-one has provided a convincing argument that an implementation which
stores 1. in the pointed-to argument is legal.
Dec 18 '06 #20
>In article <sl*******************@rlaptop.random.yi.org>
>Random832 <ra*******@gmail.comwrote:
>>>And what about sscanf?
>2006-12-18 <em*********@news4.newsguy.com>,
Chris Torek wrote:
>As far as I can tell, the ["environmental limits"] rules apply.
In article <sl*******************@rlaptop.random.yi.org>,
Random832 <ra*******@gmail.comwrote:
>That rule does not allow a limit for any scanf function - it allows
limits for other things which allows an implementation to be written for
which no such case is possible for scanf or fscanf - that is not the
same thing.
Possibly not. I think it still applies, though.
>The section you quoted has absolutely nothing to do with any *scanf
function, and even less to do with sscanf.
If you do not like my answer, you probably want comp.std.c, not
comp.lang.c. :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Dec 19 '06 #21
2006-12-19 <em*********@news4.newsguy.com>,
Chris Torek wrote:
>>In article <sl*******************@rlaptop.random.yi.org>
Random832 <ra*******@gmail.comwrote:
And what about sscanf?
>>2006-12-18 <em*********@news4.newsguy.com>,
Chris Torek wrote:
>>As far as I can tell, the ["environmental limits"] rules apply.

In article <sl*******************@rlaptop.random.yi.org>,
Random832 <ra*******@gmail.comwrote:
>>That rule does not allow a limit for any scanf function - it allows
limits for other things which allows an implementation to be written for
which no such case is possible for scanf or fscanf - that is not the
same thing.

Possibly not. I think it still applies, though.
What environmental limit does my test program posted earlier violate?
[if there's an auto array size limit that i'm not considering, move the
array to file scope]
Dec 19 '06 #22
>2006-12-19 <em*********@news4.newsguy.com>,
>Chris Torek wrote:
>>Possibly not. I think [the "environmental limit"] still applies [to
sscanf], though.
In article <sl*******************@rlaptop.random.yi.org>,
Random832 <ra*******@gmail.comwrote:
>What environmental limit does my test program posted earlier violate?
Although the limit is for "lines" in text files, and a string is
not a line in a text file, I believe it is intended to generalize
to "text mode stdio streams". If (as I believe) sscanf treats a
string as if it were a text-mode stdio stream, the limit then
intrudes itself rudely upon strings fed to sscanf().

(Since this is really a question about interpreting the number of
angels that must be collectible on various pin-heads in the Standard,
comp.std.c is a better group.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Dec 19 '06 #23
2006-12-19 <em********@news3.newsguy.com>,
Chris Torek wrote:
>>2006-12-19 <em*********@news4.newsguy.com>,
Chris Torek wrote:
>>>Possibly not. I think [the "environmental limit"] still applies [to
sscanf], though.

In article <sl*******************@rlaptop.random.yi.org>,
Random832 <ra*******@gmail.comwrote:
>>What environmental limit does my test program posted earlier violate?

Although the limit is for "lines" in text files, and a string is
not a line in a text file, I believe it is intended to generalize
to "text mode stdio streams". If (as I believe) sscanf treats a
string as if it were a text-mode stdio stream, the limit then
intrudes itself rudely upon strings fed to sscanf().

(Since this is really a question about interpreting the number of
angels that must be collectible on various pin-heads in the Standard,
comp.std.c is a better group.)
Agreed. For those of you just joining us, my implementation is documented thus:

Numerical strings are truncated to 512 characters; for example, %f
and %d are implicitly %512f and %512d.

Is this permitted by the standard?

That is, is an implementation allowed to, on a %f format, interpret
1.0<etc>0e1 as 1.0 rather than 10.0, and to leave some of the <etc>0e1
for the next format specifier? Some people think the limitation on
text file line lengths covers this case, I think it does not, as, first
of all, the implementation otherwise supports text file lines of
arbitrary length, and, second, the argument to sscanf is not a line of
a text file.
Dec 20 '06 #24
Note that here, "such an input" is, e.g., "1.23e-xyz":

int n;
double d;
char buf[100];

n = sscanf("1.23e-xyz", "%lf%99s", &d, buf);

In article <45***************@yahoo.com>
CBFalconer <cb********@maineline.netwrote:
>I disagree that such an input must fail.
If you mean "it is possible to handle this in a computer program,
without having it `fail', so that d is set to 1.23 and buf is set
to e-xyz", then yes.

If you mean "the Standard does not require that this fail", then
no: a DR or TR at some point in the past (back in the 1990s) said
otherwise.

This irked me, because my stdio handled it just fine, setting n
to 2, d to 1.23, and copying the string "e-xyz" into buf[]. But
that is what they said: it must fail. Here, n must be set to 0,
and d and buf[] must be unaltered.
>In practice this all means that the scanf series of functions
should not be used to input numerics without limiting the call to a
single field.
Because of the silly required failure, it should not even be used
for that. Better to get the string into a buffer, and then sscanf()
or (better) strtod(), strtol(), etc.

Note that the Standard requires that, given:

char *ep;
d = strtod("1.23e-xyz", &ep);

d must be set to 1.23, and ep must point to the 'e' in "e-xyz".
That is, the requirements for strtod() and the scanf() family are
different.

It might be nice if the Standard would (or, possibly, does) also
require that both the scanf engine and the strtod() routine handle
"arbitrarily long" inputs wherever they can occur, i.e., in sscanf(),
and in fscanf() and plain scanf() if there are no actual line-length
limits "underneath" the C library, as it were. A scanf engine
*could* handle this internally: if LDBL_MAX_EXP is, say, 10000, at
most a few more than 10000 decimal digits are required to hold a
number (and in fact even fewer are really necessary, if the
implementor wants to fiddle with mantissa and exponent in stringy
ways before calling strtod() internall).
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Dec 20 '06 #25
av
On 17 Dec 2006 04:13:59 GMT, Random832 wrote:
>2006-12-16 <em*********@news1.newsguy.com>,
Chris Torek wrote:
>In article <sl*******************@rlaptop.random.yi.org>
Random832 <ra*******@gmail.comwrote:
>>>Anyway, I found a possible situation in which my scanf is
non-conformant:

Numerical strings are truncated to 512 characters; for example, %f
and %d are implicitly %512f and %512d.

So, if I send %f

1.000000000000000000000000000000000000000000000 0000000000000000
00000000000000000000000000000000000000000000000 0000000000000000
00000000000000000000000000000000000000000000000 0000000000000000
00000000000000000000000000000000000000000000000 0000000000000000
00000000000000000000000000000000000000000000000 0000000000000000
00000000000000000000000000000000000000000000000 0000000000000000
00000000000000000000000000000000000000000000000 0000000000000000
00000000000000000000000000000000000000000000000 0000000000000000
e1

it converts to 1 instead of 10. Does the standard allow this?

Yes:

Environmental limits

[#7] An implementation shall support text files with lines
containing at least 254 characters, including the
terminating new-line character. The value of the macro
BUFSIZ shall be at least 256.

And what about sscanf?

int main() {
char *x[515];
is it not better here char x[515];?
double n;
memset(x+2,'0',510);
here x[2..511]='0';
x[0] = '1'; x[1] = '.'; x[512] = 'e'; x[513] = '1'; x[514] = 0;
sscanf(x,"%lf",&n); printf("%f",x);
}

prints 1 or 10?
for me has to print 10
or sscanf has to return fail
Marry Christmas
Dec 22 '06 #26
2006-12-22 <u5********************************@4ax.com>,
av wrote:
On 17 Dec 2006 04:13:59 GMT, Random832 wrote:
>>2006-12-16 <em*********@news1.newsguy.com>,
Chris Torek wrote:
>>In article <sl*******************@rlaptop.random.yi.org>
Random832 <ra*******@gmail.comwrote:
Anyway, I found a possible situation in which my scanf is
non-conformant:

Numerical strings are truncated to 512 characters; for example, %f
and %d are implicitly %512f and %512d.

So, if I send %f

1.00000000000000000000000000000000000000000000 00000000000000000
0000000000000000000000000000000000000000000000 00000000000000000
0000000000000000000000000000000000000000000000 00000000000000000
0000000000000000000000000000000000000000000000 00000000000000000
0000000000000000000000000000000000000000000000 00000000000000000
0000000000000000000000000000000000000000000000 00000000000000000
0000000000000000000000000000000000000000000000 00000000000000000
0000000000000000000000000000000000000000000000 00000000000000000
e1

it converts to 1 instead of 10. Does the standard allow this?

Yes:

Environmental limits

[#7] An implementation shall support text files with lines
containing at least 254 characters, including the
terminating new-line character. The value of the macro
BUFSIZ shall be at least 256.

And what about sscanf?

int main() {
char *x[515];

is it not better here char x[515];?
Yes, sorry, that was a typo
>
> double n;
memset(x+2,'0',510);

here x[2..511]='0';
> x[0] = '1'; x[1] = '.'; x[512] = 'e'; x[513] = '1'; x[514] = 0;
sscanf(x,"%lf",&n); printf("%f",x);
}

prints 1 or 10?

for me has to print 10
or sscanf has to return fail
Right, I didn't check the return of sscanf, but my feeling was that
"succeeding" and translating the first 512 bytes to a 1 and leaving the
e1 alone is not the right way to do it.

Incidentally, despite the documentation, my implementation actually does
result in 10. So it's a quality of implementation issue on the docs. I'm
still curious as to whether the behavior described is permitted
Dec 22 '06 #27

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Enrico 'Trippo' Porreca | last post by:
I believe there can be an integer overflow, without a silent wrap-around, in the following example: int a = INT_MAX; a++; Am I right? Could this lead to an abnormal program termination in...
25
by: junky_fellow | last post by:
Is there any way by which the overflow during addition of two integers may be detected ? eg. suppose we have three unsigned integers, a ,b, c. we are doing a check like if ((a +b) > c) do...
4
by: Raymond | last post by:
Source: http://moryton.blogspot.com/2007/08/detecting-overflowunderflow-when.html Example from source: char unsigned augend (255); char unsigned const addend (255); char unsigned const sum...
42
by: thomas.mertes | last post by:
Is it possible to use some C or compiler extension to catch integer overflow? The situation is as follows: I use C as target language for compiled Seed7 programs. For integer computions the C...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.