integer overflow in scanf functions

vid512

hi.

i wanted to know why doesn't the scanf functions check for overflow
when reading number. For example scanf("%d" on 32bit machine considers
"1" and "4294967297 " to be the same.

I tracked to code to where the conversion itself happens. Code in
scanfs just ignores return value from conversion procedures.

More info in case of glibc posted here:
http://board.flatassembler.net/topic.php?t=6359

AFAIK, implementation doesn't define behavior in case of overflow, so
glibc could consider this error and return errno=ERANGE

Dec 15 '06 #1

Subscribe Reply

9527

Walter Roberson

In article <11************ **********@79g2 000cws.googlegr oups.com>,
vi****@gmail.co m <vi****@gmail.c omwrote:

>i wanted to know why doesn't the scanf functions check for overflow
when reading number. For example scanf("%d" on 32bit machine considers
"1" and "4294967297 " to be the same.

Because that's how it is spec'd.

"An input item is defined as the longest matching sequence of
characters, unless that exceeds a specified field width, in
which case it is the initial subsequence of that length in
the sequence." [...]

"Except in the case of a % specifier, the input item (or, in the
case of a %n directive, the count of input characters) is
converted to a type appropriate for the conversion specifier. [...]
Unless assignment suppression was indicated by a *, the result
of the conversion is placed in the object pointed to by the first
argument following the format argument that has not
already received a conversion result. If this object does not
have an appropriate type, or if the result of the conversion cannot
be represented in the space provided, the behaviour is undefined."
So there you have it: if you didn't put in a field width, then
the %d is *required* to pull in all the decimal digits there, and
if that's too big for an int, then the result is officially undefined.
This is how fscanf (and hence scanf) are -required- to work according
to the standard.
--
I was very young in those days, but I was also rather dim.
-- Christopher Priest

Dec 15 '06 #2

Random832

2006-12-15 <el**********@c anopus.cc.umani toba.ca>,
Walter Roberson wrote:

In article <11************ **********@79g2 000cws.googlegr oups.com>,
vi****@gmail.co m <vi****@gmail.c omwrote:

>>i wanted to know why doesn't the scanf functions check for overflow
when reading number. For example scanf("%d" on 32bit machine considers
"1" and "4294967297 " to be the same.

Because that's how it is spec'd.

"An input item is defined as the longest matching sequence of
characters,

And in what way is "429496729" a matching sequence of characters, if
there is no such integer value?

unless that exceeds a specified field width, in
which case it is the initial subsequence of that length in
the sequence." [...]

"Except in the case of a % specifier, the input item (or, in the
case of a %n directive, the count of input characters) is
converted to a type appropriate for the conversion specifier. [...]
Unless assignment suppression was indicated by a *, the result
of the conversion is placed in the object pointed to by the first
argument following the format argument that has not
already received a conversion result. If this object does not
have an appropriate type, or if the result of the conversion cannot
be represented in the space provided, the behaviour is undefined."

It's undefined. Which means there _are_ no requirements. An
implementation is free to treat it as 1, or as 429496729 with 7 still on
the stream, or as such with 7 _not_ still on the stream, or as
4294967295 (saturation), etc, etc

Anyway, I found a possible situation in which my scanf is
non-conformant:

Numerical strings are truncated to 512 characters; for example, %f
and %d are implicitly %512f and %512d.

So, if I send %f

1.0000000000000 000000000000000 000000000000000 000000000000000 000
000000000000000 000000000000000 000000000000000 000000000000000 000
000000000000000 000000000000000 000000000000000 000000000000000 000
000000000000000 000000000000000 000000000000000 000000000000000 000
000000000000000 000000000000000 000000000000000 000000000000000 000
000000000000000 000000000000000 000000000000000 000000000000000 000
000000000000000 000000000000000 000000000000000 000000000000000 000
000000000000000 000000000000000 000000000000000 000000000000000 000
e1

it converts to 1 instead of 10. Does the standard allow this?

Dec 15 '06 #3

jacob navia

Walter Roberson a écrit :

In article <11************ **********@79g2 000cws.googlegr oups.com>,
vi****@gmail.co m <vi****@gmail.c omwrote:

>>i wanted to know why doesn't the scanf functions check for overflow
when reading number. For example scanf("%d" on 32bit machine considers
"1" and "4294967297 " to be the same.

Because that's how it is spec'd.

"An input item is defined as the longest matching sequence of
characters, unless that exceeds a specified field width, in
which case it is the initial subsequence of that length in
the sequence." [...]

"Except in the case of a % specifier, the input item (or, in the
case of a %n directive, the count of input characters) is
converted to a type appropriate for the conversion specifier. [...]
Unless assignment suppression was indicated by a *, the result
of the conversion is placed in the object pointed to by the first
argument following the format argument that has not
already received a conversion result. If this object does not
have an appropriate type, or if the result of the conversion cannot
be represented in the space provided, the behaviour is undefined."
So there you have it: if you didn't put in a field width, then
the %d is *required* to pull in all the decimal digits there, and
if that's too big for an int, then the result is officially undefined.
This is how fscanf (and hence scanf) are -required- to work according
to the standard.

In general functions like scanf are unusable. They are so
problematic, that it is a wonder when they work at all.

Use strtol, or a similar function that will give reasonable
error returns...

Dec 15 '06 #4

Walter Roberson

In article <sl************ *******@rlaptop .random.yi.org> ,
Random832 <ra*******@gmai l.comwrote:

>2006-12-15 <el**********@c anopus.cc.umani toba.ca>,
Walter Roberson wrote:
>In article <11************ **********@79g2 000cws.googlegr oups.com>,
vi****@gmail.co m <vi****@gmail.c omwrote:

>>>i wanted to know why doesn't the scanf functions check for overflow

>"An input item is defined as the longest matching sequence of
characters,

>And in what way is "429496729" a matching sequence of characters, if
there is no such integer value?

The match is based upon the lexical grammar, and the lexical
grammar does not put limitations on the number or content of the
decimal digits.
--
Okay, buzzwords only. Two syllables, tops. -- Laurie Anderson

Dec 15 '06 #5

Random832

2006-12-15 <el**********@c anopus.cc.umani toba.ca>,
Walter Roberson wrote:

In article <sl************ *******@rlaptop .random.yi.org> ,
Random832 <ra*******@gmai l.comwrote:
>>2006-12-15 <el**********@c anopus.cc.umani toba.ca>,
Walter Roberson wrote:
>>In article <11************ **********@79g2 000cws.googlegr oups.com>,
vi****@gmail.co m <vi****@gmail.c omwrote:

>>>>i wanted to know why doesn't the scanf functions check for overflow

>>"An input item is defined as the longest matching sequence of
characters,

>>And in what way is "429496729" a matching sequence of characters, if
there is no such integer value?

The match is based upon the lexical grammar, and the lexical
grammar does not put limitations on the number or content of the
decimal digits.

OK. The rest of my post stands. undefined is undefined, it's not
"required" to do anything in such a case.

Dec 15 '06 #6

vid512

so, we agree, it's undefined.

wouldn't it be better to return this overflow as error? 10 digits would
be read off the file/stream/whatever, and function will return as if
number format was invalid, with errno=ERANGE.

i don't think that current behavior is what people await. and scanf
functions are doing lot of "smart" stuff already, just because people
await such behavior.

Dec 15 '06 #7

Walter Roberson

In article <sl************ *******@rlaptop .random.yi.org> ,
Random832 <ra*******@gmai l.comwrote:

>"Except in the case of a % specifier, the input item (or, in the
case of a %n directive, the count of input characters) is
converted to a type appropriate for the conversion specifier. [...]
Unless assignment suppression was indicated by a *, the result
of the conversion is placed in the object pointed to by the first
argument following the format argument that has not
already received a conversion result. If this object does not
have an appropriate type, or if the result of the conversion cannot
be represented in the space provided, the behaviour is undefined."

>It's undefined. Which means there _are_ no requirements. An
implementati on is free to treat it as 1, or as 429496729 with 7 still on
the stream, or as such with 7 _not_ still on the stream, or as
4294967295 (saturation), etc, etc

No, consumption of the maximum characters is -required-. It cannot
leave the other characters in the stream. The undefined part comes
in the valuation and storage of the overly-long result, not in
how many characters are consumed from input.
--
All is vanity. -- Ecclesiastes

Dec 15 '06 #8

Eric Sosman

Walter Roberson wrote:

In article <sl************ *******@rlaptop .random.yi.org> ,
Random832 <ra*******@gmai l.comwrote:
>>
It's undefined. Which means there _are_ no requirements. An
implementati on is free to treat it as 1, or as 429496729 with 7 still on
the stream, or as such with 7 _not_ still on the stream, or as
4294967295 (saturation), etc, etc

No, consumption of the maximum characters is -required-. It cannot
leave the other characters in the stream. The undefined part comes
in the valuation and storage of the overly-long result, not in
how many characters are consumed from input.

Once undefined behavior strikes, the program has no way
to tell how many characters were or were not consumed. All
requirements lose their force in the face of U.B.

--
Eric Sosman
es*****@acm-dot-org.invalid

Dec 15 '06 #9

Random832

2006-12-15 <el**********@c anopus.cc.umani toba.ca>,
Walter Roberson wrote:

In article <sl************ *******@rlaptop .random.yi.org> ,
Random832 <ra*******@gmai l.comwrote:

>>"Except in the case of a % specifier, the input item (or, in the
case of a %n directive, the count of input characters) is
converted to a type appropriate for the conversion specifier. [...]
Unless assignment suppression was indicated by a *, the result
of the conversion is placed in the object pointed to by the first
argument following the format argument that has not
already received a conversion result. If this object does not
have an appropriate type, or if the result of the conversion cannot
be represented in the space provided, the behaviour is undefined."

>>It's undefined. Which means there _are_ no requirements. An
implementatio n is free to treat it as 1, or as 429496729 with 7 still on
the stream, or as such with 7 _not_ still on the stream, or as
4294967295 (saturation), etc, etc

No, consumption of the maximum characters is -required-. It cannot
leave the other characters in the stream. The undefined part comes
in the valuation and storage of the overly-long result, not in
how many characters are consumed from input.

No, I don't think you get it.

In an undefined situation, the standard forbids nothing.

Meaning the implementation gets to do whatever the f*** it wants to,
regarding anything, once anything has happened that has been undefined.

Dec 16 '06 #10

Similar topics

5371

Integer overflow

by: Enrico 'Trippo' Porreca | last post by:

I believe there can be an integer overflow, without a silent wrap-around, in the following example: int a = INT_MAX; a++; Am I right? Could this lead to an abnormal program termination in some implementations? If so, could this happen without an arithmetical operation, i.e. because

C / C++

6263

detecting integer overflow

by: junky_fellow | last post by:

Is there any way by which the overflow during addition of two integers may be detected ? eg. suppose we have three unsigned integers, a ,b, c. we are doing a check like if ((a +b) > c) do something;

C / C++

9952

Unsigned integer overflow detection

by: Raymond | last post by:

Source: http://moryton.blogspot.com/2007/08/detecting-overflowunderflow-when.html Example from source: char unsigned augend (255); char unsigned const addend (255); char unsigned const sum (augend + addend); if (sum < augend)

C / C++

7035

Catching integer overflow

by: thomas.mertes | last post by:

Is it possible to use some C or compiler extension to catch integer overflow? The situation is as follows: I use C as target language for compiled Seed7 programs. For integer computions the C type 'long' is used. That way native C speed can be reached. Now I want to experiment with raising a Seed7 exception (which is emulated with setjmp(), longjmp() in C) for integer

C / C++

10230

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

10058

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

10004

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

8886

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

6678

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

5313

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

5450

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

3972

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

3576

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP