472,973 Members | 2,102 Online

# Is this a bug in int()?

>>>int('0x', 16)
0

I'm working on a tokenizer and I'm thinking about returning a
MALFORMED_NUMBER token (1.2E, .5E+)
Dec 20 '07 #1
7 1087
Ma************@gmail.com wrote under the subject line "Is this a
bug in int()?":
>>>>int('0x', 16)
0
I think it is a general problem in the tokenizer, not just the 'int'
constructor. The syntax for integers says:

hexinteger ::= "0" ("x" | "X") hexdigit+

but 0x appears to be accepted in source code as an integer.

If I were you, I'd try reporting it as a bug.
I'm working on a tokenizer and I'm thinking about returning a
MALFORMED_NUMBER token (1.2E, .5E+)
Why would you return a token rather than throwing an exception?
Dec 20 '07 #2
Ma************@gmail.com wrote:
>>>int('0x', 16)
0

I'm working on a tokenizer and I'm thinking about returning a
MALFORMED_NUMBER token (1.2E, .5E+)
Somewhat surprisingly, "0x" is a valid integer literal in Python:
>>0x
0

</F>

Dec 20 '07 #3

"Duncan Booth" <du**********@invalid.invalidwrote in message
news:Xn*************************@127.0.0.1...
| Ma************@gmail.com wrote under the subject line "Is this a
| bug in int()?":
| >>>>int('0x', 16)
| 0
| >
| I think it is a general problem in the tokenizer, not just the 'int'
| constructor. The syntax for integers says:
|
| hexinteger ::= "0" ("x" | "X") hexdigit+
|
| but 0x appears to be accepted in source code as an integer.
|
| If I were you, I'd try reporting it as a bug.

The mismatch between doc and behavior certainly is a bug.
One should change.

Dec 21 '07 #4

Duncan Booth wrote:
Why would you return a token rather than throwing an exception?
Tokenizers have lots of uses. Colorizing text in an editor, for
example. We've got a MALFORMED_NUMBER when you type '0x'. We've got an
INTEGER when we get your next keystroke (probably).
Dec 21 '07 #5
Tokenizer bug reported.

MartinRineh...@gmail.com wrote:
>>int('0x', 16)
0

I'm working on a tokenizer and I'm thinking about returning a
MALFORMED_NUMBER token (1.2E, .5E+)
Dec 21 '07 #6
Tokenizer accepts "0x" as zero. Spec says its an error not to have at
least one hex digit after "0x".

This is a more serious bug than I had originally thought. Consider
this:

Joe types "security_code = 0x" and then goes off to the Guardian-of-
the-Codes to get the appropriate hex string. Returning to computer,
Joe's boss grabs him. Tells him that effective immediately he's on the
"rescue us from this crisis" team; his other project can wait.

Some hours, days or weeks later Joe returns to the first project. At
this point Joe has a line of code that says "security_code = 0x". I
think Joe would be well-served by a compiler error on that line. As is
now, Joe's program assigns 0 to security_code and compiles without
complaint. I'm pretty sure any line of the form "name = 0x" was a
product of some form of programmer interruptus.

Dec 22 '07 #7
On Dec 22, 5:03 pm, MartinRineh...@gmail.com wrote:
Tokenizer accepts "0x" as zero. Spec says its an error not to have at
least one hex digit after "0x".

This is a more serious bug than I had originally thought. Consider
this:

Joe types "security_code = 0x" and then goes off to the Guardian-of-
the-Codes to get the appropriate hex string. Returning to computer,
Joe's boss grabs him. Tells him that effective immediately he's on the
"rescue us from this crisis" team; his other project can wait.

Some hours, days or weeks later Joe returns to the first project. At
this point Joe has a line of code that says "security_code = 0x". I
think Joe would be well-served by a compiler error on that line. As is
now, Joe's program assigns 0 to security_code and compiles without
complaint. I'm pretty sure any line of the form "name = 0x" was a
product of some form of programmer interruptus.
:-) Are you a fiction writer by any chance ? Nice story but I somehow
doubt that the number of lines of the form "name = 0x" ever written in
Python is greater than a single digit (with zero the most likely one).

George
Dec 23 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.