By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
457,903 Members | 1,677 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 457,903 IT Pros & Developers. It's quick & easy.

Underscores in Python numbers

P: n/a
I tried finding a discussion around adding the possibility to have
optional underscores inside numbers in Python. This is a popular option
available in several "competing" scripting langauges, that I would love
to see in Python.

Examples:
1_234_567
0xdead_beef
3.141_592

Would appreciate if someone could find a pointer to a previous
discussion on this topic, or add it to a Python-feature-wishlist.

- Gustav

Nov 8 '05 #1
Share this Question
Share on Google+
96 Replies


P: n/a
Gustav Hållberg wrote:
I tried finding a discussion around adding the possibility to have
optional underscores inside numbers in Python. This is a popular option
available in several "competing" scripting langauges, that I would love
to see in Python.

Examples:
1_234_567
0xdead_beef
3.141_592

Would appreciate if someone could find a pointer to a previous
discussion on this topic, or add it to a Python-feature-wishlist.

- Gustav


I'm not sure what the _s are for, but I'm guessing they serve as
separators ("." or "," depending on where you're from). I think the _s
look ugly to me, besides, underscores look more like spaces than
separators.

Nov 8 '05 #2

P: n/a
Gustav Hållberg wrote:
I tried finding a discussion around adding the possibility to have
optional underscores inside numbers in Python. This is a popular option
available in several "competing" scripting langauges, that I would love
to see in Python.

Examples:
1_234_567
0xdead_beef
3.141_592

Would appreciate if someone could find a pointer to a previous
discussion on this topic, or add it to a Python-feature-wishlist.


Perhaps these threads, via Google?

http://groups.google.com/group/comp....ls+underscores
-Peter
Nov 8 '05 #3

P: n/a
On 7 Nov 2005 18:02:09 -0800 in comp.lang.python, "Gustav Hållberg"
<gu****@gmail.com> wrote:
I tried finding a discussion around adding the possibility to have
optional underscores inside numbers in Python. This is a popular option
available in several "competing" scripting langauges, that I would love
to see in Python.

Examples:
1_234_567
0xdead_beef
3.141_592

Would appreciate if someone could find a pointer to a previous
discussion on this topic, or add it to a Python-feature-wishlist.


I've never needed them in Python, but I've very often wished for them
in C. Along with 0b(0|1)* for binary numbers, where they'd be even
more useful.

Of course, I write _far_ more code in C than Python. But I've seen
enough bugs of the sort where someone wrote 1200000 when they meant
12000000, that I see great value in being able to specify 12_000_000.

Regards,
-=Dave

--
Change is inevitable, progress is not.
Nov 8 '05 #4

P: n/a
Dave Hansen <id**@hotmail.com> wrote:
Of course, I write _far_ more code in C than Python. But I've seen
enough bugs of the sort where someone wrote 1200000 when they meant
12000000, that I see great value in being able to specify 12_000_000.


I'll admit that being able to write 12_000_000 would be convenient.
On the other hand, writing 12 * 1000 * 1000 is almost as clear. In C,
the multiplication would be done at compile time, so it's not even any
less efficient. I'm not sure how Python handles that, but if it
turned out to be a serious run-time performance issue, it's easy
enough to factor it out into something that's done once and stored.

Bottom line, embedded no-op underscores in numbers would be nice (and,
IHMO, should be added), but the lack of such a feature should not be
used as an excuse to write such unreadable monstrosities as 12000000
in source code.

Semi-related: see Jakob Nielsen's complaint about having to enter
credit card numbers as 16-digit strings with no breaks on web forms
(http://www.useit.com/alertbox/designmistakes.html, item #7, last
bullet point).
Nov 8 '05 #5

P: n/a
Sorry for the delayed response. I somehow missed this earlier.

On Tue, 8 Nov 2005 15:39:09 +0000 (UTC) in comp.lang.python,
ro*@panix.com (Roy Smith) wrote:
Dave Hansen <id**@hotmail.com> wrote:
Of course, I write _far_ more code in C than Python. But I've seen
enough bugs of the sort where someone wrote 1200000 when they meant
Digression: 1 was enough.
12000000, that I see great value in being able to specify 12_000_000.


I'll admit that being able to write 12_000_000 would be convenient.
On the other hand, writing 12 * 1000 * 1000 is almost as clear. In C,


Perhaps, but it's pretty obvious that something's wrong when you have
to resort to ugly tricks like this to make the value of a simple
integer constant "clear."

And think about 64 (or longer) -bit unsigned long long hexadecimal
values. How much nicer is 0xFFF0_FF0F_F0FF_0FFF_ULL than
0xFFF0FF0FF0FF0FFFULL? I guess we could do something like
((((0xFFF0ULL<<16)|0xFF0FULL)<<16)|0xF0FFULL)<<16) |0x0FFFULL), but I'm
not sure it's any better.

Regards,
-=Dave

--
Change is inevitable, progress is not.
Nov 22 '05 #6

P: n/a
Sorry for the delayed response. I somehow missed this earlier.

On Tue, 8 Nov 2005 15:39:09 +0000 (UTC) in comp.lang.python,
ro*@panix.com (Roy Smith) wrote:
Dave Hansen <id**@hotmail.com> wrote:
Of course, I write _far_ more code in C than Python. But I've seen
enough bugs of the sort where someone wrote 1200000 when they meant
Digression: 1 was enough.
12000000, that I see great value in being able to specify 12_000_000.


I'll admit that being able to write 12_000_000 would be convenient.
On the other hand, writing 12 * 1000 * 1000 is almost as clear. In C,


Perhaps, but it's pretty obvious that something's wrong when you have
to resort to ugly tricks like this to make the value of a simple
integer constant "clear."

And think about 64 (or longer) -bit unsigned long long hexadecimal
values. How much nicer is 0xFFF0_FF0F_F0FF_0FFF_ULL than
0xFFF0FF0FF0FF0FFFULL? I guess we could do something like
((((0xFFF0ULL<<16)|0xFF0FULL)<<16)|0xF0FFULL)<<16) |0x0FFFULL), but I'm
not sure it's any better.

Regards,
-=Dave

--
Change is inevitable, progress is not.
Nov 22 '05 #7

P: n/a
Personally, I would rather see the int() and float() function be
smarter to take what is used for this, i.e. :

a = int("1,234,567")

Of course, also support the locale variant where the meaning of "," and
"." is swapped in most European countries.

Gustav Hållberg wrote:
I tried finding a discussion around adding the possibility to have
optional underscores inside numbers in Python. This is a popular option
available in several "competing" scripting langauges, that I would love
to see in Python.

Examples:
1_234_567
0xdead_beef
3.141_592

Would appreciate if someone could find a pointer to a previous
discussion on this topic, or add it to a Python-feature-wishlist.

- Gustav


Nov 22 '05 #8

P: n/a
Personally, I would rather see the int() and float() function be
smarter to take what is used for this, i.e. :

a = int("1,234,567")

Of course, also support the locale variant where the meaning of "," and
"." is swapped in most European countries.

Gustav Hållberg wrote:
I tried finding a discussion around adding the possibility to have
optional underscores inside numbers in Python. This is a popular option
available in several "competing" scripting langauges, that I would love
to see in Python.

Examples:
1_234_567
0xdead_beef
3.141_592

Would appreciate if someone could find a pointer to a previous
discussion on this topic, or add it to a Python-feature-wishlist.

- Gustav


Nov 22 '05 #9

P: n/a
On Fri, 18 Nov 2005 16:26:08 -0800, bo****@gmail.com wrote:
Personally, I would rather see the int() and float() function be
smarter to take what is used for this, i.e. :

a = int("1,234,567")


But the problem isn't just with conversion of strings. It is also
with literals.

n = 99999999999

Without counting, how many nines?

Obviously repeated digits is an extreme case, but even different digits
are easier to process if grouped. That's why we write phone numbers like
62 3 9621 2377 instead of 62396212377.

Here is a thought: Python already concatenates string literals:

"abc" "def" is the same as "abcdef".

Perhaps Python should concatenate numeric literals at compile time:

123 456 is the same as 123456.

Off the top of my head, I don't think this should break any older code,
because 123 456 is not currently legal in Python.
--
Steven.

Nov 22 '05 #10

P: n/a
On Fri, 18 Nov 2005 16:26:08 -0800, bo****@gmail.com wrote:
Personally, I would rather see the int() and float() function be
smarter to take what is used for this, i.e. :

a = int("1,234,567")


But the problem isn't just with conversion of strings. It is also
with literals.

n = 99999999999

Without counting, how many nines?

Obviously repeated digits is an extreme case, but even different digits
are easier to process if grouped. That's why we write phone numbers like
62 3 9621 2377 instead of 62396212377.

Here is a thought: Python already concatenates string literals:

"abc" "def" is the same as "abcdef".

Perhaps Python should concatenate numeric literals at compile time:

123 456 is the same as 123456.

Off the top of my head, I don't think this should break any older code,
because 123 456 is not currently legal in Python.
--
Steven.

Nov 22 '05 #11

P: n/a

Steven D'Aprano wrote:
On Fri, 18 Nov 2005 16:26:08 -0800, bo****@gmail.com wrote:
Personally, I would rather see the int() and float() function be
smarter to take what is used for this, i.e. :

a = int("1,234,567")


But the problem isn't just with conversion of strings. It is also
with literals.

n = 99999999999

Without counting, how many nines?

For readability, I don't see why it cannot be written as :

n = int("99,999,999,999")

we already needs to do this for decimal("9.9")

Nov 22 '05 #12

P: n/a

Steven D'Aprano wrote:
On Fri, 18 Nov 2005 16:26:08 -0800, bo****@gmail.com wrote:
Personally, I would rather see the int() and float() function be
smarter to take what is used for this, i.e. :

a = int("1,234,567")


But the problem isn't just with conversion of strings. It is also
with literals.

n = 99999999999

Without counting, how many nines?

For readability, I don't see why it cannot be written as :

n = int("99,999,999,999")

we already needs to do this for decimal("9.9")

Nov 22 '05 #13

P: n/a
on 19.11.2005 06:56 Steven D'Aprano said the following:
[snip]

Perhaps Python should concatenate numeric literals at compile time:

123 456 is the same as 123456.

Off the top of my head, I don't think this should break any older code,
because 123 456 is not currently legal in Python.


+1

but only allow (a single ?) space(s), otherwise readability issues ensue.

The other idea of teaching int() about separator characters has
internationalis/zation issues:
In many European countries, one would naturally try::

int('500.000,23')

instead of::

int('500,000.23')

--
stefan

Nov 22 '05 #14

P: n/a
on 19.11.2005 06:56 Steven D'Aprano said the following:
[snip]

Perhaps Python should concatenate numeric literals at compile time:

123 456 is the same as 123456.

Off the top of my head, I don't think this should break any older code,
because 123 456 is not currently legal in Python.


+1

but only allow (a single ?) space(s), otherwise readability issues ensue.

The other idea of teaching int() about separator characters has
internationalis/zation issues:
In many European countries, one would naturally try::

int('500.000,23')

instead of::

int('500,000.23')

--
stefan

Nov 22 '05 #15

P: n/a
Steven D'Aprano:
Perhaps Python should concatenate numeric literals at compile time:
123 456 is the same as 123456.


I think using the underscore it is more explicit:
n = 123_456

Alternatively the underscore syntax may be used to separate the number
from its base:
22875 == 22875_10 == 595b_16 == 123456_7
But probably this is less commonly useful (and not much explicit).

Bye,
bearophile

Nov 22 '05 #16

P: n/a
Steven D'Aprano:
Perhaps Python should concatenate numeric literals at compile time:
123 456 is the same as 123456.


I think using the underscore it is more explicit:
n = 123_456

Alternatively the underscore syntax may be used to separate the number
from its base:
22875 == 22875_10 == 595b_16 == 123456_7
But probably this is less commonly useful (and not much explicit).

Bye,
bearophile

Nov 22 '05 #17

P: n/a

Stefan Rank wrote:
The other idea of teaching int() about separator characters has
internationalis/zation issues:
In many European countries, one would naturally try::

int('500.000,23')

instead of::

int('500,000.23')


That is why I said

"Of course, also support the locale variant where the meaning of ","
and
"." is swapped in most European countries. "

We are seeing the same about base 2, 8, 10, 16.

May be :

int("E500.000,23")

as we are using :

0xffff

already for hex number

Nov 22 '05 #18

P: n/a

Stefan Rank wrote:
The other idea of teaching int() about separator characters has
internationalis/zation issues:
In many European countries, one would naturally try::

int('500.000,23')

instead of::

int('500,000.23')


That is why I said

"Of course, also support the locale variant where the meaning of ","
and
"." is swapped in most European countries. "

We are seeing the same about base 2, 8, 10, 16.

May be :

int("E500.000,23")

as we are using :

0xffff

already for hex number

Nov 22 '05 #19

P: n/a
On Sat, 19 Nov 2005 01:33:40 -0800, bearophileHUGS wrote:
Steven D'Aprano:
Perhaps Python should concatenate numeric literals at compile time:
123 456 is the same as 123456.


I think using the underscore it is more explicit:
n = 123_456


It is also easy to make a typo:

n = 123-456
--
Steven.

Nov 22 '05 #20

P: n/a
On Sat, 19 Nov 2005 01:33:40 -0800, bearophileHUGS wrote:
Steven D'Aprano:
Perhaps Python should concatenate numeric literals at compile time:
123 456 is the same as 123456.


I think using the underscore it is more explicit:
n = 123_456


It is also easy to make a typo:

n = 123-456
--
Steven.

Nov 22 '05 #21

P: n/a
bo****@gmail.com enlightened us with:
Of course, also support the locale variant where the meaning of ","
and "." is swapped in most European countries.


This is exactly why I wouldn't use that notation. What happens if it
is hardcoded into the source? I mean, that's what we're talking about.
Then the program would have to have an indication of which locale is
used for which source file. Without that, a program would be
interpreted in a different way on different computers. I think that
would be rather messy.

I'm in favour of using spaces or underscores.

Sybren
--
The problem with the world is stupidity. Not saying there should be a
capital punishment for stupidity, but why don't we just take the
safety labels off of everything and let the problem solve itself?
Frank Zappa
Nov 22 '05 #22

P: n/a
bo****@gmail.com enlightened us with:
Of course, also support the locale variant where the meaning of ","
and "." is swapped in most European countries.


This is exactly why I wouldn't use that notation. What happens if it
is hardcoded into the source? I mean, that's what we're talking about.
Then the program would have to have an indication of which locale is
used for which source file. Without that, a program would be
interpreted in a different way on different computers. I think that
would be rather messy.

I'm in favour of using spaces or underscores.

Sybren
--
The problem with the world is stupidity. Not saying there should be a
capital punishment for stupidity, but why don't we just take the
safety labels off of everything and let the problem solve itself?
Frank Zappa
Nov 22 '05 #23

P: n/a

Sybren Stuvel wrote:
bo****@gmail.com enlightened us with:
Of course, also support the locale variant where the meaning of ","
and "." is swapped in most European countries.


This is exactly why I wouldn't use that notation. What happens if it
is hardcoded into the source? I mean, that's what we're talking about.
Then the program would have to have an indication of which locale is
used for which source file. Without that, a program would be
interpreted in a different way on different computers. I think that
would be rather messy.

As mentioned in another post, we have that situation in all other
places. Such as

mm/dd/yyyy vs dd/mm/yyyy
decimal("10.23") - would european people expect decimal("10,23") to
work ?
0xffff - a notation for base 16

why can't I have "E100.000,23" to mean "100,000.23" ? Nothing but
notation.

Nov 22 '05 #24

P: n/a

Sybren Stuvel wrote:
bo****@gmail.com enlightened us with:
Of course, also support the locale variant where the meaning of ","
and "." is swapped in most European countries.


This is exactly why I wouldn't use that notation. What happens if it
is hardcoded into the source? I mean, that's what we're talking about.
Then the program would have to have an indication of which locale is
used for which source file. Without that, a program would be
interpreted in a different way on different computers. I think that
would be rather messy.

As mentioned in another post, we have that situation in all other
places. Such as

mm/dd/yyyy vs dd/mm/yyyy
decimal("10.23") - would european people expect decimal("10,23") to
work ?
0xffff - a notation for base 16

why can't I have "E100.000,23" to mean "100,000.23" ? Nothing but
notation.

Nov 22 '05 #25

P: n/a
be************@lycos.com wrote:
Alternatively the underscore syntax may be used to separate the number
from its base:
22875 == 22875_10 == 595b_16 == 123456_7
But probably this is less commonly useful (and not much explicit).


We already have a perfectly good syntax for entering octal and hex
integers, because those are commonly used in many applications. There are,
on occasion, need for other bases, but they are so rare, specialized, and
non-standard (RFC-1924, for example, uses an interesting flavor of base-85)
that having syntax built into the language to support them would be
completely unjustified.
Nov 22 '05 #26

P: n/a
be************@lycos.com wrote:
Alternatively the underscore syntax may be used to separate the number
from its base:
22875 == 22875_10 == 595b_16 == 123456_7
But probably this is less commonly useful (and not much explicit).


We already have a perfectly good syntax for entering octal and hex
integers, because those are commonly used in many applications. There are,
on occasion, need for other bases, but they are so rare, specialized, and
non-standard (RFC-1924, for example, uses an interesting flavor of base-85)
that having syntax built into the language to support them would be
completely unjustified.
Nov 22 '05 #27

P: n/a
bo****@gmail.com wrote:
Stefan Rank wrote:
The other idea of teaching int() about separator characters has
internationalis/zation issues:
In many European countries, one would naturally try::

int('500.000,23')

instead of::

int('500,000.23')

That is why I said

"Of course, also support the locale variant where the meaning of ","
and
"." is swapped in most European countries. "

We are seeing the same about base 2, 8, 10, 16.

May be :

int("E500.000,23")

as we are using :

0xffff

already for hex number

I really wouldn't want it to become possible to write Python code in one
locale that had to be edited before the numeric literals were valid in
another locale. That way madness lies.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Nov 22 '05 #28

P: n/a
bo****@gmail.com wrote:
Stefan Rank wrote:
The other idea of teaching int() about separator characters has
internationalis/zation issues:
In many European countries, one would naturally try::

int('500.000,23')

instead of::

int('500,000.23')

That is why I said

"Of course, also support the locale variant where the meaning of ","
and
"." is swapped in most European countries. "

We are seeing the same about base 2, 8, 10, 16.

May be :

int("E500.000,23")

as we are using :

0xffff

already for hex number

I really wouldn't want it to become possible to write Python code in one
locale that had to be edited before the numeric literals were valid in
another locale. That way madness lies.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Nov 22 '05 #29

P: n/a
Roy Smith>We already have a perfectly good syntax for entering octal
and hex integers,

There is this syntax:
1536 == int("600", 16)
that accepts strings only, up to a base of 36.
There are the hex() and oct() functions.
There is the %x and %o sintax, that isn't easy to remember.
There are the 0x600 and 0600 syntaxes that probably look good only from
the point of view of a C programmer.
I think some cleaning up, with a simpler and more consistent and
general way of converting bases, can be positive. But probably no one
shares this point of view, and compatibility with C syntax is probably
positive, so you are right. I am still learning the correct way of
thinking in python.

Bye,
bearophile

Nov 22 '05 #30

P: n/a
Roy Smith>We already have a perfectly good syntax for entering octal
and hex integers,

There is this syntax:
1536 == int("600", 16)
that accepts strings only, up to a base of 36.
There are the hex() and oct() functions.
There is the %x and %o sintax, that isn't easy to remember.
There are the 0x600 and 0600 syntaxes that probably look good only from
the point of view of a C programmer.
I think some cleaning up, with a simpler and more consistent and
general way of converting bases, can be positive. But probably no one
shares this point of view, and compatibility with C syntax is probably
positive, so you are right. I am still learning the correct way of
thinking in python.

Bye,
bearophile

Nov 22 '05 #31

P: n/a
be************@lycos.com wrote:
Steven D'Aprano:
Perhaps Python should concatenate numeric literals at compile time:
123 456 is the same as 123456.


I think using the underscore it is more explicit:
n = 123_456

Alternatively the underscore syntax may be used to separate the number
from its base:
22875 == 22875_10 == 595b_16 == 123456_7
But probably this is less commonly useful (and not much explicit).


Umm... in other words, "the underscore is under-used so let's assign
some arbitrary meaning to it" (to make the language more like Perl
perhaps?).

Or maybe one should instead interpret this as "numeric literals need
more bells and whistles, and I don't care which of these two we add, but
we have to do *something*!". :-)

-Peter
Nov 22 '05 #32

P: n/a
be************@lycos.com wrote:
Steven D'Aprano:
Perhaps Python should concatenate numeric literals at compile time:
123 456 is the same as 123456.


I think using the underscore it is more explicit:
n = 123_456

Alternatively the underscore syntax may be used to separate the number
from its base:
22875 == 22875_10 == 595b_16 == 123456_7
But probably this is less commonly useful (and not much explicit).


Umm... in other words, "the underscore is under-used so let's assign
some arbitrary meaning to it" (to make the language more like Perl
perhaps?).

Or maybe one should instead interpret this as "numeric literals need
more bells and whistles, and I don't care which of these two we add, but
we have to do *something*!". :-)

-Peter
Nov 22 '05 #33

P: n/a

Steve Holden wrote:
bo****@gmail.com wrote:
Stefan Rank wrote:
The other idea of teaching int() about separator characters has
internationalis/zation issues:
In many European countries, one would naturally try::

int('500.000,23')

instead of::

int('500,000.23')

That is why I said

"Of course, also support the locale variant where the meaning of ","
and
"." is swapped in most European countries. "

We are seeing the same about base 2, 8, 10, 16.

May be :

int("E500.000,23")

as we are using :

0xffff

already for hex number

I really wouldn't want it to become possible to write Python code in one
locale that had to be edited before the numeric literals were valid in
another locale. That way madness lies.

That is the fact, from the very beginning. 1.234 striaightly speaking
can have different meaning,. So if you don't want, don't support it and
always use the non-European notation.

Nov 22 '05 #34

P: n/a

Steve Holden wrote:
bo****@gmail.com wrote:
Stefan Rank wrote:
The other idea of teaching int() about separator characters has
internationalis/zation issues:
In many European countries, one would naturally try::

int('500.000,23')

instead of::

int('500,000.23')

That is why I said

"Of course, also support the locale variant where the meaning of ","
and
"." is swapped in most European countries. "

We are seeing the same about base 2, 8, 10, 16.

May be :

int("E500.000,23")

as we are using :

0xffff

already for hex number

I really wouldn't want it to become possible to write Python code in one
locale that had to be edited before the numeric literals were valid in
another locale. That way madness lies.

That is the fact, from the very beginning. 1.234 striaightly speaking
can have different meaning,. So if you don't want, don't support it and
always use the non-European notation.

Nov 22 '05 #35

P: n/a
bo****@gmail.com wrote:
Steve Holden wrote:

[...]
I really wouldn't want it to become possible to write Python code in one
locale that had to be edited before the numeric literals were valid in
another locale. That way madness lies.


That is the fact, from the very beginning. 1.234 striaightly speaking
can have different meaning,. So if you don't want, don't support it and
always use the non-European notation.


Being European myself I am well aware of the notational differences of
the different locales, and I am perfectly happy that users can enter
numbers in their preferred format when they execute a program.

However, I am not happy about the idea that a program source would need
to be edited before it would work after being moved to another locale.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Nov 22 '05 #36

P: n/a
bo****@gmail.com wrote:
Steve Holden wrote:

[...]
I really wouldn't want it to become possible to write Python code in one
locale that had to be edited before the numeric literals were valid in
another locale. That way madness lies.


That is the fact, from the very beginning. 1.234 striaightly speaking
can have different meaning,. So if you don't want, don't support it and
always use the non-European notation.


Being European myself I am well aware of the notational differences of
the different locales, and I am perfectly happy that users can enter
numbers in their preferred format when they execute a program.

However, I am not happy about the idea that a program source would need
to be edited before it would work after being moved to another locale.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Nov 22 '05 #37

P: n/a
On Sat, 19 Nov 2005 13:08:57 -0500, Peter Hansen wrote:
Umm... in other words, "the underscore is under-used so let's assign
some arbitrary meaning to it" (to make the language more like Perl
perhaps?).
+1

I *really* don't like the idea of allowing underscores in numeric
literals. Firstly, for aesthetic reasons: I think 123_456 is seriously
ugly. Secondly, for pragmatic reasons, I think it is too easy to mistype
as 123-456. I know that Python can't protect you from typing 9-1 instead
of 901, but why add special syntax that makes that sort of error MORE
common?)
Or maybe one should instead interpret this as "numeric literals need
more bells and whistles, and I don't care which of these two we add, but
we have to do *something*!". :-)


-1

That's a tad unfair. Dealing with numeric literals with lots of digits is
a real (if not earth-shattering) human interface problem: it is hard for
people to parse long numeric strings. In the wider world outside of IT,
people deal with long numeric digits by grouping. This is *exceedingly*
common: mathematicians do it, economists do it, everybody who handles long
numeric literals does it *except* computer language designers.

Depending on personal preference and context, we use any of comma, period,
dash or space as a separator. Underscore is never used. Of these, the
comma clashes with tuples, the period opens a rather large can of worms
vis-a-vis internationalisation, and the dash clashes with the minus sign.
Allowing spaces to group digits is subtle but effective, doesn't clash
with other syntax, and is analogous to string concatenation.

I don't believe it is either practical or desirable for a computer
language to accept every conceivable digit separator in literals. If you
need full support for internationalised numbers, that should go into a
function. But the question of including a digit separator for numeric
literals does solve a real problem, it isn't just meaningless bells and
whistles.

Likewise, base conversion into arbitrary bases is not, in my opinion,
common enough a task that support for it needs to be built into the syntax
for literals. If somebody cares enough about it, write a module to handle
it and try to get it included with the Python standard modules.
--
Steven.

Nov 22 '05 #38

P: n/a
On Sat, 19 Nov 2005 13:08:57 -0500, Peter Hansen wrote:
Umm... in other words, "the underscore is under-used so let's assign
some arbitrary meaning to it" (to make the language more like Perl
perhaps?).
+1

I *really* don't like the idea of allowing underscores in numeric
literals. Firstly, for aesthetic reasons: I think 123_456 is seriously
ugly. Secondly, for pragmatic reasons, I think it is too easy to mistype
as 123-456. I know that Python can't protect you from typing 9-1 instead
of 901, but why add special syntax that makes that sort of error MORE
common?)
Or maybe one should instead interpret this as "numeric literals need
more bells and whistles, and I don't care which of these two we add, but
we have to do *something*!". :-)


-1

That's a tad unfair. Dealing with numeric literals with lots of digits is
a real (if not earth-shattering) human interface problem: it is hard for
people to parse long numeric strings. In the wider world outside of IT,
people deal with long numeric digits by grouping. This is *exceedingly*
common: mathematicians do it, economists do it, everybody who handles long
numeric literals does it *except* computer language designers.

Depending on personal preference and context, we use any of comma, period,
dash or space as a separator. Underscore is never used. Of these, the
comma clashes with tuples, the period opens a rather large can of worms
vis-a-vis internationalisation, and the dash clashes with the minus sign.
Allowing spaces to group digits is subtle but effective, doesn't clash
with other syntax, and is analogous to string concatenation.

I don't believe it is either practical or desirable for a computer
language to accept every conceivable digit separator in literals. If you
need full support for internationalised numbers, that should go into a
function. But the question of including a digit separator for numeric
literals does solve a real problem, it isn't just meaningless bells and
whistles.

Likewise, base conversion into arbitrary bases is not, in my opinion,
common enough a task that support for it needs to be built into the syntax
for literals. If somebody cares enough about it, write a module to handle
it and try to get it included with the Python standard modules.
--
Steven.

Nov 22 '05 #39

P: n/a

Steve Holden wrote:
Being European myself I am well aware of the notational differences of
the different locales, and I am perfectly happy that users can enter
numbers in their preferred format when they execute a program.

However, I am not happy about the idea that a program source would need
to be edited before it would work after being moved to another locale.

Huh ?

Up to now, all I am talking about is making the three init
function(int/float/decimal) to be smarter on coverting string to their
type. It doesn't affect the code in anyway if you don't need it or want
to use it. It is more like a helper function for the issue of people
are so concern about the seperators in big numbers. It introduce no new
syntax to the language at all. And should you say use the imaginary
format "E500.000,23", it still works no matter where your program is
running or what the hosting locale is. Don't understand what changes
you are referring to.

We are facing similar issue today. A typical case is MM/DD/YYYY date
format. Or may be I need to import text file(csv for example) which may
already contain numbers in this format.

Nov 22 '05 #40

P: n/a

Steve Holden wrote:
Being European myself I am well aware of the notational differences of
the different locales, and I am perfectly happy that users can enter
numbers in their preferred format when they execute a program.

However, I am not happy about the idea that a program source would need
to be edited before it would work after being moved to another locale.

Huh ?

Up to now, all I am talking about is making the three init
function(int/float/decimal) to be smarter on coverting string to their
type. It doesn't affect the code in anyway if you don't need it or want
to use it. It is more like a helper function for the issue of people
are so concern about the seperators in big numbers. It introduce no new
syntax to the language at all. And should you say use the imaginary
format "E500.000,23", it still works no matter where your program is
running or what the hosting locale is. Don't understand what changes
you are referring to.

We are facing similar issue today. A typical case is MM/DD/YYYY date
format. Or may be I need to import text file(csv for example) which may
already contain numbers in this format.

Nov 22 '05 #41

P: n/a
Steven D'Aprano wrote:
[...]
Likewise, base conversion into arbitrary bases is not, in my opinion,
common enough a task that support for it needs to be built into the syntax
for literals. If somebody cares enough about it, write a module to handle
it and try to get it included with the Python standard modules.

In fact Icon managed to offer a syntax that allowed every base up to 36
to be used: an "r" was used to indicate the radix of the literal, so hex
453FF would be represented as "16r453FF". This worked fine. Upper- and
lower-case letters werw regarded as equivalent.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Nov 22 '05 #42

P: n/a
Steven D'Aprano wrote:
[...]
Likewise, base conversion into arbitrary bases is not, in my opinion,
common enough a task that support for it needs to be built into the syntax
for literals. If somebody cares enough about it, write a module to handle
it and try to get it included with the Python standard modules.

In fact Icon managed to offer a syntax that allowed every base up to 36
to be used: an "r" was used to indicate the radix of the literal, so hex
453FF would be represented as "16r453FF". This worked fine. Upper- and
lower-case letters werw regarded as equivalent.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Nov 22 '05 #43

P: n/a

Steven D'Aprano wrote:
That's a tad unfair. Dealing with numeric literals with lots of digits is
a real (if not earth-shattering) human interface problem: it is hard for
people to parse long numeric strings. In the wider world outside of IT,
people deal with long numeric digits by grouping. This is *exceedingly*
common: mathematicians do it, economists do it, everybody who handles long
numeric literals does it *except* computer language designers.

However, what is the percentage of these big number literals appears in
source code ? I believe most of them either appears in some data
file(thus is nothing but string) or during data input(again string).
Why change the language when we just want a smarter string converter ?

Nov 22 '05 #44

P: n/a

Steven D'Aprano wrote:
That's a tad unfair. Dealing with numeric literals with lots of digits is
a real (if not earth-shattering) human interface problem: it is hard for
people to parse long numeric strings. In the wider world outside of IT,
people deal with long numeric digits by grouping. This is *exceedingly*
common: mathematicians do it, economists do it, everybody who handles long
numeric literals does it *except* computer language designers.

However, what is the percentage of these big number literals appears in
source code ? I believe most of them either appears in some data
file(thus is nothing but string) or during data input(again string).
Why change the language when we just want a smarter string converter ?

Nov 22 '05 #45

P: n/a
On Sun, 20 Nov 2005 01:39:04 +0000, Steve Holden wrote:
Steven D'Aprano wrote:
[...]
Likewise, base conversion into arbitrary bases is not, in my opinion,
common enough a task that support for it needs to be built into the syntax
for literals. If somebody cares enough about it, write a module to handle
it and try to get it included with the Python standard modules.

In fact Icon managed to offer a syntax that allowed every base up to 36
to be used: an "r" was used to indicate the radix of the literal, so hex
453FF would be represented as "16r453FF". This worked fine. Upper- and
lower-case letters werw regarded as equivalent.


Forth goes significantly further than that: you can tell the Forth
interpreter what base you are using, and all numbers are then read and
displayed using that base. Numbers were case sensitive, which meant Forth
understood bases to at least 62. I don't remember whether it allows
non-alphanumeric digits, and therefore higher bases -- I think it does,
but am not sure.

Nevertheless, I don't believe that sort of functionality belongs in the
language itself. It is all well and good to be able to write 32r37gm, but
how often do you really need to write numbers in base 32?

--
Steven.

Nov 22 '05 #46

P: n/a
On Sun, 20 Nov 2005 01:39:04 +0000, Steve Holden wrote:
Steven D'Aprano wrote:
[...]
Likewise, base conversion into arbitrary bases is not, in my opinion,
common enough a task that support for it needs to be built into the syntax
for literals. If somebody cares enough about it, write a module to handle
it and try to get it included with the Python standard modules.

In fact Icon managed to offer a syntax that allowed every base up to 36
to be used: an "r" was used to indicate the radix of the literal, so hex
453FF would be represented as "16r453FF". This worked fine. Upper- and
lower-case letters werw regarded as equivalent.


Forth goes significantly further than that: you can tell the Forth
interpreter what base you are using, and all numbers are then read and
displayed using that base. Numbers were case sensitive, which meant Forth
understood bases to at least 62. I don't remember whether it allows
non-alphanumeric digits, and therefore higher bases -- I think it does,
but am not sure.

Nevertheless, I don't believe that sort of functionality belongs in the
language itself. It is all well and good to be able to write 32r37gm, but
how often do you really need to write numbers in base 32?

--
Steven.

Nov 22 '05 #47

P: n/a
Steven D'Aprano wrote:
Dealing with numeric literals with lots of digits is
a real (if not earth-shattering) human interface problem: it is hard for
people to parse long numeric strings.


I'm totally unconvinced that this _is_ a real problem, if we define
"real" as being even enough to jiggle my mouse, let alone shattering the
planet.

What examples does anyone have of where it is necessary to define a
large number of large numeric literals? Isn't it the case that other
than the odd constants in various programs, defining a large number of
such values would be better done by creating a data file and parsing it?

And if that's the case, one could easily define any convention one
desired for formatting the raw data.

And for the odd constant, either take a moment to verify the value, or
define it in parts (e.g. 24*60*60*1000*1000 microseconds per day), or
write a nice little variant on int() that can do exactly what you would
have done for the external data file if you had more values.

-Peter
Nov 22 '05 #48

P: n/a
Steven D'Aprano wrote:
Dealing with numeric literals with lots of digits is
a real (if not earth-shattering) human interface problem: it is hard for
people to parse long numeric strings.


I'm totally unconvinced that this _is_ a real problem, if we define
"real" as being even enough to jiggle my mouse, let alone shattering the
planet.

What examples does anyone have of where it is necessary to define a
large number of large numeric literals? Isn't it the case that other
than the odd constants in various programs, defining a large number of
such values would be better done by creating a data file and parsing it?

And if that's the case, one could easily define any convention one
desired for formatting the raw data.

And for the odd constant, either take a moment to verify the value, or
define it in parts (e.g. 24*60*60*1000*1000 microseconds per day), or
write a nice little variant on int() that can do exactly what you would
have done for the external data file if you had more values.

-Peter
Nov 22 '05 #49

P: n/a
Steven D'Aprano <st***@REMOVETHIScyber.com.au> wrote:
That's a tad unfair. Dealing with numeric literals with lots of digits is
a real (if not earth-shattering) human interface problem: it is hard for
people to parse long numeric strings.


There are plenty of ways to make numeric literals easier to read without
resorting to built-in language support. One way is:

sixTrillion = 6 * 1000 * 1000 * 1000 * 1000

Or, a more general solution might be to write a little factory function
which took a string, stripped out the underscores (or spaces, or commas, or
whatever bit of punctuation turned you on), and then converted the
remaining digit string to an integer. You could then write:

creditCardNumber = myInt ("1234 5678 9012 3456 789")

Perhaps not as convenient as having it built into the language, but
workable in those cases which justify the effort.
Nov 22 '05 #50

96 Replies

This discussion thread is closed

Replies have been disabled for this discussion.