By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,949 Members | 889 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,949 IT Pros & Developers. It's quick & easy.

int/long unification hides bugs

P: n/a
there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.
most of the time, i expect my numbers to be small. 2**31 is good
enough for most uses of variables, and when more is needed, 2**63
should do most of the time.

granted, unification allows code to work for larger numbers than
foreseen (as PEP 237 states) but i feel the potential for more
undetected bugs outweighs this benefit.

the other benefit of the unification - portability - can be achieved
by defining int32 & int64 types (or by defining all integers to be
32-bit (or 64-bit))

PEP 237 says, "It will give new Python programmers [...] one less
thing to learn [...]". i feel this is not so important as the quality
of code a programmer writes once he does learn the language.

-kartik
Jul 18 '05 #1
Share this Question
Share on Google+
83 Replies


P: n/a
kartik wrote:
there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs. [snip] PEP 237 says, "It will give new Python programmers [...] one less
thing to learn [...]". i feel this is not so important as the quality
of code a programmer writes once he does learn the language.


Do you feel strongly enough about the quality of your code to write
automated tests for it? Or are you just hoping that one tiny class
of potential bugs will be caught for you by this feature of the
language?

I'm not sure what you're asking, because even the exposure of
latent bugs which you are describing can happen only when you
*run* the code. Are you planning to have your users report
that there are bugs when the program crashes in a code path
which you didn't get around to testing manually?

-Peter
Jul 18 '05 #2

P: n/a
kartik wrote:
there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.
No it does not.

Just because a runaway program stops sooner by hitting the
integer limit it does not mean that this having this limit
is a validation method.
i feel this is not so important as the quality
of code a programmer writes


A code that relies on hitting the integer limit
is anything but high quality.

If you are worried about some numbers growing too much, then
check them yourself, you'll get much better results that way.

Istvan.
Jul 18 '05 #3

P: n/a
kartik wrote:
there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.
most of the time, i expect my numbers to be small.
The question is how small is small? Less than 2**7? Less than 2**15?
Less than 2**31? Less than 2**63? And what's the significance of powers
of two? And what happens if you move from a 32 bit machine to a 64 bit
one? (or a 1024 bit one in a hundred years time?)
PEP 237 says, "It will give new Python programmers [...] one less
thing to learn [...]". i feel this is not so important as the quality
of code a programmer writes once he does learn the language.


The thing is, the int/long cutoff is arbitrary, determined soley by
implemetation detail. A much better idea is the judicious use of assertions.

assert x < 15000

Not only does it protect you from runaway numbers, it also documents
what the expected range is, resulting in a much better "quality of code"
Jul 18 '05 #4

P: n/a
kartik <ka*************@yahoo.com> wrote:
there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.
So does allowing strings to be any length.
most of the time, i expect my numbers to be small. 2**31 is good
enough for most uses of variables, and when more is needed, 2**63
should do most of the time.
Most of the time, I expect my strings to be short. 1000 characters is
good enough for most uses of strings, and when more is needed, a million
should do most of the time.
granted, unification allows code to work for larger numbers than
foreseen (as PEP 237 states) but i feel the potential for more
undetected bugs outweighs this benefit.


Granted, unlimited string length allows code to work for longer strings
than foreseen (as common sense states) but (if you're consistent) you
feel the potential for more undetected bugs outweighs this benefit.
By this parallel, I intend to communicate that (and part of why) I
consider your observations to be totally without merit.
Alex
Jul 18 '05 #5

P: n/a
In article <94**************************@posting.google.com >,
ka*************@yahoo.com (kartik) wrote:
there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.


Can you exhibit any non-trivial examples of the types of bugs you are
talking about?

-M

--
Michael J. Fromberger | Lecturer, Dept. of Computer Science
http://www.dartmouth.edu/~sting/ | Dartmouth College, Hanover, NH, USA
Jul 18 '05 #6

P: n/a
Here's a bug that passes silently because ints are not limited in range
from 1 to 100:
...

OK, just joking. I couldn't think of one.

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFBfbriJd01MZaTXX0RArSWAJ4oSiMopjUa21IIpBl0ZY itXWj1OQCfedWu
/4aKnk9Lcd5balUhxzJBYDM=
=bDVr
-----END PGP SIGNATURE-----

Jul 18 '05 #7

P: n/a
On Mon, 2004-10-25 at 21:48 -0500, Jeff Epler wrote:
Here's a bug that passes silently because ints are not limited in range
from 1 to 100:
...

OK, just joking. I couldn't think of one.


Here's one:

# count how many ferrets I have
ferrets = 0
while 1:
try:
ferrets += 1
except:
break
print ferrets

As you can clearly see, the answer should have been 3, but due to Python
silently allowing numbers larger than 3 the program gets stuck in an
apparently interminable loop, requiring me to reboot Microsoft Bob.

--
Cliff Wells <cl************@comcast.net>

Jul 18 '05 #8

P: n/a
Peter Hansen <pe***@engcorp.com> wrote in message news:<_a********************@powergate.ca>...
Do you feel strongly enough about the quality of your code to write
automated tests for it? Or are you just hoping that one tiny class
of potential bugs will be caught for you by this feature of the
language?
1)catching overflow bugs in the language itself frees u from writing
overflow tests.
2)no test (or test suite) can catch all errors, so language support 4
error detection is welcome.
3)overflow detection helps when u dont have automated tests 4 a
particular part of your program.
I'm not sure what you're asking, because even the exposure of
latent bugs which you are describing can happen only when you
*run* the code.


Agreed. i'm saying that without int/long unification, the bugs will b
found sooner & closer to where they occur, rather than propagating
throughout the program's objects & being found far away from the
source, if at all.

-kartik
Jul 18 '05 #9

P: n/a
Istvan Albert <ia*****@mailblocks.com> wrote in message news:<Ho********************@giganews.com>...
kartik wrote:
there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.
No it does not.

Just because a runaway program stops sooner by hitting the
integer limit it does not mean that this having this limit
is a validation method.


i didn't say it is. all i say is that it catches bugs - & that's
valuable.

> i feel this is not so important as the quality
> of code a programmer writes


A code that relies on hitting the integer limit
is anything but high quality.


once again, i'm not relying on the integer limit to catch bugs, but
i'd much rather have bugs exposed by an overflow exception than by end
users complaining about wrong data values.

If you are worried about some numbers growing too much, then
check them yourself, you'll get much better results that way.


maybe, why not use an automated test built-in 2 the language? i get it
4 free.

-kartik
Jul 18 '05 #10

P: n/a
On Mon, 2004-10-25 at 20:37 -0700, kartik wrote:
Istvan Albert <ia*****@mailblocks.com> wrote in message news:<Ho********************@giganews.com>...
kartik wrote:
there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.
No it does not.

Just because a runaway program stops sooner by hitting the
integer limit it does not mean that this having this limit
is a validation method.


i didn't say it is. all i say is that it catches bugs - & that's
valuable.


You did say it is. And then you said it again right there.
once again, i'm not relying on the integer limit to catch bugs, but
i'd much rather have bugs exposed by an overflow exception


Again that is using the integer limit to catch bugs. Repeated self-
contradiction does little to bolster your argument.
If you are worried about some numbers growing too much, then
check them yourself, you'll get much better results that way.


maybe, why not use an automated test built-in 2 the language? i get it
4 free.


Because, strangely enough, most people want limitations *removed* from
the language, not added to it. If you are looking for a language with
arbitrary limits then I think Python isn't quite right for you.

--
Cliff Wells <cl************@comcast.net>

Jul 18 '05 #11

P: n/a
> The question is how small is small? Less than 2**7? Less than 2**15?
Less than 2**31? Less than 2**63? And what's the significance of powers
of two? And what happens if you move from a 32 bit machine to a 64 bit
one? (or a 1024 bit one in a hundred years time?)
less than 2**31 most of the time & hardly ever greater than 2**63 - no
matter if my machine is 32-bit, 64-bit or 1024-bit. the required range
depends on the data u want 2 store in the variable & not on the
hardware.

> PEP 237 says, "It will give new Python programmers [...] one less
> thing to learn [...]". i feel this is not so important as the quality
> of code a programmer writes once he does learn the language.


The thing is, the int/long cutoff is arbitrary, determined soley by
implemetation detail.


agreed, but it need not be that way. ints can be defined to be 32-bit
(or 64-bit) on all architectures.

A much better idea is the judicious use of assertions.

assert x < 15000

Not only does it protect you from runaway numbers, it also documents
what the expected range is, resulting in a much better "quality of code"


such an assertion must be placed before avery assignment to the
variable - & that's tedious. moreover, it can give u a false sense of
security when u think u have it wherever needed but u've forgotten it
somewhere.

a 32-bit limit is a crude kind of assertion that u get for free, and
one u expect should hold for most variables. for those few variables
it doesn't, u can use a long.

-kartik
Jul 18 '05 #12

P: n/a

"kartik" <ka*************@yahoo.com> wrote in message
news:94**************************@posting.google.c om...
1)catching overflow bugs in the language itself frees u from writing
overflow tests.


It is a fundamental characteristic of counts and integers that adding 1 is
always valid. Given that, raising an overflow exception is itself a bug,
one that Python had and has now eliminated.

If one wishes to work with residue classes mod n, +1 is also still always
valid. It is just that (n-1) + 1 is 0 instead of n. So again, raising an
overflow error is a bug.

A number system that prohibits +1 for some fixed number n models, for
instance, packing items into a container. However, the limit n could be
anything, so fixing it at, say, 2**31 - 1 is almost always useless.

The use of fixed range ints is a space-time machine performance hack that
has been costly in human thought time.

Terry J. Reedy

Jul 18 '05 #13

P: n/a
kartik wrote:
Peter Hansen <pe***@engcorp.com> wrote in message news:<_a********************@powergate.ca>...

Do you feel strongly enough about the quality of your code to write
automated tests for it? Or are you just hoping that one tiny class
of potential bugs will be caught for you by this feature of the
language?

1)catching overflow bugs in the language itself frees u from writing
overflow tests.


That seems to me to be a bit like saying you don't need to do any
engineering calculations for your bridge because you'll find out if it's
not strong enough when it falls down.
2)no test (or test suite) can catch all errors, so language support 4
error detection is welcome.
Yes, but you appear to feel that an arbitrary limit on the size of
integers will be helpful, while I feel it's much better to assert that
they are in bounds as necessary. Relying on hardware overflows as error
detection is pretty poor, really.
3)overflow detection helps when u dont have automated tests 4 a
particular part of your program.
But writing such tests would help much more.
I'm not sure what you're asking, because even the exposure of
latent bugs which you are describing can happen only when you
*run* the code.

Agreed. i'm saying that without int/long unification, the bugs will b
found sooner & closer to where they occur, rather than propagating
throughout the program's objects & being found far away from the
source, if at all.

Even if we assume that this specious argument is valid, what consolation
would you offer the people who actually did find that huge integers were
helpful and that their programs no longer ran after such a change?

regards
Steve
--
http://www.holdenweb.com
http://pydish.holdenweb.com
Holden Web LLC +1 800 494 3119
Jul 18 '05 #14

P: n/a
On 2004-10-26, kartik <ka*************@yahoo.com> wrote:
such an assertion must be placed before avery assignment to the
variable - & that's tedious. moreover, it can give u a false sense of
security when u think u have it wherever needed but u've forgotten it
somewhere.


Aargh... The word is "you".

We now return you to an argument over I'm not sure what...

--
Grant Edwards grante Yow! I know how to get the
at hostesses released! Give
visi.com them their own television
series!
Jul 18 '05 #15

P: n/a
kartik wrote:
Istvan Albert <ia*****@mailblocks.com> wrote in message news:<Ho********************@giganews.com>...
kartik wrote:
[yada yada]
maybe, why not use an automated test built-in 2 the language? i get it
4 free.

-kartik

Perhaps you'd like Intel to produce a CPU where the overflow limit can
be arbitrarily set?

I'm getting a bit 6 of this nonsense. Maybe I 8 something that didn't
agree with me. I of10 do that. You've 1. 4give me.

regards
S3ve
--
http://www.holdenweb.com
http://pydish.holdenweb.com
Holden Web LLC +1 800 494 3119
Jul 18 '05 #16

P: n/a
Cliff Wells wrote:
On Mon, 2004-10-25 at 21:48 -0500, Jeff Epler wrote:
Here's a bug that passes silently because ints are not limited in range
from 1 to 100:
...

OK, just joking. I couldn't think of one.

Here's one:

# count how many ferrets I have
ferrets = 0
while 1:
try:
ferrets += 1
except:
break
print ferrets

As you can clearly see, the answer should have been 3, but due to Python
silently allowing numbers larger than 3 the program gets stuck in an
apparently interminable loop, requiring me to reboot Microsoft Bob.

Come on, the answer should clearly have been seven. Don't try your trick
with ne, buddy. I know octal is the only true number system.

And heaven knows how all those different characters got encoded in
three-bit bytes. That's got to be tricky.

7-I-can-see-that-ly y'rs - steve
--
http://www.holdenweb.com
http://pydish.holdenweb.com
Holden Web LLC +1 800 494 3119
Jul 18 '05 #17

P: n/a
al*****@yahoo.com (Alex Martelli) wrote in message news:<1gm8jkc.7gtvc31avo0xhN%al*****@yahoo.com>...
kartik <ka*************@yahoo.com> wrote:
there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.


So does allowing strings to be any length.
most of the time, i expect my numbers to be small. 2**31 is good
enough for most uses of variables, and when more is needed, 2**63
should do most of the time.


Most of the time, I expect my strings to be short. 1000 characters is
good enough for most uses of strings, and when more is needed, a million
should do most of the time.
granted, unification allows code to work for larger numbers than
foreseen (as PEP 237 states) but i feel the potential for more
undetected bugs outweighs this benefit.


Granted, unlimited string length allows code to work for longer strings
than foreseen (as common sense states) but (if you're consistent) you
feel the potential for more undetected bugs outweighs this benefit.
By this parallel, I intend to communicate that (and part of why) I
consider your observations to be totally without merit.


integers are used in different ways from strings. i may expect file
paths to be around 100 characters, and if i get a 500-character path,
i have no problem just because of the length. but if a person's age is
500 where i expect it to be less than 100, then **definitely**
something's wrong.

as another example, using too long a string as an index into a
dictionary is not a problem (true, the dictionary may not have a
mapping, but i have the same issue with a short string). but too long
an index into a list rewards me with an exception.

as i look at my code, i rarely have an issue with string sizes, but if
an integer variable gets very large (say > 2**31 or 2**63), it
generally reflects a bug in my code.

i suggest u base your comments on real code, rather than reasoning in
an abstract manner from your ivory tower.

-kartik
Jul 18 '05 #18

P: n/a
On 2004-10-26, Steve Holden <st***@holdenweb.com> wrote:
Come on, the answer should clearly have been seven. Don't try your trick
with ne, buddy. I know octal is the only true number system.


Ah, but split octal or, uh, unsplit octal?

377 377 vs. 177777

That is the real question.

--
Grant Edwards grante Yow! I'm sitting on my
at SPEED QUEEN... To me,
visi.com it's ENJOYABLE... I'm
WARM... I'm VIBRATORY...
Jul 18 '05 #19

P: n/a
On Tue, 2004-10-26 at 00:12 -0400, Steve Holden wrote:
Come on, the answer should clearly have been seven. Don't try your trick
with ne, buddy. I know octal is the only true number system.


Microsoft Bob assures me that octal isn't a word, but suggested
'octopus' instead. My program was about ferrets, so I think you are
mistaken.

Octopus hugs,
Cliff

--
Cliff Wells <cl************@comcast.net>

Jul 18 '05 #20

P: n/a
On Mon, 2004-10-25 at 21:05 -0700, kartik wrote:
The question is how small is small? Less than 2**7? Less than 2**15?
Less than 2**31? Less than 2**63? And what's the significance of powers
of two? And what happens if you move from a 32 bit machine to a 64 bit
one? (or a 1024 bit one in a hundred years time?)


less than 2**31 most of the time & hardly ever greater than 2**63 - no
matter if my machine is 32-bit, 64-bit or 1024-bit. the required range
depends on the data u want 2 store in the variable & not on the
hardware.


I'm going to rewrite that last line in English so that perhaps you'll
catch on to what you are saying:

'''
the required range depends on the data you want to store in the variable
and not on the hardware.
'''

The pivotal word here is "you". The data *you* want to store. One more
time YOU. I'm not correcting your mangling of English at this point,
rather I'm pointing out that it's *you*, not Python, that knows what
sort of data *you* want to store. If *you* want to limit your integers
to some arbitrary amount then *you* are going to have to write code to
do that. What *you* need for *your* application isn't necessarily what
anyone else needs for theirs.

If you want such a domain-specific language then you should take a look
at http://kartik.sourceforge.net as I think that's the only language
that's going to suit you.

--
Cliff Wells <cl************@comcast.net>

Jul 18 '05 #21

P: n/a
On 25 Oct 2004 21:25:07 -0700, kartik <ka*************@yahoo.com> wrote:

integers are used in different ways from strings. i may expect file
paths to be around 100 characters, and if i get a 500-character path,
i have no problem just because of the length. but if a person's age is
500 where i expect it to be less than 100, then **definitely**
something's wrong.


What if Noah is using your program?

--
Sam Holden
Jul 18 '05 #22

P: n/a
On Mon, 2004-10-25 at 21:25 -0700, kartik wrote:
integers are used in different ways from strings. i may expect file
paths to be around 100 characters, and if i get a 500-character path,
i have no problem just because of the length. but if a person's age is
500 where i expect it to be less than 100, then **definitely**
something's wrong.
So Python should raise an exception if it notices that your variable is
named "age" and you've put a number greater than 100 in it? I'm amazed
there isn't already a PEP on this. Have you checked to make sure?
as another example, using too long a string as an index into a
dictionary is not a problem (true, the dictionary may not have a
mapping, but i have the same issue with a short string). but too long
an index into a list rewards me with an exception.
I'd rather be rewarded with a cookie.
as i look at my code, i rarely have an issue with string sizes, but if
an integer variable gets very large (say > 2**31 or 2**63), it
generally reflects a bug in my code.
Well, of course. After all, there are only 2**63 numbers in the
universe, so any number higher than that is clearly an error. In fact,
given that so many people use numbers on a regular basis, I suspect
there are actually far fewer numbers left now than there were
originally.
i suggest u base your comments on real code, rather than reasoning in
an abstract manner from your ivory tower.


I can see why you would want to avoid that.

--
Cliff Wells <cl************@comcast.net>

Jul 18 '05 #23

P: n/a
On Tue, 2004-10-26 at 05:10 +0000, Sam Holden wrote:
On 25 Oct 2004 21:25:07 -0700, kartik <ka*************@yahoo.com> wrote:

integers are used in different ways from strings. i may expect file
paths to be around 100 characters, and if i get a 500-character path,
i have no problem just because of the length. but if a person's age is
500 where i expect it to be less than 100, then **definitely**
something's wrong.


What if Noah is using your program?


Then Python clearly needs a special mode so that when a person's age is
entered large numbers are allowed, but when counting animals it throws
an exception if the number is greater than two.

Duh.

--
Cliff Wells <cl************@comcast.net>

Jul 18 '05 #24

P: n/a

ka*************@yahoo.com (kartik) wrote:
integers are used in different ways from strings. i may expect file
paths to be around 100 characters, and if i get a 500-character path,
i have no problem just because of the length. but if a person's age is
500 where i expect it to be less than 100, then **definitely**
something's wrong.
So *you* need to do bounds checking. Last time I checked, Python didn't
raise an overflow error on getting to 500, so you're going to need to do
the check anyways.

as i look at my code, i rarely have an issue with string sizes, but if
an integer variable gets very large (say > 2**31 or 2**63), it
generally reflects a bug in my code.
And you should check that, and not rely on a misfeature and a mistake.
It is going in whether you like it or not, and has been planned since
Python 2.3.

i suggest u base your comments on real code, rather than reasoning in
an abstract manner from your ivory tower.


Ahh, real code. Ok *rummages through some code*. Yeah, so I checked,
and I don't rely on overflow errors. I use explicit bounds checking in
all cases where values outside my expected range can come. Then again,
I prefer to check my variables when I rely on them.
- Josiah

Jul 18 '05 #25

P: n/a

ka*************@yahoo.com (kartik) wrote:

Istvan Albert <ia*****@mailblocks.com> wrote in message news:<Ho********************@giganews.com>...

If you are worried about some numbers growing too much, then
check them yourself, you'll get much better results that way.


maybe, why not use an automated test built-in 2 the language? i get it
4 free.


Overflow errors are not automated tests. Overflow errors were a (mis)
feature of Python (non-long) integers. Had the unification occurred
prior to your using of Python, this conversation wouldn't be happening.

- Josiah

Jul 18 '05 #26

P: n/a
kartik wrote:
i suggest u base your comments on real code, rather than reasoning in
an abstract manner from your ivory tower.


My experience, based on 10+ years of making a living as a
professional programmer, says that you are wrong and the
comments made by others has been spot on.

Real code? Here's one used for generating the canonical
SMILES representation of a chemical compound. It comes
from the FROWNS package.

try:
val = 1
for offset, bondtype in offsets[index]:
val *= symclasses[offset] * bondtype
except OverflowError:
# Hmm, how often does this occur?
val = 1L
for offset, bondtype in offsets[index]:
val *= symclasses[offset] * bondtype
The algorithm uses the fundamental theorem of arithmetic
as part of computing a unique characteristic value for
every atom in the molecule, up to symmetry.

It's an iterative algorithm, and the new value for
a given atom is the product of the old values of its
neighbor atoms in the graph:

V'(atom1) = V(atom1.neighbor[0]) * V(atom1.neighbor[1]) * ...

In very rare cases this can overflow 32 bits. Rare
enough that it's faster to do everything using 32 bit
numbers and just redo the full calculation if there's
an overflow.

Because Python now no longer gives this overflow error,
we have the advantage of both performance and simplified
code.

Relatively speaking, 2**31 is tiny. My little laptop
can count that high in Python in about 7 minutes, and
my hard drive has about 2**35 bits of space. I deal
with single files bigger than 2**32 bits.

Why then should I have to put in all sorts of workarounds
into *my* code because *you* don't know how to write
good code, useful test cases, and appropriate internal
sanity checks?

Your examples, btw, are hypothetical. Having an
OverflowException at 2**31 doesn't fix your 500 year
old person problem and you'll have an IndexError / KeyError
well before you reach that limit, assuming your string /
dictionary doesn't have that much data.

Why not give some example of real code that shows
1) that giving an OverflowError is the right behaviour
(excluding talking to hardware or other system that
requires a fixed size number), 2) that there is a
fixed number N that is always appropriate, and 3)
that value is < sys.maxint.

For bonus points, use proper spelling and capitalization.
Andrew
da***@dalkescientific.com
Jul 18 '05 #27

P: n/a
Cliff Wells wrote:
Then Python clearly needs a special mode so that when a person's age is
entered large numbers are allowed, but when counting animals it throws
an exception if the number is greater than two.


Unless of course the count is of a clean animal, in which case
the exception limit is greater than seven.

Andrew
da***@dalkescientific.com

Jul 18 '05 #28

P: n/a
kartik <ka*************@yahoo.com> wrote:
...
integers are used in different ways from strings. i may expect file
Integers are used in a huge variety of ways, and so are strings.
paths to be around 100 characters, and if i get a 500-character path,
i have no problem just because of the length. but if a person's age is
500 where i expect it to be less than 100, then **definitely**
something's wrong.
Try doing some accounting in Turkish liras, one of these days. Today,
each Euro is 189782957 cents of Turkish liras. If an Italian firm
selling (say) woodcutting equipment bids on a pretty modest contract in
Turkey, offering machinery worth 2375220 Euros, they need to easily
compute that their bid is 450776275125540 cents of Turkisk Liras. And
that's a _pretty modest_ contract, again -- if you're doing some
computation about truly substantial sums (e.g. ones connected to
government budgets) the numbers get way larger.

[[and yes, integer numbers of some fraction of a currency, typically
cents, are a good and practical way to do accounting -- nix floating
point, be it binary or decimal]].

Sure, Turkey will rebase its currency in 2005 -- but who can predict
when some other country's currency will similarly debase. Even just for
accounting, unlimited-size integers are simply much more practical.

as another example, using too long a string as an index into a
dictionary is not a problem (true, the dictionary may not have a
mapping, but i have the same issue with a short string). but too long
an index into a list rewards me with an exception.
But the same index, used as a dictionary key, works just fine. Specious
argument, therefore.

as i look at my code, i rarely have an issue with string sizes, but if
an integer variable gets very large (say > 2**31 or 2**63), it
generally reflects a bug in my code.
This may be peculiar to the kind of code you write -- if you hit more
bugs whose symptom is a large integer than ones whose symptom is a large
string, you're probably generating fewer strings than integers. But
many other people's code have the opposite character, and it's quite
presumptous of you to suggest changing Python without considering that.

i suggest u base your comments on real code, rather than reasoning in
an abstract manner from your ivory tower.


I suggest you take your blinkers off, and look at all ways integers are
commonly used in all sorts of computations, rather than imagining your
personal code is the end of the world. As common and everyday a
computation (in some fields) as the factorial of 1000 (number of
permutations of 1000 objects) is 2**8530 -- and combinatorial arithmetic
is anything but an "ivory tower" pursuit these days, and factorial is
the simplest building block in combinatorial arithmetic.

If you need objects with constraints, build them as custom types.
Subclass int, str, whatever, to accept bounds or other checkers in their
constructors (or as compiletime constants, whatever), convert the result
of each operation to the same type, and raise as soon as an instance is
constructed that's out of bounds. It's not a difficult exercise, and if
you design and apply your constraints well it may help you catch some
categories of bugs sooner than without such constraints.

Once you have convincing use cases and experience to show that this kind
of thing is more useful than previously noticed, _then_ you may stand a
chance to have some kind of optional limit-checking subtypes rolled into
Python's core/standard library. So far, you have nothing of the kind.
Alex
Jul 18 '05 #29

P: n/a
Cliff Wells <cl************@comcast.net> wrote:
...
The pivotal word here is "you". The data *you* want to store. One more
time YOU. I'm not correcting your mangling of English at this point,
rather I'm pointing out that it's *you*, not Python, that knows what
sort of data *you* want to store. If *you* want to limit your integers
to some arbitrary amount then *you* are going to have to write code to
do that. What *you* need for *your* application isn't necessarily what
anyone else needs for theirs.


This is correct, but optional constraint checking (on all kinds of data
types) can be a handy feature for many kinds of applications. It might
be handy to have a 'type decorator' for that, one which would wrap all
operations returning a result (and all mutations, for mutables) into
suitable checks. If I have a list that's supposed to always be between
5 and 8 items, it might be handy to write:

x = constrain(list, LenConstraint(5, 9))([0]*7)

and if I have an integer number that's supposed to always be (and return
when operated upon other such ints) nonnegative and less than 500,

y = constrain(int, SizeConstraint(0, 500))()

Now, y-1 would raise a ConstraintViolation, as would, say,
x.extend('hello'). Sure, nothing momentous, but this would catch some
typos (where I meant y+1) or thinkos (where I meant x.append('hello'))
faster than the unittests would.

There is some previous art for that, of course. For integers only,
Pascal let you declare a range of admissible values; in debug mode only,
many compilers would helpfully catch range-mistakes, much like (say) in
Python, debug or not, list indexing catches out-of-bounds errors. In
SQL (and other data modeling situations), it's normal and helpful to
express optional constraints on the range of allowed values.

Of course, this has nothing to do with silly and arbitrary bounds such
as 2**31-1. But constraint checking should not necessarily be ruled out
as a generally helpful technique.

Maybe it's better to attach constraints to attributes rather than to
types -- use c.y rather than bare y, so that assignments to c.y that do
not meet the constraints will raise ConstraintViolation, say. That
would surely be easier to program, and speedier, for non-mutables.

Exploring these design-space variations and gathering real-world use
cases is best done by designing and implementing a Python extension for
the purpose, of course. Rolling some such functionality into the Python
core / standard library would be silly without such specific previous
experience.
Alex
Jul 18 '05 #30

P: n/a
kartik wrote:
i suggest u base your comments on real code, rather than reasoning in
an abstract manner from your ivory tower.


Could someone please help me with a related question? Why do I have the
unmistakable feeling that this is a troll? Is it the misspellings? The
gratuitous sniping? Or the OP's suggestions to provide "real code" when
he has not done the same?
--
Michael Hoffman
Jul 18 '05 #31

P: n/a
Michael Hoffman <m.*********************************@example.com > wrote:
kartik wrote:
i suggest u base your comments on real code, rather than reasoning in
an abstract manner from your ivory tower.


Could someone please help me with a related question? Why do I have the
unmistakable feeling that this is a troll? Is it the misspellings? The
gratuitous sniping? Or the OP's suggestions to provide "real code" when
he has not done the same?


It's probably just the usual way he interacts, no big deal. He didn't
ask for real code to be provided, but used to base one's comments on,
please note; and I gave two real-life examples (a woodworking equipment
company in Italy bidding to supply some equipment to a Turkish customer
who prefers to get billed in their local currency; a factorial) without
any trouble and without needing to post the code (it's quite obvious).
Alex
Jul 18 '05 #32

P: n/a
Michael Hoffman wrote:
Could someone please help me with a related question? Why do I have the
unmistakable feeling that this is a troll? Is it the misspellings? The
gratuitous sniping? Or the OP's suggestions to provide "real code" when
he has not done the same?


If you check the archives you'll see the OP's nym come up in two
other hits. One for a proposal to gcc and another as a proposal to
OpenOffice. Both of the sort "I think XYZ would be cool", both using
the poor syntax.

Neither show much troll-like behavior.

My guess is the not unusual case of someone who works mostly alone
and doesn't have much experience in diverse projects nor working
with more experience people.

I've seen some similar symptoms working with, for example,
undergraduate students who are hotshot programmers ... when
compared to other students in their non-CS department but not
when compared to, say, a CS student, much less a experienced
developer.

Andrew
da***@dalkescientific.com
Jul 18 '05 #33

P: n/a
Me:
I've seen some similar symptoms ...


But of course correlation does not imply causation and I use
that solely as an example to show how it could be caused by
non-troll behavior.

There could also be cultural reasons, and/or personal ones,
and/or the keyboard could have a broken shift state... Though
that would require '*' to not be on a shifted position.

Andrew
da***@dalkescientific.com
Jul 18 '05 #34

P: n/a
Andrew Dalke wrote:
Neither show much troll-like behavior.

My guess is the not unusual case of someone who works mostly alone
and doesn't have much experience in diverse projects nor working
with more experience people.


Perhaps. Maybe I have just been spending too much time in troll-infested
forums, which sets one's troll-o-meter off a lot earlier.
--
Michael Hoffman
Jul 18 '05 #35

P: n/a
On Tue, 2004-10-26 at 09:17 +0200, Alex Martelli wrote:
Cliff Wells <cl************@comcast.net> wrote:
...
The pivotal word here is "you". The data *you* want to store. One more
time YOU. I'm not correcting your mangling of English at this point,
rather I'm pointing out that it's *you*, not Python, that knows what
sort of data *you* want to store. If *you* want to limit your integers
to some arbitrary amount then *you* are going to have to write code to
do that. What *you* need for *your* application isn't necessarily what
anyone else needs for theirs.
This is correct, but optional constraint checking (on all kinds of data
types) can be a handy feature for many kinds of applications. It might
be handy to have a 'type decorator' for that, one which would wrap all
operations returning a result (and all mutations, for mutables) into
suitable checks. If I have a list that's supposed to always be between
5 and 8 items, it might be handy to write:


<snip speculative examples>
Of course, this has nothing to do with silly and arbitrary bounds such
as 2**31-1. But constraint checking should not necessarily be ruled out
as a generally helpful technique.


Not at all. I do quite a bit of database programming and use
constraints (foreign keys, unique indices, etc) extensively. The
concept is also in widespread use in GUI programming libraries for
controls that deal with user input (e.g. masked input controls). In
fact, most controls present in a GUI implicitly constrain user input
(menus, buttons, etc).

Of course what you describe above can be done now using functions and
derived classes, but it would certainly be interesting to have a general
(and concise) way of describing constraints within the language itself.
Regards,
Cliff

--
Cliff Wells <cl************@comcast.net>

Jul 18 '05 #36

P: n/a
Cliff Wells <cl************@comcast.net> wrote in message news:<ma**************************************@pyt hon.org>...
On Mon, 2004-10-25 at 21:48 -0500, Jeff Epler wrote:
Here's a bug that passes silently because ints are not limited in range
from 1 to 100:
...

OK, just joking. I couldn't think of one.


Here's one:

# count how many ferrets I have
ferrets = 0
while 1:
try:
ferrets += 1
except:
break
print ferrets

As you can clearly see, the answer should have been 3, but due to Python
silently allowing numbers larger than 3 the program gets stuck in an
apparently interminable loop, requiring me to reboot Microsoft Bob.


There always were legends that sys.maxferretpopulation is
implementation-dependant, not readable, and not writable. More
recently the whisper is that at least under Windows it is set by the
installer at installation time using an algorithm known only to the
timbot.
Jul 18 '05 #37

P: n/a
Cliff Wells <cl************@comcast.net> wrote:
...
Of course, this has nothing to do with silly and arbitrary bounds such
as 2**31-1. But constraint checking should not necessarily be ruled out
as a generally helpful technique.
Not at all. I do quite a bit of database programming and use
constraints (foreign keys, unique indices, etc) extensively. The


Yep, and besides such 'structural' constraints even the simple kind of
check such as "this number is alway between X and Y" may to a lesser
extent come in handy.
concept is also in widespread use in GUI programming libraries for
controls that deal with user input (e.g. masked input controls). In
fact, most controls present in a GUI implicitly constrain user input
(menus, buttons, etc).
A different case, IMHO. An input/'edit' box with inherent checks is a
bit closer.
Of course what you describe above can be done now using functions and
derived classes, but it would certainly be interesting to have a general
(and concise) way of describing constraints within the language itself.


Functions are in general the proper way to check, but having to
explicitly call the checking function each time a change may have
occurred gets old fast.

What I was musing about is (well, half of it) is easy to implement if
you're using qualified names rather than barenames -- each assignment to
a.x may easily be made to go through a setter-function that calls an
appropriate checker function. Attaching similar setter-functions to
barenames is a very alien concept to Python today, but in a sense the
difference could be seen as mere syntax sugar.

The other half of the problem has to do with mutables, and ensuring a
checker function (on object invariants, if you will) runs after each
mutation -- not all that new a notion, just a part of design by contract
(people always focus on preconditions and postconditions and appear to
forget invariants, which are just as crucial;-).

I'm not sure there's a "grand unification" between these halves just
waiting to happen. Surely, though, a little add-on package making it
easy to add checker functions and providing a few typical such checkers
might be of some help. If the checkers could be easily disabled at the
flip of a switch (say the debug flag;-) they might even help by being
potentially-executable specifications of programmer intention (much like
DbC helps in part exactly because it _can_ be disabled that way;-).

Guess I'm mostly musing on the general issue and the 2**31-1 silliness
was just a spark that lit some waiting tinder in my mind;-).
Alex
Jul 18 '05 #38

P: n/a
Cliff Wells wrote:
On Tue, 2004-10-26 at 05:10 +0000, Sam Holden wrote:
On 25 Oct 2004 21:25:07 -0700, kartik <ka*************@yahoo.com> wrote:
integers are used in different ways from strings. i may expect file
paths to be around 100 characters, and if i get a 500-character path,
i have no problem just because of the length. but if a person's age is
500 where i expect it to be less than 100, then **definitely**
something's wrong.


What if Noah is using your program?


Then Python clearly needs a special mode so that when a person's age is
entered large numbers are allowed, but when counting animals it throws
an exception if the number is greater than two.

Duh.


And what about Methuselah? He's not going to be receiving his
social security cheques if he can't enter his age as higher than,
say, 2**7 (as our friend kartik would arbitrarily say...). We
actually need 2**10 for his age, but now that's a little too
high so maybe we'd better just kill the old bugger off at 2**9
(hey, he's lived so long that shaving 208 years off his age
won't bother him, right?).

-Peter
Jul 18 '05 #39

P: n/a
On Tue, 2004-10-26 at 09:36 -0400, Peter Hansen wrote:
Cliff Wells wrote: And what about Methuselah? He's not going to be receiving his
social security cheques if he can't enter his age as higher than,
say, 2**7 (as our friend kartik would arbitrarily say...). We
actually need 2**10 for his age, but now that's a little too
high so maybe we'd better just kill the old bugger off at 2**9
(hey, he's lived so long that shaving 208 years off his age
won't bother him, right?).


Ah, the infamous Methuselah quandary. I think actually killing him with
a program would involve specialized hardware (and possibly a gun
permit), which is why this problem has yet to be solved satisfactorily.
Without corporate interest this problem will probably remain unresolved.

Of course the Social Security Administration is one of those places
where constraints define reality, so my guess is he's already received a
letter informing him he's dead and so is no longer eligible anyway. Yet
another example of practicality-beats-purity.

Regards,
Cliff

--
Cliff Wells <cl************@comcast.net>

Jul 18 '05 #40

P: n/a
kartik wrote:
The question is how small is small? Less than 2**7? Less than 2**15?
Less than 2**31? Less than 2**63? And what's the significance of powers
of two? And what happens if you move from a 32 bit machine to a 64 bit
one? (or a 1024 bit one in a hundred years time?)


less than 2**31 most of the time & hardly ever greater than 2**63 - no
matter if my machine is 32-bit, 64-bit or 1024-bit. the required range
depends on the data u want 2 store in the variable & not on the
hardware.


Yes. My point exactly. Very rarely will the platform limit reflect the
algorithmic limit. If you want to limit the range if your numbers, you
need to have knowledge of your particular use case - something that
can't be done with a predefined language limit.
> PEP 237 says, "It will give new Python programmers [...] one less
> thing to learn [...]". i feel this is not so important as the quality
> of code a programmer writes once he does learn the language.


The thing is, the int/long cutoff is arbitrary, determined soley by
implemetation detail.


agreed, but it need not be that way. ints can be defined to be 32-bit
(or 64-bit) on all architectures.


But again, even though consistent, the limit is still arbitrary. Which
one will it be? How do we decide? If we're platform independent, why
bother with hardware based sizes anyway? Why not use a base 10 limit
like 10**10? As mentioned above, the choice of limit depends on the
particular algorithm, which can't be know by the language designers a
priori.
A much better idea is the judicious use of assertions.

assert x < 15000

Not only does it protect you from runaway numbers, it also documents
what the expected range is, resulting in a much better "quality of code"


such an assertion must be placed before avery assignment to the
variable - & that's tedious. moreover, it can give u a false sense of
security when u think u have it wherever needed but u've forgotten it
somewhere.


I was thinking of judicious use for local variables inside of a loop.
But if you want general, just subclass int (with 2.3.4):
class limint(long): def __add__(self, other):
value = long.__add__(self, other)
if value > 100:
raise OverflowError
return limint(value) a = limint(10)
a 10L b = a+90
b 100L c = b+1
Traceback (most recent call last):
File "<pyshell#24>", line 1, in -toplevel-
c = b+1
File "<pyshell#18>", line 5, in __add__
raise OverflowError
OverflowError


A bit crude, but it will get you started. If it's too slow, there is
nothing stopping you from making a C extension module with the
appropriate types.

I think that one of the problems we're having in this conversation is
that we are talking across each other. Nobody is denying that finding
bugs is a good thing. It's just that, for the bugs which the overflow
catches, there are much better ways of discovering them. (I'm surprised
no one has mentioned unit testing yet.)

Any decision is always has a cost/benefit analysis. For long/int
unification, the benefits have been pointed out by others, and your
proposed costs are minor, and can be ameliorated by other practices,
which most here would argue are the better way of going about it in the
first place.
Jul 18 '05 #41

P: n/a
Rocco Moretti wrote:
(I'm surprised no one has mentioned unit testing yet.)


The topic (without the word "unit" in it, mind you) was raised
fifteen minutes after kartik first posted. Istvan and Steve
have both mentioned it as well.

-Peter
Jul 18 '05 #42

P: n/a
"Terry Reedy" <tj*****@udel.edu> wrote in message news:<ma**************************************@pyt hon.org>...
"kartik" <ka*************@yahoo.com> wrote in message
news:94**************************@posting.google.c om...
1)catching overflow bugs in the language itself frees u from writing
overflow tests.
It is a fundamental characteristic of counts and integers that adding 1 is
always valid. Given that, raising an overflow exception is itself a bug,
one that Python had and has now eliminated.

If one wishes to work with residue classes mod n, +1 is also still always
valid. It is just that (n-1) + 1 is 0 instead of n. So again, raising an
overflow error is a bug.


i don't care what mathematical properties are satisfied; what matters
is to what extent the type system helps me in writing bug-free code
[...]However, the limit n could be
anything, so fixing it at, say, 2**31 - 1 is almost always useless.
i dont think so. if it catches bugs that cause numbers to increase
beyond 2**31, that's valuable. The use of fixed range ints is a space-time machine performance hack that
has been costly in human thought time.


on what basis do u say that

-kartik
Jul 18 '05 #43

P: n/a
Steve Holden <st***@holdenweb.com> wrote in message news:<Z6kfd.18413$SW3.4432@fed1read01>...
kartik wrote:
Peter Hansen <pe***@engcorp.com> wrote in message news:<_a********************@powergate.ca>...

Do you feel strongly enough about the quality of your code to write
automated tests for it? Or are you just hoping that one tiny class
of potential bugs will be caught for you by this feature of the
language?

1)catching overflow bugs in the language itself frees u from writing
overflow tests.


That seems to me to be a bit like saying you don't need to do any
engineering calculations for your bridge because you'll find out if it's
not strong enough when it falls down.


i was inaccurate. what i meant was that overflow errors provide a
certain amount of sanity checking in the absence of explicit testing -
& do u check every assignment for bounds?
2)no test (or test suite) can catch all errors, so language support 4
error detection is welcome.


Yes, but you appear to feel that an arbitrary limit on the size of
integers will be helpful [...] Relying on hardware overflows as error
detection is pretty poor, really.


i'm not relying on overflow errors to ensure correctness. it's only a
mechanism that sometimes catches bugs - & that's valuable.
3)overflow detection helps when u dont have automated tests 4 a
particular part of your program.

But writing such tests would help much more.


agreed, but do u test your code so thoroughly that u can guarantee
your code is bug-free. till then, overflow errors help.

-kartik
Jul 18 '05 #44

P: n/a
Cliff Wells <cl************@comcast.net> wrote in message news:<ma**************************************@pyt hon.org>...
On Mon, 2004-10-25 at 20:37 -0700, kartik wrote:
Istvan Albert <ia*****@mailblocks.com> wrote in message news:<Ho********************@giganews.com>...
kartik wrote:

> there seems to be a serious problem with allowing numbers to grow in a
> nearly unbounded manner, as int/long unification does: it hides bugs.

No it does not.

Just because a runaway program stops sooner by hitting the
integer limit it does not mean that this having this limit
is a validation method.
i didn't say it is. all i say is that it catches bugs - & that's
valuable.


You did say it is. And then you said it again right there.


i think you are getting confused between a mechanism that catches some
bugs & one that can catch all (a validation method)
once again, i'm not relying on the integer limit to catch bugs, but
i'd much rather have bugs exposed by an overflow exception


Again that is using the integer limit to catch bugs. Repeated self-
contradiction does little to bolster your argument.
If you are worried about some numbers growing too much, then
check them yourself, you'll get much better results that way.


maybe, why not use an automated test built-in 2 the language? i get it
4 free.


Because, strangely enough, most people want limitations *removed* from
the language, not added to it.


limits that catch bugs are good. without *any* limitations, i should
be able to redefine the symbol "4" to mean "8". would you program in
such a language? i wouldn't
If you are looking for a language with
arbitrary limits then I think Python isn't quite right for you.


not arbitrary limits, but ones that catch bugs.
Jul 18 '05 #45

P: n/a
Cliff Wells <cl************@comcast.net> wrote in message news:<ma**************************************@pyt hon.org>...
On Mon, 2004-10-25 at 21:05 -0700, kartik wrote:
The question is how small is small? Less than 2**7? Less than 2**15?
Less than 2**31? Less than 2**63? And what's the significance of powers
of two? And what happens if you move from a 32 bit machine to a 64 bit
one? (or a 1024 bit one in a hundred years time?)
less than 2**31 most of the time & hardly ever greater than 2**63 - no
matter if my machine is 32-bit, 64-bit or 1024-bit. the required range
depends on the data u want 2 store in the variable & not on the
hardware.


I'm going to rewrite that last line in English so that perhaps you'll
catch on to what you are saying:


thank u so much 4 your help, but i know what i'm saying without
assistance from clowns like u. & i dont give a damn about your rules 4
proper communciation, as long as i'm understood.

'''
the required range depends on the data you want to store in the variable
and not on the hardware.
'''

The pivotal word here is "you". The data *you* want to store. One more
time YOU. I'm not correcting your mangling of English at this point,
rather I'm pointing out that it's *you*, not Python, that knows what
sort of data *you* want to store. If *you* want to limit your integers
to some arbitrary amount then *you* are going to have to write code to
do that. What *you* need for *your* application isn't necessarily what
anyone else needs for theirs.


the required range, while being different for different variables, is
generally is less than 2**31 - & *that* can be checked by the
language.
Jul 18 '05 #46

P: n/a
al*****@yahoo.com (Alex Martelli) wrote in message news:<1gm9a9j.s0b279yqpnlvN%al*****@yahoo.com>...
Cliff Wells <cl************@comcast.net> wrote: optional constraint checking [...] can be a handy feature for many kinds of
applications [...] Of course, this has nothing to do with silly and arbitrary
bounds such as 2**31-1.


bounds such as 2**31 are a crude form of constraint checking that you
get by default. if you feel your data is going to be larger, you can
use a long type

-kartik
Jul 18 '05 #47

P: n/a
On 26 Oct 2004 20:50:58 -0700, kartik <ka*************@yahoo.com> wrote:

not arbitrary limits, but ones that catch bugs.


Please give an example of some code containing such a bug which
would be caught by integer limits but not caught by the unit tests
someone who has been programming for only a week would write.

--
Sam Holden
Jul 18 '05 #48

P: n/a
On Tue, 2004-10-26 at 20:50 -0700, kartik wrote:
Cliff Wells <cl************@comcast.net> wrote in message news:<ma**************************************@pyt hon.org>...
If you are looking for a language with
arbitrary limits then I think Python isn't quite right for you.


not arbitrary limits, but ones that catch bugs.


You win. I'll give you kudos for having seemingly infinite patience,
but frankly you've worn mine out.

--
Cliff Wells <cl************@comcast.net>

Jul 18 '05 #49

P: n/a
Rocco Moretti <ro**********@hotpop.com> wrote in message news:<cl**********@news.doit.wisc.edu>...
Very rarely will the platform limit reflect the
algorithmic limit. If you want to limit the range if your numbers, you
need to have knowledge of your particular use case - something that
can't be done with a predefined language limit.
i'm saying that most of the time, the algorithmic limit will be less
than 2**31 or 2**63 - & that can be checked by the language.

the limit is still arbitrary. Which
one will it be? How do we decide? If we're platform independent, why
bother with hardware based sizes anyway? Why not use a base 10 limit
like 10**10?
it doesn't really matter what the limit is, as long as it's large
enough that it's not crossed often. (it's only that a limit of 2**31
or 2**63 can be efficiently checked.)

I think that one of the problems we're having in this conversation is
that we are talking across each other. Nobody is denying that finding
bugs is a good thing. It's just that, for the bugs which the overflow
catches, there are much better ways of discovering them. (I'm surprised
no one has mentioned unit testing yet.)

Any decision is always has a cost/benefit analysis. For long/int
unification, the benefits have been pointed out by others, and your
proposed costs are minor, and can be ameliorated by other practices,
which most here would argue are the better way of going about it in the
first place.


agreed, but what about when you don't use these "better practices"? do
you use them for every variable? overflow catches sometimes help you
then.
Jul 18 '05 #50

83 Replies

This discussion thread is closed

Replies have been disabled for this discussion.