int/long unification hides bugs - Page 5

kartik

there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.
most of the time, i expect my numbers to be small. 2**31 is good
enough for most uses of variables, and when more is needed, 2**63
should do most of the time.

granted, unification allows code to work for larger numbers than
foreseen (as PEP 237 states) but i feel the potential for more
undetected bugs outweighs this benefit.

the other benefit of the unification - portability - can be achieved
by defining int32 & int64 types (or by defining all integers to be
32-bit (or 64-bit))

PEP 237 says, "It will give new Python programmers [...] one less
thing to learn [...]". i feel this is not so important as the quality
of code a programmer writes once he does learn the language.

-kartik

Jul 18 '05

Subscribe Reply

3473

« First
<
3
4
5
6
7
>
Last »

Rocco Moretti

kartik wrote:

The question is how small is small? Less than 2**7? Less than 2**15?
Less than 2**31? Less than 2**63? And what's the significance of powers
of two? And what happens if you move from a 32 bit machine to a 64 bit
one? (or a 1024 bit one in a hundred years time?)

less than 2**31 most of the time & hardly ever greater than 2**63 - no
matter if my machine is 32-bit, 64-bit or 1024-bit. the required range
depends on the data u want 2 store in the variable & not on the
hardware.

Yes. My point exactly. Very rarely will the platform limit reflect the
algorithmic limit. If you want to limit the range if your numbers, you
need to have knowledge of your particular use case - something that
can't be done with a predefined language limit.

> PEP 237 says, "It will give new Python programmers [...] one less
> thing to learn [...]". i feel this is not so important as the quality
> of code a programmer writes once he does learn the language.

The thing is, the int/long cutoff is arbitrary, determined soley by
implemetati on detail.

agreed, but it need not be that way. ints can be defined to be 32-bit
(or 64-bit) on all architectures.

But again, even though consistent, the limit is still arbitrary. Which
one will it be? How do we decide? If we're platform independent, why
bother with hardware based sizes anyway? Why not use a base 10 limit
like 10**10? As mentioned above, the choice of limit depends on the
particular algorithm, which can't be know by the language designers a
priori.

A much better idea is the judicious use of assertions.

assert x < 15000

Not only does it protect you from runaway numbers, it also documents
what the expected range is, resulting in a much better "quality of code"

such an assertion must be placed before avery assignment to the
variable - & that's tedious. moreover, it can give u a false sense of
security when u think u have it wherever needed but u've forgotten it
somewhere.

I was thinking of judicious use for local variables inside of a loop.
But if you want general, just subclass int (with 2.3.4):

class limint(long): def __add__(self, other):
value = long.__add__(se lf, other)
if value > 100:
raise OverflowError
return limint(value) a = limint(10)
a 10L b = a+90
b 100L c = b+1
Traceback (most recent call last):
File "<pyshell#2 4>", line 1, in -toplevel-
c = b+1
File "<pyshell#1 8>", line 5, in __add__
raise OverflowError
OverflowError

A bit crude, but it will get you started. If it's too slow, there is
nothing stopping you from making a C extension module with the
appropriate types.

I think that one of the problems we're having in this conversation is
that we are talking across each other. Nobody is denying that finding
bugs is a good thing. It's just that, for the bugs which the overflow
catches, there are much better ways of discovering them. (I'm surprised
no one has mentioned unit testing yet.)

Any decision is always has a cost/benefit analysis. For long/int
unification, the benefits have been pointed out by others, and your
proposed costs are minor, and can be ameliorated by other practices,
which most here would argue are the better way of going about it in the
first place.

Jul 18 '05 #41

Peter Hansen

Rocco Moretti wrote:

(I'm surprised no one has mentioned unit testing yet.)

The topic (without the word "unit" in it, mind you) was raised
fifteen minutes after kartik first posted. Istvan and Steve
have both mentioned it as well.

-Peter

Jul 18 '05 #42

kartik

"Terry Reedy" <tj*****@udel.e du> wrote in message news:<ma******* *************** *************** *@python.org>.. .

"kartik" <ka************ *@yahoo.com> wrote in message
news:94******** *************** ***@posting.goo gle.com...
1)catching overflow bugs in the language itself frees u from writing
overflow tests.
It is a fundamental characteristic of counts and integers that adding 1 is
always valid. Given that, raising an overflow exception is itself a bug,
one that Python had and has now eliminated.

If one wishes to work with residue classes mod n, +1 is also still always
valid. It is just that (n-1) + 1 is 0 instead of n. So again, raising an
overflow error is a bug.

i don't care what mathematical properties are satisfied; what matters
is to what extent the type system helps me in writing bug-free code
[...]However, the limit n could be
anything, so fixing it at, say, 2**31 - 1 is almost always useless.
i dont think so. if it catches bugs that cause numbers to increase
beyond 2**31, that's valuable. The use of fixed range ints is a space-time machine performance hack that
has been costly in human thought time.

on what basis do u say that

-kartik

Jul 18 '05 #43

kartik

Steve Holden <st***@holdenwe b.com> wrote in message news:<Z6kfd.184 13$SW3.4432@fed 1read01>...

kartik wrote:
Peter Hansen <pe***@engcorp. com> wrote in message news:<_a******* *************@p owergate.ca>...

Do you feel strongly enough about the quality of your code to write
automated tests for it? Or are you just hoping that one tiny class
of potential bugs will be caught for you by this feature of the
language?

1)catching overflow bugs in the language itself frees u from writing
overflow tests.

That seems to me to be a bit like saying you don't need to do any
engineering calculations for your bridge because you'll find out if it's
not strong enough when it falls down.

i was inaccurate. what i meant was that overflow errors provide a
certain amount of sanity checking in the absence of explicit testing -
& do u check every assignment for bounds?

2)no test (or test suite) can catch all errors, so language support 4
error detection is welcome.

Yes, but you appear to feel that an arbitrary limit on the size of
integers will be helpful [...] Relying on hardware overflows as error
detection is pretty poor, really.

i'm not relying on overflow errors to ensure correctness. it's only a
mechanism that sometimes catches bugs - & that's valuable.

3)overflow detection helps when u dont have automated tests 4 a
particular part of your program.

But writing such tests would help much more.

agreed, but do u test your code so thoroughly that u can guarantee
your code is bug-free. till then, overflow errors help.

-kartik

Jul 18 '05 #44

kartik

Cliff Wells <cl************ @comcast.net> wrote in message news:<ma******* *************** *************** *@python.org>.. .

On Mon, 2004-10-25 at 20:37 -0700, kartik wrote:
Istvan Albert <ia*****@mailbl ocks.com> wrote in message news:<Ho******* *************@g iganews.com>...
kartik wrote:

> there seems to be a serious problem with allowing numbers to grow in a
> nearly unbounded manner, as int/long unification does: it hides bugs.

No it does not.

Just because a runaway program stops sooner by hitting the
integer limit it does not mean that this having this limit
is a validation method.
i didn't say it is. all i say is that it catches bugs - & that's
valuable.

You did say it is. And then you said it again right there.

i think you are getting confused between a mechanism that catches some
bugs & one that can catch all (a validation method)

once again, i'm not relying on the integer limit to catch bugs, but
i'd much rather have bugs exposed by an overflow exception

Again that is using the integer limit to catch bugs. Repeated self-
contradiction does little to bolster your argument.
If you are worried about some numbers growing too much, then
check them yourself, you'll get much better results that way.

maybe, why not use an automated test built-in 2 the language? i get it
4 free.

Because, strangely enough, most people want limitations *removed* from
the language, not added to it.

limits that catch bugs are good. without *any* limitations, i should
be able to redefine the symbol "4" to mean "8". would you program in
such a language? i wouldn't
If you are looking for a language with
arbitrary limits then I think Python isn't quite right for you.

not arbitrary limits, but ones that catch bugs.

Jul 18 '05 #45

kartik

Cliff Wells <cl************ @comcast.net> wrote in message news:<ma******* *************** *************** *@python.org>.. .

On Mon, 2004-10-25 at 21:05 -0700, kartik wrote:
The question is how small is small? Less than 2**7? Less than 2**15?
Less than 2**31? Less than 2**63? And what's the significance of powers
of two? And what happens if you move from a 32 bit machine to a 64 bit
one? (or a 1024 bit one in a hundred years time?)
less than 2**31 most of the time & hardly ever greater than 2**63 - no
matter if my machine is 32-bit, 64-bit or 1024-bit. the required range
depends on the data u want 2 store in the variable & not on the
hardware.

I'm going to rewrite that last line in English so that perhaps you'll
catch on to what you are saying:

thank u so much 4 your help, but i know what i'm saying without
assistance from clowns like u. & i dont give a damn about your rules 4
proper communciation, as long as i'm understood.

'''
the required range depends on the data you want to store in the variable
and not on the hardware.
'''

The pivotal word here is "you". The data *you* want to store. One more
time YOU. I'm not correcting your mangling of English at this point,
rather I'm pointing out that it's *you*, not Python, that knows what
sort of data *you* want to store. If *you* want to limit your integers
to some arbitrary amount then *you* are going to have to write code to
do that. What *you* need for *your* application isn't necessarily what
anyone else needs for theirs.

the required range, while being different for different variables, is
generally is less than 2**31 - & *that* can be checked by the
language.

Jul 18 '05 #46

kartik

al*****@yahoo.c om (Alex Martelli) wrote in message news:<1gm9a9j.s 0b279yqpnlvN%al *****@yahoo.com >...

Cliff Wells <cl************ @comcast.net> wrote: optional constraint checking [...] can be a handy feature for many kinds of
applications [...] Of course, this has nothing to do with silly and arbitrary
bounds such as 2**31-1.

bounds such as 2**31 are a crude form of constraint checking that you
get by default. if you feel your data is going to be larger, you can
use a long type

-kartik

Jul 18 '05 #47

Sam Holden

On 26 Oct 2004 20:50:58 -0700, kartik <ka************ *@yahoo.com> wrote:

not arbitrary limits, but ones that catch bugs.

Please give an example of some code containing such a bug which
would be caught by integer limits but not caught by the unit tests
someone who has been programming for only a week would write.

--
Sam Holden

Jul 18 '05 #48

Cliff Wells

On Tue, 2004-10-26 at 20:50 -0700, kartik wrote:

Cliff Wells <cl************ @comcast.net> wrote in message news:<ma******* *************** *************** *@python.org>.. .
If you are looking for a language with
arbitrary limits then I think Python isn't quite right for you.

not arbitrary limits, but ones that catch bugs.

You win. I'll give you kudos for having seemingly infinite patience,
but frankly you've worn mine out.

--
Cliff Wells <cl************ @comcast.net>

Jul 18 '05 #49

kartik

Rocco Moretti <ro**********@h otpop.com> wrote in message news:<cl******* ***@news.doit.w isc.edu>...

Very rarely will the platform limit reflect the
algorithmic limit. If you want to limit the range if your numbers, you
need to have knowledge of your particular use case - something that
can't be done with a predefined language limit.
i'm saying that most of the time, the algorithmic limit will be less
than 2**31 or 2**63 - & that can be checked by the language.

the limit is still arbitrary. Which
one will it be? How do we decide? If we're platform independent, why
bother with hardware based sizes anyway? Why not use a base 10 limit
like 10**10?
it doesn't really matter what the limit is, as long as it's large
enough that it's not crossed often. (it's only that a limit of 2**31
or 2**63 can be efficiently checked.)

I think that one of the problems we're having in this conversation is
that we are talking across each other. Nobody is denying that finding
bugs is a good thing. It's just that, for the bugs which the overflow
catches, there are much better ways of discovering them. (I'm surprised
no one has mentioned unit testing yet.)

Any decision is always has a cost/benefit analysis. For long/int
unification, the benefits have been pointed out by others, and your
proposed costs are minor, and can be ameliorated by other practices,
which most here would argue are the better way of going about it in the
first place.

agreed, but what about when you don't use these "better practices"? do
you use them for every variable? overflow catches sometimes help you
then.

Jul 18 '05 #50