By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,742 Members | 1,227 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,742 IT Pros & Developers. It's quick & easy.

Interesting list Validity (True/False)

P: n/a
Hello all,

First let me appologise if this has been answered but I could not find
an acurate answer to this interesting problem.

If the following is true:
C:\Python25\rg.py>python
Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32
bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more
information.
>>[] == []
True
>>['-o'] == []
False
>>['-o'] == False
False
>>>
Then why do I get the following results:
C:\Python25\rg.py>help.py -o
print arg ['-o']
type(arg): <type 'list'>
arg is True? False
help.py version 1.0 Copyright RDEG (c) 2007
['-o'] is an unrecognized option.
Progam Exit (0)

<python>
import sys

_ver_ = 1.00

if '-h' in sys.argv or '--help' in sys.argv:
print
print " help.py Version", _ver_, "Copyright RDEG (c) 2007"
print '''

Options : -h, --help -- display this message
Progam Exit (0)'''
sys.exit(0)
else:
arg = sys.argv[1:]
print 'print arg', arg
print 'type(arg):', type(arg)
print 'arg is True?', arg == True
print " help.py version", _ver_, "Copyright RDEG (c) 2007"
print " ", arg, "is an unrecognized option."
print " Progam Exit (0)"
sys.exit(0)
</python>

May 11 '07 #1
Share this Question
Share on Google+
40 Replies


P: n/a
On 2007-05-11, nu******@gmail.com <nu******@gmail.comwrote:
Then why do I get the following results:
C:\Python25\rg.py>help.py -o
print arg ['-o']
type(arg): <type 'list'>
arg is True? False
help.py version 1.0 Copyright RDEG (c) 2007
['-o'] is an unrecognized option.
Progam Exit (0)
You got those results because that's what your program does.

Were you intending it to do something else? If so, you're
going to have to explain what you wanted, because we can't read
your mind.

--
Grant Edwards grante Yow! Hey, wait
at a minute!! I want a
visi.com divorce!! ... you're not
Clint Eastwood!!
May 11 '07 #2

P: n/a
On May 11, 2:28 pm, nufuh...@gmail.com wrote:
Hello all,

First let me appologise if this has been answered but I could not find
an acurate answer to this interesting problem.

If the following is true:
C:\Python25\rg.py>python
Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32
bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more
information.
>>[] == []
True
>>['-o'] == []
False
>>['-o'] == False
False
>>>

Then why do I get the following results:
C:\Python25\rg.py>help.py -o
print arg ['-o']
type(arg): <type 'list'>
arg is True? False
help.py version 1.0 Copyright RDEG (c) 2007
['-o'] is an unrecognized option.
Progam Exit (0)

<python>
import sys

_ver_ = 1.00

if '-h' in sys.argv or '--help' in sys.argv:
print
print " help.py Version", _ver_, "Copyright RDEG (c) 2007"
print '''

Options : -h, --help -- display this message
Progam Exit (0)'''
sys.exit(0)
else:
arg = sys.argv[1:]
print 'print arg', arg
print 'type(arg):', type(arg)
print 'arg is True?', arg == True
print " help.py version", _ver_, "Copyright RDEG (c) 2007"
print " ", arg, "is an unrecognized option."
print " Progam Exit (0)"
sys.exit(0)
</python>

Does this clear things up?
import sys
_ver_ = 1.00
if '-h' in sys.argv or '--help' in sys.argv:
print
print " help.py Version", _ver_, "Copyright RDEG (c) 2007"
print '''
Options : -h, --help -- display this message
Progam Exit (0)'''
sys.exit(0)
else:
arg = sys.argv[1:]
print 'print arg', arg
print 'type(arg):', type(arg)
print 'arg is True?', arg == True

print
if arg:
print 'was True'
else:
print 'was False'
print

print " help.py version", _ver_, "Copyright RDEG (c) 2007"
print " ", arg, "is an unrecognized option."
print " Progam Exit (0)"
sys.exit(0)

## C:\python25\user>python arghhh!.py -o
## print arg ['-o']
## type(arg): <type 'list'>
## arg is True? False
##
## was True
##
## help.py version 1.0 Copyright RDEG (c) 2007
## ['-o'] is an unrecognized option.
## Progam Exit (0)

## C:\python25\user>python arghhh!.py
## print arg []
## type(arg): <type 'list'>
## arg is True? False
##
## was False
##
## help.py version 1.0 Copyright RDEG (c) 2007
## [] is an unrecognized option.
## Progam Exit (0)

May 11 '07 #3

P: n/a
On Fri, May 11, 2007 at 01:20:44PM -0700, nu******@gmail.com
wrote:
On May 11, 3:55 pm, Grant Edwards <gra...@visi.comwrote:
You got those results because that's what your program does.

Were you intending it to do something else? If so, you're
going to have to explain what you wanted, because we can't
According to my output, it seems that arg is False even when I
give an option of '-o' which according to the book should be
True. No?
'-o' is not equal to True. However, that does not mean it
evaluates to false when tested by an if or while statement.
If arg == ['-o'] then shouldn't arg == True return True and
skip the if?
No. See the folloing link regarding the "truth value" of an
object:

http://docs.python.org/lib/truth.html

There are many objects other than True that evaluate to "true"
in the context of an if/while statement. Just because an
objecty has a "true" truth-value doesn't mean that it is equal
to the True object.

--
Grant Edwards grante Yow! Why don't you ever
at enter any CONTESTS,
visi.com Marvin?? Don't you know
your own ZIPCODE?
May 11 '07 #4

P: n/a
On May 11, 2:28 pm, nufuh...@gmail.com wrote:
>

Hello all,
First let me appologise if this has been answered but I could not find
an acurate answer to this interesting problem.
If the following is true:
C:\Python25\rg.py>python
Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32
bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more
information.
>>[] == []
True
>>['-o'] == []
False
>>['-o'] == False
False
Then why do I get the following results:
C:\Python25\rg.py>help.py -o
print arg ['-o']
type(arg): <type 'list'>
arg is True? False
help.py version 1.0 Copyright RDEG (c) 2007
['-o'] is an unrecognized option.
Progam Exit (0)
<python>
import sys
_ver_ = 1.00
if '-h' in sys.argv or '--help' in sys.argv:
print
print " help.py Version", _ver_, "Copyright RDEG (c) 2007"
print '''
Options : -h, --help -- display this message
Progam Exit (0)'''
sys.exit(0)
else:
arg = sys.argv[1:]
print 'print arg', arg
print 'type(arg):', type(arg)
print 'arg is True?', arg == True
print " help.py version", _ver_, "Copyright RDEG (c) 2007"
print " ", arg, "is an unrecognized option."
print " Progam Exit (0)"
sys.exit(0)
</python>

I hope this helps (I have tried to post this twice already but it
seems to be going somewhere else) you help me.

What I would like to happen is:
else:
arg = sys.argv[1:]
print 'print arg', arg
print 'type(arg):', type(arg)
print 'arg is True?', arg == True
if arg != True:
print " No Option Provided"
print " help.py version", _ver_, "Copyright RDEG (c) 2007"
print " ", arg, "is an unrecognized option."
print " Progam Exit (0)"
sys.exit(0)

But as you can see by my output ['-o'] seems to be False as well as []
so the if happens regardless.

According to the "Book", ['-o'] should return True which should fail
the if, no?

May 11 '07 #5

P: n/a
On Fri, 2007-05-11 at 12:28 -0700, nu******@gmail.com wrote:
Hello all,

First let me appologise if this has been answered but I could not find
an acurate answer to this interesting problem.

If the following is true:
C:\Python25\rg.py>python
Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32
bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more
information.
>>[] == []
True
>>['-o'] == []
False
>>['-o'] == False
False
>>>
Your confusion stems from the fact that for a given object, the answer
to the following three questions can be vastly different:
a) Is the object identical to True?
b) Is the object equal to True?
c) Is the object considered to be True in an "if" statement?

Observe:
>>def check_trueness(obj):
.... if obj is True: print repr(obj), "is identical to True."
.... else: print repr(obj), "is not identical to True."
.... if obj == True: print repr(obj), "is equal to True."
.... else: print repr(obj), "is not equal to True."
.... if obj: print repr(obj), "is considered to be True by if."
.... else: print repr(obj), "is not considered to be True by if."
....
>>check_trueness(True)
True is identical to True.
True is equal to True.
True is considered to be True by if.
>>check_trueness(1)
1 is not identical to True.
1 is equal to True.
1 is considered to be True by if.
>>check_trueness([1])
[1] is not identical to True.
[1] is not equal to True.
[1] is considered to be True by if.
>>check_trueness([])
[] is not identical to True.
[] is not equal to True.
[] is not considered to be True by if.

Testing whether an object is equal to True is a much stronger test than
whether it is considered to be True in an 'if' statement, and the test
for identity is stronger still. Testing whether an object is equal to
True or identical to True is useless in most Python programs.

So, rather than doing this:

if thing==True:
# blah

Just do this:

if thing:
# blah

Hope this helps,

--
Carsten Haese
http://informixdb.sourceforge.net
May 11 '07 #6

P: n/a
On May 11, 4:32 pm, Grant Edwards <gra...@visi.comwrote:
On Fri, May 11, 2007 at 01:20:44PM -0700, nufuh...@gmail.com
wrote:
On May 11, 3:55 pm, Grant Edwards <gra...@visi.comwrote:
You got those results because that's what your program does.
Were you intending it to do something else? If so, you're
going to have to explain what you wanted, because we can't
According to my output, it seems that arg is False even when I
give an option of '-o' which according to the book should be
True. No?

'-o' is not equal to True. However, that does not mean it
evaluates to false when tested by an if or while statement.
If arg == ['-o'] then shouldn't arg == True return True and
skip the if?

No. See the folloing link regarding the "truth value" of an
object:

http://docs.python.org/lib/truth.html

There are many objects other than True that evaluate to "true"
in the context of an if/while statement. Just because an
objecty has a "true" truth-value doesn't mean that it is equal
to the True object.

--
Grant Edwards grante Yow! Why don't you ever
at enter any CONTESTS,
visi.com Marvin?? Don't you know
your own ZIPCODE?
OK. Then how would you differenciate between a call with an option
versus one without (e.g. help.py -o (where arg == ['-o']) Vs. help.py
(where arg == []))?

May 11 '07 #7

P: n/a
On May 11, 5:07 pm, Carsten Haese <cars...@uniqsys.comwrote:
On Fri, 2007-05-11 at 12:28 -0700, nufuh...@gmail.com wrote:
Hello all,
First let me appologise if this has been answered but I could not find
an acurate answer to this interesting problem.
If the following is true:
C:\Python25\rg.py>python
Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32
bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more
information.
>>[] == []
True
>>['-o'] == []
False
>>['-o'] == False
False

Your confusion stems from the fact that for a given object, the answer
to the following three questions can be vastly different:
a) Is the object identical to True?
b) Is the object equal to True?
c) Is the object considered to be True in an "if" statement?

Observe:
>def check_trueness(obj):

... if obj is True: print repr(obj), "is identical to True."
... else: print repr(obj), "is not identical to True."
... if obj == True: print repr(obj), "is equal to True."
... else: print repr(obj), "is not equal to True."
... if obj: print repr(obj), "is considered to be True by if."
... else: print repr(obj), "is not considered to be True by if."
...>>check_trueness(True)

True is identical to True.
True is equal to True.
True is considered to be True by if.>>check_trueness(1)

1 is not identical to True.
1 is equal to True.
1 is considered to be True by if.>>check_trueness([1])

[1] is not identical to True.
[1] is not equal to True.
[1] is considered to be True by if.>>check_trueness([])

[] is not identical to True.
[] is not equal to True.
[] is not considered to be True by if.

Testing whether an object is equal to True is a much stronger test than
whether it is considered to be True in an 'if' statement, and the test
for identity is stronger still. Testing whether an object is equal to
True or identical to True is useless in most Python programs.

So, rather than doing this:

if thing==True:
# blah

Just do this:

if thing:
# blah

Hope this helps,

--
Carsten Haesehttp://informixdb.sourceforge.net- Hide quoted text -

- Show quoted text -
Thanks Carsten (& all), I will give the if thing: # blah trick. I
guess I am starting to seem my own confusion. As Grant mentioned, I
was comparing ['-o'] to True which of course is False :o)

However, how would you test for the falsness of the object arg?

May 11 '07 #8

P: n/a
On Fri, 2007-05-11 at 14:07 -0700, nu******@gmail.com wrote:
OK. Then how would you differenciate between a call with an option
versus one without (e.g. help.py -o (where arg == ['-o']) Vs. help.py
(where arg == []))?
if arg:
print "With options"
else:
print "Without options"

--
Carsten Haese
http://informixdb.sourceforge.net
May 11 '07 #9

P: n/a
On May 11, 5:12 pm, nufuh...@gmail.com wrote:
On May 11, 5:07 pm, Carsten Haese <cars...@uniqsys.comwrote:


On Fri, 2007-05-11 at 12:28 -0700, nufuh...@gmail.com wrote:
Hello all,
First let me appologise if this has been answered but I could not find
an acurate answer to this interesting problem.
If the following is true:
C:\Python25\rg.py>python
Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32
bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more
information.
>>[] == []
True
>>['-o'] == []
False
>>['-o'] == False
False
Your confusion stems from the fact that for a given object, the answer
to the following three questions can be vastly different:
a) Is the object identical to True?
b) Is the object equal to True?
c) Is the object considered to be True in an "if" statement?
Observe:
>>def check_trueness(obj):
... if obj is True: print repr(obj), "is identical to True."
... else: print repr(obj), "is not identical to True."
... if obj == True: print repr(obj), "is equal to True."
... else: print repr(obj), "is not equal to True."
... if obj: print repr(obj), "is considered to be True by if."
... else: print repr(obj), "is not considered to be True by if."
...>>check_trueness(True)
True is identical to True.
True is equal to True.
True is considered to be True by if.>>check_trueness(1)
1 is not identical to True.
1 is equal to True.
1 is considered to be True by if.>>check_trueness([1])
[1] is not identical to True.
[1] is not equal to True.
[1] is considered to be True by if.>>check_trueness([])
[] is not identical to True.
[] is not equal to True.
[] is not considered to be True by if.
Testing whether an object is equal to True is a much stronger test than
whether it is considered to be True in an 'if' statement, and the test
for identity is stronger still. Testing whether an object is equal to
True or identical to True is useless in most Python programs.
So, rather than doing this:
if thing==True:
# blah
Just do this:
if thing:
# blah
Hope this helps,
--
Carsten Haesehttp://informixdb.sourceforge.net-Hide quoted text -
- Show quoted text -

Thanks Carsten (& all), I will give the if thing: # blah trick. I
guess I am starting to seem my own confusion. As Grant mentioned, I
was comparing ['-o'] to True which of course is False :o)

However, how would you test for the falsness of the object arg?- Hide quoted text -

- Show quoted text -
Would that be arg is not True: # blah.?

May 11 '07 #10

P: n/a
On Fri, 2007-05-11 at 14:12 -0700, nu******@gmail.com wrote:
However, how would you test for the falsness of the object arg?
if not arg:
# stuff

--
Carsten Haese
http://informixdb.sourceforge.net
May 11 '07 #11

P: n/a
On May 11, 5:19 pm, Carsten Haese <cars...@uniqsys.comwrote:
On Fri, 2007-05-11 at 14:12 -0700, nufuh...@gmail.com wrote:
However, how would you test for the falsness of the object arg?

if not arg:
# stuff

--
Carsten Haesehttp://informixdb.sourceforge.net
I think that is the ticket Carsten! Thanks for all the good
information all y'all.

May 11 '07 #12

P: n/a
wrote in news:11**********************@e51g2000hsg.googlegr oups.com in
comp.lang.python:
>>[] == []
True
>>['-o'] == []
False
>>['-o'] == False
False
>>>
To test wether something is true use if.
To test wether something is false use if not.

The python values "True" and "False" are for when you need to
*store* a boolean value (for later testing).

I you want to to see if an arbitry expression would test as true
or false at the interactive prompt use bool():
>>bool([])
False
>>bool(['-o'])
True
>>>
There is *never* any need to write things like:

expression == True

or:
expression == False

Once you stop doing this things will become much simpler.

Rob.
--
http://www.victim-prime.dsl.pipex.com/
May 11 '07 #13

P: n/a
On May 11, 3:36 pm, nufuh...@gmail.com wrote:
On May 11, 2:28 pm, nufuh...@gmail.com wrote:


Hello all,
First let me appologise if this has been answered but I could not find
an acurate answer to this interesting problem.
If the following is true:
C:\Python25\rg.py>python
Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32
bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more
information.
>>[] == []
True
>>['-o'] == []
False
>>['-o'] == False
False
Then why do I get the following results:
C:\Python25\rg.py>help.py -o
print arg ['-o']
type(arg): <type 'list'>
arg is True? False
help.py version 1.0 Copyright RDEG (c) 2007
['-o'] is an unrecognized option.
Progam Exit (0)
<python>
import sys
_ver_ = 1.00
if '-h' in sys.argv or '--help' in sys.argv:
print
print " help.py Version", _ver_, "Copyright RDEG (c) 2007"
print '''
Options : -h, --help -- display this message
Progam Exit (0)'''
sys.exit(0)
else:
arg = sys.argv[1:]
print 'print arg', arg
print 'type(arg):', type(arg)
print 'arg is True?', arg == True
print " help.py version", _ver_, "Copyright RDEG (c) 2007"
print " ", arg, "is an unrecognized option."
print " Progam Exit (0)"
sys.exit(0)
</python>

I hope this helps (I have tried to post this twice already but it
seems to be going somewhere else) you help me.

What I would like to happen is:
else:
arg = sys.argv[1:]
print 'print arg', arg
print 'type(arg):', type(arg)
print 'arg is True?', arg == True
if arg != True:
print " No Option Provided"
print " help.py version", _ver_, "Copyright RDEG (c) 2007"
print " ", arg, "is an unrecognized option."
print " Progam Exit (0)"
sys.exit(0)

But as you can see by my output ['-o'] seems to be False as well as []
so the if happens regardless.

According to the "Book", ['-o'] should return True which should fail
the if, no?
You're mistaking the porperties of an object for the object itself.

if arg:

tests the property (of being empty).

if arg==True:

tests the type property (whether a list is a boolean).

Change the code I gave above to be:

print
if arg:
print 'The argument given was:',arg
else:
print 'No argument given'
print

then you'll get

## C:\python25\user>python arghhh!.py -o
## print arg ['-o']
## type(arg): <type 'list'>
## arg is True? False
##
## The argument given was: ['-o']
##
## help.py version 1.0 Copyright RDEG (c) 2007
## ['-o'] is an unrecognized option.
## Progam Exit (0)
##
## C:\python25\user>python arghhh!.py
## print arg []
## type(arg): <type 'list'>
## arg is True? False
##
## No argument given
##
## help.py version 1.0 Copyright RDEG (c) 2007
## [] is an unrecognized option.
## Progam Exit (0)

May 11 '07 #14

P: n/a
Just an update of my output after Carsten and company's advice:

<out>
C:\Python25\rg.py>help.py -h

help.py Version 1.0 Copyright RDEG (c) 2007
Options : -h, --help -- display this message
Progam Exit (0)

C:\Python25\rg.py>help.py -i
print arg ['-i']
type(arg): <type 'list'>
arg is True? False
help.py version 1.0 Copyright RDEG (c) 2007
['-i'] is an unrecognized option.
Progam Exit (0)

C:\Python25\rg.py>help.py -i
help.py version 1.0 Copyright RDEG (c) 2007
['-i'] is an unrecognized option.
Progam Exit (0)

C:\Python25\rg.py>help.py
No Option provided
help.py version 1.0 Copyright RDEG (c) 2007
No Option is an unrecognized option.
Progam Exit (0)
</out>

Thanks again.
May 11 '07 #15

P: n/a
On Fri, 2007-05-11 at 14:26 -0700, me********@aol.com wrote:
if arg==True:

tests the type property (whether a list is a boolean).
That sounds nonsensical and incorrect. Please explain what you mean.

"if arg==True" tests whether the object known as arg is equal to the
object known as True.

Regards,

--
Carsten Haese
http://informixdb.sourceforge.net
May 12 '07 #16

P: n/a
On May 12, 12:56?pm, Carsten Haese <cars...@uniqsys.comwrote:
On Fri, 2007-05-11 at 14:26 -0700, mensana...@aol.com wrote:
if arg==True:
tests the type property (whether a list is a boolean).

That sounds nonsensical and incorrect. Please explain what you mean.
<quote>
Sec 2.2.3:
Objects of different types, except different numeric types and
different string types, never compare equal;
</quote>
>
"if arg==True" tests whether the object known as arg is equal to the
object known as True.

Regards,

--
Carsten Haesehttp://informixdb.sourceforge.net

May 13 '07 #17

P: n/a
On Sat, 2007-05-12 at 17:55 -0700, me********@aol.com wrote:
On May 12, 12:56?pm, Carsten Haese <cars...@uniqsys.comwrote:
On Fri, 2007-05-11 at 14:26 -0700, mensana...@aol.com wrote:
if arg==True:
tests the type property (whether a list is a boolean).
That sounds nonsensical and incorrect. Please explain what you mean.

<quote>
Sec 2.2.3:
Objects of different types, except different numeric types and
different string types, never compare equal;
</quote>
That doesn't explain what you mean. How does "if arg==True" test whether
"a list is a boolean"?

--
Carsten Haese
http://informixdb.sourceforge.net
May 13 '07 #18

P: n/a
On May 12, 8:10?pm, Carsten Haese <cars...@uniqsys.comwrote:
On Sat, 2007-05-12 at 17:55 -0700, mensana...@aol.com wrote:
On May 12, 12:56?pm, Carsten Haese <cars...@uniqsys.comwrote:
On Fri, 2007-05-11 at 14:26 -0700, mensana...@aol.com wrote:
if arg==True:
tests the type property (whether a list is a boolean).
That sounds nonsensical and incorrect. Please explain what you mean.
<quote>
Sec 2.2.3:
Objects of different types, except different numeric types and
different string types, never compare equal;
</quote>

That doesn't explain what you mean. How does "if arg==True" test whether
"a list is a boolean"?
>>type(sys.argv)
<type 'list'>
>>type(True)
<type 'bool'>
Actually, it's this statement that's non-sensical.

<quote>
"if arg==True" tests whether the object known as arg is equal to the
object known as True.
</quote>

None of these four examples are "equal" to any other.
>>a = 1
b = (1,)
c = [1]
d = gmpy.mpz(1)

type(a)
<type 'int'>
>>type(b)
<type 'tuple'>
>>type(c)
<type 'list'>
>>type(d)
<type 'mpz'>
>>a==b
False
>>b==c
False
>>a==d
True

And yet a==d returns True. So why doesn't b==c
also return True, they both have a 1 at index position 0?
>>x = [1]
y = [1]
x==y
True

>
--
Carsten Haesehttp://informixdb.sourceforge.net

May 13 '07 #19

P: n/a
On Sat, 12 May 2007 18:43:54 -0700, me********@aol.com wrote:
On May 12, 8:10?pm, Carsten Haese <cars...@uniqsys.comwrote:
>On Sat, 2007-05-12 at 17:55 -0700, mensana...@aol.com wrote:
On May 12, 12:56?pm, Carsten Haese <cars...@uniqsys.comwrote:
On Fri, 2007-05-11 at 14:26 -0700, mensana...@aol.com wrote:
if arg==True:
tests the type property (whether a list is a boolean).
That sounds nonsensical and incorrect. Please explain what you mean.
<quote>
Sec 2.2.3:
Objects of different types, except different numeric types and
different string types, never compare equal;
</quote>
I should point out that only applies to built-in types, not custom classes.

>That doesn't explain what you mean. How does "if arg==True" test whether
"a list is a boolean"?
>>>type(sys.argv)
<type 'list'>
>>>type(True)
<type 'bool'>

That still doesn't make sense. However, using my incredible psychic
ability to read between the lines, I think what Mensanator is trying (but
failing) to say is that "if arg==True" first tests whether arg is of type
bool, and if it is not, it knows they can't be equal. That's not actually
correct. We can check this:
>>import dis
def test(arg):
.... return arg == True
....
>>dis.dis(test)
2 0 LOAD_FAST 0 (arg)
3 LOAD_GLOBAL 0 (True)
6 COMPARE_OP 2 (==)
9 RETURN_VALUE
As you can see, there is no explicit type test. (There may or may not be
an _implicit_ type test, buried deep in the Python implementation of the
COMPARE_OP operation, but that is neither here nor there.)

Also, since bool is a subclass of int, we can do this:
>>1.0+0j == True
True


Actually, it's this statement that's non-sensical.

<quote>
"if arg==True" tests whether the object known as arg is equal to the
object known as True.
</quote>
Not at all, it makes perfect sense. X == Y always tests whether the
argument X is equal to the object Y regardless of what X and Y are.

None of these four examples are "equal" to any other.
That's actually wrong, as you show further down.

>>>a = 1
b = (1,)
c = [1]
d = gmpy.mpz(1)

type(a)
<type 'int'>
>>>type(b)
<type 'tuple'>
>>>type(c)
<type 'list'>
>>>type(d)
<type 'mpz'>
>>>a==b
False
>>>b==c
False
>>>a==d
True
See, a and d are equal.

And yet a==d returns True. So why doesn't b==c
also return True, they both have a 1 at index position 0?
Why should they return true just because the contents are the same? A bag
of shoes is not the same as a box of shoes, even if they are the same
shoes. Since both lists and tuples are containers, neither are strings or
numeric types, so the earlier rule applies: they are different types, so
they can't be equal.

gmpy.mpz(1) on the other hand, is both a numeric type and a custom class.
It is free to define equal any way that makes sense, and it treats itself
as a numeric type and therefore says that it is equal to 1, just like 1.0
and 1+0j are equal to 1.
--
Steven.

May 13 '07 #20

P: n/a
On May 12, 11:02´┐Żpm, Steven D'Aprano
<s...@REMOVE.THIS.cybersource.com.auwrote:
On Sat, 12 May 2007 18:43:54 -0700, mensana...@aol.com wrote:
On May 12, 8:10?pm, Carsten Haese <cars...@uniqsys.comwrote:
On Sat, 2007-05-12 at 17:55 -0700, mensana...@aol.com wrote:
On May 12, 12:56?pm, Carsten Haese <cars...@uniqsys.comwrote:
On Fri, 2007-05-11 at 14:26 -0700, mensana...@aol.com wrote:
if arg==True:
tests the type property (whether a list is a boolean).
That sounds nonsensical and incorrect. Please explain what you mean.
<quote>
Sec 2.2.3:
Objects of different types, except different numeric types and
different string types, never compare equal;
</quote>

I should point out that only applies to built-in types, not custom classes.
That doesn't explain what you mean. How does "if arg==True" test whether
"a list is a boolean"?
>>type(sys.argv)
<type 'list'>
>>type(True)
<type 'bool'>

That still doesn't make sense. However, using my incredible psychic
ability to read between the lines, I think what Mensanator is trying (but
failing) to say is that "if arg==True" first tests whether arg is of type
bool, and if it is not, it knows they can't be equal. That's not actually
correct. We can check this:
>import dis
def test(arg):

... * * return arg == True
...>>dis.dis(test)

* 2 * * * * * 0 LOAD_FAST * * * * * * * *0 (arg)
* * * * * * * 3 LOAD_GLOBAL * * * * * * *0 (True)
* * * * * * * 6 COMPARE_OP * * * * * * * 2 (==)
* * * * * * * 9 RETURN_VALUE

As you can see, there is no explicit type test. (There may or may not be
an _implicit_ type test, buried deep in the Python implementation of the
COMPARE_OP operation, but that is neither here nor there.)

Also, since bool is a subclass of int, we can do this:
>1.0+0j == *True

True
Actually, it's this statement that's non-sensical.
<quote>
"if arg==True" tests whether the object known as arg is equal to the
object known as True.
</quote>

Not at all, it makes perfect sense. X == Y always tests whether the
argument X is equal to the object Y regardless of what X and Y are.
Except for the exceptions, that's why the statement is wrong.
>
None of these four examples are "equal" to any other.

That's actually wrong, as you show further down.
No, it's not, as I show further down.
>

>>a = 1
b = (1,)
c = [1]
d = gmpy.mpz(1)
>>type(a)
<type 'int'>
>>type(b)
<type 'tuple'>
>>type(c)
<type 'list'>
>>type(d)
<type 'mpz'>
>>a==b
False
>>b==c
False
>>a==d
True

See, a and d are equal.
No, they are not "equal". Ints and mpzs should NEVER
be used together in loops, even though it's legal. The ints
ALWAYS have to be coerced to mpzs to perform arithmetic
and this takes time...LOTS of it. The absolute stupidest
thing you can do (assuming n is an mpz) is:

while n >1:
if n % 2 == 0:
n = n/2
else:
n = 3*n + 1

You should ALWAYS do:

ZED = gmpy.mpz(0)
ONE = gmpy.mpz(1)
TWO = gmpy.mpz(2)
TWE = gmpy.mpz(3)

while n >ONE:
if n % TWO == ZED:
n = n/TWO
else:
n = TWE*n + ONE

This way, no coercion is performed.
>
And yet a==d returns True. So why doesn't b==c
also return True, they both have a 1 at index position 0?

Why should they return true just because the contents are the same?
Why should the int 1 return True when compared to mpz(1)?

a = [1]
b = [1]

returns True for a==b? After all, it returns false if b is [2],
so it looks at the content in this case. So for numerics,
it's the value that matters, not the type. And this creates
a false sense of "equality" when a==d returns True.
A bag
of shoes is not the same as a box of shoes, even if they are the same
shoes.
Exactly. For the very reason I show above. The fact that the int
has the same shoes as the mpz doesn't mean the int should be
used, it has to be coerced.
Since both lists and tuples are containers, neither are strings or
numeric types, so the earlier rule applies: they are different types, so
they can't be equal.
But you can't trust a==d returning True to mean a and d are
"equal". To say the comparison means the two objects are
equal is misleading, in other words, wrong. It only takes one
turd to spoil the whole punchbowl.
>
gmpy.mpz(1) on the other hand, is both a numeric type and a custom class.
It is free to define equal any way that makes sense, and it treats itself
as a numeric type and therefore says that it is equal to 1, just like 1.0
and 1+0j are equal to 1.
They are equal in the mathematical sense, but not otherwise.
And to think that makes no difference is to be naive.
>
--
Steven
May 13 '07 #21

P: n/a
On Sat, 12 May 2007 21:50:12 -0700, me********@aol.com wrote:
Actually, it's this statement that's non-sensical.
<quote>
"if arg==True" tests whether the object known as arg is equal to the
object known as True.
</quote>

Not at all, it makes perfect sense. X == Y always tests whether the
argument X is equal to the object Y regardless of what X and Y are.

Except for the exceptions, that's why the statement is wrong.
But there are no exceptions. X == Y tests for equality. If it returns
True, then the objects are equal by definition. That's what equal means in
Python.

One can abuse the technology to give nonsensical results:

class EqualToEverything(object):
def __eq__(self, other):
return True
>>x = EqualToEverything()
x == 1.0
True
>>x == [2.9, "hello world"]
True

but that's no different from any language that allows you to override
operators.
None of these four examples are "equal" to any other.

That's actually wrong, as you show further down.

No, it's not, as I show further down.
But you show no such thing.

Or, to put it another way:

Did! Did not! Did! Did not! Did! Did not! ...

>>>a = 1
b = (1,)
c = [1]
d = gmpy.mpz(1)
[snip]
>>>a==d
True

See, a and d are equal.

No, they are not "equal".
Of course they are. It says so right there: "a equals d" is true.

Ints and mpzs should NEVER
be used together in loops, even though it's legal.
Why ever not? If you need an mpz value in order to do something, and no
other data type will do, what would you suggest? Just give up and say
"Don't do this, because it is Bad, m'kay?"
The ints
ALWAYS have to be coerced to mpzs to perform arithmetic
and this takes time...LOTS of it.
Really? Just how much time?

timeit.Timer("x == y", "import gmpy; x = 1; y = gmpy.mpz(1)").repeat()
timeit.Timer("x == y", "x = 1; y = 1").repeat()

I don't have gmpy installed here, so I can't time it, but I look forward
to seeing the results, if you would be so kind.

Even if it is terribly slow, that's just an implementation detail. What
happens when Python 2.7 comes out (or Python 3.0 or Python 99.78) and
coercion from int to mpz is lightning fast? Would you then say "Well,
int(1) and mpz(1) used to be unequal, but now they are equal?".

Me, I'd say they always were equal, but previously it used to be slow to
coerce one to the other.

The absolute stupidest
thing you can do (assuming n is an mpz) is:

while n >1:
if n % 2 == 0:
n = n/2
else:
n = 3*n + 1
Oh, I can think of much stupider things to do.

while len([math.sin(random.random()) for i in range(n)[:]][:]) 1:
if len( "+" * \
int(len([math.cos(time.time()) for i in \
range(1000, n+1000)[:]][:])/2.0)) == 0:
n = len([math.pi**100/i for i in range(n) if i % 2 == 1][:])
else:
s = '+'
for i in range(n - 1):
s += '+'
s += s[:] + ''.join(reversed(s[:]))
s += s[:].replace('+', '-')[0:1]
n = s[:].count('+') + s[:].count('-')
You should ALWAYS do:

ZED = gmpy.mpz(0)
ONE = gmpy.mpz(1)
TWO = gmpy.mpz(2)
TWE = gmpy.mpz(3)

while n >ONE:
if n % TWO == ZED:
n = n/TWO
else:
n = TWE*n + ONE

This way, no coercion is performed.
I know that algorithm, but I don't remember what it is called...

In any case, what you describe is a local optimization. Its probably a
good optimization, but in no way, shape or form does it imply that mpz(1)
is not equal to 1.

And yet a==d returns True. So why doesn't b==c
also return True, they both have a 1 at index position 0?

Why should they return true just because the contents are the same?

Why should the int 1 return True when compared to mpz(1)?
Because they both represent the same mathematical number, where as a list
containing 1 and a tuple containing 1 are different containers. Even if
the contents are the same, lists aren't equal to tuples.

a = [1]
b = [1]

returns True for a==b?
That's because both are the same kind of container, and they both have the
same contents.

After all, it returns false if b is [2],
so it looks at the content in this case. So for numerics,
it's the value that matters, not the type. And this creates
a false sense of "equality" when a==d returns True.
There's nothing false about it. Ask any mathematician, does 1 equal 1.0,
and they will say "of course".

>A bag
of shoes is not the same as a box of shoes, even if they are the same
shoes.

Exactly. For the very reason I show above. The fact that the int
has the same shoes as the mpz doesn't mean the int should be
used, it has to be coerced.
Ints are not containers. An int doesn't contain values, an int is the
value.

Numeric values are automatically coerced because that's more practical.
That's a design decision, and it works well.

As for gmpy.mpz, since equality tests are completely under the control of
the class author, the gmpy authors obviously wanted mpz values to compare
equal with ints.
>Since both lists and tuples are containers, neither are strings or
numeric types, so the earlier rule applies: they are different types, so
they can't be equal.

But you can't trust a==d returning True to mean a and d are
"equal".
What does it mean then?

To say the comparison means the two objects are
equal is misleading, in other words, wrong. It only takes one
turd to spoil the whole punchbowl.
>>
gmpy.mpz(1) on the other hand, is both a numeric type and a custom class.
It is free to define equal any way that makes sense, and it treats itself
as a numeric type and therefore says that it is equal to 1, just like 1.0
and 1+0j are equal to 1.

They are equal in the mathematical sense, but not otherwise.
Since they are mathematical values, what other sense is meaningful?
And to think that makes no difference is to be naive.
I never said that there was no efficiency differences. Comparing X with Y
might take 0.02ms or it could take 2ms depending on how much work needs
to be done. I just don't understand why you think that has a bearing on
whether they are equal or not.
--
Steven.

May 13 '07 #22

P: n/a
On Sat, 2007-05-12 at 18:43 -0700, me********@aol.com wrote:
That doesn't explain what you mean. How does "if arg==True" test whether
"a list is a boolean"?
>type(sys.argv)
<type 'list'>
>type(True)
<type 'bool'>
All right, so what you meant was "Assuming that arg is a list, 'if
arg==True' will always fail because lists never compare equal to any
boolean."
Actually, it's this statement that's non-sensical.

<quote>
"if arg==True" tests whether the object known as arg is equal to the
object known as True.
</quote>

[snip examples of "surprising" equality tests...]
The statement I made is simply the meaning of "if arg==True" by
definition, so I don't see how it can be nonsensical.

The problem is that you consider equality tests in Python to be
nonsensical because they don't fit with your opinion of what equality
should mean.

Regards,

--
Carsten Haese
http://informixdb.sourceforge.net
May 13 '07 #23

P: n/a
On May 13, 8:57?am, Carsten Haese <cars...@uniqsys.comwrote:
On Sat, 2007-05-12 at 18:43 -0700, mensana...@aol.com wrote:
That doesn't explain what you mean. How does "if arg==True" test whether
"a list is a boolean"?
>>type(sys.argv)
<type 'list'>
>>type(True)
<type 'bool'>

All right, so what you meant was "Assuming that arg is a list, 'if
arg==True' will always fail because lists never compare equal to any
boolean."
Actually, it's this statement that's non-sensical.
<quote>
"if arg==True" tests whether the object known as arg is equal to the
object known as True.
</quote>
[snip examples of "surprising" equality tests...]

The statement I made is simply the meaning of "if arg==True" by
definition, so I don't see how it can be nonsensical.
Because you didn't allow for exceptions, which are
prominently pointed out in the Python docs.
>
The problem is that you consider equality tests in Python to be
nonsensical because they don't fit with your opinion of what equality
should mean.
No, it has nothing to do with what it means. 1, [1], (1,)
and mpz(1) are all different types and all mathmatically
the same. Yet 1 and mpz(1) compare equal but (1,) and
[1] do not. The later fails due to type mis-match, the
former does not despite type mis-match due to the fact
they are the same mathematically.

I'm not saying the situation is wrong, what I'm saying
is that somone who doesn't understand why arg==True
is failing should be told ALL the rules, not just the easy
ones.
>
Regards,

--
Carsten Haesehttp://informixdb.sourceforge.net

May 13 '07 #24

P: n/a
On Sun, 2007-05-13 at 09:26 -0700, me********@aol.com wrote:
The statement I made is simply the meaning of "if arg==True" by
definition, so I don't see how it can be nonsensical.

Because you didn't allow for exceptions, which are
prominently pointed out in the Python docs.
I said: "if arg==True" tests whether the object known as arg is equal to
the object known as True. There are no exceptions. "==" means "equal",
period! Your problem is that Python's notion of "equal" is different
from your notion of "equal".
The problem is that you consider equality tests in Python to be
nonsensical because they don't fit with your opinion of what equality
should mean.

No, it has nothing to do with what it means. 1, [1], (1,)
and mpz(1) are all different types and all mathmatically
the same. Yet 1 and mpz(1) compare equal but (1,) and
[1] do not.
And that just proves my point. You insist on the notion that equality
means "mathematically the same". Python's equality tests sometimes work
out that way, but that's not how equality actually works, nor how it is
actually defined in Python.

Regards,

--
Carsten Haese
http://informixdb.sourceforge.net
May 13 '07 #25

P: n/a
On May 13, 2:09?pm, Carsten Haese <cars...@uniqsys.comwrote:
On Sun, 2007-05-13 at 09:26 -0700, mensana...@aol.com wrote:
There are no exceptions.
"...and when I say none, I mean there is a certain amount."

May 14 '07 #26

P: n/a
En Sun, 13 May 2007 23:45:22 -0300, me********@aol.com
<me********@aol.comescribiˇ:
On May 13, 2:09?pm, Carsten Haese <cars...@uniqsys.comwrote:
>There are no exceptions.
"...and when I say none, I mean there is a certain amount."
One of the beautiful things about Python that I like, is how few
exceptions it has; most things are rather regular.

--
Gabriel Genellina

May 14 '07 #27

P: n/a
On May 13, 8:24 am, Steven D'Aprano
<s...@REMOVE.THIS.cybersource.com.auwrote:
On Sat, 12 May 2007 21:50:12 -0700, mensana...@aol.com wrote:
I intended to reply to this yesterday, but circumstances
(see timeit results) prevented it.
Actually, it's this statement that's non-sensical.
<quote>
"if arg==True" tests whether the object known as arg is equal to the
object known as True.
</quote>
Not at all, it makes perfect sense. X == Y always tests whether the
argument X is equal to the object Y regardless of what X and Y are.
Except for the exceptions, that's why the statement is wrong.

But there are no exceptions.
<quote emphasis added>
Sec 2.2.3:
Objects of different types, *--->except<---* different numeric types
and different string types, never compare equal;
</quote>

X == Y tests for equality. If it returns
True, then the objects are equal by definition. That's what equal means in
Python.

One can abuse the technology to give nonsensical results:

class EqualToEverything(object):
def __eq__(self, other):
return True
>x = EqualToEverything()
x == 1.0
True
>x == [2.9, "hello world"]

True

but that's no different from any language that allows you to override
operators.
None of these four examples are "equal" to any other.
That's actually wrong, as you show further down.
No, it's not, as I show further down.

But you show no such thing.

Or, to put it another way:

Did! Did not! Did! Did not! Did! Did not! ...
>>a = 1
b = (1,)
c = [1]
d = gmpy.mpz(1)

[snip]
>>a==d
True
See, a and d are equal.
No, they are not "equal".

Of course they are. It says so right there: "a equals d" is true.
Ok, but they are an exception to the rule "different types compare
False".
>
Ints and mpzs should NEVER
be used together in loops, even though it's legal.

Why ever not? If you need an mpz value in order to do something, and no
other data type will do, what would you suggest? Just give up and say
"Don't do this, because it is Bad, m'kay?"
It's not the mpzs you shouldn't use, its the ints. I also stessed
"in loops". Replacing an integer literal with a variable still
requires a coercion, so it doesn't matter if n + 1 occurs outside
a loop.
>
The ints
ALWAYS have to be coerced to mpzs to perform arithmetic
and this takes time...LOTS of it.

Really? Just how much time?
Can't say, had to abort the following.
Returns the count of n/2 and 3n+1 operations [1531812, 854697].

import gmpy

def collatz(a):
ONE = gmpy.mpz(1)
TWO = gmpy.mpz(2)
TWE = gmpy.mpz(3)
a = gmpy.mpz(a)
t = 0
u = 0
done = 0
while done==0:
f = gmpy.scan1(a,0)
if f>0:
a = a >f
u += f
else:
if a==1:
done = 1
else:
a = a*TWE + ONE
t += 1
return [u,t]

def collatz2(a):
t = 0
u = 0
done = 0
while done==0:
f = gmpy.scan1(a,0)
if f>0:
a = a >f
u += f
else:
if a==1:
done = 1
else:
a = a*3 + 1
t += 1
return [u,t]

def test():
collatz(2**177149-1)

def test2():
collatz2(2**177149-1)

if __name__=='__main__':
from timeit import Timer
t = Timer("a = test()", "from __main__ import test")
u = Timer("b = test2()", "from __main__ import test2")
print t.timeit(10)
print u.timeit(10)

## 723.430377542
## *ABORTED after 20 hours*

>
timeit.Timer("x == y", "import gmpy; x = 1; y = gmpy.mpz(1)").repeat()
timeit.Timer("x == y", "x = 1; y = 1").repeat()

I don't have gmpy installed here,
Good Lord! How do you solve a Linear Congruence? :-)
so I can't time it, but I look forward
to seeing the results, if you would be so kind.
I had a lot of trouble with this, but I think I finally got a
handle on it. I had to abort the previous test after 20+ hours
and abort a second test (once I figured out to do your example)
on another machine after 14+ hours. I had forgotten just how
significant the difference is.

import timeit

## t = timeit.Timer("a == b", "a = 1; b = 1")
## u = timeit.Timer("c == d", "import gmpy; c = 1; d =
gmpy.mpz(1)")
## t.repeat()
## [0.22317417437132372, 0.22519314605627253, 0.22474588250741367]
## u.repeat()
## [0.59943819675405763, 0.5962260566636246, 0.60122920650529466]

Unfortunately, this is not a very useful test, since mpz
coercion appears to vary ny the size of the number involved.
Although changing t to

## t = timeit.Timer("a == b", "a = 2**177149-1; b = 2**177149-1")

still produces tractable results
## t.repeat()
## [36.323597552202841, 34.727026758987506, 34.574566320579862]

the same can't be said for mpz coercion:

## u = timeit.Timer("c == d", "import gmpy; c = 2**177149-1; d =
gmpy.mpz(2**177149-1)")
## u.repeat()
## *ABORTED after 14 hours*

So I changed it to (using yet a third machine)

for i in xrange(8):
e = 2*i*100
n = 2**e-1
r = 'a = %d; b = %d' % (n,n)
s = 'import gmpy; a = %d; b = gmpy.mpz(%d)' % (n,n)
print 'For 2**e-1',e
t = timeit.Timer("a == b",r)
u = timeit.Timer("a == b",s)
print t.repeat()
print u.repeat()
print

which clearly shows the growth rate of the mpz coercion.

## int==int vs. int==mpz
##
## For 2**e-1 0
## [0.054264941118974445, 0.054553378257723141,
0.054355515455681791]
## [0.16161957500399435, 0.16188363643198839, 0.16197491752897064]
##
## For 2**e-1 200
## [0.093393746299376912, 0.093660961833065492,
0.092977494572419439]
## [1.0425794607193544, 1.0436544844503342, 1.0451038279715417]
##
## For 2**e-1 400
## [0.10496130299527184, 0.10528292779203152, 0.10497603593951155]
## [2.2687503839249636, 2.2685411490493506, 2.2691453463783233]
##
## For 2**e-1 600
## [0.11724617625774236, 0.11701867087715279, 0.11747874550051129]
## [3.616420796797021, 3.617562537946073, 3.6152373342355801]
##
## For 2**e-1 800
## [0.13156379733273482, 0.1310266632832402, 0.13168082630802047]
## [5.2398534562645089, 5.2389728893525458, 5.2353889230364388]
##
## For 2**e-1 1000
## [0.153719968797283, 0.15383679852633492, 0.15352625633217798]
## [6.967458038928207, 6.9640038947002409, 6.9675019294931388]
##
## For 2**e-1 1200
## [0.16716219584402126, 0.16743472335786436, 0.16782637005291434]
## [11.603391791430532, 11.601063020084396, 11.603106936964878]
##
## For 2**e-1 1400
## [0.179120966908215, 0.17908259508838853, 0.17934175430681876]
## [14.753954507946347, 14.755623642634944, 14.756064585859164]

And, just for laughs, I compared mpzs to mpzs,

s = 'import gmpy; a = gmpy.mpz(%d); b = gmpy.mpz(%d)' % (n,n)

which ended up faster than comparing ints to ints.

## int==int vs. mpz==mpz
##
## For 2**e-1 0
## [0.054301433257206225, 0.054502401293220933,
0.054274144039999611]
## [0.12487657446828507, 0.099130500653189346,
0.094799646619862565]
##
## For 2**e-1 200
## [0.10013419046813476, 0.10156139134030695, 0.10151083166511599]
## [0.091683807483012414, 0.091326269489948375,
0.091261281378934411]
##
## For 2**e-1 400
## [0.10716937998703036, 0.10704403530042028, 0.10705119312788414]
## [0.099165500324245093, 0.097540568227742153,
0.10131808159697742]
##
## For 2**e-1 600
## [0.12060785142996777, 0.11720683828159517, 0.11800506010281886]
## [0.11328210449149934, 0.1146064679843235, 0.11307050873582014]
##
## For 2**e-1 800
## [0.12996358680839437, 0.13021352430898236, 0.12973684081916526]
## [0.12344120825932059, 0.11454960385710677, 0.12339954699673861]
##
## For 2**e-1 1000
## [0.15328649918703752, 0.15362917265815135, 0.15313422618208516]
## [0.12753811336359666, 0.12534907002753748, 0.12588097104350471]
##
## For 2**e-1 1200
## [0.16756264696760326, 0.16747118166182684, 0.167885034915086]
## [0.12162660501311073, 0.13368267591470051, 0.13387503876843265]
##
## For 2**e-1 1400
## [0.17867761017283623, 0.17829534684824377, 0.17826312158720281]
## [0.13718761665773815, 0.13779106963280441, 0.13708166276632738]

>
Even if it is terribly slow, that's just an implementation detail. What
happens when Python 2.7 comes out (or Python 3.0 or Python 99.78) and
coercion from int to mpz is lightning fast? Would you then say "Well,
int(1) and mpz(1) used to be unequal, but now they are equal?".
Are you saying I should be unconcerned about implementation details?
That it's silly of me to be concerned about implementation side
effects
due to mis-matched types?
>
Me, I'd say they always were equal, but previously it used to be slow to
coerce one to the other.
So, when you're giving advice to the OP you don't feel any need to
point
this out? That's all I'm trying to do, supply some "yes, but you
should
be aware of..." commentary.
>
The absolute stupidest
thing you can do (assuming n is an mpz) is:
while n >1:
if n % 2 == 0:
n = n/2
else:
n = 3*n + 1

Oh, I can think of much stupider things to do.

while len([math.sin(random.random()) for i in range(n)[:]][:]) 1:
if len( "+" * \
int(len([math.cos(time.time()) for i in \
range(1000, n+1000)[:]][:])/2.0)) == 0:
n = len([math.pi**100/i for i in range(n) if i % 2 == 1][:])
else:
s = '+'
for i in range(n - 1):
s += '+'
s += s[:] + ''.join(reversed(s[:]))
s += s[:].replace('+', '-')[0:1]
n = s[:].count('+') + s[:].count('-')
You should ALWAYS do:
ZED = gmpy.mpz(0)
ONE = gmpy.mpz(1)
TWO = gmpy.mpz(2)
TWE = gmpy.mpz(3)
while n >ONE:
if n % TWO == ZED:
n = n/TWO
else:
n = TWE*n + ONE
This way, no coercion is performed.

I know that algorithm, but I don't remember what it is called...
The Collatz Conjecture. If true, it means the while loop
terminates for any n.
>
In any case, what you describe is a local optimization. Its probably a
good optimization, but in no way, shape or form does it imply that mpz(1)
is not equal to 1.
It's a different type. It is an exception to the "different types
compare
False" rule. That exception is not without cost, the type mis-match
causes coercion.
>
And yet a==d returns True. So why doesn't b==c
also return True, they both have a 1 at index position 0?
Why should they return true just because the contents are the same?
Why should the int 1 return True when compared to mpz(1)?

Because they both represent the same mathematical number, where as a list
containing 1 and a tuple containing 1 are different containers. Even if
the contents are the same, lists aren't equal to tuples.
a = [1]
b = [1]
returns True for a==b?

That's because both are the same kind of container, and they both have the
same contents.
After all, it returns false if b is [2],
so it looks at the content in this case. So for numerics,
it's the value that matters, not the type. And this creates
a false sense of "equality" when a==d returns True.

There's nothing false about it. Ask any mathematician, does 1 equal 1.0,
and they will say "of course".
And if you ask any mathematician, he'll say that (1,) is equal to [1].
That's the difference between a mathematician and a programmer.
A programmer will say "of course not, the int has to be coered."
>
A bag
of shoes is not the same as a box of shoes, even if they are the same
shoes.
Exactly. For the very reason I show above. The fact that the int
has the same shoes as the mpz doesn't mean the int should be
used, it has to be coerced.
Ints are not containers. An int doesn't contain values, an int is the
value.

Numeric values are automatically coerced because that's more practical.
That's a design decision, and it works well.
And I'm not saying it shouldn't be that way. But when I wrote my
Collatz Functions library, I wasn't aware of the performance issues
when doing millions of loop cycles with numbers having millions
of digits. I only found that out later. Would I have gotten a
proper answer on this newgroup had I asked here? Sure doesn't look
like it.

BTW, in reviewing my Collatz Functions library, I noticed a coercion
I had overlooked, so as a result of this discussion, my library is
now slightly faster. So some good comes out of this argument after
all.
>
As for gmpy.mpz, since equality tests are completely under the control of
the class author, the gmpy authors obviously wanted mpz values to compare
equal with ints.
And they chose to do a silent coercion rather than raise a type
exception.
It says right in the gmpy documentation that this coercion will be
performed.
What it DOESN'T say is what the implications of this silent coercion
are.
>
Since both lists and tuples are containers, neither are strings or
numeric types, so the earlier rule applies: they are different types, so
they can't be equal.
But you can't trust a==d returning True to mean a and d are
"equal".

What does it mean then?
It means they are mathematically equivalent, which is not the same as
being programatically equivalent. Mathematical equivalency is what
most
people want most of the time. Not all of the people all of the time,
however. For example, I can calculate my Hailstone Function
parameters
using either a list or a tuple:
>>import collatz_functions as cf
print cf.calc_xyz([1,2])
(mpz(8), mpz(9), mpz(5))
>>print cf.calc_xyz((1,2))
(mpz(8), mpz(9), mpz(5))

But [1,2]==(1,2) yields False, so although they are not equal,
they ARE interchangeable in this application because they are
mathematically equivalent.
>
To say the comparison means the two objects are
equal is misleading, in other words, wrong. It only takes one
turd to spoil the whole punchbowl.
gmpy.mpz(1) on the other hand, is both a numeric type and a custom class.
It is free to define equal any way that makes sense, and it treats itself
as a numeric type and therefore says that it is equal to 1, just like 1.0
and 1+0j are equal to 1.
They are equal in the mathematical sense, but not otherwise.

Since they are mathematical values, what other sense is meaningful?
And to think that makes no difference is to be naive.

I never said that there was no efficiency differences. Comparing X with Y
might take 0.02ms or it could take 2ms depending on how much work needs
to be done. I just don't understand why you think that has a bearing on
whether they are equal or not.
The bearing it has matters when you're writing a function library that
you want to execute efficiently.
>
--
Steven.

May 14 '07 #28

P: n/a
On Mon, 2007-05-14 at 11:41 -0700, me********@aol.com wrote:
On May 13, 8:24 am, Steven D'Aprano
<s...@REMOVE.THIS.cybersource.com.auwrote:
On Sat, 12 May 2007 21:50:12 -0700, mensana...@aol.com wrote:
<quote>
"if arg==True" tests whether the object known as arg is equal to the
object known as True.
</quote>
>Not at all, it makes perfect sense. X == Y always tests whether the
>argument X is equal to the object Y regardless of what X and Y are.
Except for the exceptions, that's why the statement is wrong.
But there are no exceptions.

<quote emphasis added>
Sec 2.2.3:
Objects of different types, *--->except<---* different numeric types
and different string types, never compare equal;
</quote>
The exceptions you mean are not exceptions to "'X==Y' means 'X equals
Y'". They are exceptions to "'X equals Y' means 'X is mathematically the
same as Y'," but that is not how equality is actually defined.

--
Carsten Haese
http://informixdb.sourceforge.net
May 15 '07 #29

P: n/a
On May 14, 8:10?pm, Carsten Haese <cars...@uniqsys.comwrote:
On Mon, 2007-05-14 at 11:41 -0700, mensana...@aol.com wrote:
On May 13, 8:24 am, Steven D'Aprano
<s...@REMOVE.THIS.cybersource.com.auwrote:
On Sat, 12 May 2007 21:50:12 -0700, mensana...@aol.com wrote:
<quote>
"if arg==True" tests whether the object known as arg is equal to the
object known as True.
</quote>
Not at all, it makes perfect sense. X == Y always tests whether the
argument X is equal to the object Y regardless of what X and Y are.
Except for the exceptions, that's why the statement is wrong.
But there are no exceptions.
<quote emphasis added>
Sec 2.2.3:
Objects of different types, *--->except<---* different numeric types
and different string types, never compare equal;
</quote>

The exceptions you mean are not exceptions to "'X==Y' means 'X equals
Y'".
I never said they were. I said they were exceptions to
"Obbjects of different types never compare equal".
They are exceptions to "'X equals Y' means 'X is mathematically the
same as Y',"
Who's "they"?. (1,2) and [1,2] are mathematically equal but
the == comparison returns False. They are not an exception
to "mathematically equal", neither are they exceptions to
"different types never compare equal".

1 and mpz(1) compare equal so aren't an exception to
"mathematically equal" although they are an exception
to "different types never compare equal".

You need to be more explicit about what you're
talking about, as this last argument makes no sense.
but that is not how equality is actually defined.
Ok, I'll bite. How is "equality" defined?

Are you implying that I can interchange 1 and mpz(1)
because the == comparison returns True?

Are you implying that I can't interchange (1,2) and [1,2]
because the == comparison returns False?

Please make sure your definition deals with these cases.
>
--
Carsten Haesehttp://informixdb.sourceforge.net
May 15 '07 #30

P: n/a
En Tue, 15 May 2007 01:37:07 -0300, me********@aol.com
<me********@aol.comescribiˇ:
<quote emphasis added>
Sec 2.2.3:
Objects of different types, *--->except<---* different numeric types
and different string types, never compare equal;
</quote>

The exceptions you mean are not exceptions to "'X==Y' means 'X equals
Y'".

I never said they were. I said they were exceptions to
"Obbjects of different types never compare equal".
This is an unfortunate wording, and perhaps should read: "For most builtin
types, objects of different types never compare equal; such objects are
ordered consistently but arbitrarily (so that sorting a heterogeneous
sequence yields a consistent result). The exceptions being different
numeric types and different string types, that have a special treatment;
see section 5.9 in the Reference Manual for details."

And said section 5.9 should be updated too: "The objects need not have the
same type. If both are numbers or strings, they are converted to a common
type. Otherwise, objects of different builtin types always compare
unequal, and are ordered consistently but arbitrarily. You can control
comparison behavior of objects of non-builtin types by defining a __cmp__
method or rich comparison methods like __gt__, described in section 3.4."

I hope this helps a bit. Your performance issues don't have to do with the
*definition* of equal or not equal, only with how someone decided to write
the mpz class.

--
Gabriel Genellina

May 15 '07 #31

P: n/a
On May 15, 12:30 am, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
En Tue, 15 May 2007 01:37:07 -0300, mensana...@aol.com
<mensana...@aol.comescribiˇ:
<quote emphasis added>
Sec 2.2.3:
Objects of different types, *--->except<---* different numeric types
and different string types, never compare equal;
</quote>
The exceptions you mean are not exceptions to "'X==Y' means 'X equals
Y'".
I never said they were. I said they were exceptions to
"Obbjects of different types never compare equal".

This is an unfortunate wording, and perhaps should read: "For most builtin
types, objects of different types never compare equal; such objects are
ordered consistently but arbitrarily (so that sorting a heterogeneous
sequence yields a consistent result). The exceptions being different
numeric types and different string types, that have a special treatment;
see section 5.9 in the Reference Manual for details."

And said section 5.9 should be updated too: "The objects need not have the
same type. If both are numbers or strings, they are converted to a common
type.
Except when they aren't.
>>import gmpy
a = 2**177149-1
b = gmpy.mpz(2**177149-1)
a==b
True
>>print '%d' % (b)
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
print '%d' % (b)
TypeError: int argument required

So although the comparison operator is smart enough to realize
the equivalency of numeric types and do the type conversion,
the print statement isn't so smart.
Otherwise, objects of different builtin types always compare
unequal, and are ordered consistently but arbitrarily. You can control
comparison behavior of objects of non-builtin types by defining a __cmp__
method or rich comparison methods like __gt__, described in section 3.4."

I hope this helps a bit. Your performance issues don't have to do with the
*definition* of equal or not equal,
I didn't say that, I said the performance issues were related
to type conversion. Can you explain how the "definition" of
equal does not involve type conversion?
only with how someone decided to write the mpz class.
I'm beginning to think there's a problem there.
>
--
Gabriel Genellina

May 15 '07 #32

P: n/a
En Tue, 15 May 2007 14:01:20 -0300, me********@aol.com
<me********@aol.comescribiˇ:
On May 15, 12:30 am, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
>And said section 5.9 should be updated too: "The objects need not have
the
same type. If both are numbers or strings, they are converted to a
common
type.

Except when they aren't.
I think you don't get the difference between a builtin object, fully under
the Python developers' control, and a user defined class that can behave
arbitrarily at wish of its writer and for which the Python documentation
barely can say a word.
The docs say how will the Python interpreter try to compare objects
(invoke the rich comparison methods, invoke __cmp__, etc) and how the
*builtin* objects behave. For other objects, it's up to the object
*writer* to provide such methods, and he can do whatever he wishes:

pyclass Reversed(int):
.... def __lt__(self, other): return cmp(int(self),other)>0
.... def __gt__(self, other): return cmp(int(self),other)<0
.... def __le__(self, other): return cmp(int(self),other)>=0
.... def __ge__(self, other): return cmp(int(self),other)<=0
....
py>
pyj=Reversed(6)
pyj==6
True
pyj>5
False
pyj>10
True
pyj<=5
True

You can't blame Python for this.
>>>import gmpy
a = 2**177149-1
b = gmpy.mpz(2**177149-1)
a==b
True
>>>print '%d' % (b)

Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
print '%d' % (b)
TypeError: int argument required

So although the comparison operator is smart enough to realize
the equivalency of numeric types and do the type conversion,
the print statement isn't so smart.
This is up to the gmpy designers/writers/maintainers. Anyone writing a
class chooses which features to implement, which ones to omit, how to
implement them, etc. The code may contain bugs, may not be efficient, may
not behave exactly as the users expect, may not have anticipated all usage
scenarios, a long etc. In this case, probably the gmpy writers have chosen
not to allow to convert to int, and they may have good reasons to not do
that (I don't know what platform are you working in, but I feel that your
b object is somewhat larger than sys.maxint...).
>Otherwise, objects of different builtin types always compare
unequal, and are ordered consistently but arbitrarily. You can control
comparison behavior of objects of non-builtin types by defining a
__cmp__
method or rich comparison methods like __gt__, described in section
3.4."

I hope this helps a bit. Your performance issues don't have to do with
the
*definition* of equal or not equal,

I didn't say that, I said the performance issues were related
to type conversion. Can you explain how the "definition" of
equal does not involve type conversion?
There is no type conversion involved for user defined classes, *unless*
the class writer chooses to do so.
Let's invent some new class Number; they can be added and have basic
str/repr support

pyclass Number(object):
.... def __init__(self, value): self.value=value
.... def __add__(self, other): return Number(self.value+other.value)
.... def __str__(self): return str(self.value)
.... def __repr__(self): return 'Number(%s)' % self.value
....
pyx = Number(2)
pyy = Number(3)
pyz = x+y
pyz
Number(5)
pyz == 5
False
py5 == z
False
pyz == Number(5)
False
pyint(z)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: int() argument must be a string or a number
py"%d" % z
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: int argument required

You can't compare them to anything, convert to integer, still nothing.
Let's add "int conversion" first:

pyNumber.__int__ = lambda self: int(self.value)
pyint(z)
5
py"%d" % z
'5'
pyz == 5
False
py5 == z
False

Ok, a Number knows how to convert itself to integer, but still can't be
compared successfully to anything. (Perhaps another language would try to
convert automagically z to int, to compare against 5, but not Python).
Let's add basic comparison support:

pyNumber.__cmp__ = lambda self, other: cmp(self.value, other.value)
pyz == Number(5)
True
pyz Number(7)
False
pyz == z
True
pyz == 5
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 1, in <lambda>
AttributeError: 'int' object has no attribute 'value'

Now, a Number can be compared to another Number, but still not compared to
integers. Let's make the comparison a bit smarter (uhm, I'll write it as a
regular function because it's getting long...)

pydef NumberCmp(self, other):
.... if isinstance(other, Number): return cmp(self.value, other.value)
.... else: return cmp(self.value, other)
....
pyNumber.__cmp__ = NumberCmp
pyz == 5
True
pyz == 6
False
py5 == z
True

As you can see, until I wrote some code explicitely to do the comparison,
and allow other types of comparands, Python will not "convert" anything.
If you find that some class appears to do a type conversion when comparing
instances, it's because the class writer has explicitely coded it that
way, not because Python does the conversion automagically.
>only with how someone decided to write the mpz class.
I'm beginning to think there's a problem there.
Yes: you don't recognize that gmpy is not a builtin package, it's an
external package, and its designers/writers/implementors/coders/whatever
decide how it will behave, not Python itself nor the Python developers.

--
Gabriel Genellina

May 16 '07 #33

P: n/a
me********@aol.com wrote:
On May 12, 11:02´┐Żpm, Steven D'Aprano
[ ... ]
>
But you can't trust a==d returning True to mean a and d are
"equal". To say the comparison means the two objects are
equal is misleading, in other words, wrong. It only takes one
turd to spoil the whole punchbowl.
Unfortunately that is the very *definition* of "equal".
>gmpy.mpz(1) on the other hand, is both a numeric type and a custom class.
It is free to define equal any way that makes sense, and it treats itself
as a numeric type and therefore says that it is equal to 1, just like 1.0
and 1+0j are equal to 1.

They are equal in the mathematical sense, but not otherwise.
And to think that makes no difference is to be naive.
Perhaps so, but you are a long way from the original question now!

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
------------------ Asciimercial ---------------------
Get on the web: Blog, lens and tag your way to fame!!
holdenweb.blogspot.com squidoo.com/pythonology
tagged items: del.icio.us/steve.holden/python
All these services currently offer free registration!
-------------- Thank You for Reading ----------------

May 16 '07 #34

P: n/a
On Mon, 14 May 2007 11:41:21 -0700, me********@aol.com wrote:
On May 13, 8:24 am, Steven D'Aprano
<s...@REMOVE.THIS.cybersource.com.auwrote:
>On Sat, 12 May 2007 21:50:12 -0700, mensana...@aol.com wrote:

I intended to reply to this yesterday, but circumstances (see timeit
results) prevented it.
Actually, it's this statement that's non-sensical.
<quote>
"if arg==True" tests whether the object known as arg is equal to
the object known as True.
</quote>
>Not at all, it makes perfect sense. X == Y always tests whether the
argument X is equal to the object Y regardless of what X and Y are.
Except for the exceptions, that's why the statement is wrong.

But there are no exceptions.

<quote emphasis added>
Sec 2.2.3:
Objects of different types, *--->except<---* different numeric types and
different string types, never compare equal; </quote>
Yes, and all swans are white, except for the black swans from Australia,
but we're not talking about swans, nor are we talking about objects of
different type comparing unequal, we're talking about whether X == Y
means X is equal to Y.

THERE ARE NO EXCEPTIONS TO THIS, BECAUSE IT IS TRUE BY DEFINITION.

In Python, the meaning of "equal" is nothing more and nothing less than
"does X == Y return True?". End of story, there is nothing more to
discuss. If it returns True, they are equal. If it doesn't, they aren't.

If you want to drag in non-Python meanings of "equal", you are wrong to
do so. "Lizzie Windsor", "Queen Elizabeth the Second", "the Queen of
England" and "Her Royal Majesty, Queen Elizabeth II" are all equal in the
sense that they refer to the same person, but it would be crazy to expect
Python to compare those strings equal.

If you want to complain that lists and tokens should compare equal if
their contents are the same, that's a different issue. I don't believe
you'll have much support for that.

If you want to complain that numeric types shouldn't compare equal, so
that 1.0 != 1 != 1L != gmpy.mpz(1), that's also a different issue. I
believe you'll have even less support for that suggestion.

[snip]

No, they are not "equal".

Of course they are. It says so right there: "a equals d" is true.

Ok, but they are an exception to the rule "different types compare
False".
You are only quoting part of the rule. The rule says that numeric types
and strings are not included in the "different types" clause. If you
quote the full rule, you will see that it is not an exception to the
rule, it matches perfectly.

Although, the rule as given is actually incomplete, because it only
applies to built-in types. It does not apply to classes, because the
class designer has complete control over the behaviour of his class. If
the designer wants his class to compare equal to lists on Wednesdays and
unequal on other days, he can. (That would be a stupid thing to do, but
possible.)
[snip]
The ints
ALWAYS have to be coerced to mpzs to perform arithmetic and this
takes time...LOTS of it.

Really? Just how much time?

Can't say, had to abort the following. Returns the count of n/2 and 3n+1
operations [1531812, 854697].
Maybe you should use a test function that isn't so insane then. Honestly,
if you want to time something, time something that actually completes!
You don't gain any accuracy by running a program for twenty hours instead
of twenty minutes.

[snip functions generating the Collatz sequence]

>timeit.Timer("x == y", "import gmpy; x = 1; y = gmpy.mpz(1)").repeat()
timeit.Timer("x == y", "x = 1; y = 1").repeat()

I don't have gmpy installed here,

Good Lord! How do you solve a Linear Congruence? :-)
In my head of course. Don't you?

*wink*

>so I can't time it, but I look forward to seeing the results, if you
would be so kind.

I had a lot of trouble with this, but I think I finally got a handle on
it. I had to abort the previous test after 20+ hours and abort a second
test (once I figured out to do your example) on another machine after
14+ hours. I had forgotten just how significant the difference is.

import timeit

## t = timeit.Timer("a == b", "a = 1; b = 1") ## u =
timeit.Timer("c == d", "import gmpy; c = 1; d = gmpy.mpz(1)")
## t.repeat()
## [0.22317417437132372, 0.22519314605627253, 0.22474588250741367] ##
u.repeat()
## [0.59943819675405763, 0.5962260566636246, 0.60122920650529466]

Comparisons between ints take about 0.2 microseconds, compared to about
0.6 microseconds for small gmpy.mpz values. That's an optimization worth
considering, but certainly not justifying your claim that one should
NEVER compare an int and a mpz "in a loop". If the rest of the loop takes
five milliseconds, who cares about a fraction of a microsecond difference?

Unfortunately, this is not a very useful test, since mpz coercion
appears to vary ny the size of the number involved.
No, it is a very useful test. It's not an EXHAUSTIVE test.

(By the way, you're not testing coercion. You're testing the time it
takes to compare the two. There may or may not be any coercion involved.)

Although changing t to

## t = timeit.Timer("a == b", "a = 2**177149-1; b = 2**177149-1")

still produces tractable results
## t.repeat()
## [36.323597552202841, 34.727026758987506, 34.574566320579862]
About 36 microseconds per comparison, for rather large longints.

the same can't be said for mpz coercion:

## u = timeit.Timer("c == d", "import gmpy; c = 2**177149-1; d =
gmpy.mpz(2**177149-1)")
## u.repeat()
## *ABORTED after 14 hours*
This tells us that a comparison between large longints and large gmpz.mpz
vales take a minimum of 14 hours divided by three million, or roughly 17
milliseconds each. That's horribly expensive if you have a lot of them.

It isn't clear _why_ the comparison takes so long.
[snip]
And, just for laughs, I compared mpzs to mpzs,

s = 'import gmpy; a = gmpy.mpz(%d); b = gmpy.mpz(%d)' % (n,n)

which ended up faster than comparing ints to ints.
I'm hardly surprised. If speed is critical, gmpy is likely to be faster
than anything you can do in pure Python.

[snip]

>Even if it is terribly slow, that's just an implementation detail. What
happens when Python 2.7 comes out (or Python 3.0 or Python 99.78) and
coercion from int to mpz is lightning fast? Would you then say "Well,
int(1) and mpz(1) used to be unequal, but now they are equal?".

Are you saying I should be unconcerned about implementation details?
That it's silly of me to be concerned about implementation side effects
due to mis-matched types?
Of course not. But the discussion isn't about optimization, that's just
an irrelevant side-track.

>Me, I'd say they always were equal, but previously it used to be slow
to coerce one to the other.

So, when you're giving advice to the OP you don't feel any need to point
this out? That's all I'm trying to do, supply some "yes, but you should
be aware of..." commentary.
Why on Earth would I need to mention gmpy.mpz()? Does the OP even use
gmpy? You were the one who brought gmpy into the discussion, not him. Why
not give him a lecture about not repeatedly adding strings together, or
using << instead of multiplication by two, or any other completely
irrelevant optimization? My favorite, by the way, is that you can save
anything up to an hour of driving time by avoiding Hoddle Street during
peak hour and using the back-streets through Abbotsford, next to Yarra
Bend Park and going under the Eastern Freeway. Perhaps I should have
raised that as well?

>In any case, what you describe is a local optimization. Its probably a
good optimization, but in no way, shape or form does it imply that
mpz(1) is not equal to 1.

It's a different type. It is an exception to the "different types
compare False" rule.
What does this have to do with your ridiculous claim that mpz(1) is not
equal to 1? It clearly is equal.

That exception is not without cost, the type mis-match
causes coercion.
Any comparison has a cost. Sometimes its a lot, sometimes a little. That
has nothing to do with equality.
>There's nothing false about it. Ask any mathematician, does 1 equal
1.0, and they will say "of course".

And if you ask any mathematician, he'll say that (1,) is equal to [1].
I'd like to find the mathematician who says that. The first thing he'd
say is "what is this (1,) notation you are using?" and the second thing
he'd ask is "equal in what sense?".

Perhaps you should ask a mathematician if the set {1, 2} and the vector
[1, 2] are equal, and if either of them are equal to the coordinate pair
(1, 2).

That's the difference between a mathematician and a programmer. A
programmer will say "of course not, the int has to be coered."
A C programmer maybe.

[snip]
>Numeric values are automatically coerced because that's more practical.
That's a design decision, and it works well.

And I'm not saying it shouldn't be that way. But when I wrote my Collatz
Functions library, I wasn't aware of the performance issues when doing
millions of loop cycles with numbers having millions of digits. I only
found that out later. Would I have gotten a proper answer on this
newgroup had I asked here? Sure doesn't look like it.
If you had asked _what_? Unless you tell me what question you asked, how
can anyone guess what answer you would have received?

If you had asked a question about optimization, you surely would have
received an answer about optimization.

If you asked about string concatenation, you would have received a
question about string concatenation.

If you had asked a question about inheritance, you would have received an
answer about inheritance.

See the pattern?

BTW, in reviewing my Collatz Functions library, I noticed a coercion I
had overlooked, so as a result of this discussion, my library is now
slightly faster. So some good comes out of this argument after all.

>As for gmpy.mpz, since equality tests are completely under the control
of the class author, the gmpy authors obviously wanted mpz values to
compare equal with ints.

And they chose to do a silent coercion rather than raise a type
exception.
It says right in the gmpy documentation that this coercion will be
performed.
What it DOESN'T say is what the implications of this silent coercion
are.
OF COURSE a coercion takes time. This is Python, where everything is a
rich object, not some other language where a coercion merely tells the
compiler to consider bytes to be some other type. If you need your hand-
held to the point that you need somebody to tell you that operations take
time, maybe you need to think about changing professions.

The right way to do this is to measure first, then worry about
optimizations. The wrong way is to try to guess the bottlenecks ahead of
time. The worse way is to expect other people to tell you were your
bottlenecks are ahead of time.
>
>Since both lists and tuples are containers, neither are strings or
numeric types, so the earlier rule applies: they are different
types, so they can't be equal.
But you can't trust a==d returning True to mean a and d are "equal".

What does it mean then?

It means they are mathematically equivalent, which is not the same as
being programatically equivalent. Mathematical equivalency is what most
people want most of the time.
I think that by "most people", you mean you.

Not all of the people all of the time,
however. For example, I can calculate my Hailstone Function parameters
using either a list or a tuple:
>>>import collatz_functions as cf
print cf.calc_xyz([1,2])
(mpz(8), mpz(9), mpz(5))
>>>print cf.calc_xyz((1,2))
(mpz(8), mpz(9), mpz(5))

But [1,2]==(1,2) yields False, so although they are not equal, they ARE
interchangeable in this application because they are mathematically
equivalent.
No, they aren't mathematically equivalent, because Python data structures
aren't mathematical entities. (They may be _similar to_ mathematical
entities, but they aren't the same. Just ask a mathematician about the
difference between a Real number and a float.)

They are, however, both sequences, and so if your function expects any
sequence, they will both work.
[snip]
>I never said that there was no efficiency differences. Comparing X with
Y might take 0.02ms or it could take 2ms depending on how much work
needs to be done. I just don't understand why you think that has a
bearing on whether they are equal or not.

The bearing it has matters when you're writing a function library that
you want to execute efficiently.
Which is true, but entirely irrelevant to the question in hand, which is
"are they equal?".

--
Steven.
May 16 '07 #35

P: n/a
On May 15, 9:23´┐Żpm, Steven D'Aprano
<ste...@REMOVE.THIS.cybersource.com.auwrote:
On Mon, 14 May 2007 11:41:21 -0700, mensana...@aol.com wrote:
On May 13, 8:24 am, Steven D'Aprano
<s...@REMOVE.THIS.cybersource.com.auwrote:
On Sat, 12 May 2007 21:50:12 -0700, mensana...@aol.com wrote:
I intended to reply to this yesterday, but circumstances (see timeit
results) prevented it.
Actually, it's this statement that's non-sensical.
<quote>
"if arg==True" tests whether the object known as arg is equalto
the object known as True.
</quote>
Not at all, it makes perfect sense. X == Y always tests whetherthe
argument X is equal to the object Y regardless of what X and Y are.
Except for the exceptions, that's why the statement is wrong.
But there are no exceptions.
<quote emphasis added>
Sec 2.2.3:
Objects of different types, *--->except<---* different numeric types and
different string types, never compare equal; </quote>

Yes, and all swans are white, except for the black swans from Australia,
but we're not talking about swans, nor are we talking about objects of
different type comparing unequal, we're talking about whether X == Y
means X is equal to Y.

THERE ARE NO EXCEPTIONS TO THIS, BECAUSE IT IS TRUE BY DEFINITION.

In Python, the meaning of "equal" is nothing more and nothing less than
"does X == Y return True?". End of story, there is nothing more to
discuss. If it returns True, they are equal. If it doesn't, they aren't.

If you want to drag in non-Python meanings of "equal", you are wrong to
do so. "Lizzie Windsor", "Queen Elizabeth the Second", "the Queen of
England" and "Her Royal Majesty, Queen Elizabeth II" are all equal in the
sense that they refer to the same person, but it would be crazy to expect
Python to compare those strings equal.

If you want to complain that lists and tokens should compare equal if
their contents are the same, that's a different issue. I don't believe
you'll have much support for that.

If you want to complain that numeric types shouldn't compare equal, so
that 1.0 != 1 != 1L != gmpy.mpz(1), that's also a different issue. I
believe you'll have even less support for that suggestion.

[snip]
No, they are not "equal".
Of course they are. It says so right there: "a equals d" is true.
Ok, but they are an exception to the rule "different types compare
False".

You are only quoting part of the rule. The rule says that numeric types
and strings are not included in the "different types" clause. If you
quote the full rule, you will see that it is not an exception to the
rule, it matches perfectly.
Uh...ok, I get it...I think.

I always thought that when someone said "all primes are
odd except 2" it meant that 2 was was an exception.
But since the rule specifically says 2 is an exception,
it's not an exception.
>
Although, the rule as given is actually incomplete, because it only
applies to built-in types. It does not apply to classes, because the
class designer has complete control over the behaviour of his class. If
the designer wants his class to compare equal to lists on Wednesdays and
unequal on other days, he can. (That would be a stupid thing to do, but
possible.)

[snip]
The ints
ALWAYS have to be coerced to mpzs to perform arithmetic and this
takes time...LOTS of it.
Really? Just how much time?
Can't say, had to abort the following. Returns the count of n/2 and 3n+1
operations [1531812, 854697].

Maybe you should use a test function that isn't so insane then. Honestly,
if you want to time something, time something that actually completes!
You don't gain any accuracy by running a program for twenty hours instead
of twenty minutes.
Actually, I misunderstood the timeit tests, didn't quite realize the
difference between .timeit() and .repeat(). And although that number
may look insane, it's one I'm quite familiar with so I can tell that
everything's working right. My Collatz research tends to be on the
fringe, in places where angels fear to tread.
>
[snip functions generating the Collatz sequence]
timeit.Timer("x == y", "import gmpy; x = 1; y = gmpy.mpz(1)").repeat()
timeit.Timer("x == y", "x = 1; y = 1").repeat()
I don't have gmpy installed here,
Good Lord! How do you solve a Linear Congruence? :-)

In my head of course. Don't you?

*wink*
so I can't time it, but I look forward to seeing the results, if you
would be so kind.
I had a lot of trouble with this, but I think I finally got a handle on
it. I had to abort the previous test after 20+ hours and abort a second
test (once I figured out to do your example) on another machine after
14+ hours. I had forgotten just how significant the difference is.
import timeit
## * *t = timeit.Timer("a == b", "a = 1; b = 1") ## * *u =
timeit.Timer("c == d", "import gmpy; c = 1; d = gmpy.mpz(1)")
## * *t.repeat()
## * *[0.22317417437132372, 0.22519314605627253, 0.22474588250741367] ##
* *u.repeat()
## * *[0.59943819675405763, 0.5962260566636246, 0.60122920650529466]

Comparisons between ints take about 0.2 microseconds, compared to about
0.6 microseconds for small gmpy.mpz values. That's an optimization worth
considering, but certainly not justifying your claim that one should
NEVER compare an int and a mpz "in a loop". If the rest of the loop takes
five milliseconds, who cares about a fraction of a microsecond difference?
Unfortunately, this is not a very useful test, since mpz coercion
appears to vary ny the size of the number involved.

No, it is a very useful test. It's not an EXHAUSTIVE test.

(By the way, you're not testing coercion. You're testing the time it
takes to compare the two. There may or may not be any coercion involved.)
But isn't the difference between t.repeat() and u.repeat() due to
coercion?
>
Although changing t to
## * *t = timeit.Timer("a == b", "a = 2**177149-1; b = 2**177149-1")
still produces tractable results
## * *t.repeat()
## * *[36.323597552202841, 34.727026758987506, 34.574566320579862]

About 36 microseconds per comparison, for rather large longints.
the same can't be said for mpz coercion:
## * *u = timeit.Timer("c == d", "import gmpy; c = 2**177149-1; d =
gmpy.mpz(2**177149-1)")
## * *u.repeat()
## * **ABORTED after 14 hours*

This tells us that a comparison between large longints and large gmpz.mpz
vales take a minimum of 14 hours divided by three million,
I thought it was 14 hours divided by 3. I said I didn't quite
understand how timeit worked.
or roughly 17
milliseconds each. That's horribly expensive if you have a lot of them.
Yeah, and that will be the case for large numbers which is
why I chose that insane number. In the Collatz test, that
works out to about 1.7 million loop cycles. Run time is
logarithmic to number size, so truly insane values still have
tractable run times. Provided you don't mistakenly ask for
3 million tests thinking it's 3.
>
It isn't clear _why_ the comparison takes so long.
I'm thinking there may be something wrong.
>
[snip]
And, just for laughs, I compared mpzs to mpzs,
* * s = 'import gmpy; a = gmpy.mpz(%d); b = gmpy.mpz(%d)' % (n,n)
which ended up faster than comparing ints to ints.

I'm hardly surprised. If speed is critical, gmpy is likely to be faster
than anything you can do in pure Python.

[snip]
Even if it is terribly slow, that's just an implementation detail. What
happens when Python 2.7 comes out (or Python 3.0 or Python 99.78) and
coercion from int to mpz is lightning fast? Would you then say "Well,
int(1) and mpz(1) used to be unequal, but now they are equal?".
Are you saying I should be unconcerned about implementation details?
That it's silly of me to be concerned about implementation side effects
due to mis-matched types?

Of course not. But the discussion isn't about optimization, that's just
an irrelevant side-track.
Me, I'd say they always were equal, but previously it used to be slow
to coerce one to the other.
So, when you're giving advice to the OP you don't feel any need to point
this out? That's all I'm trying to do, supply some "yes, but you should
be aware of..." commentary.

Why on Earth would I need to mention gmpy.mpz()? Does the OP even use
gmpy? You were the one who brought gmpy into the discussion, not him. Why
not give him a lecture about not repeatedly adding strings together, or
using << instead of multiplication by two, or any other completely
irrelevant optimization? My favorite, by the way, is that you can save
anything up to an hour of driving time by avoiding Hoddle Street during
peak hour and using the back-streets through Abbotsford, next to Yarra
Bend Park and going under the Eastern Freeway. Perhaps I should have
raised that as well?
In any case, what you describe is a local optimization. Its probably a
good optimization, but in no way, shape or form does it imply that
mpz(1) is not equal to 1.
It's a different type. It is an exception to the "different types
compare False" rule.

What does this have to do with your ridiculous claim that mpz(1) is not
equal to 1? It clearly is equal.
That exception is not without cost, the type mis-match
causes coercion.

Any comparison has a cost. Sometimes its a lot, sometimes a little. That
has nothing to do with equality.
There's nothing false about it. Ask any mathematician, does 1 equal
1.0, and they will say "of course".
And if you ask any mathematician, he'll say that (1,) is equal to [1].

I'd like to find the mathematician who says that. The first thing he'd
say is "what is this (1,) notation you are using?" and the second thing
he'd ask is "equal in what sense?".

Perhaps you should ask a mathematician if the set {1, 2} and the vector
[1, 2] are equal, and if either of them are equal to the coordinate pair
(1, 2).
That's the difference between a mathematician and a programmer. A
programmer will say "of course not, the int has to be coered."

A C programmer maybe.

[snip]
Numeric values are automatically coerced because that's more practical.
That's a design decision, and it works well.
And I'm not saying it shouldn't be that way. But when I wrote my Collatz
Functions library, I wasn't aware of the performance issues when doing
millions of loop cycles with numbers having millions of digits. I only
found that out later. Would I have gotten a proper answer on this
newgroup had I asked here? Sure doesn't look like it.

If you had asked _what_? Unless you tell me what question you asked, how
can anyone guess what answer you would have received?

If you had asked a question about optimization, you surely would have
received an answer about optimization.

If you asked about string concatenation, you would have received a
question about string concatenation.

If you had asked a question about inheritance, you would have received an
answer about inheritance.

See the pattern?
BTW, in reviewing my Collatz Functions library, I noticed a coercion I
had overlooked, so as a result of this discussion, my library is now
slightly faster. So some good comes out of this argument after all.
As for gmpy.mpz, since equality tests are completely under the control
of the class author, the gmpy authors obviously wanted mpz values to
compare equal with ints.
And they chose to do a silent coercion rather than raise a type
exception.
It says right in the gmpy documentation that this coercion will be
performed.
What it DOESN'T say is what the implications of this silent coercion
are.

OF COURSE a coercion takes time. This is Python, where everything is a
rich object, not some other language where a coercion merely tells the
compiler to consider bytes to be some other type. If you need your hand-
held to the point that you need somebody to tell you that operations take
time, maybe you need to think about changing professions.

The right way to do this is to measure first, then worry about
optimizations. The wrong way is to try to guess the bottlenecks ahead of
time. The worse way is to expect other people to tell you were your
bottlenecks are ahead of time.
Since both lists and tuples are containers, neither are strings or
numeric types, so the earlier rule applies: they are different
types, so they can't be equal.
But you can't trust a==d returning True to mean a and d are "equal".
What does it mean then?
It means they are mathematically equivalent, which is not the same as
being programatically equivalent. Mathematical equivalency is what most
people want most of the time.

I think that by "most people", you mean you.
Not all of the people all of the time,
however. For example, I can calculate my Hailstone Function parameters
using either a list or a tuple:
>>import collatz_functions as cf
print cf.calc_xyz([1,2])
(mpz(8), mpz(9), mpz(5))
>>print cf.calc_xyz((1,2))
(mpz(8), mpz(9), mpz(5))
But [1,2]==(1,2) yields False, so although they are not equal, theyARE
interchangeable in this application because they are mathematically
equivalent.

No, they aren't mathematically equivalent, because Python data structures
aren't mathematical entities. (They may be _similar to_ mathematical
entities, but they aren't the same. Just ask a mathematician about the
difference between a Real number and a float.)

They are, however, both sequences, and so if your function expects any
sequence, they will both work.

[snip]
I never said that there was no efficiency differences. Comparing X with
Y might take 0.02ms or it could take 2ms depending on how much work
needs to be done. I just don't understand why you think that has a
bearing on whether they are equal or not.
The bearing it has matters when you're writing a function library that
you want to execute efficiently.

Which is true, but entirely irrelevant to the question in hand, which is
"are they equal?".
Hey, here's an idea...let's forget the whole thing.
>
--
Steven.

May 16 '07 #36

P: n/a
On May 15, 7:07 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
En Tue, 15 May 2007 14:01:20 -0300, mensana...@aol.com
<mensana...@aol.comescribiˇ:
On May 15, 12:30 am, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
And said section 5.9 should be updated too: "The objects need not have
the
same type. If both are numbers or strings, they are converted to a
common
type.
Except when they aren't.

I think you don't get the difference between a builtin object, fully under
the Python developers' control, and a user defined class that can behave
arbitrarily at wish of its writer and for which the Python documentation
barely can say a word.
The docs say how will the Python interpreter try to compare objects
(invoke the rich comparison methods, invoke __cmp__, etc) and how the
*builtin* objects behave. For other objects, it's up to the object
*writer* to provide such methods, and he can do whatever he wishes:

pyclass Reversed(int):
... def __lt__(self, other): return cmp(int(self),other)>0
... def __gt__(self, other): return cmp(int(self),other)<0
... def __le__(self, other): return cmp(int(self),other)>=0
... def __ge__(self, other): return cmp(int(self),other)<=0
...
py>
pyj=Reversed(6)
pyj==6
True
pyj>5
False
pyj>10
True
pyj<=5
True

You can't blame Python for this.
>>import gmpy
a = 2**177149-1
b = gmpy.mpz(2**177149-1)
a==b
True
>>print '%d' % (b)
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
print '%d' % (b)
TypeError: int argument required
So although the comparison operator is smart enough to realize
the equivalency of numeric types and do the type conversion,
the print statement isn't so smart.

This is up to the gmpy designers/writers/maintainers. Anyone writing a
class chooses which features to implement, which ones to omit, how to
implement them, etc. The code may contain bugs, may not be efficient, may
not behave exactly as the users expect, may not have anticipated all usage
scenarios, a long etc. In this case, probably the gmpy writers have chosen
not to allow to convert to int, and they may have good reasons to not do
that (I don't know what platform are you working in, but I feel that your
b object is somewhat larger than sys.maxint...).
Then how does this work?
>>print '%d' % (long(gmpy.mpz(2**177149-1)))
1454...<53320 digits snipped>...3311

I honestly don't understand why there's a problem here.
If print can handle arbitrary precision longs without
a problem, why does it fail on mpzs sys.maxint?
If the gmpy writers are not allowing the conversion,
then why do small mpz values work? Something smells
inconsistent here.

How is it that
>>print '%d' % (1.0)
1

doesn't make a type mismatch? Obviously, the float
got changed to an int and this had nothing to do with
gmpy. Is it the print process responsible for doing
the conversion? Maybe I should say invoking the
conversion? Maybe the gmpy call tries to literally
convert to an integer rather than sneakily substitute
a long?

How else can this phenomena be explained?
Otherwise, objects of different builtin types always compare
unequal, and are ordered consistently but arbitrarily. You can control
comparison behavior of objects of non-builtin types by defining a
__cmp__
method or rich comparison methods like __gt__, described in section
3.4."
I hope this helps a bit. Your performance issues don't have to do with
the
*definition* of equal or not equal,
I didn't say that, I said the performance issues were related
to type conversion. Can you explain how the "definition" of
equal does not involve type conversion?

There is no type conversion involved for user defined classes, *unless*
the class writer chooses to do so.
Let's invent some new class Number; they can be added and have basic
str/repr support

pyclass Number(object):
... def __init__(self, value): self.value=value
... def __add__(self, other): return Number(self.value+other.value)
... def __str__(self): return str(self.value)
... def __repr__(self): return 'Number(%s)' % self.value
...
pyx = Number(2)
pyy = Number(3)
pyz = x+y
pyz
Number(5)
pyz == 5
False
py5 == z
False
pyz == Number(5)
False
pyint(z)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: int() argument must be a string or a number
py"%d" % z
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: int argument required

You can't compare them to anything, convert to integer, still nothing.
Let's add "int conversion" first:

pyNumber.__int__ = lambda self: int(self.value)
pyint(z)
5
py"%d" % z
'5'
pyz == 5
False
py5 == z
False

Ok, a Number knows how to convert itself to integer, but still can't be
compared successfully to anything. (Perhaps another language would try to
convert automagically z to int, to compare against 5, but not Python).
Let's add basic comparison support:

pyNumber.__cmp__ = lambda self, other: cmp(self.value, other.value)
pyz == Number(5)
True
pyz Number(7)
False
pyz == z
True
pyz == 5
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 1, in <lambda>
AttributeError: 'int' object has no attribute 'value'

Now, a Number can be compared to another Number, but still not compared to
integers. Let's make the comparison a bit smarter (uhm, I'll write it as a
regular function because it's getting long...)

pydef NumberCmp(self, other):
... if isinstance(other, Number): return cmp(self.value, other.value)
... else: return cmp(self.value, other)
...
pyNumber.__cmp__ = NumberCmp
pyz == 5
True
pyz == 6
False
py5 == z
True

As you can see, until I wrote some code explicitely to do the comparison,
and allow other types of comparands, Python will not "convert" anything.
If you find that some class appears to do a type conversion when comparing
instances, it's because the class writer has explicitely coded it that
way, not because Python does the conversion automagically.
Ok, ok. But how does the subroutine that the class
writer created to do the actual conversion get invoked?
>
only with how someone decided to write the mpz class.
I'm beginning to think there's a problem there.

Yes: you don't recognize that gmpy is not a builtin package, it's an
external package, and its designers/writers/implementors/coders/whatever
decide how it will behave, not Python itself nor the Python developers.

--
Gabriel Genellina

May 16 '07 #37

P: n/a
En Wed, 16 May 2007 03:16:59 -0300, me********@aol.com
<me********@aol.comescribiˇ:
On May 15, 7:07 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
>>>import gmpy
a = 2**177149-1
b = gmpy.mpz(2**177149-1)
a==b
True
print '%d' % (b)
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
print '%d' % (b)
TypeError: int argument required
So although the comparison operator is smart enough to realize
the equivalency of numeric types and do the type conversion,
the print statement isn't so smart.

This is up to the gmpy designers/writers/maintainers. Anyone writing a
class chooses which features to implement, which ones to omit, how to
implement them, etc. The code may contain bugs, may not be efficient,
may
not behave exactly as the users expect, may not have anticipated all
usage
scenarios, a long etc. In this case, probably the gmpy writers have
chosen
not to allow to convert to int, and they may have good reasons to not do
that (I don't know what platform are you working in, but I feel that
your
b object is somewhat larger than sys.maxint...).

Then how does this work?
>>>print '%d' % (long(gmpy.mpz(2**177149-1)))
1454...<53320 digits snipped>...3311

I honestly don't understand why there's a problem here.
If print can handle arbitrary precision longs without
a problem, why does it fail on mpzs sys.maxint?
If the gmpy writers are not allowing the conversion,
then why do small mpz values work? Something smells
inconsistent here.
Python (builtin) "integral numbers" come on two flavors: int and long.
ints correspond to the C `long` type usually, and have a limited range, at
least from -2**31 to 2**31-1; most operations have hardware support (or at
least it's up to the C compiler). Long integers are a totally different
type, they have unlimited range but are a lot slower, and all operations
must be done "by hand". See http://docs.python.org/ref/types.html

If you say "%d" % something, Python first tries to see if `something` is a
long integer -not to *convert* it to a long integer, just to see if the
object *is* a long integer. If it's a long, it's formatted accordingly.
If not, Python sees if `something` is a plain integer. If not, it sees if
it's a number (in this context, that means that the structure describing
its type contains a non-NULL tp_as_number member) and tries to *convert*
it to an integer. Notice that if the object whas not originally a long
integer, no attempt is made to convert it to a long using the nb_long
member - just a plain integer conversion is attempted.
It's at this stage that a large mpz object may fail - when its value can't
fit in a plain integer, it raises an OverflowError and the "%d" formatting
fails.
If you force a conversion to long integer, using long(mpz(...)) as above,
the % operator sees a long integer from start and it can be formatted
without problems.

I don't know if this asymmetric behavior is a design decision, a historic
relic, a change in protocol (is nb_int allowed now to return a
PyLongObject, but not before?), a "nobody cares" issue, or just a bug.
Perhaps someone else can give an opinion - and certainly I may be wrong, I
had never looked at the PyString_Format function internal details before
(thanks for providing an excuse!).

As a workaround you can always write "%d" % long(mpznumber) when you want
to print them (or perhaps "%s" % mpznumber, which might be faster).
How is it that
>>>print '%d' % (1.0)
1

doesn't make a type mismatch? Obviously, the float
got changed to an int and this had nothing to do with
gmpy. Is it the print process responsible for doing
the conversion? Maybe I should say invoking the
conversion? Maybe the gmpy call tries to literally
convert to an integer rather than sneakily substitute
a long?
Same as above: is the argument a long integer? no. is it a number? yes.
Convert to int. No errors? Apply format.

--
Gabriel Genellina

May 16 '07 #38

P: n/a
On May 16, 4:12 am, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
En Wed, 16 May 2007 03:16:59 -0300, mensana...@aol.com
<mensana...@aol.comescribiˇ:


On May 15, 7:07 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
>>import gmpy
a = 2**177149-1
b = gmpy.mpz(2**177149-1)
a==b
True
print '%d' % (b)
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
print '%d' % (b)
TypeError: int argument required
So although the comparison operator is smart enough to realize
the equivalency of numeric types and do the type conversion,
the print statement isn't so smart.
This is up to the gmpy designers/writers/maintainers. Anyone writing a
class chooses which features to implement, which ones to omit, how to
implement them, etc. The code may contain bugs, may not be efficient,
may
not behave exactly as the users expect, may not have anticipated all
usage
scenarios, a long etc. In this case, probably the gmpy writers have
chosen
not to allow to convert to int, and they may have good reasons to not do
that (I don't know what platform are you working in, but I feel that
your
b object is somewhat larger than sys.maxint...).
Then how does this work?
>>print '%d' % (long(gmpy.mpz(2**177149-1)))
1454...<53320 digits snipped>...3311
I honestly don't understand why there's a problem here.
If print can handle arbitrary precision longs without
a problem, why does it fail on mpzs sys.maxint?
If the gmpy writers are not allowing the conversion,
then why do small mpz values work? Something smells
inconsistent here.

Python (builtin) "integral numbers" come on two flavors: int and long.
ints correspond to the C `long` type usually, and have a limited range, at
least from -2**31 to 2**31-1; most operations have hardware support (or at
least it's up to the C compiler). Long integers are a totally different
type, they have unlimited range but are a lot slower, and all operations
must be done "by hand". Seehttp://docs.python.org/ref/types.html

If you say "%d" % something, Python first tries to see if `something` is a
long integer -not to *convert* it to a long integer, just to see if the
object *is* a long integer. If it's a long, it's formatted accordingly.
If not, Python sees if `something` is a plain integer. If not, it sees if
it's a number (in this context, that means that the structure describing
its type contains a non-NULL tp_as_number member) and tries to *convert*
it to an integer. Notice that if the object whas not originally a long
integer, no attempt is made to convert it to a long using the nb_long
member - just a plain integer conversion is attempted.
It's at this stage that a large mpz object may fail - when its value can't
fit in a plain integer, it raises an OverflowError and the "%d" formatting
fails.
If you force a conversion to long integer, using long(mpz(...)) as above,
the % operator sees a long integer from start and it can be formatted
without problems.

I don't know if this asymmetric behavior is a design decision, a historic
relic, a change in protocol (is nb_int allowed now to return a
PyLongObject, but not before?), a "nobody cares" issue, or just a bug.
Perhaps someone else can give an opinion - and certainly I may be wrong, I
had never looked at the PyString_Format function internal details before
(thanks for providing an excuse!).
Ah, thanks for the info, I know nothing about Python internals.

That implies that although this works:
>>print '%d' %(1234567890.0)
1234567890

this does not:
>>print '%d' %(12345678901234567890.0)
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
print '%d' %(12345678901234567890.0)
TypeError: int argument required

So we can work around it by doing the long conversion
ourselves since print only knows how to invoke int conversion.
>>print '%d' %(long(12345678901234567890.0))
12345678901234567168

which demonstartes the problem is not with gmpy.
>
As a workaround you can always write "%d" % long(mpznumber) when you want
to print them (or perhaps "%s" % mpznumber, which might be faster).
How is it that
>>print '%d' % (1.0)
1
doesn't make a type mismatch? Obviously, the float
got changed to an int and this had nothing to do with
gmpy. Is it the print process responsible for doing
the conversion? Maybe I should say invoking the
conversion? Maybe the gmpy call tries to literally
convert to an integer rather than sneakily substitute
a long?

Same as above: is the argument a long integer? no. is it a number? yes.
Convert to int. No errors? Apply format.
Thanks again, as long as I know why the behaviour is strange,
I know how to work around it
>
--
Gabriel Genellina
May 16 '07 #39

P: n/a
Gabriel Genellina <ga*******@yahoo.com.arwrote:
>>import gmpy
a = 2**177149-1
b = gmpy.mpz(2**177149-1)
a==b
True
>>print '%d' % (b)
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
print '%d' % (b)
TypeError: int argument required

So although the comparison operator is smart enough to realize
the equivalency of numeric types and do the type conversion,
the print statement isn't so smart.

This is up to the gmpy designers/writers/maintainers. Anyone writing a
class chooses which features to implement, which ones to omit, how to
implement them, etc. The code may contain bugs, may not be efficient, may
not behave exactly as the users expect, may not have anticipated all usage
scenarios, a long etc. In this case, probably the gmpy writers have chosen
not to allow to convert to int, and they may have good reasons to not do
that (I don't know what platform are you working in, but I feel that your
b object is somewhat larger than sys.maxint...).
The gmpy designer, writer and maintainer (all in the singular -- that's
me) has NOT chosen anything of the sort. gmpy.mpz does implement
__int__ and __long__ -- but '%d'%somempzinstance chooses not to call
either of them. sys.maxint has nothing to do with the case:
'%d'%somelonginstance DOES work just fine -- hey, even a *float*
instance formats just fine here (it gets truncated). I personally
consider this a bug in %d-formatting, definitely NOT in gmpy.
Alex
May 18 '07 #40

P: n/a
En Fri, 18 May 2007 01:48:29 -0300, Alex Martelli <al***@mac.comescribiˇ:
The gmpy designer, writer and maintainer (all in the singular -- that's
me) has NOT chosen anything of the sort. gmpy.mpz does implement
__int__ and __long__ -- but '%d'%somempzinstance chooses not to call
either of them. sys.maxint has nothing to do with the case:
'%d'%somelonginstance DOES work just fine -- hey, even a *float*
instance formats just fine here (it gets truncated). I personally
consider this a bug in %d-formatting, definitely NOT in gmpy.
Yes, sorry, at first I thought it was gmpz which refused to convert itself
to long. But the fault is in the string formatting code, and it was
pointed out later on this same thread. Floats have the same problem: "%d"
% 5.2 does work, but "%d" % 1e30 does not.

After digging a bit in the implementation of PyString_Format, for a "%d"
format it does:
- test if the value to be printed is actually a long integer (using
PyLong_Check). Yes? Format as a long integer.
- else, convert the value into a plain integer (using PyInt_AsLong), and
format that.
No attempt is made to *convert* the value to a long integer. I understand
that this could be a slow operation, so the various tests should be
carefully ordered, but anyway the __long__ conversion should be done.

--
Gabriel Genellina

May 18 '07 #41

This discussion thread is closed

Replies have been disabled for this discussion.