pickle broken: can't handle NaN or Infinity under win32

Grant Edwards

I finally figured out why one of my apps sometimes fails under
Win32 when it always works fine under Linux: Under Win32, the
pickle module only works with a subset of floating point
values. In particular the if you try to dump/load an infinity
or nan value, the load operation chokes:
Under Linux:

$ python
Python 2.3.4 (#2, Feb 9 2005, 14:22:48)
[GCC 3.4.1 (Mandrakelinux 10.1 3.4.1-4mdk)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

$ python pickletest.py
(inf, nan) (inf, nan)
Under Win32:

$ python
ActivePython 2.3.4 Build 233 (ActiveState Corp.) based on
Python 2.3.4 (#53, Oct 18 2004, 20:35:07) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.

$ python pickletest.py
Traceback (most recent call last):
File "pickletest.py", line 8, in ?
d = pickle.loads(s)
File "C:\PYTHON23\lib\pickle.py", line 1394, in loads
return Unpickler(file).load()
File "C:\PYTHON23\lib\pickle.py", line 872, in load
dispatch[key](self)
File "C:\PYTHON23\lib\pickle.py", line 968, in load_float
self.append(float(self.readline()[:-1]))
ValueError: invalid literal for float(): 1.#INF

I realize that this is probably due to underlying brokenness in
the Win32 libc implimentation, but should the pickle module
hide such platform-dependancies from the user?

Best case, it would be nice if pickle could handle all floats
in a portable way.

Worst case, shouldn't the pickle module documentation mention
that pickling floats non-portable or only partially implimented?

On a more immediate note, are there hooks in pickle to allow
the user to handle types that pickle can't deal with? Or, do I
have to throw out pickle and write something from scratch?

[NaN and Infinity are prefectly valid (and extremely useful)
floating point values, and not using them would require huge
complexity increases in my apps (not using them would probably
at least triple the amount of code required in some cases).]

--
Grant Edwards grante Yow! Yow!
at
visi.com

Jul 19 '05 #1

Subscribe Post Reply

2635

Grant Edwards

On 2005-06-21, Grant Edwards <gr****@visi.com> wrote:

I finally figured out why one of my apps sometimes fails under
Win32 when it always works fine under Linux [...]

Oh, I forgot, here's pickletest.py:

#!/usr/bin/python
import pickle

f1 = (1e300*1e300)
f2 = f1/f1
o = (f1,f2)
s = pickle.dumps(o)
d = pickle.loads(s)

print o,d
$ python pickletest.py
(inf, nan) (inf, nan)
Under Win32:

$ python pickletest.py
Traceback (most recent call last):
File "pickletest.py", line 8, in ?
d = pickle.loads(s)
File "C:\PYTHON23\lib\pickle.py", line 1394, in loads
return Unpickler(file).load()
File "C:\PYTHON23\lib\pickle.py", line 872, in load
dispatch[key](self)
File "C:\PYTHON23\lib\pickle.py", line 968, in load_float
self.append(float(self.readline()[:-1]))
ValueError: invalid literal for float(): 1.#INF

--
Grant Edwards grante Yow! Here I am in 53
at B.C. and all I want is a
visi.com dill pickle!!

Jul 19 '05 #2

Scott David Daniels

Grant Edwards wrote:

I finally figured out why one of my apps sometimes fails under
Win32 when it always works fine under Linux: Under Win32, the
pickle module only works with a subset of floating point
values. In particular the if you try to dump/load an infinity
or nan value, the load operation chokes: There is no completely portable way to do this. Any single platform
can have a solution, but (since the C standards don't address how
NaNs and Infs are represented) there is not a good portable way to do
the pickle / unpickle. It is nice the exception is raised, since at
one point it was not (and a simple 1.0 was returned).
See explanations in article 654866:

http://sourceforge.net/tracker/index...70&atid=105470
$ python pickletest.py
Traceback (most recent call last): ...
File "C:\PYTHON23\lib\pickle.py", line 968, in load_float
self.append(float(self.readline()[:-1]))
ValueError: invalid literal for float(): 1.#INF I realize that this is probably due to underlying brokenness in
the Win32 libc implimentation, but should the pickle module
hide such platform-dependancies from the user? As mentioned above, there is no C standard-accessible way to
predictably build or represent NaNs, negative zeroes, or Infinities.
[NaN and Infinity are prefectly valid (and extremely useful)
floating point values, and not using them would require huge
complexity increases in my apps (not using them would probably
at least triple the amount of code required in some cases).]

You could check to see if the Python 2.5 pickling does a better
job. Otherwise, you've got your work cut out for you.

-Scott David Daniels
Sc***********@Acm.Org

Jul 19 '05 #3

Grant Edwards

On 2005-06-22, Scott David Daniels <Sc***********@Acm.Org> wrote:

I finally figured out why one of my apps sometimes fails under
Win32 when it always works fine under Linux: Under Win32, the
pickle module only works with a subset of floating point
values. In particular the if you try to dump/load an infinity
or nan value, the load operation chokes:
There is no completely portable way to do this.

Python deals with all sorts of problems for which there is no
completely portable solution. Remember: "practicality beats purity."
Any single platform can have a solution, but (since the C
standards don't address how NaNs and Infs are represented)
there is not a good portable way to do the pickle / unpickle.
Likewise, there is no completely portable python
implimentation. Any single platform can have a Python
implimentation, but since the C standards don't address a
universal standard for "a computer" there is not a good
portable way to do Python. I guess we'd better give up on
Python. :)
It is nice the exception is raised, since at one point it was
not (and a simple 1.0 was returned).

That would be even worse.

[NaN and Infinity are prefectly valid (and extremely useful)
floating point values, and not using them would require huge
complexity increases in my apps (not using them would probably
at least triple the amount of code required in some cases).]

You could check to see if the Python 2.5 pickling does a better
job. Otherwise, you've got your work cut out for you.

Fixing it is really quite trivial. It takes less than a dozen
lines of code. Just catch the exception and handle it.

def load_float(self):
s = self.readline()[:-1]
try:
f = float(s)
except ValueError:
s = s.upper()
if s in ["1.#INF", "INF"]:
f = 1e300*1e300
elif s in ["-1.#INF", "-INF"]:
f = -1e300*1e300
elif s in ["NAN","1.#QNAN","QNAN","1.#IND","IND","-1.#IND"]:
f = -((1e300*1e300)/(1e300*1e300))
else:
raise ValueError, "Don't know what to do with "+`s`
self.append(f)

Obviously the list of accepted string values should be expanded
to include other platforms as needed. The above example
handles Win32 and glibc (e.g. Linux).

Even better, add that code to float().

--
Grant Edwards grante Yow! Is the EIGHTIES
at when they had ART DECO
visi.com and GERALD McBOING-BOING
lunch boxes??

Jul 19 '05 #4

Scott David Daniels

Grant Edwards wrote:

On 2005-06-22, Scott David Daniels <Sc***********@Acm.Org> wrote:
...Under Win32, the pickle module only works with a subset of
floating point values. In particular ... infinity or nan ...

There is no completely portable way to do this.

Python deals with all sorts of problems for which there is no
completely portable solution. Remember: "practicality beats purity."
Any single platform can have a solution, but (since the C
standards don't address how NaNs and Infs are represented)
there is not a good portable way to do the pickle / unpickle.

...
Fixing it is really quite trivial. It takes less than a dozen
lines of code. Just catch the exception and handle it.

Since you know it is quite trivial, and I don't, why not submit a
patch resolving this issue. Be sure to include tests for all
supported Python platforms.

--Scott David Daniels
Sc***********@Acm.Org

Jul 19 '05 #5

Grant Edwards

On 2005-06-22, Scott David Daniels <Sc***********@Acm.Org> wrote:

Fixing it is really quite trivial. It takes less than a dozen
lines of code. Just catch the exception and handle it.

Since you know it is quite trivial, and I don't, why not
submit a patch resolving this issue. Be sure to include tests
for all supported Python platforms.

I'm working on it. I should have said it's trivial if you have
access to the platforms to be supported. I've tested a fix
that supports pickle streams generated under Win32 and glibc.
That's using the "native" string representation of a NaN or
Inf.

A perhaps simpler approach would be to define a string
representation for Python to use for NaN and Inf. Just because
something isn't defined by the C standard doesn't mean it can't
be defined by Python.

--
Grant Edwards grante Yow! I'm shaving!! I'M
at SHAVING!!
visi.com

Jul 19 '05 #6

Scott David Daniels

Grant Edwards wrote:

I'm working on it. I should have said it's trivial if you have
access to the platforms to be supported. I've tested a fix
that supports pickle streams generated under Win32 and glibc.
That's using the "native" string representation of a NaN or
Inf.

Several issues:
(1) The number of distinct NaNs varies among platforms. There are
quiet and signaling NaNs, negative 0, the NaN that Windows VC++
calls "Indeterminate," and so on.
(2) There is no standard-conforming way to create these values.
(3) There is no standard-conforming way to detect these values.

--Scott David Daniels
Sc***********@Acm.Org

Jul 19 '05 #7

Terry Reedy

"Grant Edwards" <gr****@visi.com> wrote in message
news:11*************@corp.supernews.com...

I'm working on it. I should have said it's trivial if you have
access to the platforms to be supported. I've tested a fix
that supports pickle streams generated under Win32 and glibc.
That's using the "native" string representation of a NaN or
Inf.

A perhaps simpler approach would be to define a string
representation for Python to use for NaN and Inf. Just because
something isn't defined by the C standard doesn't mean it can't
be defined by Python.

I believe that changes have been made to marshal/unmarshal in 2.5 CVS with
respect to NAN/INF to eliminate annoying/surprising behavior differences
between corresponding .py and .pyc files. Perhaps these revisions would be
relevant to pickle changes.

TJR

Jul 19 '05 #8

Grant Edwards

On 2005-06-22, Scott David Daniels <Sc***********@Acm.Org> wrote:

I'm working on it. I should have said it's trivial if you have
access to the platforms to be supported. I've tested a fix
that supports pickle streams generated under Win32 and glibc.
That's using the "native" string representation of a NaN or
Inf. Several issues:

(1) The number of distinct NaNs varies among platforms.
According to the IEEE standard, there are exactly two:
signalling and quiet, and on platforms that don't impliment
floating point exceptions (probably in excess of 99.9% of
python installations), the difference between the two is moot.
There are quiet and signaling NaNs, negative 0,
Negative 0 isn't a NaN, it's just negative 0.
the NaN that Windows VC++ calls "Indeterminate," and so
on.
That's just Microsoft's way of spelling "signalling NaN."
(2) There is no standard-conforming way to create these values.
What standard are you looking at? My copy of the IEEE 754
standard is pretty clear.
(3) There is no standard-conforming way to detect these
values.

The bit patterns are defined by the IEEE 754 standard. If
there are Python-hosting platoforms that don't use IEEE 754 as
the floating point representation, then that can be dealt with.

Python has _tons_ of platform-specific code in it.

Why all of a sudden is it taboo for Python to impliment
something that's not universally portable and defined in a
standard? Where's the standard defining Python?

--
Grant Edwards grante Yow! ... A housewife
at is wearing a polypyrene
visi.com jumpsuit!!

Jul 19 '05 #9

Terry Reedy

"Grant Edwards" <gr****@visi.com> wrote in message
news:11*************@corp.supernews.com...

The bit patterns are defined by the IEEE 754 standard. If
there are Python-hosting platoforms that don't use IEEE 754 as
the floating point representation, then that can be dealt with.

Python has _tons_ of platform-specific code in it.
More, I believe, than the current maintainers would like. Adding more
would probably require a commitment to maintain the addition (respond to
bug reports) for a few years.
Why all of a sudden is it taboo for Python to impliment
something that's not universally portable and defined in a
standard?
??

Perhaps you wrote this before reading my last post reporting that some
NaN/Inf changes have already been made for 2.5. I believe that more would
be considered if properly submitted.
Where's the standard defining Python?

The Language and Library Reference Manuals at python.org.

Terry J. Reedy

Jul 19 '05 #10

Scott David Daniels

Grant Edwards wrote:

On 2005-06-22, Scott David Daniels <Sc***********@Acm.Org> wrote:
Several issues:
(1) The number of distinct NaNs varies among platforms.

According to the IEEE standard, there are exactly two:
signalling and quiet, and on platforms that don't impliment
floating point exceptions (probably in excess of 99.9% of
python installations), the difference between the two is moot.

But it does not specify the representation of such NaNs.
Negative 0 isn't a NaN, it's just negative 0. Right, but it is hard to construct in standard C.

(2) There is no standard-conforming way to create these values.

What standard are you looking at? My copy of the IEEE 754
standard is pretty clear.

I was talking about the C89 standard. In the absence of a C
standard way to create, manipulate, and test for these values,
you must implement and test platform-by-platform.
The bit patterns are defined by the IEEE 754 standard. Perhaps this is right and I misunderstand the standard, but my
understanding is that the full bit pattern is not, in fact,
defined.
If there are Python-hosting platoforms that don't use IEEE 754 as
the floating point representation, then that can be dealt with. There are.
Python has _tons_ of platform-specific code in it. But the _tons_ are written in C89 C.
Why all of a sudden is it taboo for Python to impliment
something that's not universally portable and defined in a
standard? Where's the standard defining Python?

It is not taboo. I am trying to explain why it is not a
trivial task, but a substantial effort. If you are willing
to perform the substantial effort, good on you, and I'll help.
If you simply want to implement on the two platforms you use,
and want everyone else to implement the interface you choose,
that seems to me an unreasonable request.

--Scott David Daniels
Sc***********@Acm.Org

Jul 19 '05 #11

Paul Rubin

Scott David Daniels <Sc***********@Acm.Org> writes:

Negative 0 isn't a NaN, it's just negative 0.

Right, but it is hard to construct in standard C.

Huh? It's just a hex constant.

Jul 19 '05 #12

Scott David Daniels

Paul Rubin wrote:

Scott David Daniels <Sc***********@Acm.Org> writes:
Negative 0 isn't a NaN, it's just negative 0.

Right, but it is hard to construct in standard C.

Huh? It's just a hex constant.

Well, -0.0 doesn't work, and (double)0x80000000 doesn't work,
and.... I think you have to use quirks of a compiler to create
it. And I don't know how to test for it either, x < 0.0 is
not necessarily true for negative 0.

I am not trying to say there is no way to do this. I am
trying to say it takes thought and effort on every detail,
in the definition, implementations, and unit tests.

--Scott David Daniels
Sc***********@Acm.Org

Jul 19 '05 #13

Paul Rubin

Scott David Daniels <Sc***********@Acm.Org> writes:

Negative 0 isn't a NaN, it's just negative 0.

Right, but it is hard to construct in standard C.

Huh? It's just a hex constant.

Well, -0.0 doesn't work, and (double)0x80000000 doesn't work,
and.... I think you have to use quirks of a compiler to create
it. And I don't know how to test for it either, x < 0.0 is
not necessarily true for negative 0.

Aren't we talking about IEEE 754 arithmetic? There's some specific
bit pattern(s) for -0.0 and you can assign a float variable to
such a pattern.

Jul 19 '05 #14

Tim Peters

[with the start of US summer comes the start of 754 ranting season]

[Grant Edwards]

Negative 0 isn't a NaN, it's just negative 0.
[Scott David Daniels] Right, but it is hard to construct in standard C.
[Paul Rubin]
Huh? It's just a hex constant.

[Scott David Daniels] Well, -0.0 doesn't work,
C89 doesn't define the result of that, but "most" C compilers these
days will create a negative 0.
and (double)0x80000000 doesn't work,
In part because that's an integer <wink>, and in part because it's
only 32 bits. It requires representation casting tricks (not
conversion casting tricks like the above), knowledge of the platform
endianness, and knowledge of the platform integer sizes. Assuming the
platform uses 754 bit layout to begin with, of course.
and.... I think you have to use quirks of a compiler to create
it.
You at least need platform knowledge. It's really not hard, if you
can assume enough about the platform.
And I don't know how to test for it either, x < 0.0 is
not necessarily true for negative 0.
If it's a 754-conforming C compiler, that's necessarily false (+0 and
-0 compare equal in 754). Picking the bits apart is again the closest
thing to a portable test. Across platforms with a 754-conforming
libm, the most portable way is via using atan2(!):
pz = 0.0
mz = -pz
from math import atan2
atan2(pz, pz) 0.0 atan2(mz, mz) -3.1415926535897931

It's tempting to divide into 1, then check the sign of the infinity,
but Python stops you from doing that:
1/pz

Traceback (most recent call last):
File "<stdin>", line 1, in ?
ZeroDivisionError: float division

That can't be done at the C level either, because _some_ people run
Python with their 754 HW floating-point zero-division, overflow, and
invalid operation traps enabled, and then anything like division by 0
causes the interpreter to die. The CPython implementation is
constrained that way.

Note that Python already has Py_IS_NAN and Py_IS_INFINITY macros in
pyport.h, and the Windows build maps them to appropriate
Microsoft-specific library functions. I think it's stuck waiting on
others to care enough to supply them for other platforms. If a
platform build doesn't #define them, a reasonable but cheap attempt is
made to supply "portable" code sequences for them, but, as the
pyport.h comments note, they're guaranteed to do wrong things in some
cases, and may not work at all on some platforms. For example, the
default

#define Py_IS_NAN(X) ((X) != (X))

is guaranteed never to return true under MSVC 6.0.
I am not trying to say there is no way to do this. I am
trying to say it takes thought and effort on every detail,
in the definition, implementations, and unit tests.

It's par for the course -- everyone thinks "this must be easy" at
first, and everyone who persists eventually gives up. Kudos to
Michael Hudson for persisting long enough to make major improvements
here in pickle, struct and marshal for Python 2.5!

Jul 19 '05 #15

Grant Edwards

On 2005-06-22, Scott David Daniels <Sc***********@Acm.Org> wrote:

Grant Edwards wrote:
On 2005-06-22, Scott David Daniels <Sc***********@Acm.Org> wrote:
Several issues:
(1) The number of distinct NaNs varies among platforms.

According to the IEEE standard, there are exactly two:
signalling and quiet, and on platforms that don't impliment
floating point exceptions (probably in excess of 99.9% of
python installations), the difference between the two is moot.

But it does not specify the representation of such NaNs.

Yes, it does. It specifies it exactly: certain bits are ones,
certain other bits are zeros. I don't know how much more exactly
the representation can be defined.

The bit patterns are defined by the IEEE 754 standard. Perhaps this is right and I misunderstand the standard, but my
understanding is that the full bit pattern is not, in fact,
defined.
The represntation of NaNs, infinities, normalized numbers and
denormal numbers are all completely defined by the standard.
If there are Python-hosting platoforms that don't use IEEE 754 as
the floating point representation, then that can be dealt with. There are.
That's where it gets nasty.
Python has _tons_ of platform-specific code in it.

But the _tons_ are written in C89 C.
True.
It is not taboo. I am trying to explain why it is not a
trivial task, but a substantial effort.
It's trivial for platforms that obey the IEEE 754 standard.
If you are willing to perform the substantial effort, good on
you, and I'll help. If you simply want to implement on the two
platforms you use, and want everyone else to implement the
interface you choose, that seems to me an unreasonable
request.

I would think that implimenenting things according to the IEEE
standard and letting non-standard platforms figure out what to
do for themselves would seem a reasonable approach.

--
Grant Edwards grante Yow! Now I understand the
at meaning of "THE MOD SQUAD"!
visi.com

Jul 19 '05 #16

Grant Edwards

On 2005-06-22, Paul Rubin <http> wrote:

Negative 0 isn't a NaN, it's just negative 0.

Right, but it is hard to construct in standard C.

Huh? It's just a hex constant.

Yup. There are two ways to construct a NaN. One is to do
something like (1e300*1e300)/(1e300*1e300) and hope for the
best. The other is to assume IEEE 754 just use 7f800000 or
7fc00000 depending on whether you want a signalling or quiet
NaN.

--
Grant Edwards grante Yow! Don't hit me!! I'm in
at the Twilight Zone!!!
visi.com

Jul 19 '05 #17

Grant Edwards

On 2005-06-23, Paul Rubin <http> wrote:

Scott David Daniels <Sc***********@Acm.Org> writes:
>>>Negative 0 isn't a NaN, it's just negative 0.
>>
>>Right, but it is hard to construct in standard C.
> Huh? It's just a hex constant. Well, -0.0 doesn't work, and (double)0x80000000 doesn't work,
and.... I think you have to use quirks of a compiler to create
it. And I don't know how to test for it either, x < 0.0 is
not necessarily true for negative 0.

Aren't we talking about IEEE 754 arithmetic?

Mainly, yes.
There's some specific bit pattern(s) for -0.0 and you can
assign a float variable to such a pattern.

Yup.

--
Grant Edwards grante Yow! My Aunt MAUREEN was
at a military advisor to IKE &
visi.com TINA TURNER!!

Jul 19 '05 #18

Grant Edwards

On 2005-06-23, Tim Peters <ti********@gmail.com> wrote:

C89 doesn't define the result of that, but "most" C compilers these
days will create a negative 0.
and (double)0x80000000 doesn't work,
I think you meant something like

float f;
*((uint32_t*)&d) = 0xNNNNNNNN;
And I don't know how to test for it either, x < 0.0 is not
necessarily true for negative 0.

If it's a 754-conforming C compiler, that's necessarily false (+0 and
-0 compare equal in 754). Picking the bits apart is again the closest
thing to a portable test. Across platforms with a 754-conforming
libm, the most portable way is via using atan2(!):

[brain-bending example elided]

It's probobly because of the domains in which I work, but I
don't think I've ever cared whether a zero is positive or
negative. I understand why it's easier to impliment things
that way, but I don't see why anybody would care. OTOH, NaNs
and Infinities are indisposable for real-world stuff.

I am not trying to say there is no way to do this. I am
trying to say it takes thought and effort on every detail, in
the definition, implementations, and unit tests.

It's par for the course -- everyone thinks "this must be easy"
at first, and everyone who persists eventually gives up.
Kudos to Michael Hudson for persisting long enough to make
major improvements here in pickle, struct and marshal for
Python 2.5!

I would think it doable if one assumed IEEE-754 FP (famous last
words). I suppose there are still a few VAX machines around.
And there are things like TI DSPs that don't use IEEE-754.

--
Grant Edwards grante Yow! RELAX!!... This
at is gonna be a HEALING
visi.com EXPERIENCE!! Besides,
I work for DING DONGS!

Jul 19 '05 #19

Grant Edwards

On 2005-06-23, Grant Edwards <gr****@visi.com> wrote:

On 2005-06-23, Tim Peters <ti********@gmail.com> wrote:
C89 doesn't define the result of that, but "most" C compilers these
days will create a negative 0.
and (double)0x80000000 doesn't work,

I think you meant something like

float f;
*((uint32_t*)&d) = 0xNNNNNNNN;

*((uint32_t*)&f) = 0xNNNNNNNN;

It doesn't matter how many times one proofreads things like
that...

--
Grant Edwards grante Yow! I will establish
at the first SHOPPING MALL in
visi.com NUTLEY, New Jersey...

Jul 19 '05 #20

Ivan Van Laningham

Hi All--

Tim Peters wrote:

Across platforms with a 754-conforming
libm, the most portable way is via using atan2(!):
pz = 0.0
mz = -pz
from math import atan2
atan2(pz, pz) 0.0 atan2(mz, mz)

-3.1415926535897931

Never fails. Tim, you gave me the best laugh of the day.

Metta,
Ivan
----------------------------------------------
Ivan Van Laningham
God N Locomotive Works
http://www.andi-holmes.com/
http://www.foretec.com/python/worksh...oceedings.html
Army Signal Corps: Cu Chi, Class of '70
Author: Teach Yourself Python in 24 Hours

Jul 19 '05 #21

Tim Peters

[Tim Peters]
....

Across platforms with a 754-conforming libm, the most portable way [to
distinguish +0.0 from -0.0 in standard C] is via using atan2(!):
>>> pz = 0.0
>>> mz = -pz
>>> from math import atan2
>>> atan2(pz, pz) 0.0
>>> atan2(mz, mz)

-3.1415926535897931

[Ivan Van Laningham] Never fails. Tim, you gave me the best laugh of the day.

Well, I try, Ivan. But lest the point be missed <wink>, 754 doesn't
_want_ +0 and -0 to act differently in "almost any" way. The only
good rationale I've seen for why it makes the distinction at all is in
Kahan's paper "Branch Cuts for Complex
Elementary Functions, or Much Ado About Nothing's Sign Bit". There
are examples in that where, when working with complex numbers, you can
easily stumble into getting real-world dead-wrong results if there's
only one flavor of 0. And, of course, atan2 exists primarily to help
convert complex numbers from rectangular to polar form.

Odd bit o' trivia: following "the rules" for signed zeroes in 754
makes exponeniation c**n ambiguous, where c is a complex number with
c.real == c.imag == 0.0 (but the zeroes may be signed), and n is a
positive integer. The signs on the zeroes coming out can depend on
the exact order in which multiplications are performed, because the
underlying multiplication isn't associative despite that it's exact.
I stumbled into this in the 80's when KSR's Fortran compiler failed a
federal conformance test, precisely because the test did atan2 on the
components of an all-zero complex raised to an integer power, and I
had written one of the few 754-conforming libms at the time. They
wanted 0, while my atan2 dutifully returned -pi. I haven't had much
personal love for 754 esoterica since then ...

Jul 19 '05 #22

Steven D'Aprano

On Thu, 23 Jun 2005 00:11:20 -0400, Tim Peters wrote:

Well, I try, Ivan. But lest the point be missed <wink>, 754 doesn't
_want_ +0 and -0 to act differently in "almost any" way. The only
good rationale I've seen for why it makes the distinction at all is in
Kahan's paper "Branch Cuts for Complex
Elementary Functions, or Much Ado About Nothing's Sign Bit". There
are examples in that where, when working with complex numbers, you can
easily stumble into getting real-world dead-wrong results if there's
only one flavor of 0. And, of course, atan2 exists primarily to help
convert complex numbers from rectangular to polar form.
It isn't necessary to look at complex numbers to see the difference
between positive and negative zero. Just look at a graph of y=1/x. In
particular, look at the behaviour of the graph around x=0. Now tell me
that the sign of zero doesn't make a difference.

Signed zeroes also preserve 1/(1/x) == x for all x, admittedly at the cost
of y==x iff 1/y == 1/x (which fails for y=-0 and x=+0). Technically, -0
and +0 are not the same (for some definition of "technically"); but
practicality beats purity and it is more useful to have -0==+0 than the
alternative.
Odd bit o' trivia: following "the rules" for signed zeroes in 754
makes exponeniation c**n ambiguous, where c is a complex number with
c.real == c.imag == 0.0 (but the zeroes may be signed), and n is a
positive integer. The signs on the zeroes coming out can depend on
the exact order in which multiplications are performed, because the
underlying multiplication isn't associative despite that it's exact.
That's an implementation failure. Mathematically, the sign of 0**n should
depend only on whether n is odd or even. If c**n is ambiguous, then that's
a bug in the implementation, not the standard.
I stumbled into this in the 80's when KSR's Fortran compiler failed a
federal conformance test, precisely because the test did atan2 on the
components of an all-zero complex raised to an integer power, and I
had written one of the few 754-conforming libms at the time. They
wanted 0, while my atan2 dutifully returned -pi. I haven't had much
personal love for 754 esoterica since then ...

Sounds to me that the Feds wanted something broken and you gave them
something that was working. No wonder they failed you :-)

--
Steven.

Jul 19 '05 #23

Tim Peters

[Tim Peters']

Well, I try, Ivan. But lest the point be missed <wink>, 754 doesn't
_want_ +0 and -0 to act differently in "almost any" way. The only
good rationale I've seen for why it makes the distinction at all is in
Kahan's paper "Branch Cuts for Complex
Elementary Functions, or Much Ado About Nothing's Sign Bit". There
are examples in that where, when working with complex numbers, you can
easily stumble into getting real-world dead-wrong results if there's
only one flavor of 0. And, of course, atan2 exists primarily to help
convert complex numbers from rectangular to polar form.
[Steven D'Aprano]
It isn't necessary to look at complex numbers to see the difference
between positive and negative zero. Just look at a graph of y=1/x. In
particular, look at the behaviour of the graph around x=0. Now tell me
that the sign of zero doesn't make a difference.
OK, I looked, and it made no difference to me. Really. If I had an
infinitely tall monitor, maybe I could see a difference, but I don't
-- the sign of 0 on the nose makes no difference to the behavior of
1/x for any x other than 0. On my finite monitor, I see it looks like
the line x=0 is an asymptote, and the graph approaches minus infinity
on that line from the left and positive infinity from the right; the
value of 1/0 doesn't matter to that.
Signed zeroes also preserve 1/(1/x) == x for all x,
No, signed zeros "preverse" that identity for exactly the set {+Inf,
-Inf}, and that's all. That's worth something, but 1/(1/x) == x isn't
generally true in 754 anyway. Most obviously, when x is subnormal,
1/x overflows to an infinity (the 754 exponent range isn't symmetric
around 0 -- subnormals make it "heavy" on the negative side), and then
1/(1/x) is a zero, not x. 1/(1/x) == x doesn't hold for a great many
normal x either (pick a pile at random and check -- you'll find
counterexamples quickly).
admittedly at the cost of y==x iff 1/y == 1/x (which fails for y=-0 and x=+0).

Technically, -0 and +0 are not the same (for some definition of "technically"); but
practicality beats purity and it is more useful to have -0==+0 than the alternative.
Can just repeat that the only good rationale I've seen is in Kahan's
paper (previously referenced).
Odd bit o' trivia: following "the rules" for signed zeroes in 754
makes exponeniation c**n ambiguous, where c is a complex number with
c.real == c.imag == 0.0 (but the zeroes may be signed), and n isa
positive integer. The signs on the zeroes coming out can depend on
the exact order in which multiplications are performed, because the
underlying multiplication isn't associative despite that it's exact. That's an implementation failure. Mathematically, the sign of 0**n should
depend only on whether n is odd or even. If c**n is ambiguous, then that's
a bug in the implementation, not the standard.
As I said, these are complex zeroes, not real zeroes. The 754
standard doesn't say anything about complex numbers. In rectangular
form, a complex zero contains two real zeroes. There are 4
possiblities for a complex zero if the components are 754
floats/doubles:

+0+0i
+0-0i
-0+0i
-0-0i

Implement Cartesian complex multiplication in the obvious way:

(a+bi)(c+di) = (ac-bd) + (ad+bc)i

Now use that to raise the four complex zeroes above to various integer
powers, trying different ways of grouping the multiplications. For
example, x**4 can be computed as

((xx)x)x

or

(xx)(xx)

or

x((xx)x)

etc. You'll discover that, in some cases, for fixed x and n, the
signs of the zeroes in the result depend how the multiplications were
grouped. The 754 standard says nothing about any of this, _except_
for the results of multiplying and adding 754 zeroes. Multiplication
of signed zeroes in 754 is associative. The problem is that the
extension to Cartesian complex multiplication isn't associative under
these rules in some all-zero cases, mostly because the sum of two
signed zeroes is (under 3 of the rounding modes) +0 unless both
addends are -0. Try examples and you'll discover this for yourself.

I was part of NCEG (the Numerical C Extension Group) at the time I
stumbled into this, and they didn't have any trouble following it
<wink>. It was a surprise to everyone at the time that Cartesian
multiplication of complex zeroes lost associativity when applying 754
rules in the obvious way, and no resolution was reached at that time.
I stumbled into this in the 80's when KSR's Fortran compiler failed a
federal conformance test, precisely because the test did atan2 on the
components of an all-zero complex raised to an integer power, and I
had written one of the few 754-conforming libms at the time. They
wanted 0, while my atan2 dutifully returned -pi. I haven't had much
personal love for 754 esoterica since then ...

Sounds to me that the Feds wanted something broken and you gave them
something that was working. No wonder they failed you :-)

Yup, and they did a lot of that <0.9 wink>. Luckily(?), Fortran is so
eager to allow optimizations that failure due to numeric differences
in conformance tests rarely withstood challenge.

Jul 19 '05 #24

Ivan Van Laningham

Hi All--

Tim Peters wrote:

Fortran is so
eager to allow optimizations that failure due to numeric differences
in conformance tests rarely withstood challenge.

+1 QOTW

Metta,
Ivan
----------------------------------------------
Ivan Van Laningham
God N Locomotive Works
http://www.andi-holmes.com/
http://www.foretec.com/python/worksh...oceedings.html
Army Signal Corps: Cu Chi, Class of '70
Author: Teach Yourself Python in 24 Hours

Jul 19 '05 #25

Steven D'Aprano

Tim Peters wrote:

[Steven D'Aprano]
It isn't necessary to look at complex numbers to see the difference
between positive and negative zero. Just look at a graph of y=1/x. In
particular, look at the behaviour of the graph around x=0. Now tell me
that the sign of zero doesn't make a difference.

OK, I looked, and it made no difference to me. Really. If I had an
infinitely tall monitor, maybe I could see a difference, but I don't
-- the sign of 0 on the nose makes no difference to the behavior of
1/x for any x other than 0. On my finite monitor, I see it looks like
the line x=0 is an asymptote, and the graph approaches minus infinity
on that line from the left and positive infinity from the right; the
value of 1/0 doesn't matter to that.

Well, I didn't say that the value of zero made a
difference for _other_ values of x. Perhaps you and I
are interpreting the same graph differently. To me, the
behaviour of 1/x around x = 0 illustrates why +0 and -0
are different, but it isn't worth arguing about.

Signed zeroes also preserve 1/(1/x) == x for all x,

No, signed zeros "preverse" that identity for exactly the set {+Inf,
-Inf}, and that's all.

Preverse? Did you mean "pervert"? Or just a mispelt
"preserve"? Sorry, not trying to be a spelling Nazi, I
genuinely don't understand what you are trying to get
across here.

That's worth something, but 1/(1/x) == x isn't
generally true in 754 anyway.
Of course it isn't. Sorry, I was thinking like a
mathematician instead of a programmer again.

(Note to self: write out 1000 lines, "Real number !=
floating point number".)

[snip]

Odd bit o' trivia: following "the rules" for signed zeroes in 754
makes exponeniation c**n ambiguous, where c is a complex number with
c.real == c.imag == 0.0 (but the zeroes may be signed), and n is a
positive integer. The signs on the zeroes coming out can depend on
the exact order in which multiplications are performed, because the
underlying multiplication isn't associative despite that it's exact.

That's an implementation failure. Mathematically, the sign of 0**n should
depend only on whether n is odd or even. If c**n is ambiguous, then that's
a bug in the implementation, not the standard.

As I said, these are complex zeroes, not real zeroes. The 754
standard doesn't say anything about complex numbers. In rectangular
form, a complex zero contains two real zeroes. There are 4
possiblities for a complex zero if the components are 754
floats/doubles:

+0+0i
+0-0i
-0+0i
-0-0i

Implement Cartesian complex multiplication in the obvious way:

(a+bi)(c+di) = (ac-bd) + (ad+bc)i

Yes, but that supports what I said: it is an
_implementation_ issue. A different implementation
might recognise a complex zero and return the correctly
signed complex zero without actually doing the
multiplication.

Assuming mathematicians can decide on which complex
zero is the correct one.

Now use that to raise the four complex zeroes above to various integer
powers, trying different ways of grouping the multiplications. For
example, x**4 can be computed as

((xx)x)x

or

(xx)(xx)

or

x((xx)x)

etc. You'll discover that, in some cases, for fixed x and n, the
signs of the zeroes in the result depend how the multiplications were
grouped. The 754 standard says nothing about any of this, _except_
for the results of multiplying and adding 754 zeroes. Multiplication
of signed zeroes in 754 is associative. The problem is that the
extension to Cartesian complex multiplication isn't associative under
these rules in some all-zero cases, mostly because the sum of two
signed zeroes is (under 3 of the rounding modes) +0 unless both
addends are -0. Try examples and you'll discover this for yourself.

Yes, point taken. But that is just another example of
where floats fail to be sufficiently close to real
numbers. In a "perfect" representation, this would not
be a factor.

However... now that I think of it... in polar form,
there are an uncountably infinite number of complex zeroes.

z = 0*cis(0), z = 0*cis(0.1), z = 0*cis(-0.21), ...

I think I won't touch that one with a fifty foot pole.
--
Steven.

Jul 19 '05 #26

Robert Kern

Tim Peters wrote:

OK, I looked, and it made no difference to me. Really. If I had an
infinitely tall monitor, maybe I could see a difference, but I don't
-- the sign of 0 on the nose makes no difference to the behavior of
1/x for any x other than 0. On my finite monitor, I see it looks like
the line x=0 is an asymptote, and the graph approaches minus infinity
on that line from the left and positive infinity from the right; the
value of 1/0 doesn't matter to that.

Well, the value of 1/0 is undefined. Occasionally, it's useful to report
+inf as the value of 1.0/+0.0 because practically we're more concerned
with limiting behavior from an assumed limiting process than being
correct. By the same token, we might also be concerned with the limiting
behavior coming from the other direction (a different limiting process),
so we might want 1.0/-0.0 to give -inf (although it's still actually
undefined, no different from the first expression, and inf is really the
same thing as -inf, too).

Although I haven't read the paper you cited, it seems to me that the
branch cut issue is the same thing. If you're on the cut itself, the
value, practically, depends on which end of the branch you're deciding
to approach the point from. It's arbitrary; there's no correct answer;
but signed zeros give a way to express some of the desired, useful but
wrong answers.

And floating point is about nothing if not being usefully wrong.

--
Robert Kern
rk***@ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Jul 19 '05 #27

Kent Johnson

Robert Kern wrote:

And floating point is about nothing if not being usefully wrong.

+1 QOTW and maybe it could be worked into the FAQ about floats as well :-)
http://www.python.org/doc/faq/genera...-so-inaccurate

Kent

Jul 19 '05 #28

Michael Hudson

"Terry Reedy" <tj*****@udel.edu> writes:

"Grant Edwards" <gr****@visi.com> wrote in message
news:11*************@corp.supernews.com...
I'm working on it. I should have said it's trivial if you have
access to the platforms to be supported. I've tested a fix
that supports pickle streams generated under Win32 and glibc.
That's using the "native" string representation of a NaN or
Inf.

A perhaps simpler approach would be to define a string
representation for Python to use for NaN and Inf. Just because
something isn't defined by the C standard doesn't mean it can't
be defined by Python.

I believe that changes have been made to marshal/unmarshal in 2.5 CVS with
respect to NAN/INF to eliminate annoying/surprising behavior differences
between corresponding .py and .pyc files. Perhaps these revisions would be
relevant to pickle changes.

If you use a binary protocol for pickle, yes.

Cheers,
mwh

--
Java sucks. [...] Java on TV set top boxes will suck so hard it
might well inhale people from off their sofa until their heads
get wedged in the card slots. --- Jon Rabone, ucam.chat

Jul 21 '05 #29

pickle broken: can't handle NaN or Infinity under win32

Similar topics