pickle broken: can't handle NaN or Infinity under win32 - Page 2

Grant Edwards

I finally figured out why one of my apps sometimes fails under
Win32 when it always works fine under Linux: Under Win32, the
pickle module only works with a subset of floating point
values. In particular the if you try to dump/load an infinity
or nan value, the load operation chokes:
Under Linux:

$ python
Python 2.3.4 (#2, Feb 9 2005, 14:22:48)
[GCC 3.4.1 (Mandrakelinux 10.1 3.4.1-4mdk)] on linux2
Type "help", "copyright" , "credits" or "license" for more information.

$ python pickletest.py
(inf, nan) (inf, nan)
Under Win32:

$ python
ActivePython 2.3.4 Build 233 (ActiveState Corp.) based on
Python 2.3.4 (#53, Oct 18 2004, 20:35:07) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright" , "credits" or "license" for more information.

$ python pickletest.py
Traceback (most recent call last):
File "pickletest.py" , line 8, in ?
d = pickle.loads(s)
File "C:\PYTHON23\li b\pickle.py", line 1394, in loads
return Unpickler(file) .load()
File "C:\PYTHON23\li b\pickle.py", line 872, in load
dispatch[key](self)
File "C:\PYTHON23\li b\pickle.py", line 968, in load_float
self.append(flo at(self.readlin e()[:-1]))
ValueError: invalid literal for float(): 1.#INF

I realize that this is probably due to underlying brokenness in
the Win32 libc implimentation, but should the pickle module
hide such platform-dependancies from the user?

Best case, it would be nice if pickle could handle all floats
in a portable way.

Worst case, shouldn't the pickle module documentation mention
that pickling floats non-portable or only partially implimented?

On a more immediate note, are there hooks in pickle to allow
the user to handle types that pickle can't deal with? Or, do I
have to throw out pickle and write something from scratch?

[NaN and Infinity are prefectly valid (and extremely useful)
floating point values, and not using them would require huge
complexity increases in my apps (not using them would probably
at least triple the amount of code required in some cases).]

--
Grant Edwards grante Yow! Yow!
at
visi.com

Jul 19 '05

Subscribe Reply

2680

Scott David Daniels

Grant Edwards wrote:

On 2005-06-22, Scott David Daniels <Sc***********@ Acm.Org> wrote:
Several issues:
(1) The number of distinct NaNs varies among platforms.

According to the IEEE standard, there are exactly two:
signalling and quiet, and on platforms that don't impliment
floating point exceptions (probably in excess of 99.9% of
python installations), the difference between the two is moot.

But it does not specify the representation of such NaNs.
Negative 0 isn't a NaN, it's just negative 0. Right, but it is hard to construct in standard C.

(2) There is no standard-conforming way to create these values.

What standard are you looking at? My copy of the IEEE 754
standard is pretty clear.

I was talking about the C89 standard. In the absence of a C
standard way to create, manipulate, and test for these values,
you must implement and test platform-by-platform.
The bit patterns are defined by the IEEE 754 standard. Perhaps this is right and I misunderstand the standard, but my
understanding is that the full bit pattern is not, in fact,
defined.
If there are Python-hosting platoforms that don't use IEEE 754 as
the floating point representation, then that can be dealt with. There are.
Python has _tons_ of platform-specific code in it. But the _tons_ are written in C89 C.
Why all of a sudden is it taboo for Python to impliment
something that's not universally portable and defined in a
standard? Where's the standard defining Python?

It is not taboo. I am trying to explain why it is not a
trivial task, but a substantial effort. If you are willing
to perform the substantial effort, good on you, and I'll help.
If you simply want to implement on the two platforms you use,
and want everyone else to implement the interface you choose,
that seems to me an unreasonable request.

--Scott David Daniels
Sc***********@A cm.Org

Jul 19 '05 #11

Paul Rubin

Scott David Daniels <Sc***********@ Acm.Org> writes:

Negative 0 isn't a NaN, it's just negative 0.

Right, but it is hard to construct in standard C.

Huh? It's just a hex constant.

Jul 19 '05 #12

Scott David Daniels

Paul Rubin wrote:

Scott David Daniels <Sc***********@ Acm.Org> writes:
Negative 0 isn't a NaN, it's just negative 0.

Right, but it is hard to construct in standard C.

Huh? It's just a hex constant.

Well, -0.0 doesn't work, and (double)0x80000 000 doesn't work,
and.... I think you have to use quirks of a compiler to create
it. And I don't know how to test for it either, x < 0.0 is
not necessarily true for negative 0.

I am not trying to say there is no way to do this. I am
trying to say it takes thought and effort on every detail,
in the definition, implementations , and unit tests.

--Scott David Daniels
Sc***********@A cm.Org

Jul 19 '05 #13

Paul Rubin

Scott David Daniels <Sc***********@ Acm.Org> writes:

Negative 0 isn't a NaN, it's just negative 0.

Right, but it is hard to construct in standard C.

Huh? It's just a hex constant.

Well, -0.0 doesn't work, and (double)0x80000 000 doesn't work,
and.... I think you have to use quirks of a compiler to create
it. And I don't know how to test for it either, x < 0.0 is
not necessarily true for negative 0.

Aren't we talking about IEEE 754 arithmetic? There's some specific
bit pattern(s) for -0.0 and you can assign a float variable to
such a pattern.

Jul 19 '05 #14

Tim Peters

[with the start of US summer comes the start of 754 ranting season]

[Grant Edwards]

Negative 0 isn't a NaN, it's just negative 0.
[Scott David Daniels] Right, but it is hard to construct in standard C.
[Paul Rubin]
Huh? It's just a hex constant.

[Scott David Daniels] Well, -0.0 doesn't work,
C89 doesn't define the result of that, but "most" C compilers these
days will create a negative 0.
and (double)0x80000 000 doesn't work,
In part because that's an integer <wink>, and in part because it's
only 32 bits. It requires representation casting tricks (not
conversion casting tricks like the above), knowledge of the platform
endianness, and knowledge of the platform integer sizes. Assuming the
platform uses 754 bit layout to begin with, of course.
and.... I think you have to use quirks of a compiler to create
it.
You at least need platform knowledge. It's really not hard, if you
can assume enough about the platform.
And I don't know how to test for it either, x < 0.0 is
not necessarily true for negative 0.
If it's a 754-conforming C compiler, that's necessarily false (+0 and
-0 compare equal in 754). Picking the bits apart is again the closest
thing to a portable test. Across platforms with a 754-conforming
libm, the most portable way is via using atan2(!):
pz = 0.0
mz = -pz
from math import atan2
atan2(pz, pz) 0.0 atan2(mz, mz) -3.1415926535897 931

It's tempting to divide into 1, then check the sign of the infinity,
but Python stops you from doing that:
1/pz

Traceback (most recent call last):
File "<stdin>", line 1, in ?
ZeroDivisionErr or: float division

That can't be done at the C level either, because _some_ people run
Python with their 754 HW floating-point zero-division, overflow, and
invalid operation traps enabled, and then anything like division by 0
causes the interpreter to die. The CPython implementation is
constrained that way.

Note that Python already has Py_IS_NAN and Py_IS_INFINITY macros in
pyport.h, and the Windows build maps them to appropriate
Microsoft-specific library functions. I think it's stuck waiting on
others to care enough to supply them for other platforms. If a
platform build doesn't #define them, a reasonable but cheap attempt is
made to supply "portable" code sequences for them, but, as the
pyport.h comments note, they're guaranteed to do wrong things in some
cases, and may not work at all on some platforms. For example, the
default

#define Py_IS_NAN(X) ((X) != (X))

is guaranteed never to return true under MSVC 6.0.
I am not trying to say there is no way to do this. I am
trying to say it takes thought and effort on every detail,
in the definition, implementations , and unit tests.

It's par for the course -- everyone thinks "this must be easy" at
first, and everyone who persists eventually gives up. Kudos to
Michael Hudson for persisting long enough to make major improvements
here in pickle, struct and marshal for Python 2.5!

Jul 19 '05 #15

Grant Edwards

On 2005-06-22, Scott David Daniels <Sc***********@ Acm.Org> wrote:

Grant Edwards wrote:
On 2005-06-22, Scott David Daniels <Sc***********@ Acm.Org> wrote:
Several issues:
(1) The number of distinct NaNs varies among platforms.

According to the IEEE standard, there are exactly two:
signalling and quiet, and on platforms that don't impliment
floating point exceptions (probably in excess of 99.9% of
python installations), the difference between the two is moot.

But it does not specify the representation of such NaNs.

Yes, it does. It specifies it exactly: certain bits are ones,
certain other bits are zeros. I don't know how much more exactly
the representation can be defined.

The bit patterns are defined by the IEEE 754 standard. Perhaps this is right and I misunderstand the standard, but my
understanding is that the full bit pattern is not, in fact,
defined.
The represntation of NaNs, infinities, normalized numbers and
denormal numbers are all completely defined by the standard.
If there are Python-hosting platoforms that don't use IEEE 754 as
the floating point representation, then that can be dealt with. There are.
That's where it gets nasty.
Python has _tons_ of platform-specific code in it.

But the _tons_ are written in C89 C.
True.
It is not taboo. I am trying to explain why it is not a
trivial task, but a substantial effort.
It's trivial for platforms that obey the IEEE 754 standard.
If you are willing to perform the substantial effort, good on
you, and I'll help. If you simply want to implement on the two
platforms you use, and want everyone else to implement the
interface you choose, that seems to me an unreasonable
request.

I would think that implimenenting things according to the IEEE
standard and letting non-standard platforms figure out what to
do for themselves would seem a reasonable approach.

--
Grant Edwards grante Yow! Now I understand the
at meaning of "THE MOD SQUAD"!
visi.com

Jul 19 '05 #16

Grant Edwards

On 2005-06-22, Paul Rubin <http> wrote:

Negative 0 isn't a NaN, it's just negative 0.

Right, but it is hard to construct in standard C.

Huh? It's just a hex constant.

Yup. There are two ways to construct a NaN. One is to do
something like (1e300*1e300)/(1e300*1e300) and hope for the
best. The other is to assume IEEE 754 just use 7f800000 or
7fc00000 depending on whether you want a signalling or quiet
NaN.

--
Grant Edwards grante Yow! Don't hit me!! I'm in
at the Twilight Zone!!!
visi.com

Jul 19 '05 #17

Grant Edwards

On 2005-06-23, Paul Rubin <http> wrote:

Scott David Daniels <Sc***********@ Acm.Org> writes:
>>>Negative 0 isn't a NaN, it's just negative 0.
>>
>>Right, but it is hard to construct in standard C.
> Huh? It's just a hex constant. Well, -0.0 doesn't work, and (double)0x80000 000 doesn't work,
and.... I think you have to use quirks of a compiler to create
it. And I don't know how to test for it either, x < 0.0 is
not necessarily true for negative 0.

Aren't we talking about IEEE 754 arithmetic?

Mainly, yes.
There's some specific bit pattern(s) for -0.0 and you can
assign a float variable to such a pattern.

Yup.

--
Grant Edwards grante Yow! My Aunt MAUREEN was
at a military advisor to IKE &
visi.com TINA TURNER!!

Jul 19 '05 #18

Grant Edwards

On 2005-06-23, Tim Peters <ti********@gma il.com> wrote:

C89 doesn't define the result of that, but "most" C compilers these
days will create a negative 0.
and (double)0x80000 000 doesn't work,
I think you meant something like

float f;
*((uint32_t*)&d ) = 0xNNNNNNNN;
And I don't know how to test for it either, x < 0.0 is not
necessarily true for negative 0.

If it's a 754-conforming C compiler, that's necessarily false (+0 and
-0 compare equal in 754). Picking the bits apart is again the closest
thing to a portable test. Across platforms with a 754-conforming
libm, the most portable way is via using atan2(!):

[brain-bending example elided]

It's probobly because of the domains in which I work, but I
don't think I've ever cared whether a zero is positive or
negative. I understand why it's easier to impliment things
that way, but I don't see why anybody would care. OTOH, NaNs
and Infinities are indisposable for real-world stuff.

I am not trying to say there is no way to do this. I am
trying to say it takes thought and effort on every detail, in
the definition, implementations , and unit tests.

It's par for the course -- everyone thinks "this must be easy"
at first, and everyone who persists eventually gives up.
Kudos to Michael Hudson for persisting long enough to make
major improvements here in pickle, struct and marshal for
Python 2.5!

I would think it doable if one assumed IEEE-754 FP (famous last
words). I suppose there are still a few VAX machines around.
And there are things like TI DSPs that don't use IEEE-754.

--
Grant Edwards grante Yow! RELAX!!... This
at is gonna be a HEALING
visi.com EXPERIENCE!! Besides,
I work for DING DONGS!

Jul 19 '05 #19

Grant Edwards

On 2005-06-23, Grant Edwards <gr****@visi.co m> wrote:

On 2005-06-23, Tim Peters <ti********@gma il.com> wrote:
C89 doesn't define the result of that, but "most" C compilers these
days will create a negative 0.
and (double)0x80000 000 doesn't work,

I think you meant something like

float f;
*((uint32_t*)&d ) = 0xNNNNNNNN;

*((uint32_t*)&f ) = 0xNNNNNNNN;

It doesn't matter how many times one proofreads things like
that...

--
Grant Edwards grante Yow! I will establish
at the first SHOPPING MALL in
visi.com NUTLEY, New Jersey...

Jul 19 '05 #20

Similar topics

4019

AssertionError in pickle's memoize function

by: Michael Hohn | last post by:

Hi, under python 2.2, the pickle/unpickle sequence incorrectly restores a larger data structure I have. Under Python 2.3, these structures now give an explicit exception from Pickle.memoize(): assert id(obj) not in self.memo I'm shrinking the offending data structure down to find the problem

Python

2435

pickle alternative

by: simonwittber | last post by:

I've written a simple module which serializes these python types: IntType, TupleType, StringType, FloatType, LongType, ListType, DictType It available for perusal here: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/415503 It appears to work faster than pickle, however, the decode process is much slower (5x) than the encode process. Has anyone got any tips on

Python

1782

XML Pickle with PyGraphLib - Problems

by: Mike P. | last post by:

Hi all, I'm working on a simulation (can be considered a game) in Python where I want to be able to dump the simulation state to a file and be able to load it up later. I have used the standard Python pickle module and it works fine pickling/unpickling from files. However, I want to be able to use a third party tool like an XML editor (or other custom tool) to setup the initial state of the simulation, so I have been playing around...

Python

2347

pickle

by: Shi Mu | last post by:

I got a sample code and tested it but really can not understand the use of pickle and dump: >>> import pickle >>> f = open("try.txt", "w") >>> pickle.dump(3.14, f) >>> pickle.dump(, f) >>> f.close()

Python

1106

Is there anything that pickle + copy_reg cannot serialize?

by: Maurice LING | last post by:

Hi, I need to look into serialization for python objects, including codes, recursive types etc etc. Currently, I have no idea exactly what needs to be serialized, so my scope is to be as wide as possible. I understand that marshal is extended by pickle to serialize class instances, shared elements, and recursive data structures (http://www.effbot.org/librarybook/pickle.htm) but cannot handle code types. pickle can be used together...

Python

5548

how do you use pickle?

by: John Salerno | last post by:

Here's what I have: import pickle data = open(r'C:\pickle_data.txt', 'w') image = open(r'C:\peakhell.jpg') pickle.Pickler(data) data.dump(image) data.close() image.close()

Python

1739

pickle and infinity

by: Bart Ogryczak | last post by:

Hello, I´ve got this problem with pickle, it seems it doesn´t handle correctly infinite values (nor does Python return overflow/underflow error). What could I do about it? Example code: <type 'float'> 1.#INF ValueError: invalid literal for float(): 1.#INF

Python

1919

Python 2.4 does not marshal infinity floating point properly under Win32

by: Pierre Rouleau | last post by:

Hi all, When using Python 2.4.x on a Win32 box, marshal.loads(marshal.dumps(1e66666)) returns 1.0 instead of infinity as it should and does under Python 2.5 (also running on Win32 ). This problem was reported in another thread here by Peter Hansen http://groups-beta.google.com/group/comp.lang.python/browse_frm/thread/5c2b4b2a88c8df4/f216739705c9304f?lnk=gst&q=simplejson&rnum=5#f216739705c9304f Is this considered an important enough...

Python

2357

pickle passing client/server design

by: DwBear75 | last post by:

I am contemplating the need for a way to handle high speed data passing between two processes. One process would act as a queue that would 'buffer' data coming from another processes. Seems that the easiest way to handle the data would be to just pass pickles. Further, I'm thinking that using a unix domain socket would make this a simple way to pass high volumes of pickles. Are there any examples of an architecture like these, where a...

Python

9586

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

9423

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

10210

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

10043

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

9861

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

8869

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

6672

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

3956

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

3561

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP