numpy magic: cast scalar returns auto to python types float & int?

robert

Turning algs for old NumPy modules into numpy code I suffer from this:
Upon further processing of returns of numpy calculations, lots of data in an apps object tree will become elementary numpy types.
First there is some inefficiency in calculations. And then you get data inflation and questionable dependencies - e.g. with pickle,ZODB,mpi 's ... :

>>l=array((1.,0 ))
l.prod()

0.0

>>cPickle.dumps (_)

"cnumpy.core.mu ltiarray\nscala r\np1\n(cnumpy\ ndtype\np2\n(S' f8'\nI0\nI1\ntR p3\n(I2\nS'<'\n NNNI-1\nI-1\ntbS'\\x00\\x 00\\x00\\x00\\x 00\\x00\\x00\\x 00'\ntRp4\n."

>>cPickle.dumps (0.0)

'F0\n.'

>>l=array((1,0) )
l.prod()

>>cPickle.dumps (_)

"cnumpy.core.mu ltiarray\nscala r\np1\n(cnumpy\ ndtype\np2\n(S' i4'\nI0\nI1\ntR p3\n(I2\nS'<'\n NNNI-1\nI-1\ntbS'\\x00\\x 00\\x00\\x00'\n tRp4\n."

>>cPickle.dumps (0)

'I0\n.'

>>type(l.prod() )

<type 'numpy.int32'>
To avoid this you'd need a type cast in Python code everywhere you get scalars from numpy into a python variable. Error prone task. Or check/re-render your whole object tree.
Wouldn't it be much better if numpy would return Python scalars for float64 (maybe even for float32) and int32, int64 ... where possible? (as numarray and Numeric did)
I suppose numpy knows internally very quickly how to cast.
Or is there maybe a config-setting to turn numpy this way?

Robert

Nov 17 '06 #1

Subscribe Reply

4865

Tim Hochberg

robert wrote:

Turning algs for old NumPy modules into numpy code I suffer from this:
Upon further processing of returns of numpy calculations, lots of data in an apps object tree will become elementary numpy types.
First there is some inefficiency in calculations. And then you get data inflation and questionable dependencies - e.g. with pickle,ZODB,mpi 's ... :

>>>l=array((1., 0))
l.prod()

0.0

>>>cPickle.dump s(_)

"cnumpy.core.mu ltiarray\nscala r\np1\n(cnumpy\ ndtype\np2\n(S' f8'\nI0\nI1\ntR p3\n(I2\nS'<'\n NNNI-1\nI-1\ntbS'\\x00\\x 00\\x00\\x00\\x 00\\x00\\x00\\x 00'\ntRp4\n."

>>>cPickle.dump s(0.0)

'F0\n.'

>>>l=array((1,0 ))
l.prod()

0

>>>cPickle.dump s(_)

"cnumpy.core.mu ltiarray\nscala r\np1\n(cnumpy\ ndtype\np2\n(S' i4'\nI0\nI1\ntR p3\n(I2\nS'<'\n NNNI-1\nI-1\ntbS'\\x00\\x 00\\x00\\x00'\n tRp4\n."

>>>cPickle.dump s(0)

'I0\n.'

>>>type(l.prod( ))

<type 'numpy.int32'>
To avoid this you'd need a type cast in Python code everywhere you get scalars from numpy into a python variable. Error prone task. Or check/re-render your whole object tree.
Wouldn't it be much better if numpy would return Python scalars for float64 (maybe even for float32) and int32, int64 ... where possible? (as numarray and Numeric did)
I suppose numpy knows internally very quickly how to cast.

The short answer is no, it would not be better. There are some trade
offs involved here, but overall, always returning numpy scalars is a
significant improvement over returning Python scalars some of the time.
Which is why numpy does it that way now; it was a conscious choice, it
didn't just happen. Please search the archives of numpy-discussion for
previous discussions of this and if that is not enlightening enough
please ask at on the numpy-discussion list (the address of which just
changed and I don't have it handy, but I'm sure you can find it).

For your particular issue, you might try tweaking pickle to convert
int64 objects to int objects. Assuming of course that you have enough of
these to matter, otherwise, I suggest just leaving things alone.

-tim

Nov 17 '06 #2

robert

Tim Hochberg wrote:

robert wrote:
>To avoid this you'd need a type cast in Python code everywhere you get
scalars from numpy into a python variable. Error prone task. Or
check/re-render your whole object tree.
Wouldn't it be much better if numpy would return Python scalars for
float64 (maybe even for float32) and int32, int64 ... where possible?
(as numarray and Numeric did)
I suppose numpy knows internally very quickly how to cast.

The short answer is no, it would not be better. There are some trade
offs involved here, but overall, always returning numpy scalars is a
significant improvement over returning Python scalars some of the time.
Which is why numpy does it that way now; it was a conscious choice, it
didn't just happen. Please search the archives of numpy-discussion for
previous discussions of this and if that is not enlightening enough
please ask at on the numpy-discussion list (the address of which just
changed and I don't have it handy, but I'm sure you can find it).

Didn't find the relevant reasoning within time. Yet guess the reason is isolated-module-centric.
All further computations in python are much slower and I cannot even see a speed increase when (rare case) puting a numpy-ic scalar back into a numpy array:

>>a=array([1.,0,0,0,0])
f=1.0
fn=a[0]
type(fn)

>>timeit.Timer( "f+f",glbls=glo bals()).timeit( 10000)

0.0048265910890 909324

>>timeit.Timer( "f+f",glbls=glo bals()).timeit( 100000)

0.0459921582212 26376

>>timeit.Timer( "fn+fn",glbls=g lobals()).timei t(100000)

0.1490130728905 4877

>>timeit.Timer( "a[1]=f",glbls=globa ls()).timeit(10 0000)

0.0608256077238 99111

>>timeit.Timer( "a[1]=fn",glbls=glob als()).timeit(1 00000)

0.0595195758120 04177

>>timeit.Timer( "x=a[0]",glbls=globals ()).timeit(1000 00)

0.1230231775267 6117

>>timeit.Timer( "x=float(a[0])",glbls=global s()).timeit(100 000)

0.3155627321349 6411

creation of numpy scalar objects seems not be cheap/advantagous anyway:

>>oa=array([1.0,1.0,1.0,1.0 ,1],numpy.object)
oa

array([1.0, 1.0, 1.0, 1.0, 1], dtype=object)

>>timeit.Timer( "x=a[0]",glbls=globals ()).timeit(1000 00)

0.1202543898734 8025

>>timeit.Timer( "x=oa[0]",glbls=globals ()).timeit(1000 00)

0.0506092254740 90636

>>timeit.Timer( "a+a",glbls=glo bals()).timeit( 100000)

1.3081539692893 784

>>timeit.Timer( "oa+oa",glbls=g lobals()).timei t(100000)

1.5201345422392 478

For your particular issue, you might try tweaking pickle to convert
int64 objects to int objects. Assuming of course that you have enough of
these to matter, otherwise, I suggest just leaving things alone.

( int64've not had so far don't know whats with python L's )

the main problem is with hundreds of all-day normal floats (now numpy.float64) and ints (numpy.int32) variables.
Speed issues, memory consumption... And a pickled tree cannot be read by an app which has not numpy available. and the pickles are very big.

I still really wonder how all this observations and the things which I can imagine so far can sum up to an overall advantage for letting numpy.float64 & numpy.int32 scalars out by default - and also possibly not for numpy.float32 which has somewhat importance in practice ?
Letting out nan and inf.. objects and offering an explicit type case is of course ok.

Robert

Nov 17 '06 #3

Robert Kern

robert wrote:

Didn't find the relevant reasoning within time. Yet guess the reason is isolated-module-centric.

I gave you a brief rundown on this list already.

http://mail.python.org/pipermail/pyt...er/411145.html

And I'll note again that a fuller discussion is given in Chapter 2 of the _Guide
to NumPy_.

http://numpy.scipy.org/numpybooksample.pdf

And yet again, the best place for numpy questions is the numpy mailing list, not
here. Here, you will get maybe one or two people responding to you, usually me,
and I'm a cranky SOB. There you will get much nicer people answering your
questions and more fully.

http://www.scipy.org/Mailing_Lists

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Nov 17 '06 #4

robert

Robert Kern wrote:

robert wrote:

>Didn't find the relevant reasoning within time. Yet guess the reason is isolated-module-centric.

I gave you a brief rundown on this list already.

http://mail.python.org/pipermail/pyt...er/411145.html

think I took this into account already for this discussion.

>>array([1.,.0,4,4,],float32)

array([ 1., 0., 4., 4.], dtype=float32)

>>a=_
a+3

array([ 4., 3., 7., 7.], dtype=float32)

>>a+3.0

array([ 4., 3., 7., 7.], dtype=float32)

>>3+a

array([ 4., 3., 7., 7.], dtype=float32)

>>3.0+a

array([ 4., 3., 7., 7.], dtype=float32)

>>3.0*a

array([ 3., 0., 12., 12.], dtype=float32)
numpy does anyway not force the precision upwards for float32 x python-float. ( what is good so).

There remains the argument, that (float64,int32) scalars coming out should - by default - support the array interface.
How many people are there to expect and use this? I'd have never noticed it, if it wouldn't have been mentioned here. Have never seen such code nor seen somewhere or felt myself such a requirement. Thats very special an maybe can be turned on by a global option - if there is more than a handful of users for that.
I still think this is over-design and that it brings much much more disadvantages than advantages to let these beasts out by default into a general purpose language like python. The target area for numpy output is much bigger than that e.g. for matlab script someone uses in a rush to create a paper. Maybe for users who want to make an alt-matlab-only box out of python there could be a global switch somewhere or a enforcing versions of the data types ...

Seeing the speed impact and pickle-problems now everywere on post-computations upon numpy out, its a critical decision to migrate code to numpy. Almost a killer. Even if I spread float()-casts everywhere, this cost a lot of speed, makes code ugly etc., and its an holey sieve.

I think I'll stay as a voice to vote heavily against that scheme of numpy scalar types. 11 on a scale from 0 to 10 :-)

And yet again, the best place for numpy questions is the numpy mailing list, not
here. Here, you will get maybe one or two people responding to you, usually me,
and I'm a cranky SOB. There you will get much nicer people answering your
questions and more fully.

http://www.scipy.org/Mailing_Lists

Maybe once I take the hurdle to use this. Access and searching on such lists is somewhat proprietry. Numerics is a major field in Python land. There are also lots of cross relations to other libs and techniques. Maybe there could be a nntp-comfortable comp.lang.pytho n.numeric for users - and also a comp.lang.pytho n.net, comp.lang.pytho n.ui. I think that would greatly strentghen Pythons "marketing" in the numerics domain. main clp's posting frequency is too high anyway meanwhile.
Robert

Nov 18 '06 #5

Robert Kern

robert wrote:

There remains the argument, that (float64,int32) scalars coming out should - by default - support the array interface.
How many people are there to expect and use this? I'd have never noticed it, if it wouldn't have been mentioned here. Have never seen such code nor seen somewhere or felt myself such a requirement. Thats very special an maybe can be turned on by a global option - if there is more than a handful of users for that.

It derived from our experience building scipy. Writing a library of functions
that work on scalars, vectors and higher-dimensional arrays requires either a
certain amount of generic behavior in its types or a lot of hairy code. We went
for the former. "Global options" affecting the behavior of types don't fit very
well in a library.

I still think this is over-design and that it brings much much more disadvantages than advantages to let these beasts out by default into a general purpose language like python.

It's a judgement call. You judged differently than we have. <shrug>

I think I'll stay as a voice to vote heavily against that scheme of numpy scalar types. 11 on a scale from 0 to 10 :-)

Vote all you like; no one's taking a poll at this time.

[I wrote:]

>And yet again, the best place for numpy questions is the numpy mailing list, not
here. Here, you will get maybe one or two people responding to you, usually me,
and I'm a cranky SOB. There you will get much nicer people answering your
questions and more fully.

http://www.scipy.org/Mailing_Lists

Maybe once I take the hurdle to use this. Access and searching on such lists is somewhat proprietry. Numerics is a major field in Python land. There are also lots of cross relations to other libs and techniques. Maybe there could be a nntp-comfortable comp.lang.pytho n.numeric for users - and also a comp.lang.pytho n.net, comp.lang.pytho n.ui. I think that would greatly strentghen Pythons "marketing" in the numerics domain. main clp's posting frequency is too high anyway meanwhile.

When you have to put "numpy" in your subject lines because you're asking
questions about how and why numpy does this one particular thing, it's time to
think about posting to the appropriate list. If you need NNTP, use GMane.

http://dir.gmane.org/gmane.comp.python.numeric.general

Here's the root of the problem: many of the people you want to talk to aren't
here. They don't read comp.lang.pytho n; mostly it's just me, and I'm getting
crankier by the character.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Nov 18 '06 #6

Similar topics

1399

distinction between float & int

by: Ishwor | last post by:

hi all, can anyone tell me why this distinction? i mean why it returns False on floats?? >>> a = 1 >>> b = 1 >>> a is b True >>> a = 1.1 >>> b = 1.1 >>> a is b

Python

3891

how to cast from (void*) to other types?

by: Roman Mashak | last post by:

Hello, All! I'm implementing function parsing config file, which look like this: # this is comment port=10000 path=/var/run/dump.pid .... Declared the type

C / C++

1059

Is the Python types module superfluous?

by: Colin J. Williams | last post by:

Travis Oliphant wrote: > Gerard Vermeulen wrote: > >> On Wed, 01 Feb 2006 11:15:09 -0500 >> "Colin J. Williams" <cjw@sympatico.ca> wrote: >> >> >> >>

Python

1235

Method return type of a methond that returns different object types

by: Nerd | last post by:

I have a function in C3 that returns different object types depending on the paramter passed. What is the best way to declare the return type if the objects types doesnot have a common parent class. Can I use System.Object as the return type? something like this if (param1 =="1") {

.NET Framework

1622

Python types

by: Salvatore | last post by:

Hello, I've read several articles where it's said that Python is weakly typed. I'm a little surprised. All objects seem to have a perfectly defined type Am i wrong? Regards

Python

1196

about Python types and objects

by: pipehappy | last post by:

Hi everyone: I have read the booth python types and objects. I think its book for clearificating types and objects in python. It says there will be a book named python attributes and methods. Do you know any information about this book. I am interested in it. Thanks

Python

7535

Cards in python (classes PlayingCard & DeckOfCards) [solved]

by: prince99 | last post by:

i am worlking on the cards class so it can do several things. My code is as before but need to change the things to solve given problem Create a new class Deck that represents a pack of 52 cards. The class should support the following methods: __init__ ( self ) Creates a deck of cards in standard order. shuffle(self) Randomizes the order of the cards. dealCard(self) Returns a single card from the top of the deck, and removes the card...

Python

4014

python vs java & eclipse

by: Amir Michail | last post by:

Hi, It seems to me that measuring productivity in a programming language must take into account available tools and libraries. Eclipse for example provides such an amazing IDE for java that it is no longer obvious to me that one would be much more productive in python for medium sized projects. Sure, all that Java static typing can be painful, but Eclipse takes

Python

1423

get the shape of a numpy ndarray in C++ code [boost.python]

by: Marc Oldenhof | last post by:

Hello, I'm trying to use a numpy array in C++ (win2000) using boost.python. Test code: void test( numeric::array& nsP) { object shape = nsP.getshape(); int rows = extract<int>(shape); int cols = extract<int>(shape);

Python

9271

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

10031

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

9869

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

9708

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

7242

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

6534

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

5140

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

3805

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

2665

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General