473,847 Members | 1,771 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Lazy evaluation: overloading the assignment operator?


Python allows the binding behaviour to be defined for descriptors,
using the __set__ and __get__ methods. I think it would be a major
advantage if this could be generalized to any object, by allowing the
assignment operator (=) to be overloaded.

One particular use for this would be to implement "lazy evaluation".
For example it would allow us to get rid of all the temporary arrays
produced by NumPy.

For example, consider the expression:

y = a * b + c * d

If this expression is evaluated bya Fortran 90/95 compiler, it will
automatically generate code like

do i = 1,n
y(i) = a(i) * b(i) + c(i) * d(i)
enddo

On the other hand, conventional use of overloaded binary operators
would result in something like this:

allocate(tmp1,n )
do i = 1,n
tmp1(i) = a(i) * b(i)
enddo
allocate(tmp2,n )
do i = 1,n
tmp2(i) = c(i) * d(i)
enddo
allocate(tmp3,n )
do i = 1,n
tmp3(i) = tmp1(i) + tmp2(i)
enddo
deallocate(tmp1 )
deallocate(tmp2 )
do i = 1,n
y(i) = tmp3(i)
enddo
deallocate(tmp3 )

Traversing memory is one of the most expensive thing a CPU can do.
This approach is therefore extremely inefficient compared with what a
Fortran compiler can do.

However, if we could overload the assignment operator, there would be
an efficient solution to this problem. Instead of constructing
temporary temporary arrays, one could replace those with objects
containing lazy expressions "to be evaluated sometime in the future".
A statement like

y = a * b + c * d

would then result in something like this:

tmp1 = LazyExpr('__mul __',a,b) # symbolic representation of "a * b"
tmp2 = LazyExpr('__mul __',c,d) # symbolic representation of "c * d"
tmp3 = LazyExpr('__add __',tmp1,tmp1) # symbolic "a * b + c * d"
del tmp1
del tmp2
y = tmp3 # tmp3 gets evaluated as assignment is overloaded
Should there be a PEP to overload the assignment operator? In terms of
syntax, it would not be any worse than the current descriptor objects
- but it would make lazy evaluation idioms a lot easier to implement.

Sturla Molden

May 2 '07 #1
9 3523
sturlamolden wrote:
Python allows the binding behaviour to be defined for descriptors,
using the __set__ and __get__ methods.
AFAIK, __getattribute_ _ calls them *explicitly*.
I think it would be a major
advantage if this could be generalized to any object, by allowing the
assignment operator (=) to be overloaded.
>
One particular use for this would be to implement "lazy evaluation".
For example it would allow us to get rid of all the temporary arrays
produced by NumPy.

For example, consider the expression:
[snip]
>
y = a * b + c * d

would then result in something like this:

tmp1 = LazyExpr('__mul __',a,b) # symbolic representation of "a * b"
tmp2 = LazyExpr('__mul __',c,d) # symbolic representation of "c * d"
tmp3 = LazyExpr('__add __',tmp1,tmp1) # symbolic "a * b + c * d"
del tmp1
del tmp2
y = tmp3 # tmp3 gets evaluated as assignment is overloaded
To allow lazy evaluation, you need overloading of the assignment
operator? Where should you overload it? y is less than None when you do
that assignment. I don't really see the need for overloading here.
Following the binding rules, __mul__ would (even without any hackery) be
evaluated before __add__.
>
Should there be a PEP to overload the assignment operator?
If -- after this discussion -- community seems to like this feature, you
could try to come up with some patch and a PEP. But not yet.
In terms of
syntax, it would not be any worse than the current descriptor objects
- but it would make lazy evaluation idioms a lot easier to implement.
--
Stargaming
May 2 '07 #2
sturlamolden schrieb:
Python allows the binding behaviour to be defined for descriptors,
using the __set__ and __get__ methods. I think it would be a major
advantage if this could be generalized to any object, by allowing the
assignment operator (=) to be overloaded.

One particular use for this would be to implement "lazy evaluation".
For example it would allow us to get rid of all the temporary arrays
produced by NumPy.

For example, consider the expression:

y = a * b + c * d

If this expression is evaluated bya Fortran 90/95 compiler, it will
automatically generate code like

do i = 1,n
y(i) = a(i) * b(i) + c(i) * d(i)
enddo

On the other hand, conventional use of overloaded binary operators
would result in something like this:

allocate(tmp1,n )
do i = 1,n
tmp1(i) = a(i) * b(i)
enddo
allocate(tmp2,n )
do i = 1,n
tmp2(i) = c(i) * d(i)
enddo
allocate(tmp3,n )
do i = 1,n
tmp3(i) = tmp1(i) + tmp2(i)
enddo
deallocate(tmp1 )
deallocate(tmp2 )
do i = 1,n
y(i) = tmp3(i)
enddo
deallocate(tmp3 )

Traversing memory is one of the most expensive thing a CPU can do.
This approach is therefore extremely inefficient compared with what a
Fortran compiler can do.
I fail to see where laziness has anything to do with this. In C++, this
problem can be remedied with the so called temporary base class idiom.

But this has nothing to do with laziness, which does not reduce the
amount of code to execute, but instead defers the point of execution of
that code.

And AFAIK the general overhead of laziness versus eager evaluation does
not pay off - haskell is a tad slower than e.g. an ML dialect AFAIK.

Diez
May 2 '07 #3
On May 2, 9:46 pm, Stargaming <stargam...@gma il.comwrote:
del tmp2
y = tmp3 # tmp3 gets evaluated as assignment is overloaded

To allow lazy evaluation, you need overloading of the assignment
operator?
No I don't. I could for example delay evaluation until some data from
y is requested, say when

x = y[5]

is executed. However, that can allow mutual dependencies between
unevaluated expressions. These can be complicated to resolve, and is
an issue similar to to that of cyclic dependencies in reference
counting. With an overloaded assignment operator, we would avoid this
as assignments are natural places to flush lazy evaluations. No
dangling unevaluated expression would be produced, and thus there
would be no strange bugs of this sort.

Where should you overload it? y is less than None when you do
that assignment.
In this case, I am not suggesting overloading

y =

but rather overloading

= tmp3

That is, when a variable is bound to an object, a method is called
(e.g. __get__) and the variable gets the return value output from that
function instead. It is analogous to the __get__ method of
descriptors. COnsider happens when we call

a = someObject.some Property

and someProperty has a __get__ method. Even if a is less than None, we
still get a call to __get__.

On the other hand, if y had been bound to a value before hand, it
would be meaningful to call a method called __set__ on y when

y = tmp3

is executed. Just like

someObject.some Property = value

would call __set__ on someProperty if it had one. Obviously if
someObject.some Property had been unbound, there would have been no
call to __set__.

So I am suggesting generalising __set__ and __get__ to overload the
assignment operator.

This would be an example:
class Foo(object):

def __init__(self):
self.value = None

def __set__(self,va lue):
''' overloads bar = value after bar = Foo()'''
self.value = value

def __get__(self):
''' overloads obj = bar after bar = Foo()'''
return self.value
So it is just a generalization of the already existing descriptors. It
even makes descriptors and properties easier to understand.

I don't really see the need for overloading here.
Following the binding rules, __mul__ would (even without any hackery) be
evaluated before __add__.
Yes, but at the cost of generating several temporary arrays and
looping over the same memory several times. If the assignment operator
could be overloaded, we could avoid the temporary objects and only
have one loop. For numerical code, this can mean speed ups in the
order of several magnitudes.

Sturla Molden




>
Should there be a PEP to overload the assignment operator?

If -- after this discussion -- community seems to like this feature, you
could try to come up with some patch and a PEP. But not yet.
In terms of
syntax, it would not be any worse than the current descriptor objects
- but it would make lazy evaluation idioms a lot easier to implement.

--
Stargaming

May 2 '07 #4
On May 2, 11:08 pm, "Diez B. Roggisch" <d...@nospam.we b.dewrote:
And AFAIK the general overhead of laziness versus eager evaluation does
not pay off - haskell is a tad slower than e.g. an ML dialect AFAIK.
In the numerical Python community there is already a prototype
compiler called 'numexpr' which can provide efficient evaluation of
expressions like y = a*b + c*d. But as long as there is no way of
overloading an assignment, it cannot be seamlessly integrated in an
array framework. One will e.g. have to type up Python expressions as
strings and calling eval() on the string instead of working directly
with Python expressions.

In numerical work we all know how Fortran compares with C++. Fortran
knows about arrays and can generate efficient code. C++ doesn't and
have to resort to temporaries returned from overloaded operators. The
only case where C++ can compare to Fortran is libraries like Blitz++,
where for small fixes-sized arrays the temporary objects and loops can
be removed using template meta-programming and optimizing compilers.
NumPy has to generate a lot of temporary arrays and traverse memory
more than necessary. This is a tremendous slow down when arrays are
too large to fit in the CPU cache. Numexpr deals with this, but Python
cannot integrate it seamlessly. I think it is really a matter of what
you are trying to do. Some times lazy evaluation pays off, some times
it doesn't.

But overloaded assignment operators have more use than lazy
evaluation. It can be used and abused in numerous ways. For example
one can have classes where every assignment results in the creation of
a copy, which may seem to totally change the semantics of Python code
(except that it doesn't, it's just an illusion).
Sturla Molden

May 2 '07 #5

"sturlamold en" <st**********@y ahoo.nowrote in message
news:11******** **************@ n76g2000hsh.goo glegroups.com.. .
|
| Python allows the binding behaviour to be defined for descriptors,
| using the __set__ and __get__ methods. I think it would be a major
| advantage if this could be generalized to any object, by allowing the
| assignment operator (=) to be overloaded.

Conceptually, assignment is *not* an operator. Binary operators take two
values (of the same type) and produce a third (usually of either the input
type or a boolean) that usually depends on both inputs. Assignment does
nothing of the sort.

In Python, the equal sign is *not* an operator: it is a grammatical symbol.
One use of the operator fiction in C is to enable simple chained
assignment: a=b=2. Python does this directly without the fiction. C's
store-and-test usage can be implemented in Python with a 'purse' class.

| One particular use for this would be to implement "lazy evaluation".

Since (in Python, at least) operands are evaluated *before* the
operator/function is called, I do not see how.

| Should there be a PEP to overload the assignment operator?

You mean a PEP to make assignment an (pseudo)operati on and hence
overloadable (as all operators are). That has been proposed and rejected
before, more than once, but I don't have a reference handy.

Terry Jan Reedy

May 3 '07 #6
Diez B. Roggisch wrote:
I fail to see where laziness has anything to do with this.
In C++, this problem can be remedied with the so called
temporary base class idiom.
I have seen this referred to as lazy evaluation in C++,
so I suspect that Diez and Sturia are using "Lazy evaluation"
in different contexts with different meaning.
But this has nothing to do with laziness, which does not
reduce the amount of code to execute, but instead defers the
point of execution of that code.
But that is precisely what Sturia is suggesting, defer
(for a few nanoseconds) the evaluation of the multiplications
and addition until the assignment occurs. Admittedly a big
difference to the lazy evaluation implied by python's yield
statement, but still a version of lazy evaluation and (at
least sometimes) referred to as such in a C++ context.

I am a python newbie (about one month) but I think
some of what Sturia wants could be achieved by partially
following what is usually done in C++ to achieve what he
wants. It would involve a replacement array class (possibly
derived from NumPy's arrays) and a proxy class.

+ Addition, multiplication, etc of arrays and proxy
arrays does not return the result array, but returns
a proxy which stores the arguments and the
operation.

+ Array indexing of the proxy objects results in
the indexing methods of the arguments being
called and the operation being carried out and
returned. In C++ this is normally very efficient
as the operations are all defined inline and
expanded by the compiler.

+ If necessary, define an additional method to evaluate
the entire array and return it.

I think this would allow code like (if the new array type is
XArray)

a = Xarray(...)
b = Xarray(...)
c = Xarray(...)
d = Xarray(...)

y = a*b+c*d # Returns a proxy object

x = y[4] # Computes x = a[4]*b[4] + c[4]*d[4]

v = y.eval() # Evaluates all elements, returning Xarray

z = ((a+b)*(c+d)).e val() # Also evaluates all elements

Whether it would be any faster is doubtful, but it would eliminate
the temporaries.

Charles
May 3 '07 #7
On 2007-05-03, Terry Reedy <tj*****@udel.e duwrote:
>
"sturlamold en" <st**********@y ahoo.nowrote in message
news:11******** **************@ n76g2000hsh.goo glegroups.com.. .
|
| Python allows the binding behaviour to be defined for descriptors,
| using the __set__ and __get__ methods. I think it would be a major
| advantage if this could be generalized to any object, by allowing the
| assignment operator (=) to be overloaded.

Conceptually, assignment is *not* an operator. Binary operators take two
values (of the same type) and produce a third (usually of either the input
type or a boolean) that usually depends on both inputs. Assignment does
nothing of the sort.

In Python, the equal sign is *not* an operator: it is a grammatical symbol.
One use of the operator fiction in C is to enable simple chained
assignment: a=b=2. Python does this directly without the fiction. C's
store-and-test usage can be implemented in Python with a 'purse' class.

| One particular use for this would be to implement "lazy evaluation".

Since (in Python, at least) operands are evaluated *before* the
operator/function is called, I do not see how.
But they could evaluate to an expression tree instead of the actual
result. This tree could then be evaluate at the moment of assignment.

This is an idea I have been playing with myself in an other context.
You have a class of symbolic names. e.g. First, Last ... You can use
the normal operators to these names, the result will be an expression
tree. So Last - 2 will evaluate to something like

sub
/ \
Last 2

I want to use this in the context of a table (a list like structure
but with arbitrary start index, which can be negative, so tab[-1]
can't refer to the last element).

So I can use this as follows:

el = tab[Last - 2]

to access the element two places before the last, because the evaluation
of the tree happens in the __getitem__ method.

I could even write something like:

el = tab[(First + Last) / 2]

To get at the midle element.

--
Antoon Pardon
May 3 '07 #8
On May 3, 6:22 am, Charles Sanders <C.delete_this. Sand...@BoM.GOv .AU>
wrote:
y = a*b+c*d # Returns a proxy object

x = y[4] # Computes x = a[4]*b[4] + c[4]*d[4]

v = y.eval() # Evaluates all elements, returning Xarray

z = ((a+b)*(c+d)).e val() # Also evaluates all elements
When I suggested this on the NumPy mailing list, I too suggested using
the indexing operator to trigger the computations. But I am worried
that if an expression like

y = a*b+c*d

returns a proxy, it is somehow possible to mess things up by creating
cyclically dependent proxies. I may be wrong about this, in which case
__getitem__ et al. will do the job.

Whether it would be any faster is doubtful, but it would eliminate
the temporaries.
The Numexpr compiler in SciPy suggests that it can. It parses an
expression like 'y = a*b+c*d' and evaluates it. Numexpr is only a
slow prototype written in pure Python, but still it can sometimes give
dramatical speed-ups. Here we do not even need all the machinery of
Numexpr, as Python creates the parse tree on the fly.

Inefficiency of binary operators that return temporary arrays is
mainly an issue when the arrays in the expression is too large to fit
in cache. RAM access can be very expensive, but cache access is
usually quite cheap. One also avoids unnecessary allocation and
deallocation of buffers to hold temporary arrays. Again, it is mainly
an issue when arrays are large, as malloc and free can be rather
efficient for small objects.


May 3 '07 #9
On 2007-05-03, sturlamolden <st**********@y ahoo.nowrote:
On May 3, 6:22 am, Charles Sanders <C.delete_this. Sand...@BoM.GOv .AU>
wrote:
> y = a*b+c*d # Returns a proxy object

x = y[4] # Computes x = a[4]*b[4] + c[4]*d[4]

v = y.eval() # Evaluates all elements, returning Xarray

z = ((a+b)*(c+d)).e val() # Also evaluates all elements

When I suggested this on the NumPy mailing list, I too suggested using
the indexing operator to trigger the computations. But I am worried
that if an expression like

y = a*b+c*d

returns a proxy, it is somehow possible to mess things up by creating
cyclically dependent proxies. I may be wrong about this, in which case
__getitem__ et al. will do the job.
How do you expect to handle the following kind of situation:

while <condition>:
x = y
a = ...
b = ...
y = a * x + b

--
Antoon Pardon
May 3 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

25
2401
by: Steven Bethard | last post by:
So I end up writing code like this a fair bit: map = {} for key, value in sequence: map.setdefault(key, ).append(value) This code basically constructs a one-to-many mapping -- each value that a key occurs with is stored in the list for that key. This code's fine, and seems pretty simple, but thanks to generator
3
1617
by: jim.brown | last post by:
The attached code implements a test class to show an error in overloading operator=. This code works on Windows with Visual Studio and simpler cases work with gcc 3.3.2 on Solaris 9. On Windows, operator= gets called twice and I'm not sure whay that is. This fails on Solaris with: In function 'int main(int, char**)' error: no match for 'operator=' in 't2 = TestCL::getTestCL()()'
77
4779
by: berns | last post by:
Hi All, A coworker and I have been debating the 'correct' expectation of evaluation for the phrase a = b = c. Two different versions of GCC ended up compiling this as b = c; a = b and the other ended up compiling it as a = c; b = c. (Both with no optimizations enabled). How did we notice you may ask? Well, in our case 'b' was a memory mapped register that only has a select number of writable bits. He claims it has been a 'C...
12
8098
by: Achim Domma | last post by:
Hi, I want to use Python to script some formulas in my application. The user should be able to write something like A = B * C where A,B,C are instances of some wrapper classes. Overloading * is no problem but I cannot overload the assignment of A. I understand that this is due to the nature of Python, but is there a trick to work around
22
3635
by: clicwar | last post by:
A simple program with operator overloading and copy constructor: #include <iostream> #include <string> using namespace std; class Vector { private: float x,y; public: Vector(float u, float v);
3
1895
by: Philipp Brune | last post by:
Hi all, i just played around a little with the c# operator overloading features and an idea came to my mind. You all possibly know the Nullable<T> Datatype. Now i thought it would be a nice idea to write my own Revertable<Ttype that maintains the values of multiple assignments and can revert to a previous value. An Example to make it clear : Revertable<intir0 = 123; ir0 = 500;
8
2979
by: Wayne Shu | last post by:
Hi everyone, I am reading B.S. 's TC++PL (special edition). When I read chapter 11 Operator Overloading, I have two questions. 1. In subsection 11.2.2 paragraph 1, B.S. wrote "In particular, operator =, operator, operator(), and operator-must be nonstatic member function; this ensures that their first operands will be lvalues". I know that these operators must be nonstatic member functions, but why this ensure their first operands will...
39
5094
by: Boltar | last post by:
Why does C do lazy evaluation for logical boolean operations but not bitwise ones? Ie: the following program prints "1 2" , not "1 1" under gcc main() { int a = 1; int b = 1; 0 && ++a;
6
3938
by: Peng Yu | last post by:
Hi, I'm wondering if the following assignment is lazy copy or not? Thanks, Peng std::vector<intv. v.push_back(1); v.push_back(2);
0
9730
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10983
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10707
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10341
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7883
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5719
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4529
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4119
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3164
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.