By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,758 Members | 1,225 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,758 IT Pros & Developers. It's quick & easy.

New-style classes, iter and PySequence_Check

P: n/a
Someone pasted the original version of the following code snippet on
#python today. I started investigating why the new-style class didn't
work as expected, and found that at least some instances of new-style
classes apparently don't return true for PyInstance_Check, which causes
a problem in PySequence_Check, since it will only do an attribute lookup
for instances.

Things probably shouldn't be this way. Should I go to python-dev with this?

Demonstration snippet:

args={'a':0}

class Args(object):
def __getattr__(self,attr):
print "__getattr__:", attr
return getattr(args,attr)

class ClassicArgs:
def __getattr__(self, attr):
print "__getattr__:", attr
return getattr(args, attr)

if __name__ == '__main__':
c = ClassicArgs()
i = c.__iter__()
print i
i = iter(c)
print i

a = Args()
i = a.__iter__()
print i
i = iter(a)
print i
Jul 19 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
Tuure Laurinolli wrote:
Someone pasted the original version of the following code snippet on
#python today. I started investigating why the new-style class didn't
work as expected, and found that at least some instances of new-style
classes apparently don't return true for PyInstance_Check, which causes
a problem in PySequence_Check, since it will only do an attribute lookup
for instances.

Things probably shouldn't be this way. Should I go to python-dev with this?

Demonstration snippet:


For anyone who's curious, here's what the code actually does:

py> args={'a':0}
py> class Args(object):
.... def __getattr__(self,attr):
.... print "__getattr__:", attr
.... return getattr(args,attr)
....
py> class ClassicArgs:
.... def __getattr__(self, attr):
.... print "__getattr__:", attr
.... return getattr(args, attr)
....
py> c = ClassicArgs()
py> i = c.__iter__()
__getattr__: __iter__
py> print i
<dictionary-keyiterator object at 0x0115D920>
py> i = iter(c)
__getattr__: __iter__
py> print i
<dictionary-keyiterator object at 0x01163CA0>
py> a = Args()
py> i = a.__iter__()
__getattr__: __iter__
py> print i
<dictionary-keyiterator object at 0x01163D20>
py> i = iter(a)
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "D:\Steve\My Programming\pythonmodules\sitecustomize.py", line
37, in iter
return orig(*args)
TypeError: iteration over non-sequence

STeVe
Jul 19 '05 #2

P: n/a
[Tuure Laurinolli]
Someone pasted the original version of the following code snippet on
#python today. I started investigating why the new-style class didn't
work as expected
classes apparently don't return true for PyInstance_Check, which causes
a problem in PySequence_Check, since it will only do an attribute lookup
for instances.

Things probably shouldn't be this way. Should I go to python-dev with this?


PyInstance_Check is defined as checking for instances of old style classes.
Hence, it appropriate that it does not return True for instances of new-style
classes.

PySequence_Check is pretty good shape. In general, it is not always possible to
tell a mapping from a sequence, but the function does try to take advantage of
all available information. In particular, it has a separate code path for
new-style instances which sometimes have additional information (i.e. dict and
its subclasses fill the tp_as_mapping->mp_subscript slot instead of the
tp_as_sequence->sq_slice slot).

Old style classes never inherit from dict, tuple, list, etc. so that slot
distinction isn't meaningful. Accordingly, PySequence_Check simply checks to
see if an object defines __getitem__ and that check entails an attribute lookup.
The posted code snippet uses a __getattr__ gimmick to supply an affirmative
answer to that check.

The docs do not make guarantees about how iter(o) will determine whether object
o is a sequence. Currently, it does the best it can and that entails having
different implementation logic for new-style and old-style objects as described
above. The provided snippet uses the __getattr__ gimmick to reveal the
implementation specific behavior. Mystery solved.

In contrast, the docs do make guarantees about explicit attribute lookup (i.e.
that an attribute lookup will actually occur and that slot checking logic will
not be substituted). The posted snippet reveals that these guarantees are
intact for both new and old-style classes.

Knowing all of this reveals the behavior to be a feature rather than a bug --
the feature being that operator.IsMappingType and operator.IsSequenceType do a
remarkable job (few false positives and no false negatives) across a huge
variety of object types. It has been tested with instances of set, int, float,
complex, long, bool, str, unicode, UserString, list, UserList, tuple, deque,
NoneType, NotImplemented, new and old style instances with and without defining
__getitem__. Also, tested were subclasses of the above types. The false
positives are limited to user defined classes (new or old-style) defining
__getitem__. Since the mapping API overlaps with the sequence API, it is not
always possible to distinguish the two.

Of course, you found that it is possible to throw a monkey wrench in the process
using __getattr__ to exploit undefined, implementation specific behavior. Don't
do that. If a class needs to be iterable, then supply an __iter__ method. If
that method needs to be retargeted dynamically, then
take advantage of the guarantee that iter(o) will always find a defined __iter__
method. Perhaps, define __iter__ as lambda obj: obj(obj.__getattr__(obj,
'__iter__')) or somesuch. IOW, to guarantee that iter() performs an __iter__
lookup, fill the slot with something that does that lookup.
Raymond Hettinger
E = mc**2 # Einstein
E = IR # Ohm's law
therefore
IR = mc**2 # Raymond's grand unified theory
Jul 19 '05 #3

P: n/a
On Thu, 14 Apr 2005 15:18:20 -0600, Steven Bethard <st************@gmail.com> wrote:
Tuure Laurinolli wrote:
Someone pasted the original version of the following code snippet on
#python today. I started investigating why the new-style class didn't
work as expected, and found that at least some instances of new-style
classes apparently don't return true for PyInstance_Check, which causes
a problem in PySequence_Check, since it will only do an attribute lookup
for instances.

Things probably shouldn't be this way. Should I go to python-dev with this?

Demonstration snippet:


For anyone who's curious, here's what the code actually does:

py> args={'a':0}
py> class Args(object):
... def __getattr__(self,attr):
... print "__getattr__:", attr
... return getattr(args,attr)
...
py> class ClassicArgs:
... def __getattr__(self, attr):
... print "__getattr__:", attr
... return getattr(args, attr)
...
py> c = ClassicArgs()
py> i = c.__iter__()
__getattr__: __iter__
py> print i
<dictionary-keyiterator object at 0x0115D920>
py> i = iter(c)
__getattr__: __iter__
py> print i
<dictionary-keyiterator object at 0x01163CA0>
py> a = Args()
py> i = a.__iter__()
__getattr__: __iter__
py> print i
<dictionary-keyiterator object at 0x01163D20>
py> i = iter(a)
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "D:\Steve\My Programming\pythonmodules\sitecustomize.py", line
37, in iter
return orig(*args)
TypeError: iteration over non-sequence

I think this is a known thing about the way several builtin functions like iter work.
I'm not sure, but I think it may be expedient optimization practicality beating purity.
IOW, I think maybe iter(a) skips the instance logic altogether and goes right to
type(a).__iter__(a) instead of looking for something on the instance shadowing __iter__.
And I suspect iter(a) does its own internal mro chase for __iter__, bypassing even a
__getattribute__ in a metaclass of Args, as it appears if you try to monitor that way. E.g.,
class Show(object): ... class __metaclass__(type):
... def __getattribute__(cls, attr):
... print 'Show.__getattribute__:', attr
... return type.__getattribute__(cls, attr)
... def __getattribute__(self, attr):
... print 'self.__getattribute__:', attr
... return object.__getattribute__(self, attr)
...
... show = Show()
Show.__module__ Show.__getattribute__: __module__
'__main__' show.__module__ self.__getattribute__: __module__
'__main__' show.__iter__ self.__getattribute__: __iter__
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 8, in __getattribute__
AttributeError: 'Show' object has no attribute '__iter__' Show.__iter__ Show.__getattribute__: __iter__
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 5, in __getattribute__
AttributeError: type object 'Show' has no attribute '__iter__'

But no interception this way: iter(show) Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: iteration over non-sequence
Naturally __getattr__ gets bypassed too, if there's no instance attribute lookup even tried.

If we actually supply an __iter__ method for iter(a) to find as type(a).__iter__, we can see:
class Args(object): ... def __getattr__(self, attr):
... print "__getattr__:", attr
... return getattr(args, attr)
... def __iter__(self):
... print "__iter__"
... return iter('Some kind of iterator')
... a = Args()
a.__iter__ <bound method Args.__iter__ of <__main__.Args object at 0x02EF8BEC>> a.__iter__() __iter__
<iterator object at 0x02EF888C> iter(a) __iter__
<iterator object at 0x02EF8CAC>
type(a).__iter__ <unbound method Args.__iter__> type(a).__iter__(a) __iter__
<iterator object at 0x02EF8CAC>

Now if we get rid of the __iter__ method, __getattr__ can come into play again:
del Args.__iter__
a.__iter__ __getattr__: __iter__
<method-wrapper object at 0x02EF8C0C> a.__iter__() __getattr__: __iter__
<dictionary-keyiterator object at 0x02EF8CE0> type(a).__iter__ Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: type object 'Args' has no attribute '__iter__' iter(a)

Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: iteration over non-sequence

It's kind of like iter(a) assuming that type(a).__iter__ is a data descriptor
delivering a bound method instead of an ordinary method, and therefore it can
assume that the data descriptor always trumps instance attribute lookup, and
it can optimize with that assumption. I haven't walked the relevant iter() code.
That would be too easy ;-)

Regards,
Bengt Richter
Jul 19 '05 #4

P: n/a
[Bengt Richter]
I'm not sure, but I think it may be expedient optimization practicality beating purity.

It is more a matter of coping with the limitations of the legacy API where want
to wrap a sequence iterator around __getitem__ in sequences but not in
mappings.
<snipped code experiments>
I haven't walked the relevant iter() code.
That would be too easy ;-)


The actual code for Object/abstract.c's PyObject_GetIter() is somewhat more
straight-forward and less mysterious than those experiments suggest.

Raymond
Jul 19 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.