flatten() time trial mystery. (or, 101 ways to flatten a nested list using generators)

Francis Avila

A few days ago (see the 'itertools.flatten()?' thread from October 28) I
became obsessed with refactoring a recursive generator that yielded the
leaves of nested iterables. When the dust settled, I had many flatten
functions at hand.

So I had to time them. Results below.

History of the functions (from flattrial.py):

# There are three basic features:
# 1. Can specify a function that determines iterability.
# 2. Can specify class-iterability.
# 3. Can modify sequence before iterating.

##Function: Features Supported : Author
##
##flatten_fa: 1 : Francis Avila
##flatten_po: 1,2,3 : Peter Otten
##flatten_po2: None : Alex Martelli channeling Peter Otten
##flatten_am: None : Alex Martelli
##flatten_dict: 2 : Peter Otten
##flatten_fastcond: 1,2 : Francis Avila
##flatten_itercond: 1,2,3 : Francis Avila
##flatten_dictcond: 1,2,3 : Francis Avila
##flatten_dictdef: 1,2,3 : Francis Avila
##flatten_trydictdef: 1,2,3 : Francis Avila
##flatten_fastdictdef: 1,2,3 : Francis Avila

Tree test flattens tree:
subtree = ['foo']*18 + [1,2,3]*6
tree = [ subtree*10, [ subtree * 8 ] ]

Node test flattens nodetree:
class Node(object):
def __init__(self, label=None, data=None, children=()):
self.children = children
self.label = label
self.data = data
def __iter__(self):
return iter(self.children)

leaves = [Node(chr(i+65),i) for i in range(10)]
branches = [Node(chr(i+65), i, leaves) for i in range(10,30)]
nodetree = [Node(chr(i+65), i, branches) for i in range(30,50)]

Results (Python 2.2):

.....>python flattrial.py
C:\Docs>python flattrial.py
Tree Tests (100 reps):
02.113544 fa
03.129147 po
02.845054 po2
02.587387 am
00.643718 dict
00.648371 fastcond
00.724689 itercond
00.791277 dictcond
01.006224 dictdef
00.833452 trydictdef
00.776937 fastdictdef
Node Tests (10 reps):
02.877818 po
02.633231 itercond
00.878554 dictcond
01.040838 dictdef
00.897504 trydictdef
00.864411 fastdictdef

I'd post flattrial.py, but it's about 500 lines and I don't have any web
space to put it up. Besides, I'm not sure anyone is interested. :)

A mystery, though. I did not expect dictdef (my magnum opus) to be as slow
as it was, so I investigated. I went to the obvious first: using dict.get
is quite a bit slower than using try: x = dict[key]; except KeyError: x =
default. This is rather inexplicable....

It was still noticeably slower than dictcond. So, I made fastdict, which
emulated dictcond more closely by not allowing the default handler to modify
the sequence passed to it:

def defaulthandler(seq):
try:
it = iter(seq)
except TypeError:
return False, seq
else:
return True, it

def flatten_fastdictdef(iterable, get_iterbility=None):
#Note defaulthandler is no longer an argument.
if get_iterbility is None:
get_iterbility = {''.__class__:False, u''.__class__:False}

# In dictdef, the following try-except is:

# iterbility = get_iterbility.get(iterable.__class__, defaulthandler)

try:
iterbility = get_iterbility[iterable.__class__]
except KeyError:
#Following added to avoid a function call
#Was:

# iterbility = defaulthandler
#
# if iterbility is defaulthandler:
# iterbility, iterable = defaulthandler(iterable)
# get_iterbility[iterable.__class__] = iterbility

#Now:
t = iterable.__class__
try:
iterable = iter(iterable)
except TypeError:
iterbility = get_iterbility[t] = False
else:
iterbility = get_iterbility[t] = True

if callable(iterbility):
iterbility, iterable = iterbility(iterable)

if not iterbility:
yield iterable
else:
for elem in iterable:
for subelem in flatten_fastdictdef(elem, get_iterbility):
yield subelem
This gave the results you see above: even faster than dictcond. The thing
is, I don't have any idea why. The function call overhead doesn't seem to
be enough to explain the difference between the try version of dictdef and
fastdictdef. Nor the name rebinding (which is very fast). Anyone have any
ideas?

Second, is the speed gain of fastdictdef over trydictdef worth the loss of
specifying a defaulthandler that can dictate what goes into the cache
dictionary and modify sequences? (I know, "it depends", but in the general
case, is that a feature anyone would ever *need*? I can't see how.)

--
Francis Avila

Jul 18 '05 #1

Subscribe Reply

1862

by: Carlos Ribeiro | last post by:

As a side track of my latest investigations, I began to rely heavily on generators for some stuff where I would previsouly use a more conventional approach. Whenever I need to process a list, I'm...

Python

Horizontal nested list

by: Lee K. Seitz | last post by:

I'm still relatively new to stylesheets. I'm trying to do something that seemed fairly simple on the surface, but is proving to be a challenge. I have a set of nested lists: <ul> <li>Side...

HTML / CSS

Nested DataGrid using Relations sorting child

by: DelphiBlue | last post by:

I have a Nested Datagrid that is using a data relations to tie the parent child datagrids together. All is working well with the display but I am having some issues trying to sort the child...

.NET Framework

How to construct XHTML-compliant nested list?

by: deko | last post by:

How do I construct an XHTML-compliant nested unordered list? This displays correctly (both FF and IE): <ul> <li>list item</li> <li>list item</li> <li>list item</li> <ul> <li>nested list...

HTML / CSS

The </li> before a nested list. Why not?

by: patrick j | last post by:

Hi I'm wondering about lists with nested lists as one does on a Saturday afternoon. Anyway below is an example of a list with a nested list which the iCab browser's very useful HTML...

HTML / CSS

nested list comprehension and if clauses

by: Jyotirmoy Bhattacharya | last post by:

I'm a newcomer to Python. I have just discovered nested list comprehensions and I need help to understand how the if-clause interacts with the multiple for-clauses. I have this small program: ...

Python

Retrieving indexes of elements in a nested list by recursive depth

by: Gentr1 | last post by:

Hi everybody! I am presently working on a Genetic Programming API in python. I have a bit of a problem at the moment... For some specific reasons, I am using nested lists data structure to...

Python

Expanding a third level nested list?

by: TAL651 | last post by:

I have a nested list with three levels: <ul> <li>Main Point 1 <ul> <li>Sub point 1 <ul> <li>sub-sub point a</li> </ul> ...

Javascript

Unable to identify the level for Nested List inside an order list

by: suneelkn | last post by:

Unable to identify the same level for nested lists in all scenarios, when the nested-list inside an ordered list the conversion process executes with out proper list order for nested list items. The...

XML

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

C# / C Sharp

flatten() time trial mystery. (or, 101 ways to flatten a nested list using generators)

Similar topics