473,385 Members | 1,593 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Iterating across a filtered list

All -

I'm currently writing a toy program as I learn python that acts as a
simple address book. I've run across a situation in my search function
where I want to iterate across a filtered list. My code is working
just fine, but I'm wondering if this is the most "elegant" way to do
this. Essentially, I'm searching the dict self.contacts for a key that
matches the pattern entered by the user. If so, I print the value
associated with that key. A pastie to the method is below, any help/
advice is appreciated:

http://pastie.caboo.se/46647

Side note: I'm learning python after ruby experience. In ruby I would
do something like:

contacts.find_all{|name,contact| name =~ /search/}.each{|name,contact|
puts contact}

Mar 13 '07 #1
18 1791
"Drew" <ol*****@gmail.comwrites:
I'm currently writing a toy program as I learn python that acts as a
simple address book. I've run across a situation in my search function
where I want to iterate across a filtered list. My code is working
just fine, but I'm wondering if this is the most "elegant" way to do
this. Essentially, I'm searching the dict self.contacts for a key that
matches the pattern entered by the user. If so, I print the value
associated with that key. A pastie to the method is below, any help/
advice is appreciated:
If I can decipher your Ruby example (I don't know Ruby), I think you
want:

for name,contact in contacts.iteritems():
if re.search('search', name):
print contact

If you just want to filter the dictionary inside an expression, you
can use a generator expression:

d = ((name,contact) for (name,contact) in contacts.iteritems() \
if re.search('search', name))

print '\n'.join(d) # prints items from filtered dict, one per line

Note that d is an iterator, which means it mutates when you step
through it.
Mar 13 '07 #2
On Mar 13, 2:42 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
If I can decipher your Ruby example (I don't know Ruby), I think you
want:

for name,contact in contacts.iteritems():
if re.search('search', name):
print contact

If you just want to filter the dictionary inside an expression, you
can use a generator expression:

d = ((name,contact) for (name,contact) in contacts.iteritems() \
if re.search('search', name))

print '\n'.join(d) # prints items from filtered dict, one per line

Note that d is an iterator, which means it mutates when you step
through it.
Paul -

You're exactly on the mark. I guess I was just wondering if your first
example (that is, breaking the if statement away from the iteration)
was preferred rather than initially filtering and then iterating.
However, you're examples make a lot of sense are are quite helpful.

Thanks,
Drew

Mar 13 '07 #3
On Mar 13, 6:04 pm, "Drew" <olso...@gmail.comwrote:
All -
Hi!

[snip]
http://pastie.caboo.se/46647
There is no need for such a convoluted list comprehension as you
iterate over it immediately! It is clearer to put the filtering logic
in the for loop. Moreover you recalculate the regexp for each element
of the list. Instead I would do something like this:

def find(search_str, flags=re.IGNORECASE):
print "Contact(s) found:"
search = re.compile(search_str, flags).search
for name, contact in self.contacts.items():
if search(name):
print contact
print

Although I would rather have one function that returns the list of all
found contacts:

def find(search_str, flags=re.IGNORECASE):
search = re.compile(search_str, flags).search
for name, contact in self.contacts.items():
if search(name):
yield contact

And then another one that prints it.
Side note: I'm learning python after ruby experience. In ruby I would
do something like:

contacts.find_all{|name,contact| name =~ /search/}.each{|name,contact|
puts contact}
And that's why you're right to learn Python ;)

HTH

--
Arnaud

Mar 13 '07 #4
"Drew" <ol*****@gmail.comwrites:
You're exactly on the mark. I guess I was just wondering if your first
example (that is, breaking the if statement away from the iteration)
was preferred rather than initially filtering and then iterating.
I think the multiple statement version is more in Python tradition.
Python is historically an imperative, procedural language with some OO
features. Iterators like that are a new Python feature and they have
some annoying characteristics, like the way they mutate when you touch
them. It's usually safest to create and consume them in the same
place, e.g. creating some sequence and passing it through map, filter, etc.
Mar 13 '07 #5
"Arnaud Delobelle" <ar*****@googlemail.comwrites:
in the for loop. Moreover you recalculate the regexp for each element
of the list.
The re library caches the compiled regexp, I think.
Mar 13 '07 #6
Paul Rubin a écrit :
"Drew" <ol*****@gmail.comwrites:
>>You're exactly on the mark. I guess I was just wondering if your first
example (that is, breaking the if statement away from the iteration)
was preferred rather than initially filtering and then iterating.


I think the multiple statement version is more in Python tradition.
I don't know if I qualify as a Python traditionalist, but I'm using
Python since the 1.5.2 days, and I usually favor list comps or generator
expressions over old-style loops when it comes to this kind of operations.
Python is historically an imperative, procedural language with some OO
features.
Python has had functions as first class objects and (quite-limited-but)
anonymous functions, map(), filter() and reduce() as builtin funcs at
least since 1.5.2 (quite some years ago).
Iterators like that are a new Python feature
List comps are not that new (2.0 or 2.1 ?):
print "\n".join([contact for name, contact in contacts.items() \
if search.match(name)])

and they have
some annoying characteristics, like the way they mutate when you touch
them.
While sequences are iterables, all iterables are not sequences. Know
what you use, and you'll be fine.
It's usually safest to create and consume them in the same
place, e.g. creating some sequence and passing it through map, filter, etc.
Safest ? Why so ?
Mar 13 '07 #7
On Mar 13, 7:36 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
"Arnaud Delobelle" <arno...@googlemail.comwrites:
in the for loop. Moreover you recalculate the regexp for each element
of the list.

The re library caches the compiled regexp, I think.
That would surprise me.
How can re.search know that string.lower(search) is the same each
time? Or else there is something that I misunderstand.

Moreover:

In [49]: from timeit import Timer
In [50]: Timer('for i in range(1000): search("abcdefghijk")', 'import
re; search=re.compile("ijk").search').timeit(100)
Out[50]: 0.36964607238769531

In [51]: Timer('for i in range(1000): re.search("ijk",
"abcdefghijk")', 'import re;
search=re.compile("ijk").search').timeit(100)
Out[51]: 1.4777300357818604

--
Arnaud

Mar 13 '07 #8
Bruno Desthuilliers <bd*****************@free.quelquepart.frwrites:
I don't know if I qualify as a Python traditionalist, but I'm using
Python since the 1.5.2 days, and I usually favor list comps or
generator expressions over old-style loops when it comes to this kind
of operations.
I like genexps when they're nested inside other expressions so they're
consumed as part of the evaluation of the outer expression. They're a
bit scary when the genexp-created iterator is saved in a variable.

Listcomps are different, they allocate storage for the entire list, so
they're just syntax sugar for a loop. They have an annoying
misfeature of their
Python has had functions as first class objects and
(quite-limited-but) anonymous functions, map(), filter() and reduce()
as builtin funcs at least since 1.5.2 (quite some years ago).
True, though no iterators so you couldn't easily use those functions
on lazily-evaluated streams like you can now.
Iterators like that are a new Python feature
List comps are not that new (2.0 or 2.1 ?):
print "\n".join([contact for name, contact in contacts.items() \
if search.match(name)])
Well you could do it that way but it allocates the entire filtered
list in memory. In this example "\n".join() also builds up a string
in memory, but you could do something different, like run the sequence
through another filter or print out one element at a time, in which
case lazy evaluation can be important (imagine that contacts.iteritems
chugs through a billion row table in an SQL database).
It's usually safest to create and consume them in the same
place, e.g. creating some sequence and passing it through map, filter, etc.
Safest ? Why so ?
Just that things can get confusing if you're consuming the iterator in
more than one place. It can get to be like those old languages where
you had to do your own storage management ;-).
Mar 13 '07 #9
On Mar 13, 8:53 pm, Bruno Desthuilliers
<bdesth.quelquech...@free.quelquepart.frwrote:
Paul Rubin a écrit :
[snip]
Iterators like that are a new Python feature

List comps are not that new (2.0 or 2.1 ?):
print "\n".join([contact for name, contact in contacts.items() \
if search.match(name)])
You can write this, but:
* it is difficult to argue that it is more readable than Paul's (or
my) 'imperative' version;
* it has no obvious performance benefit, in fact it creates a list
unnecessarily (I know you could use a generator with recent python).
and they have
some annoying characteristics, like the way they mutate when you touch
them.

While sequences are iterables, all iterables are not sequences. Know
what you use, and you'll be fine.
....And know when to use for statements :)

--
Arnaud

Mar 13 '07 #10
En Tue, 13 Mar 2007 15:04:50 -0300, Drew <ol*****@gmail.comescribió:
I'm currently writing a toy program as I learn python that acts as a
simple address book. I've run across a situation in my search function
where I want to iterate across a filtered list. My code is working
just fine, but I'm wondering if this is the most "elegant" way to do
this. Essentially, I'm searching the dict self.contacts for a key that
matches the pattern entered by the user. If so, I print the value
associated with that key. A pastie to the method is below, any help/
advice is appreciated:

http://pastie.caboo.se/46647

Side note: I'm learning python after ruby experience. In ruby I would
do something like:

contacts.find_all{|name,contact| name =~ /search/}.each{|name,contact|
puts contact}
Just a few changes:

def find(self, search):
search_re = re.compile(search, re.IGNORECASE)
for result in [self.contacts[name] for name in self.contacts if
search_re.match(name)]:
print result

- you can iterate directly over a dictionary keys using: for key in dict
- you can compile a regexp to re-use it in all loops; using re.IGNORECASE,
you don't need to explicitely convert all to lowercase before comparing
- if all you want to do is to print the results, you can even avoid the
for loop:

print '\n'.join('%s' % self.contacts[name] for name in self.contacts
if search_re.match(name))

--
Gabriel Genellina

Mar 13 '07 #11
On Mar 13, 8:59 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
[snip]
def find(self, search):
search_re = re.compile(search, re.IGNORECASE)
for result in [self.contacts[name] for name in self.contacts if
search_re.match(name)]:
print result
I do not see how

for y in [f(x) for x in L if g(x)]:
do stuff with y

can be preferable to

for x in L:
if g(x):
do stuff with f(x)

What can be the benefit of creating a list by comprehension for the
sole purpose of iterating over it?

--
Arnaud

Mar 13 '07 #12
En Tue, 13 Mar 2007 17:19:53 -0300, Arnaud Delobelle
<ar*****@googlemail.comescribió:
On Mar 13, 7:36 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
>>
The re library caches the compiled regexp, I think.

That would surprise me.
How can re.search know that string.lower(search) is the same each
time? Or else there is something that I misunderstand.
It does.

pyimport re
pyx = re.compile("ijk")
pyy = re.compile("ijk")
pyx is y
True

Both, separate calls, returned identical results. You can show the cache:

pyre._cache
{(<type 'str'>, '%(?:\\((?P<key>.*?)\\))?(?P<modifiers>[-#0-9
+*.hlL]*?)[eEfFgGd
iouxXcrs%]', 0): <_sre.SRE_Pattern object at 0x00A786A0>,
(<type 'str'>, 'ijk', 0): <_sre.SRE_Pattern object at 0x00ABB338>}

--
Gabriel Genellina

Mar 13 '07 #13
En Tue, 13 Mar 2007 18:16:32 -0300, Arnaud Delobelle
<ar*****@googlemail.comescribió:
On Mar 13, 8:59 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
[snip]
>def find(self, search):
search_re = re.compile(search, re.IGNORECASE)
for result in [self.contacts[name] for name in self.contacts if
search_re.match(name)]:
print result

I do not see how

for y in [f(x) for x in L if g(x)]:
do stuff with y

can be preferable to

for x in L:
if g(x):
do stuff with f(x)

What can be the benefit of creating a list by comprehension for the
sole purpose of iterating over it?
No benefit...

--
Gabriel Genellina

Mar 13 '07 #14
On Mar 13, 9:31 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
En Tue, 13 Mar 2007 17:19:53 -0300, Arnaud Delobelle
<arno...@googlemail.comescribió:
On Mar 13, 7:36 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
The re library caches the compiled regexp, I think.
That would surprise me.
How can re.search know that string.lower(search) is the same each
time? Or else there is something that I misunderstand.

It does.

pyimport re
pyx = re.compile("ijk")
pyy = re.compile("ijk")
pyx is y
True

Both, separate calls, returned identical results. You can show the cache:
OK I didn't realise this. But even so each time there is the cost of
looking up the regexp string in the cache dictionary.

--
Arnaud

Mar 13 '07 #15
Hi,

On Tuesday 13 of March 2007 22:16:32 Arnaud Delobelle wrote:
for x in L:
if g(x):
do stuff with f(x)
for x in itertools.ifilterfalse(g, L):
do stuff

Maybe this would be even better?

L
Mar 13 '07 #16
En Tue, 13 Mar 2007 19:12:12 -0300, Arnaud Delobelle
<ar*****@googlemail.comescribió:
>pyimport re
pyx = re.compile("ijk")
pyy = re.compile("ijk")
pyx is y
True

Both, separate calls, returned identical results. You can show the
cache:

OK I didn't realise this. But even so each time there is the cost of
looking up the regexp string in the cache dictionary.
Sure, it's much better to create the regex only once. Just to note that
calling re.compile is not soooooooo bad as it could.

--
Gabriel Genellina

Mar 13 '07 #17
Paul Rubin a écrit :
Bruno Desthuilliers <bd*****************@free.quelquepart.frwrites:
(snip)
>Python has had functions as first class objects and
(quite-limited-but) anonymous functions, map(), filter() and reduce()
as builtin funcs at least since 1.5.2 (quite some years ago).

True, though no iterators so you couldn't easily use those functions
on lazily-evaluated streams like you can now.
Obviously. But what I meant is that Python may not be *so* "historically
imperative" !-)

FWIW, I first learned FP concepts with Python.
>> Iterators like that are a new Python feature
List comps are not that new (2.0 or 2.1 ?):
print "\n".join([contact for name, contact in contacts.items() \
if search.match(name)])

Well you could do it that way but it allocates the entire filtered
list in memory.
Of course. But then nothing prevents you from using a genexp instead of
the list comp - same final result, and the syntax is quite close:

print "\n".join(contact for name, contact in contacts.items() \
if search.match(name))

So the fact that genexps are still a bit "new" is not a problem here
IMHO - this programming style is not new in Python.
>>It's usually safest to create and consume them in the same
place, e.g. creating some sequence and passing it through map, filter, etc.
Safest ? Why so ?

Just that things can get confusing if you're consuming the iterator in
more than one place.
Indeed. But that's not what we have here. And FWIW, in programming, lots
of things tends to be confusing at first.
Mar 14 '07 #18
Arnaud Delobelle a écrit :
On Mar 13, 8:53 pm, Bruno Desthuilliers
<bdesth.quelquech...@free.quelquepart.frwrote:
>Paul Rubin a écrit :

[snip]
>> Iterators like that are a new Python feature
List comps are not that new (2.0 or 2.1 ?):
print "\n".join([contact for name, contact in contacts.items() \
if search.match(name)])

You can write this, but:
* it is difficult to argue that it is more readable than Paul's (or
my) 'imperative' version;
I personnaly find it more readable. To me, it tells what, not how.
* it has no obvious performance benefit,
No.
>>and they have
some annoying characteristics, like the way they mutate when you touch
them.
While sequences are iterables, all iterables are not sequences. Know
what you use, and you'll be fine.

...And know when to use for statements :)
Don't worry, I still use them when appropriate.
Mar 14 '07 #19

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: Matthew Wilson | last post by:
I'm playing around with genetic algorithms and I want to write a function that mutates an integer by iterating across the bits, and about 1 in 10 times, it should switch a zero to a one, or a one...
6
by: Gustaf Liljegren | last post by:
I ran into this problem today: I got an array with Account objects. I need to iterate through this array to supplement the accounts in the array with more data. But the compiler complains when I...
1
by: Shawn McNiven | last post by:
Hi, I've got a problem I hope someone can help me with. First some long winded explanations: I have several forms: Countries.aspx, Country.aspx, Cities.aspx and City.aspx. Countries displays a...
3
by: Parag Gaikwad | last post by:
Hi, I need to delete files across the network using web client (IE 6.x) in Win 2003 - IIS 6.0 environment. Can someone please suggest an approach I can use to acheive this. Will using FSO do...
1
by: sparks | last post by:
I have a main table with teacher names and students I can put this in a subform and filter by teacher name so I have a list of her students in a sub form. the problem I have is this is created in...
2
by: MarkAurit | last post by:
Ive been using arraylists in 1.1 to return collections of custom business objects, and thoroughly enjoyed their simple programming style. After hearing of the advantages of generics during a 2.0...
1
by: abTech | last post by:
Have struggled a lot to get a filtered drop down in the normal html and that too editable ... i have used table like auto-completion etc ... This is the simplest solution for a filtered drop down ...
4
RMWChaos
by: RMWChaos | last post by:
The next episode in the continuing saga of trying to develop a modular, automated DOM create and remove script asks the question, "Where should I put this code?" Alright, here's the story: with a...
6
by: Nettle | last post by:
Purpose: I am creating a mailing distribution list database. Users should be able to filter/search contacts and add them to distribution lists they have created. My problem? I can't add multiple,...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.