473,471 Members | 1,912 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Anyway to clarify this code? (dictionaries)

I've been searching thru the library documentation, and this is the
best code I can produce for this alogorithm:

I'd like to return a dictionary which is a copy of 'another' dictionary
whoes values are bigger than 'x' and has the keys 'keys':

def my_search (another, keys, x):
temp = another.fromkeys(keys)
return dict([[k,v] for k in temp.keys() for v in temp.values()
if v>=x])

Is there any way to improve this code?
I want to avoid converting the dictionary to a list and then to a
dictionay. Are there speed penalties for such a conversion?

Bye.

Nov 23 '05 #1
9 1333
"javuchi" <ja*****@gmail.com> writes:
I've been searching thru the library documentation, and this is the
best code I can produce for this alogorithm:

I'd like to return a dictionary which is a copy of 'another' dictionary
whoes values are bigger than 'x' and has the keys 'keys':

def my_search (another, keys, x):
temp = another.fromkeys(keys)
return dict([[k,v] for k in temp.keys() for v in temp.values()
if v>=x])

Is there any way to improve this code?
Lots of them. Let's start by pointing out two bugs:

You're creating a list that's the "cross product" of keys and values,
then handing that to dict. You're handing it a list with entries for
the same key. That behavior may be defined, but I wouldn't count on
it.

fromkeys sets the values in temp to the same value - None. So
temp.values() is a list of None's, so v is None every time through the
loop.

So you could do what that code does with:

def my_search(another, keys, x):
if None >= x:
return another.fromkeys(keys)
else:
return dict()
You probably want something like:

def my_search(another, keys, x):
return dict([[k,another[k]] for k in keys if another[k] >= x]

If you really want to avoid indexing another twice, you could do:

def my_search(another, keys, x):
return dict([[k, v] for k, v in another.items() if v >= x and k in keys])

But then you're looking through all the keys in another, and searching
through keys multiple times, which probably adds up to a lot more
wasted work than indexing another twice.
I want to avoid converting the dictionary to a list and then to a
dictionay. Are there speed penalties for such a conversion?


No. But it does take time to do the conversion. I think I'd write it
out "longhand":

def my_search(another, keys, x):
new = dict()
for k in keys:
if another[k] >= x:
new[k] = another[k]
return new

This makes it clear that you only index another twice if you actually
use it. The list comprehension will do the loop in C, but it means you
have to scan two lists instead of one. If you're worried about which
is faster, measure it on your target platform.

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
Nov 23 '05 #2

Mike Meyer wrote:
def my_search(another, keys, x):
return dict([[k, v] for k, v in another.items() if v >= x and k in keys])

But then you're looking through all the keys in another, and searching
through keys multiple times, which probably adds up to a lot more
wasted work than indexing another twice.

Would you mind clarify ? Do you mean "k in keys" is a scan rather than
a lookup ? I find it to be pretty clean and straight forward.

I think one way or another, one need to loop through one of them, then
index search the other. It may help a bit to take the len() and loop
through the shorter one.

This seems like a SQL equivalent.

select * from a where a.key=b.key and a.v >= x

Nov 23 '05 #3
On 22 Nov 2005 17:58:28 -0800, "javuchi" <ja*****@gmail.com> wrote:
I've been searching thru the library documentation, and this is the
best code I can produce for this alogorithm:

I'd like to return a dictionary which is a copy of 'another' dictionary
whoes values are bigger than 'x' and has the keys 'keys':

def my_search (another, keys, x):
temp = another.fromkeys(keys)
return dict([[k,v] for k in temp.keys() for v in temp.values()
if v>=x])

Is there any way to improve this code?
I want to avoid converting the dictionary to a list and then to a
dictionay. Are there speed penalties for such a conversion?

Bye.
another = dict(zip('abcd', iter(random.random, 2)))
import random
another = dict(zip('abcd', iter(random.random, 2)))
for k,v in another.items(): print k,v ...
a 0.606494662034
c 0.273998760342
b 0.358066029098
d 0.774406432218

If keys are few compared to the number of keys in another, this may be prefereable:
def my_search(another, keys, x): return dict((k,another[k]) for k in keys if another[k]>x) ... my_search(another, 'cb', .3) {'b': 0.35806602909756235} my_search(another, 'abcd', .4)

{'a': 0.60649466203365532, 'd': 0.77440643221840166}

This sounds like homework though ... ?

Regards,
Bengt Richter
Nov 23 '05 #4

Mike Meyer wrote:
def my_search(another, keys, x):
new = dict()
for k in keys:
if another[k] >= x:
new[k] = another[k]
return new

BTW, this would raise exception if k is not in another.

Nov 23 '05 #5

Bengt Richter wrote:
>>> def my_search(another, keys, x): return dict((k,another[k]) for k in keys if another[k]>x) ... >>> my_search(another, 'cb', .3) {'b': 0.35806602909756235} >>> my_search(another, 'abcd', .4)

{'a': 0.60649466203365532, 'd': 0.77440643221840166}

Do you need to guard the case "k not in another" ?

Nov 23 '05 #6
"bo****@gmail.com" <bo****@gmail.com> writes:
Mike Meyer wrote:
def my_search(another, keys, x):
return dict([[k, v] for k, v in another.items() if v >= x and k in keys])
But then you're looking through all the keys in another, and searching
through keys multiple times, which probably adds up to a lot more
wasted work than indexing another twice. Would you mind clarify ? Do you mean "k in keys" is a scan rather than
a lookup ? I find it to be pretty clean and straight forward.


I assumed keys was a simple sequence of some kind, because you passed
it to fromkeys. I guess it could be set or a dictionary, in which case
"k in keys" would be a lookup. Were you trying to force a lookup by
creating a dict with the keys from k via fromkeys? If so, using a set
would have the same effect, and be a lot clearer:

temp = set(keys)
return dict([[k, v] for k, v in another.items() if v >= x and k in temp])
I think one way or another, one need to loop through one of them, then
index search the other. It may help a bit to take the len() and loop
through the shorter one.


First, remember the warnings about premature optimization. The
following might be worth looking into:

use = set(another) - set(keys)
return dict([[k, another[k]] for k in use if another[k] >= x]

Though I still think I prefer the longhand version:

out = dict()
for key in set(another) - set(keys):
if another[k] >= x:
out[k] = another[k]

The set difference is still going to loop through one and do lookups
in the other, but it'll happen in C instead of Python.

Unless your lists are *very* long, the performance differences will
probably negligible, and are liable to change as you change the
underlying platform. So I'd recommend you choose the form that's mostt
readable to you, and go with that.

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
Nov 23 '05 #7

Mike Meyer wrote:
First, remember the warnings about premature optimization.
Which is why I said the one-liner(your first one) is clean and clear,
and bug free in one go.

use = set(another) - set(keys)
return dict([[k, another[k]] for k in use if another[k] >= x]

Though I still think I prefer the longhand version:

out = dict()
for key in set(another) - set(keys):
if another[k] >= x:
out[k] = another[k]

This is definitely better than the other long hand version as the set
operation remove the potential problem of another[k] raise KeyError.

Nov 23 '05 #8
javuchi wrote:
I want to avoid converting the dictionary to a list and then to a
dictionay. Are there speed penalties for such a conversion?


You mean, is it faster to write, test, debug and
execute slow Python code rather than letting Python's
built-in routines written in fast C do the job?

I have no idea. Perhaps you should try it and see.
Write some code to do it all manually, and time it.

Make sure you use realistic test data: if your users
will be using dictionaries with 10,000 items, there is
no point in testing only dictionaries with 10 items.
For accuracy, run (say) 20 tests, and look at the
average speed. Of better still, use the timeit module.

--
Steven.

Nov 23 '05 #9
On 22 Nov 2005 19:52:40 -0800, "bo****@gmail.com" <bo****@gmail.com> wrote:

Bengt Richter wrote:
>>> def my_search(another, keys, x): return dict((k,another[k]) for k in keys if another[k]>x)

...
>>> my_search(another, 'cb', .3)

{'b': 0.35806602909756235}
>>> my_search(another, 'abcd', .4)

{'a': 0.60649466203365532, 'd': 0.77440643221840166}

Do you need to guard the case "k not in another" ?

Good catch ;-)
What did the OP want as a value if any for that case? None? or no entry at all?
Taking a cue from Mike, I like the set method of getting the common keys, to eliminate the entry (untested)

def my_search(another, keys, x):
return dict((k,another[k]) for k in (set(another)&set(keys)) if another[k]>x)

otherwise, to get Nones, maybe (untested)

def my_search(another, keys, x):
return dict((k,another.get(k)) for k in keys if k not in another or another[k]>x)

Regards,
Bengt Richter
Nov 23 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Kerry Neilson | last post by:
Hi, Really hung up on this one. I'm trying to get all the fields of a dictionary to be unique for each class: class A { my_dict = dict_entry = { 'key1':0, 'key2':0 } __init__(self): for...
0
by: Till Plewe | last post by:
Is there a way to speed up killing python from within a python program? Sometimes shutting down takes more than 10 times as much time as the actual running of the program. The programs are...
3
by: Shivram U | last post by:
Hi, I want to store dictionaries on disk. I had a look at a few modules like bsddb, shelve etc. However would it be possible for me to do the following hash = where the key is an int and not...
9
by: Henk Verhoeven | last post by:
We are not alone! "Where other MDA tools are generating programmingcode, Codeless chooses not to generate code at all". OK, phpPeanuts is not an MDA tool (it has no fancy modeling GUI). But it...
2
by: David Pratt | last post by:
Hi. I like working with lists of dictionaries since order is preserved in a list when I want order and the dictionaries make it explicit what I have got inside them. I find this combination very...
3
by: Faisal Alquaddoomi | last post by:
Hello, I'm having a bit of trouble isolating my scripts from each other in my embedded Python interpreter, so that their global namespaces don't get all entangled. I've had some luck with...
2
by: techiepundit | last post by:
I'm parsing some data of the form: OuterName1 InnerName1=5,InnerName2=7,InnerName3=34; OuterName2 InnerNameX=43,InnerNameY=67,InnerName3=21; OuterName3 .... and so on.... These are fake...
11
by: BartlebyScrivener | last post by:
Still new. Learning attributes and functions and so on. Sorry if this is obvious, but if I'm defining a function for some dictionaries, how can I print just the names of the dictionaries? E.g....
8
by: placid | last post by:
Hi all, Just wondering if anyone knows how to pop up the dialog that windows pops up when copying/moving/deleting files from one directory to another, in python ? Cheers
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.