473,385 Members | 1,506 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Testing for membership speed

I just ran this stuff for my own knowledge. Though it might be
useful to some other people to know and maybe spark a discussion.

I needed a fast way to test for membership, so naturally the
choices were the builtin containers: lists, dictionaries, and
tuples. The following is the test code and results:

import timeit

lst_i=timeit.Timer('random.randrange(10000) in l','import
random; l=range(10000)')
dct_i=timeit.Timer('l.has_key(random.randrange(100 00))','import
random; l=dict([(i,None) for i in xrange(10000)])')
tup_i=timeit.Timer('random.randrange(10000) in l','import
random; l=tuple(range(10000))')

lst_str=timeit.Timer('md5.md5(str(random.randrange (10000))).hexdigest()
in l','import random,md5; l=[md5.md5(str(i)).hexdigest() for i
in xrange(10000)]')
dct_str=timeit.Timer('l.has_key(md5.md5(str(random .randrange(10000))).hexdigest())','import
random,md5; l=dict([(md5.md5(str(i)).hexdigest(),None) for i
in xrange(10000)])')
tup_str=timeit.Timer('md5.md5(str(random.randrange (10000))).hexdigest()
in l','import random,md5; l=tuple([md5.md5(str(i)).hexdigest()
for i in xrange(10000)])')

print 'Integer lookup'
r=lst_i.repeat(100,100); print 'list:',min(r),max(r);
r=dct_i.repeat(100,100); print 'dict:',min(r),max(r);
r=tup_i.repeat(100,100); print 'tupl:',min(r),max(r);
print 'String lookup'
r=lst_str.repeat(100,100); print 'list:',min(r),max(r);
r=dct_str.repeat(100,100); print 'dict:',min(r),max(r);
r=tup_str.repeat(100,100); print 'tupl:',min(r),max(r);

[[[Ran on IRIX64 6.5, 24 processors, 12G Memory, 4G Swap, this
code only uses one processor at %100 the full length of the run]]]
Python 2.2.3 (#1, Nov 25 2003, 18:58:21) [C] on irix646-n32
Type "help", "copyright", "credits" or "license" for more
information.
## working on region in file /usr/tmp/python-119959673PMu.py...
Integer lookup
list: 0.126830816269 0.160212993622
dict: 0.00362300872803 0.00385618209839
tupl: 0.119297981262 0.161748170853
String lookup
list: 0.142526865005 0.188524961472
dict: 0.00711393356323 0.00760197639465
tupl: 0.143892049789 0.186873912811


The results are conclusive, go for dictionaries. But this
surprised me a little, anyone have insight as to why?

I was also surprised that tuples and lists scored exactly the
same. I was hoping that immutable tuples might earn it some
speed over lists.

I will eventually need this for testing strings. So the
doubling of speed for strings over integers for dictionaries
is a little alarming. Lists and tuples only saw a modest increase.

Thank you in advance for any clever tricks you suggest.

-Brian

Jul 18 '05 #1
1 2044
The dictionaries are faster because they are indexed.
Lists/tuples must be searched serially from the
beginning.

Dictionary indexes are hashed and the
hashing algorithm has more to do on a string
than on an integer (see first article link
below for explanation).

You could check into pre-hashing your strings and using
these as integer keys. It will take longer to build
the dictionary, but searching should be faster.

Here are some articles that might interest you:

http://effbot.org/zone/python-hash.htm

http://mail.python.org/pipermail/pyt...ne/036556.html

http://www.egenix.com/files/python/mxTextTools.html

Searching list or tuple serially from the beginning
should take approximately the same time. Nothing
about the mutability can help the search speed.

HTH,
Larry Bates
Syscon, Inc.
<br****@temple.edu> wrote in message
news:ma*************************************@pytho n.org...
I just ran this stuff for my own knowledge. Though it might be
useful to some other people to know and maybe spark a discussion.

I needed a fast way to test for membership, so naturally the
choices were the builtin containers: lists, dictionaries, and
tuples. The following is the test code and results:

import timeit

lst_i=timeit.Timer('random.randrange(10000) in l','import
random; l=range(10000)')
dct_i=timeit.Timer('l.has_key(random.randrange(100 00))','import
random; l=dict([(i,None) for i in xrange(10000)])')
tup_i=timeit.Timer('random.randrange(10000) in l','import
random; l=tuple(range(10000))')

lst_str=timeit.Timer('md5.md5(str(random.randrange (10000))).hexdigest()
in l','import random,md5; l=[md5.md5(str(i)).hexdigest() for i
in xrange(10000)]')
dct_str=timeit.Timer('l.has_key(md5.md5(str(random .randrange(10000))).hexdig
est())','import random,md5; l=dict([(md5.md5(str(i)).hexdigest(),None) for i
in xrange(10000)])')
tup_str=timeit.Timer('md5.md5(str(random.randrange (10000))).hexdigest()
in l','import random,md5; l=tuple([md5.md5(str(i)).hexdigest()
for i in xrange(10000)])')

print 'Integer lookup'
r=lst_i.repeat(100,100); print 'list:',min(r),max(r);
r=dct_i.repeat(100,100); print 'dict:',min(r),max(r);
r=tup_i.repeat(100,100); print 'tupl:',min(r),max(r);
print 'String lookup'
r=lst_str.repeat(100,100); print 'list:',min(r),max(r);
r=dct_str.repeat(100,100); print 'dict:',min(r),max(r);
r=tup_str.repeat(100,100); print 'tupl:',min(r),max(r);

[[[Ran on IRIX64 6.5, 24 processors, 12G Memory, 4G Swap, this
code only uses one processor at %100 the full length of the run]]]
Python 2.2.3 (#1, Nov 25 2003, 18:58:21) [C] on irix646-n32
Type "help", "copyright", "credits" or "license" for more
information.
## working on region in file /usr/tmp/python-119959673PMu.py...
Integer lookup
list: 0.126830816269 0.160212993622
dict: 0.00362300872803 0.00385618209839
tupl: 0.119297981262 0.161748170853
String lookup
list: 0.142526865005 0.188524961472
dict: 0.00711393356323 0.00760197639465
tupl: 0.143892049789 0.186873912811


The results are conclusive, go for dictionaries. But this
surprised me a little, anyone have insight as to why?

I was also surprised that tuples and lists scored exactly the
same. I was hoping that immutable tuples might earn it some
speed over lists.

I will eventually need this for testing strings. So the
doubling of speed for strings over integers for dictionaries
is a little alarming. Lists and tuples only saw a modest increase.

Thank you in advance for any clever tricks you suggest.

-Brian

Jul 18 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Brian | last post by:
Hello all.. Am working on an Air Hockey game... have an table loaded into a picture box. The borders of the table are slightly slanted. Am using hit testing lines with GDI+ to manipulate the...
9
by: Paul Keegstra | last post by:
Hi, I am currently working on an asp.net 2.0 web site that is a replacement of a classic asp web site. The current web site uses a Commerce Server 2002 database for storing user information. ...
2
by: Balaji | last post by:
Hi All, Can I use more than one membership provider for a given website? I understand only one of them could be default one. If yes, then how to programmatically access the other membership...
0
by: Cooper Blake | last post by:
Hello, I'm trying to unit test an asp.net application. We are using the .net 2..0 authentication / loginView, etc. web controls, with a customized MembershipProvider (MP) class to connect...
2
by: gnewsgroup | last post by:
I am working on a membership web application, The logic is like this: 1. All logged-in users can visit Folder1, Folder2, Folder3, but only administrators can access the Admin folder. ...
1
by: =?Utf-8?B?Q2FtaWxvIE9yb3pjbw==?= | last post by:
How can I test a membership provider using a test project within VS 2005? Basically I am stuck trying to read the custom app.config file I added to the project with a <system.websection. I think...
5
by: DotNetNewbie | last post by:
Hi, I am developing an application that has to scale and be very efficient, and I am using asp.net membership in my application. I set things up in my Users table (it has extra columns that I...
0
by: mike222 | last post by:
Hi All, This is a bit off-topic but may be of interest to some people. I have just found a website that looks interesting. It lets companies register their products for beta testing, and testers...
1
by: Jeff | last post by:
Hey ASP.NET 2.0 At work my boss have given me the task of developing a new website. Users will be able to register at the website and gain exclusive access to some information etc... Some...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.