473,586 Members | 2,639 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Which is faster? (if not b in m) or (if m.count(b) > 0)


Which is Faster in Python and Why?

jc = {}; m = []
x = [ [1,2,3,4,5,6,7,8 ,9],[..],.......] # upwards of 10000 entries

def mcountb():
for item in x:
b = item[:]; b.sort(); bc = 0
for bitem in b: bc += int(bitem)
try: m = jc[bc]
except: m = []
if m.count(b) == 0:
m.append(b); jc[bc]=m

def binm()
for item in x:
b = item[:]; b.sort(); bc = 0
for bitem in b: bc += int(bitem)
try: m = jc[bc]
except: m = []
if not b in m:
m.append(b); jc[bc]=m

Feb 15 '06 #1
11 1525
In <11************ **********@z14g 2000cwz.googleg roups.com>, Farel wrote:
Which is Faster in Python and Why?


``if not b in m`` looks at each element of `m` until it finds `b` in it
and stops then. Assuming `b` is in `m`, otherwise all elements of `m` are
"touched".

``if m.count(b) > 0`` will always goes through all elements of `m` in the
`count()` method.

Ciao,
Marc 'BlackJack' Rintsch
Feb 15 '06 #2
Em Ter, 2006-02-14 Ã*s 20:14 -0800, Farel escreveu:
Which is Faster in Python and Why?

jc = {}; m = []
x = [ [1,2,3,4,5,6,7,8 ,9],[..],.......] # upwards of 10000 entries
def binm()
for item in x:
b = item[:]; b.sort(); bc = 0
for bitem in b: bc += int(bitem)
try: m = jc[bc]
except: m = []
if not b in m:
m.append(b); jc[bc]=m


Why do you copy the list and sort it if you're just summing its
elements? Instead of

b = item[:]; b.sort(); bc = 0
for bitem in b: bc += int(bitem)

you can do just

bc = sum(int(i) for i in item)

or, if the order *really* matters, what is very strange, you can do

bc = sum(int(i) for i in sorted(item))

Another thing, if you want to have in the values of the dictionary jc
only unique elements, you can use sets:
a = set()
print a set([]) a.add(10)
print a set([10]) a.add(10)
print a set([10]) a.add(10)
print a

set([10])

So, the final example would be (assuming that you don't need to sum in
order):

def binm():
for item in x:
bc = sum(int(i) for i in item)
try:
jc[bc].add(b)
except KeyError:
jc[bc] = set([b])

Looks prettier, have less statements and is (or should be) faster. This
only works in Python 2.4, but you can adapt it to run on other versions,
too.

Cheers,
Felipe.

--
"Quem excele em empregar a força militar subjulga os exércitos dos
outros povos sem travar batalha, toma cidades fortificadas dos outros
povos sem as atacar e destrói os estados dos outros povos sem lutas
prolongadas. Deve lutar sob o Céu com o propósito primordial da
'preservação' . Desse modo suas armas não se embotarão, e os ganhos
poderão ser preservados. Essa é a estratégia para planejar ofensivas."

-- Sun Tzu, em "A arte da guerra"

Feb 15 '06 #3
Farel wrote:
Which is Faster in Python and Why?


Why don't you try by yourself ?
hint :
from timeit import Timer
help(Timer)
--
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in 'o****@xiludom. gro'.split('@')])"
Feb 15 '06 #4
On Wed, 15 Feb 2006 08:44:10 +0100, Marc 'BlackJack' Rintsch wrote:
In <11************ **********@z14g 2000cwz.googleg roups.com>, Farel wrote:
Which is Faster in Python and Why?


``if not b in m`` looks at each element of `m` until it finds `b` in it
and stops then. Assuming `b` is in `m`, otherwise all elements of `m` are
"touched".

``if m.count(b) > 0`` will always goes through all elements of `m` in the
`count()` method.


But the first technique executes in (relatively slow) pure Python, while
the count method executes (relatively fast) C code. So even though count
may do more work, it may do it faster.

The only way to tell for sure is to actually time the code, not try to
guess.

--
Steven.

Feb 15 '06 #5
Steven D'Aprano wrote:
On Wed, 15 Feb 2006 08:44:10 +0100, Marc 'BlackJack' Rintsch wrote:
In <11************ **********@z14g 2000cwz.googleg roups.com>, Farel wrote:
Which is Faster in Python and Why?


``if not b in m`` looks at each element of `m` until it finds `b` in it
and stops then. Assuming `b` is in `m`, otherwise all elements of `m` are
"touched".

``if m.count(b) > 0`` will always goes through all elements of `m` in the
`count()` method.


But the first technique executes in (relatively slow) pure Python, while
the count method executes (relatively fast) C code. So even though count
may do more work, it may do it faster.


Why does "not b in m" execute in pure Python? list.__contains __ is implemented
in C code as well as list.count.

Georg
Feb 15 '06 #6
Steven D'Aprano wrote:
On Wed, 15 Feb 2006 08:44:10 +0100, Marc 'BlackJack' Rintsch wrote:

``if not b in m`` looks at each element of `m` until it finds `b` in it
and stops then. Assuming `b` is in `m`, otherwise all elements of `m` are
"touched".

``if m.count(b) > 0`` will always goes through all elements of `m` in the
`count()` method.

But the first technique executes in (relatively slow) pure Python, while
the count method executes (relatively fast) C code. So even though count
may do more work, it may do it faster.


'a in b' actually takes fewer bytecodes because 'in' has it's own
bytecode but b.count requires an attribute lookup:
import dis
def foo(): ... a in b
... def bar(): ... b.count(a)
... dis.dis(foo) 2 0 LOAD_GLOBAL 0 (a)
3 LOAD_GLOBAL 1 (b)
6 COMPARE_OP 6 (in)
9 POP_TOP
10 LOAD_CONST 0 (None)
13 RETURN_VALUE dis.dis(bar)

2 0 LOAD_GLOBAL 0 (b)
3 LOAD_ATTR 1 (count)
6 LOAD_GLOBAL 2 (a)
9 CALL_FUNCTION 1
12 POP_TOP
13 LOAD_CONST 0 (None)
16 RETURN_VALUE

Not much difference in speed when the item is not in the list, a slight
edge to 'a in b' for a large list:

D:\>python -m timeit -s "lst = range(1000)" "1001 in lst"
10000 loops, best of 3: 55.5 usec per loop

D:\>python -m timeit -s "lst = range(1000)" "lst.count(1001 ) > 1"
10000 loops, best of 3: 55.5 usec per loop

D:\>python -m timeit -s "lst = range(1000000)" "lst.count(-1) > 1"
10 loops, best of 3: 62.2 msec per loop

D:\>python -m timeit -s "lst = range(1000000)" "(-1) in lst"
10 loops, best of 3: 60.8 msec per loop

Of course if the item appears in the list, 'a in b' has a huge advantage:

D:\>python -m timeit -s "lst = range(1000000)" "(1) in lst"
10000000 loops, best of 3: 0.171 usec per loop

Kent
Feb 15 '06 #7
Georg Brandl wrote:
Steven D'Aprano wrote:
On Wed, 15 Feb 2006 08:44:10 +0100, Marc 'BlackJack' Rintsch wrote:

In <11************ **********@z14g 2000cwz.googleg roups.com>, Farel wrote:
Which is Faster in Python and Why?

``if not b in m`` looks at each element of `m` until it finds `b` in it
and stops then. Assuming `b` is in `m`, otherwise all elements of `m` are
"touched".

``if m.count(b) > 0`` will always goes through all elements of `m` in the
`count()` method.


But the first technique executes in (relatively slow) pure Python, while
the count method executes (relatively fast) C code. So even though count
may do more work, it may do it faster.

Why does "not b in m" execute in pure Python? list.__contains __ is implemented
in C code as well as list.count.


Thank you to Kent Johnson for actually *timing* the two
pieces of code with various sets of data, cutting
through all the wild guesses (including mine). Which
was my point.

To answer Georg's question, perhaps I was wrong -- I
was under the impression that "target in source"
executes more or less like:

for obj in source:
if target == obj: return True
return False

Perhaps this was only true in older versions of Python,
or perhaps I was misinformed, or perhaps I imagined the
whole thing.

The important lesson is that your intuitions about what
will be fast are not necessarily correct, and all the
theorizing in the world is no substitute for good
testing. (On the other hand, lousy testing is
practically worthless.)
--
Steven.

Feb 16 '06 #8
Steven D'Aprano wrote:
[...]
(On the other hand, lousy testing is practically worthless.)

+1 QOTW
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Feb 16 '06 #9
Great observations... Changing for lists to tuples would make sorting
unnecessary...

Feb 17 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
15014
by: Bob | last post by:
I have a function that takes in a list of IDs (hundreds) as input parameter and needs to pass the data to another step as a comma delimited string. The source can easily create this list of IDs in a comma-delimited string or string array. I don't want it to be a string because I want to overload this function, and it's sister already uses a...
3
28633
by: Ricardo Quintanilla | last post by:
i had a problem whom i do not know how to explain. i was using a TcpClient (System.Net.Sockets.TcpClient) object to send and receive data to an AS400 socket. Two months ago it started to work slowly, about 4 seconds between send and receive. In our production environment with hundreds of transactions it was truly costly. a while ago i...
8
3256
by: Scott Emick | last post by:
I am using the following to compute distances between two lat/long coordinates for a store locator - (VB .NET 2003) it seems to take a long time to iterate through like 100-150 locations - about 10-15 seconds...I want to make the code faster. I changed it to be multi-threaded, and it doesn't really make it any faster. The bottleneck seems...
4
2646
by: Sonnich | last post by:
Hi I have a costum function for a special search, which sort strings. This is currently the place where I can save a lot of time (~70%) if possible. So, which is faster: for($j = 0;$j<count($array);$j++)
16
5648
by: Brian Tkatch | last post by:
Is there a way to check the order in which SET INTEGRITY needs to be applied? This would be for a script with a dynamic list of TABLEs. B.
52
5083
by: Nomad.C | last post by:
Hi I've been thinking of learning Fortran as number crunching kinda language for my Physics degree......but then looking around the internet, people are saying that the libraries/ Algorithms once used for number crunching is now slowly converting into C, so do you think I should stick with C, since I know C already, or should I proceed...
6
10804
by: mgcclx | last post by:
For loop and while loop. which one is faster? I see many articles fighting over it and different people come up with different results.
7
2538
by: Chris | last post by:
I am trying to increase/decrease the value of $_SESSION by 1 after clicking on a link e.g index.php?gotoWk=nxtWk and index.php? gotoWk=lstWk. I'm sure you will get the drift if you look at the code below. However, this code seems to be unreliable. Is there a more robust way of achieving the same thing? Many thanks,
11
1252
by: Mahdi | last post by:
Hi This seems like a crazy question but I wanted to find out if anyone knows which one of those two conditions are better (faster, smaller memory footprint, can be optimized by the compiler, etc..) 1- if(i 0) {// do stuff} 2- if(i >=1) {// do stuff} Thanks.
0
7912
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7839
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8338
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7959
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6614
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5710
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
3837
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
2345
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1449
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.