473,320 Members | 2,177 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Which is faster? (if not b in m) or (if m.count(b) > 0)


Which is Faster in Python and Why?

jc = {}; m = []
x = [ [1,2,3,4,5,6,7,8,9],[..],.......] # upwards of 10000 entries

def mcountb():
for item in x:
b = item[:]; b.sort(); bc = 0
for bitem in b: bc += int(bitem)
try: m = jc[bc]
except: m = []
if m.count(b) == 0:
m.append(b); jc[bc]=m

def binm()
for item in x:
b = item[:]; b.sort(); bc = 0
for bitem in b: bc += int(bitem)
try: m = jc[bc]
except: m = []
if not b in m:
m.append(b); jc[bc]=m

Feb 15 '06 #1
11 1509
In <11**********************@z14g2000cwz.googlegroups .com>, Farel wrote:
Which is Faster in Python and Why?


``if not b in m`` looks at each element of `m` until it finds `b` in it
and stops then. Assuming `b` is in `m`, otherwise all elements of `m` are
"touched".

``if m.count(b) > 0`` will always goes through all elements of `m` in the
`count()` method.

Ciao,
Marc 'BlackJack' Rintsch
Feb 15 '06 #2
Em Ter, 2006-02-14 Ã*s 20:14 -0800, Farel escreveu:
Which is Faster in Python and Why?

jc = {}; m = []
x = [ [1,2,3,4,5,6,7,8,9],[..],.......] # upwards of 10000 entries
def binm()
for item in x:
b = item[:]; b.sort(); bc = 0
for bitem in b: bc += int(bitem)
try: m = jc[bc]
except: m = []
if not b in m:
m.append(b); jc[bc]=m


Why do you copy the list and sort it if you're just summing its
elements? Instead of

b = item[:]; b.sort(); bc = 0
for bitem in b: bc += int(bitem)

you can do just

bc = sum(int(i) for i in item)

or, if the order *really* matters, what is very strange, you can do

bc = sum(int(i) for i in sorted(item))

Another thing, if you want to have in the values of the dictionary jc
only unique elements, you can use sets:
a = set()
print a set([]) a.add(10)
print a set([10]) a.add(10)
print a set([10]) a.add(10)
print a

set([10])

So, the final example would be (assuming that you don't need to sum in
order):

def binm():
for item in x:
bc = sum(int(i) for i in item)
try:
jc[bc].add(b)
except KeyError:
jc[bc] = set([b])

Looks prettier, have less statements and is (or should be) faster. This
only works in Python 2.4, but you can adapt it to run on other versions,
too.

Cheers,
Felipe.

--
"Quem excele em empregar a força militar subjulga os exércitos dos
outros povos sem travar batalha, toma cidades fortificadas dos outros
povos sem as atacar e destrói os estados dos outros povos sem lutas
prolongadas. Deve lutar sob o Céu com o propósito primordial da
'preservação'. Desse modo suas armas não se embotarão, e os ganhos
poderão ser preservados. Essa é a estratégia para planejar ofensivas."

-- Sun Tzu, em "A arte da guerra"

Feb 15 '06 #3
Farel wrote:
Which is Faster in Python and Why?


Why don't you try by yourself ?
hint :
from timeit import Timer
help(Timer)
--
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in 'o****@xiludom.gro'.split('@')])"
Feb 15 '06 #4
On Wed, 15 Feb 2006 08:44:10 +0100, Marc 'BlackJack' Rintsch wrote:
In <11**********************@z14g2000cwz.googlegroups .com>, Farel wrote:
Which is Faster in Python and Why?


``if not b in m`` looks at each element of `m` until it finds `b` in it
and stops then. Assuming `b` is in `m`, otherwise all elements of `m` are
"touched".

``if m.count(b) > 0`` will always goes through all elements of `m` in the
`count()` method.


But the first technique executes in (relatively slow) pure Python, while
the count method executes (relatively fast) C code. So even though count
may do more work, it may do it faster.

The only way to tell for sure is to actually time the code, not try to
guess.

--
Steven.

Feb 15 '06 #5
Steven D'Aprano wrote:
On Wed, 15 Feb 2006 08:44:10 +0100, Marc 'BlackJack' Rintsch wrote:
In <11**********************@z14g2000cwz.googlegroups .com>, Farel wrote:
Which is Faster in Python and Why?


``if not b in m`` looks at each element of `m` until it finds `b` in it
and stops then. Assuming `b` is in `m`, otherwise all elements of `m` are
"touched".

``if m.count(b) > 0`` will always goes through all elements of `m` in the
`count()` method.


But the first technique executes in (relatively slow) pure Python, while
the count method executes (relatively fast) C code. So even though count
may do more work, it may do it faster.


Why does "not b in m" execute in pure Python? list.__contains__ is implemented
in C code as well as list.count.

Georg
Feb 15 '06 #6
Steven D'Aprano wrote:
On Wed, 15 Feb 2006 08:44:10 +0100, Marc 'BlackJack' Rintsch wrote:

``if not b in m`` looks at each element of `m` until it finds `b` in it
and stops then. Assuming `b` is in `m`, otherwise all elements of `m` are
"touched".

``if m.count(b) > 0`` will always goes through all elements of `m` in the
`count()` method.

But the first technique executes in (relatively slow) pure Python, while
the count method executes (relatively fast) C code. So even though count
may do more work, it may do it faster.


'a in b' actually takes fewer bytecodes because 'in' has it's own
bytecode but b.count requires an attribute lookup:
import dis
def foo(): ... a in b
... def bar(): ... b.count(a)
... dis.dis(foo) 2 0 LOAD_GLOBAL 0 (a)
3 LOAD_GLOBAL 1 (b)
6 COMPARE_OP 6 (in)
9 POP_TOP
10 LOAD_CONST 0 (None)
13 RETURN_VALUE dis.dis(bar)

2 0 LOAD_GLOBAL 0 (b)
3 LOAD_ATTR 1 (count)
6 LOAD_GLOBAL 2 (a)
9 CALL_FUNCTION 1
12 POP_TOP
13 LOAD_CONST 0 (None)
16 RETURN_VALUE

Not much difference in speed when the item is not in the list, a slight
edge to 'a in b' for a large list:

D:\>python -m timeit -s "lst = range(1000)" "1001 in lst"
10000 loops, best of 3: 55.5 usec per loop

D:\>python -m timeit -s "lst = range(1000)" "lst.count(1001) > 1"
10000 loops, best of 3: 55.5 usec per loop

D:\>python -m timeit -s "lst = range(1000000)" "lst.count(-1) > 1"
10 loops, best of 3: 62.2 msec per loop

D:\>python -m timeit -s "lst = range(1000000)" "(-1) in lst"
10 loops, best of 3: 60.8 msec per loop

Of course if the item appears in the list, 'a in b' has a huge advantage:

D:\>python -m timeit -s "lst = range(1000000)" "(1) in lst"
10000000 loops, best of 3: 0.171 usec per loop

Kent
Feb 15 '06 #7
Georg Brandl wrote:
Steven D'Aprano wrote:
On Wed, 15 Feb 2006 08:44:10 +0100, Marc 'BlackJack' Rintsch wrote:

In <11**********************@z14g2000cwz.googlegroups .com>, Farel wrote:
Which is Faster in Python and Why?

``if not b in m`` looks at each element of `m` until it finds `b` in it
and stops then. Assuming `b` is in `m`, otherwise all elements of `m` are
"touched".

``if m.count(b) > 0`` will always goes through all elements of `m` in the
`count()` method.


But the first technique executes in (relatively slow) pure Python, while
the count method executes (relatively fast) C code. So even though count
may do more work, it may do it faster.

Why does "not b in m" execute in pure Python? list.__contains__ is implemented
in C code as well as list.count.


Thank you to Kent Johnson for actually *timing* the two
pieces of code with various sets of data, cutting
through all the wild guesses (including mine). Which
was my point.

To answer Georg's question, perhaps I was wrong -- I
was under the impression that "target in source"
executes more or less like:

for obj in source:
if target == obj: return True
return False

Perhaps this was only true in older versions of Python,
or perhaps I was misinformed, or perhaps I imagined the
whole thing.

The important lesson is that your intuitions about what
will be fast are not necessarily correct, and all the
theorizing in the world is no substitute for good
testing. (On the other hand, lousy testing is
practically worthless.)
--
Steven.

Feb 16 '06 #8
Steven D'Aprano wrote:
[...]
(On the other hand, lousy testing is practically worthless.)

+1 QOTW
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Feb 16 '06 #9
Great observations... Changing for lists to tuples would make sorting
unnecessary...

Feb 17 '06 #10
Well, I could, but I was looking for answers from persons more
experienced in the ways of Python then myself....

Feb 17 '06 #11
Execellent, Kent. Thank you... and everyone too! I've learned much!

Feb 17 '06 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
by: Bob | last post by:
I have a function that takes in a list of IDs (hundreds) as input parameter and needs to pass the data to another step as a comma delimited string. The source can easily create this list of IDs in...
3
by: Ricardo Quintanilla | last post by:
i had a problem whom i do not know how to explain. i was using a TcpClient (System.Net.Sockets.TcpClient) object to send and receive data to an AS400 socket. Two months ago it started to work...
8
by: Scott Emick | last post by:
I am using the following to compute distances between two lat/long coordinates for a store locator - (VB .NET 2003) it seems to take a long time to iterate through like 100-150 locations -...
4
by: Sonnich | last post by:
Hi I have a costum function for a special search, which sort strings. This is currently the place where I can save a lot of time (~70%) if possible. So, which is faster: for($j =...
16
by: Brian Tkatch | last post by:
Is there a way to check the order in which SET INTEGRITY needs to be applied? This would be for a script with a dynamic list of TABLEs. B.
52
by: Nomad.C | last post by:
Hi I've been thinking of learning Fortran as number crunching kinda language for my Physics degree......but then looking around the internet, people are saying that the libraries/ Algorithms once...
6
by: mgcclx | last post by:
For loop and while loop. which one is faster? I see many articles fighting over it and different people come up with different results.
7
by: Chris | last post by:
I am trying to increase/decrease the value of $_SESSION by 1 after clicking on a link e.g index.php?gotoWk=nxtWk and index.php? gotoWk=lstWk. I'm sure you will get the drift if you look at the code...
11
by: Mahdi | last post by:
Hi This seems like a crazy question but I wanted to find out if anyone knows which one of those two conditions are better (faster, smaller memory footprint, can be optimized by the compiler,...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.