473,508 Members | 2,227 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

list comprehention

Hi,

Python beginner here and very much enjoying it. I'm looking for a
pythonic way to find how many listmembers are also present in a reference
list. Don't count duplicates (eg. if you already found a matching member
in the ref list, you can't use the ref member anymore).

Example1:
ref=[2, 2, 4, 1, 1]
list=[2, 3, 4, 5, 3]
solution: 2

Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third 2 in the
list should not be counted)

Any suggestions or comments?

Thanks.
M.
#my failing effort:
sum([min(r.count(n)-l[:i].count(n),l.count(n)) for i,n in enumerate(l)])

#test lists
import random
#reference list
r=[random.randint(1,5) for n in range(5)]
#list
l=[random.randint(1,5) for n in range(5)]

Jan 19 '06 #1
29 1497
> Python beginner here and very much enjoying it. I'm looking
for a pythonic way to find how many listmembers are also
present in a reference list. Don't count duplicates (eg. if
you already found a matching member in the ref list, you can't
use the ref member anymore).

Example1:
ref=[2, 2, 4, 1, 1]
list=[2, 3, 4, 5, 3]
solution: 2

Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third
2 in the list should not be counted)


It sounds like you're looking for "set" operations: (using "ell"
for clarity)
from sets import Set
a = [2,2,4,1,1]
b = [2,3,4,5,3]
setA = Set(a)
setB = Set(b)
results = setA.intersection(setB)
results Set([2,4]) intersection = [x for x in results]
intersection

[2,4]
I'm a tad confused by the help, as it sounds like sets are
supposed to be first-class citizens, but in ver2.3.5 that I'm
running here (or rather "there", on a friend's box), I have to
"import sets" which I didn't see mentioned in the reference manual.

-one of many tims on the list
tim = Set(["bald", "vegetarian", "loving husband"])

:)


Jan 19 '06 #2
def reference(alist1,alist2):
counter = 0
for x in lis:
if x in ref:
ref.pop(ref.index(x))
counter += 1
return counter

this works I think for your examples, but you should check it against
them and other cases.
good luck

Jan 19 '06 #3
revision of previous:

def reference(reflist,alist2):
counter = 0
for x in alist2:
if x in reflist:
reflist.pop(reflist.index(x))
counter += 1
return counter

Jan 19 '06 #4
another approach:

ref = [2,2,4,1,1]
lis = [2,2,5,2,4]

len([ref.pop(ref.index(x)) for x in lis if x in ref])

Jan 19 '06 #5

Tim Chase wrote:
Python beginner here and very much enjoying it. I'm looking
> for a pythonic way to find how many listmembers are also
> present in a reference list. Don't count duplicates (eg. if
> you already found a matching member in the ref list, you can't
> use the ref member anymore).
>
> Example1:
> ref=[2, 2, 4, 1, 1]
> list=[2, 3, 4, 5, 3]
> solution: 2
>
> Example2:
> ref=[2, 2, 4, 1, 1]
> list=[2, 2, 5, 2, 4]
> solution: 3 (note that only the first two 2's count, the third
> 2 in the list should not be counted)


It sounds like you're looking for "set" operations: (using "ell"
for clarity)
>>> from sets import Set
>>> a = [2,2,4,1,1]
>>> b = [2,3,4,5,3]
>>> setA = Set(a)
>>> setB = Set(b)
>>> results = setA.intersection(setB)
>>> results Set([2,4]) >>> intersection = [x for x in results]
>>> intersection

[2,4]

won't set remove duplicates which he wants to preserve ? He is not just
looking for the 'values' that is in common, but the occurence as well,
if I understand the requirement correctly.

I would just build a dict with the value as the key and occurence as
the value then loop the list and lookup.

Jan 19 '06 #6
Mathijs wrote:
Python beginner here and very much enjoying it. I'm looking for a
pythonic way to find how many listmembers are also present in a reference
list. Don't count duplicates (eg. if you already found a matching member
in the ref list, you can't use the ref member anymore).

Example1:
ref=[2, 2, 4, 1, 1]
list=[2, 3, 4, 5, 3]
solution: 2

Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third 2 in the
list should not be counted)

Any suggestions or comments?


sum(min(list.count(n), ref.count(n)) for n in set(ref))

Is that it?

Peter
Jan 19 '06 #7
Tim Chase wrote:
I'm a tad confused by the help, as it sounds like sets are
supposed to be first-class citizens, but in ver2.3.5 that I'm
running here (or rather "there", on a friend's box), I have to
"import sets" which I didn't see mentioned in the reference manual.


set and frozenset are a builtins starting with Python 2.4.

http://www.python.org/2.4/highlights.html

"""
New or upgraded built-ins

built-in sets - the sets module, introduced in 2.3, has now been implemented
in C, and the set and frozenset types are available as built-in types (PEP
218)
"""

Peter

Jan 19 '06 #8

Tim Chase wrote:
<snip>

I'm a tad confused by the help, as it sounds like sets are
supposed to be first-class citizens, but in ver2.3.5 that I'm
running here (or rather "there", on a friend's box), I have to
"import sets" which I didn't see mentioned in the reference manual.

-one of many tims on the list
tim = Set(["bald", "vegetarian", "loving husband"])

:)


The builtin "set" was added in 2.4, I believe (note lowercase "s"):

http://docs.python.org/whatsnew/node2.html

print "vegetarian".replace("etari","")
:-)

Jan 19 '06 #9
>> >Python beginner here and very much enjoying it. I'm looking
> for a pythonic way to find how many listmembers are also
> present in a reference list. Don't count duplicates (eg. if

[snipped]
won't set remove duplicates which he wants to preserve ?


My reading was that the solution shouldn't count duplicates
("Don't count duplicates"). However, once you mentioned it, and
I saw other folks' responses that looked very diff. from my own,
I re-read the OP's comments and found that I missed this key bit:

"note that only the first *two* 2's count, the third 2
in the list should not be counted"
(*emphasis* mine)

and that the desired result was a count (though a len(Set())
would return the right count if the OP wanted the true
intersection, but that's beside the point).

My error. Sorry, ladies and gentlemen :)

I'm partial to the elegance of markscala's suggestion of:

len([ref.pop(ref.index(x)) for x in lis if x in ref])

though it might need to come with a caveat that it doesn't leave
"ref" in the same state is it was originally, so it should be
copied and then manipulated thusly:

r = ref[:]
len([r.pop(r.index(x)) for x in (lis if x in r])

which would then leave ref undisturbed. As tested:

import random
ref = [random.randint(1,5) for n in range(5)]
for x in range(1,10):
lis = [random.randint(1,5) for n in range(5)]
r = ref[:]
print repr((r,lis))
print len([r.pop(r.index(x)) for x in lis if x in r])

seems to give the results the OP describes.

-tim


Jan 19 '06 #10
Hi,
I liked the twist at the end when you state that only the first two 2's
count. It reminded me
of my maths O'level revision where you always had to read the question
thoroughly.

Here is what I came up with:
ref [2, 2, 4, 1, 1] lst [2, 2, 5, 2, 4] tmp = [ [val]*min(lst.count(val), ref.count(val)) for val in set(ref)]
tmp [[], [2, 2], [4]] answer = [x for y in tmp for x in y]
answer [2, 2, 4]
I took a lot from Peter Ottens reply to generate tmp then flattened the
inner lists.

After a bit more thought, the intermediate calculation of tmp can be
removed with a
little loss in clarity though, to give:
answer = [ val for val in set(ref) for x in range(min(lst.count(val), ref.count(val)))]
answer

[2, 2, 4]

- Cheers, Paddy.

Jan 19 '06 #11
Mathijs wrote:
Python beginner here and very much enjoying it. I'm looking for a
pythonic way to find how many listmembers are also present in a
reference list. Don't count duplicates (eg. if you already found a
matching member in the ref list, you can't use the ref member
anymore).

Example1:
ref=[2, 2, 4, 1, 1]
list=[2, 3, 4, 5, 3]
solution: 2

Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third 2 in
the list should not be counted)


Here's the way I would do it:
def occurrences(it): res = {}
for item in it:
if item in res:
res[item] += 1
else:
res[item] = 1
return res
ref=[2, 2, 4, 1, 1]
lst=[2, 2, 5, 2, 4]
oref = occurrences(ref)
sum(min(v,oref.get(k,0)) for (k,v) in occurrences(lst).iteritems()) 3 lst=[2, 3, 4, 5, 3]
sum(min(v,oref.get(k,0)) for (k,v) in occurrences(lst).iteritems()) 2


Or in other words, define a function to return a dictionary containing
a count of the number of occurrences of each element in the list (this
assumes that the list elements are hashable). Then you just add up the
values in the test list making sure each count is limited to no higher than
the reference count.
Jan 20 '06 #12
Duncan Booth wrote:
Here's the way I would do it:
def occurrences(it):

res = {}
for item in it:
if item in res:
res[item] += 1
else:
res[item] = 1
return res


I slightly prefer:

def occurrences(it):
res = {}
res[item] = res.get(item, 0) + 1
return res
[...] Or in other words, define a function to return a dictionary containing
a count of the number of occurrences of each element in the list (this
assumes that the list elements are hashable). Then you just add up the
values in the test list making sure each count is limited to no higher than
the reference count.


Resulting in a linear-time average case, where the posted
list-comprehension-based solutions are quadratic. The title
of the thread is unfortunate.

The generalized problem is multiset (AKA "bag") intersection:

http://en.wikipedia.org/wiki/Bag_(mathematics)
--
--Bryan

Jan 20 '06 #13
Op 19 jan 2006 vond "ma*******@gmail.com" :
another approach:

ref = [2,2,4,1,1]
lis = [2,2,5,2,4]

len([ref.pop(ref.index(x)) for x in lis if x in ref])


This is the type of solution I was hoping to see: one-liners, with no use
of local variables. As Tim Chase already wrote, it has only one less
elegant side: it alters the original ref list.

Thanks for your suggestion.
Jan 23 '06 #14
Op 19 jan 2006 vond Peter Otten <__*******@web.de> :
sum(min(list.count(n), ref.count(n)) for n in set(ref))

Is that it?


Seems like this is it! Thanks.
Jan 23 '06 #15
Op 19 jan 2006 vond "Paddy" <pa*******@netscape.net>:
answer = [ val for val in set(ref) for x in
range(min(lst.count(val), ref.count(val)))] answer

[2, 2, 4]


I don't think it's correct. Your algoritm with the ref and lst below gives
3 as answer. The answer should have been 2 (1,3).

ref=[3, 3, 1, 1, 3]
lst=[5, 1, 4, 5, 3]

Jan 23 '06 #16
Op 20 jan 2006 vond Duncan Booth <du**********@invalid.invalid>:
Or in other words, define a function to return a dictionary containing
a count of the number of occurrences of each element in the list (this
assumes that the list elements are hashable). Then you just add up the
values in the test list making sure each count is limited to no higher
than the reference count.


Thanks. Though I don't know much about python (yet), this is more or less
the way I'de do it the language I'm more comfortable with (object pascal),
and I wouldn't label this as a pythonic solution. I could be wrong,
though:)
Jan 23 '06 #17
Mathijs wrote:
Op 20 jan 2006 vond Duncan Booth <du**********@invalid.invalid>:
Or in other words, define a function to return a dictionary
containing a count of the number of occurrences of each element in
the list (this assumes that the list elements are hashable). Then you
just add up the values in the test list making sure each count is
limited to no higher than the reference count.


Thanks. Though I don't know much about python (yet), this is more or
less the way I'de do it the language I'm more comfortable with (object
pascal), and I wouldn't label this as a pythonic solution. I could be
wrong, though:)

I'm curious what you think isn't pythonic about this solution?
Jan 23 '06 #18
Mathijs wrote:
Op 20 jan 2006 vond Duncan Booth <du**********@invalid.invalid>:
Or in other words, define a function to return a dictionary containing
a count of the number of occurrences of each element in the list (this
assumes that the list elements are hashable). Then you just add up the
values in the test list making sure each count is limited to no higher
than the reference count.


Thanks. Though I don't know much about python (yet), this is more or less
the way I'de do it the language I'm more comfortable with (object pascal),
and I wouldn't label this as a pythonic solution. I could be wrong,
though:)


You *are* wrong. Also, as the lists grow, Duncan's approach scales *much*
better than, e. g., mine.

If picking a better algorithm were unpythonic there would not be much value
in striving for pythonic solutions.

Peter
Jan 23 '06 #19
On Mon, 23 Jan 2006 17:41:55 +0000, Mathijs wrote:
Op 19 jan 2006 vond "ma*******@gmail.com" :
another approach:

ref = [2,2,4,1,1]
lis = [2,2,5,2,4]

len([ref.pop(ref.index(x)) for x in lis if x in ref])


This is the type of solution I was hoping to see: one-liners, with no use
of local variables.


Because you like unreadable, incomprehensible, unmaintainable code?

*wink*

--
Steven.

Jan 23 '06 #20
Duncan Booth showed how to solve a problem posed by Mathijs. This is
very similar to Duncan's solution, except I (ab)use setdefault on a
regular basis...
def occurrences(t): .... res = {}
.... for item in t:
.... res.setdefault(item,[0])[0] += 1
.... return res
.... ref = [2,2,4,1,1]
lst = [2,2,5,2,4]
oref = occurrences(ref)
sum(min(v[0],oref.get(k,[0])[0]) for (k,v) in occurrences(lst).iteritems()) 3
Actually, this brings up an interesting point:
{}.setdefault('foo',0) += 1

SyntaxError: can't assign to function call

I understand why this doesn't work, but it would sure be nice if it
did. I think somebody was mentioning "mutable ints" at one point,
which is basically what I abuse [0] to provide. If I were doing a lot
of this in one place, I would code a mutable integer class, and then
the rest of the code would get simpler.

Regards,
Pat

Jan 23 '06 #21
Bryan Olson wrote:
Duncan Booth wrote:
Here's the way I would do it:
> def occurrences(it):

res = {}
for item in it:
if item in res:
res[item] += 1
else:
res[item] = 1
return res

I slightly prefer:

def occurrences(it):
res = {}
res[item] = res.get(item, 0) + 1
return res


Arg! In illustrating a trivial point, I dropped something essential.
This version has almost three minutes of testing behind it:

def occurrences(it):
res = {}
for item in it:
res[item] = res.get(item, 0) + 1
return res
--
--Bryan
Jan 24 '06 #22
On 23/01/2006 18:41, Mathijs uttered:
len([ref.pop(ref.index(x)) for x in lis if x in ref])
This is the type of solution I was hoping to see: one-liners, with no
use of local variables. As Tim Chase already wrote, it has only one
less elegant side: it alters the original ref list.

Thanks for your suggestion.


May I suggest another one-liner:

len(set(ref).intersection(lis))

I have no idea how this scales/performs compared to the other
suggestions you've received, but I find it immediately comprehensible.
-- Morten
Jan 24 '06 #23
[Mathijs]
Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third 2 in the
list should not be counted)
[Morten Vold]
May I suggest another one-liner:

len(set(ref).intersection(lis))

I have no idea how this scales/performs compared to the other
suggestions you've received, but I find it immediately comprehensible.

len(set([2, 2, 4, 1, 1]).intersection([2, 2, 5, 2, 4]))

2

Close, but no cigar.

Peter
Jan 24 '06 #24
On 24/01/2006 09:46, Peter Otten uttered:
len(set([2, 2, 4, 1, 1]).intersection([2, 2, 5, 2, 4]))

2
Close, but no cigar.


I need more coffee! (note to self: Always read entire problem set first)

-- Morten
Jan 24 '06 #25
Patrick Maupin wrote:
def occurrences(t):
... res = {}
... for item in t:
... res.setdefault(item,[0])[0] += 1
... return res

.... I think somebody was mentioning "mutable ints" at one point,
which is basically what I abuse [0] to provide. If I were doing a lot
of this in one place, I would code a mutable integer class, and then
the rest of the code would get simpler.


Or you could use the facilities built in to Python: use the get method as
Bryan did, and the code gets simpler without using mutable integers.

I prefer writing an 'if' statement here, Bryan prefers 'get', that's just a
choice of style. But 'setdefault' here, that has no style.
Jan 24 '06 #26
Duncan Booth wrote:
I prefer writing an 'if' statement here, Bryan prefers 'get', that's just a
choice of style. But 'setdefault' here, that has no style.


Well, I'm often told I have no style, and I _did_ admit that it's an
abuse of setdefault. However, I often use setdefault to populate
nested dictionaries, or dictionaries of sets or lists, e.g.:

for x, y in somebiglist:
bigdict.setdefault(x,set()).add(y) # Strips duplicates
for x, y in somebiglist:
bigdict.setdefault(x,[]).append(y) # Preserves duplicates

To my mind, this latter is so much cleaner and clearer than any of the
alternatives that it just isn't funny:

for x, y in somebiglist:
if x in bigdict:
bigdict[x].append(y)
else:
bigdict[x] = [y]

for x, y in somebiglist:
if x not in bigdict:
bigdict[x] = []
bigdict[x].append(y)

for x, y in somebiglist:
try:
bigdict[x].append(y)
except KeyError:
bigdict[x] = [y]

etc.

Since I use setdefault in this manner quite often, I am very
comfortable with it. On a single line I know what I am creating (list,
dict, set, etc.), what the default value is, and how it is being
modified.

Because I am NOT a believer in the Perl idiom of TMTOWTDI, and because,
TO ME, setdefault screams "This line is creating and subsequently
modifying a dictionary entry", setdefault is absolutely my first choice
for this particular task.

Having said that, my enthusiasm for setdefault (for the given problem!)
IS tempered somewhat by having to abuse a list as a mutable integer.
That is the only reason that I concede that my solution is an abuse of
setdefault, but the setdefault idiom is (to me) so clear and compelling
that I view it as a toss-up whether to use a single line setdefault or
an if/else in this case.

And, as I mentioned earlier, if I had to do this same thing LOTS of
times in a program, I most likely would code a class (perhaps not as
fancy as a true mutable int, but certainly something with an increment
method), because, IMO, it would be a lot cleaner (hmmm, maybe there's
no style points in that) than tons and tons of if/else statements.

Regards,
Pat

Jan 24 '06 #27
Patrick Maupin wrote:
Duncan Booth wrote:
I prefer writing an 'if' statement here, Bryan prefers 'get', that's
just a choice of style. But 'setdefault' here, that has no style.


Well, I'm often told I have no style, and I _did_ admit that it's an
abuse of setdefault. However, I often use setdefault to populate
nested dictionaries, or dictionaries of sets or lists, e.g.:

for x, y in somebiglist:
bigdict.setdefault(x,set()).add(y) # Strips duplicates
for x, y in somebiglist:
bigdict.setdefault(x,[]).append(y) # Preserves duplicates

To my mind, this latter is so much cleaner and clearer than any of the
alternatives that it just isn't funny:


Yes, but storing a mutable is a not unreasonable use of setdefault. What I
objected to was storing an immutable just to overwrite it immediately.

Also, while I agree it is shorter, I'm not convinced that it is much
cleaner, and it is likely to be slower if there are a lot of duplicates,
possibly a lot slower if the mutable object has much overhead on its
construction.

Jan 24 '06 #28
Umm,
My answer is not correct, but for a different reason; it seems you need
the length of my
previous answer, thus:

PythonWin 2.4 (#60, Feb 9 2005, 19:03:27) [MSC v.1310 32 bit (Intel)]
on win32.
Portions Copyright 1994-2004 Mark Hammond (mh******@skippinet.com.au) -
see 'Help/About PythonWin' for further copyright information.
ref = [3, 3, 1, 1, 3]
lst=[5, 1, 4, 5, 3]
answer = len([ val for val in set(ref) for x in range(min(lst.count(val), ref.count(val)))])
answer 2


- Paddy.

Jan 24 '06 #29
23 jan 2006 ta (Steven D'Aprano) shuo le:
This is the type of solution I was hoping to see: one-liners, with no
use of local variables.
Because you like unreadable, incomprehensible, unmaintainable code?


For practical use: no! But I'm just learning python and to understand
sets/lists/dicts and it's functions, this seemed like good practise:) and
besides that: it's fun! A solution with local vaiables, if-statemenst and a
for loop wouldn't be much different from what I know with my current
programming skills.
*wink*


Yeah, yeah, I know, but I felt a response was on its place. :)
Jan 25 '06 #30

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
3070
by: massimo | last post by:
Hey, I wrote this program which should take the numbers entered and sort them out. It doesnąt matter what order, if decreasing or increasing. I guess I'm confused in the sorting part. Anyone...
10
15099
by: Kent | last post by:
Hi! I want to store data (of enemys in a game) as a linked list, each node will look something like the following: struct node { double x,y; // x and y position coordinates struct enemy...
24
5716
by: Robin Cole | last post by:
I'd like a code review if anyone has the time. The code implements a basic skip list library for generic use. I use the following header for debug macros: /* public.h - Public declarations and...
4
3585
by: JS | last post by:
I have a file called test.c. There I create a pointer to a pcb struct: struct pcb {   void *(*start_routine) (void *);   void *arg;   jmp_buf state;   int    stack; }; ...
3
2698
by: chellappa | last post by:
hi this simple sorting , but it not running...please correect error for sorting using pointer or linked list sorting , i did value sorting in linkedlist please correct error #include<stdio.h>...
0
1808
by: drewy2k12 | last post by:
Heres the story, I have to create a doubly linked list for class, and i have no clue on how to do it, i can barely create a single linked list. It has to have both a head and a tail pointer, and...
10
6553
by: AZRebelCowgirl73 | last post by:
This is what I have so far: My program! import java.util.*; import java.lang.*; import java.io.*; import ch06.lists.*; public class UIandDB {
0
8611
by: Atos | last post by:
SINGLE-LINKED LIST Let's start with the simplest kind of linked list : the single-linked list which only has one link per node. That node except from the data it contains, which might be...
12
3999
by: kalyan | last post by:
Hi, I am using Linux + SysV Shared memory (sorry, but my question is all about offset + pointers and not about linux/IPC) and hence use offset's instead on pointers to store the linked list in...
0
7231
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7401
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
7504
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5640
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
5059
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
3196
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1568
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
773
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
432
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.