Hi,
Python beginner here and very much enjoying it. I'm looking for a
pythonic way to find how many listmembers are also present in a reference
list. Don't count duplicates (eg. if you already found a matching member
in the ref list, you can't use the ref member anymore).
Example1:
ref=[2, 2, 4, 1, 1]
list=[2, 3, 4, 5, 3]
solution: 2
Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third 2 in the
list should not be counted)
Any suggestions or comments?
Thanks.
M.
#my failing effort:
sum([min(r.count(n)-l[:i].count(n),l.count(n)) for i,n in enumerate(l)])
#test lists
import random
#reference list
r=[random.randint(1,5) for n in range(5)]
#list
l=[random.randint(1,5) for n in range(5)] 29 1497
> Python beginner here and very much enjoying it. I'm looking for a pythonic way to find how many listmembers are also present in a reference list. Don't count duplicates (eg. if you already found a matching member in the ref list, you can't use the ref member anymore).
Example1: ref=[2, 2, 4, 1, 1] list=[2, 3, 4, 5, 3] solution: 2
Example2: ref=[2, 2, 4, 1, 1] list=[2, 2, 5, 2, 4] solution: 3 (note that only the first two 2's count, the third 2 in the list should not be counted)
It sounds like you're looking for "set" operations: (using "ell"
for clarity) from sets import Set a = [2,2,4,1,1] b = [2,3,4,5,3] setA = Set(a) setB = Set(b) results = setA.intersection(setB) results
Set([2,4]) intersection = [x for x in results] intersection
[2,4]
I'm a tad confused by the help, as it sounds like sets are
supposed to be first-class citizens, but in ver2.3.5 that I'm
running here (or rather "there", on a friend's box), I have to
"import sets" which I didn't see mentioned in the reference manual.
-one of many tims on the list
tim = Set(["bald", "vegetarian", "loving husband"])
:)
def reference(alist1,alist2):
counter = 0
for x in lis:
if x in ref:
ref.pop(ref.index(x))
counter += 1
return counter
this works I think for your examples, but you should check it against
them and other cases.
good luck
revision of previous:
def reference(reflist,alist2):
counter = 0
for x in alist2:
if x in reflist:
reflist.pop(reflist.index(x))
counter += 1
return counter
another approach:
ref = [2,2,4,1,1]
lis = [2,2,5,2,4]
len([ref.pop(ref.index(x)) for x in lis if x in ref])
Tim Chase wrote: Python beginner here and very much enjoying it. I'm looking > for a pythonic way to find how many listmembers are also > present in a reference list. Don't count duplicates (eg. if > you already found a matching member in the ref list, you can't > use the ref member anymore). > > Example1: > ref=[2, 2, 4, 1, 1] > list=[2, 3, 4, 5, 3] > solution: 2 > > Example2: > ref=[2, 2, 4, 1, 1] > list=[2, 2, 5, 2, 4] > solution: 3 (note that only the first two 2's count, the third > 2 in the list should not be counted)
It sounds like you're looking for "set" operations: (using "ell" for clarity)
>>> from sets import Set >>> a = [2,2,4,1,1] >>> b = [2,3,4,5,3] >>> setA = Set(a) >>> setB = Set(b) >>> results = setA.intersection(setB) >>> results Set([2,4]) >>> intersection = [x for x in results] >>> intersection
[2,4]
won't set remove duplicates which he wants to preserve ? He is not just
looking for the 'values' that is in common, but the occurence as well,
if I understand the requirement correctly.
I would just build a dict with the value as the key and occurence as
the value then loop the list and lookup.
Mathijs wrote: Python beginner here and very much enjoying it. I'm looking for a pythonic way to find how many listmembers are also present in a reference list. Don't count duplicates (eg. if you already found a matching member in the ref list, you can't use the ref member anymore).
Example1: ref=[2, 2, 4, 1, 1] list=[2, 3, 4, 5, 3] solution: 2
Example2: ref=[2, 2, 4, 1, 1] list=[2, 2, 5, 2, 4] solution: 3 (note that only the first two 2's count, the third 2 in the list should not be counted)
Any suggestions or comments?
sum(min(list.count(n), ref.count(n)) for n in set(ref))
Is that it?
Peter
Tim Chase wrote: I'm a tad confused by the help, as it sounds like sets are supposed to be first-class citizens, but in ver2.3.5 that I'm running here (or rather "there", on a friend's box), I have to "import sets" which I didn't see mentioned in the reference manual.
set and frozenset are a builtins starting with Python 2.4. http://www.python.org/2.4/highlights.html
"""
New or upgraded built-ins
built-in sets - the sets module, introduced in 2.3, has now been implemented
in C, and the set and frozenset types are available as built-in types (PEP
218)
"""
Peter
Tim Chase wrote:
<snip> I'm a tad confused by the help, as it sounds like sets are supposed to be first-class citizens, but in ver2.3.5 that I'm running here (or rather "there", on a friend's box), I have to "import sets" which I didn't see mentioned in the reference manual.
-one of many tims on the list tim = Set(["bald", "vegetarian", "loving husband"])
:)
The builtin "set" was added in 2.4, I believe (note lowercase "s"): http://docs.python.org/whatsnew/node2.html
print "vegetarian".replace("etari","")
:-)
>> >Python beginner here and very much enjoying it. I'm looking > for a pythonic way to find how many listmembers are also > present in a reference list. Don't count duplicates (eg. if
[snipped]
won't set remove duplicates which he wants to preserve ?
My reading was that the solution shouldn't count duplicates
("Don't count duplicates"). However, once you mentioned it, and
I saw other folks' responses that looked very diff. from my own,
I re-read the OP's comments and found that I missed this key bit:
"note that only the first *two* 2's count, the third 2
in the list should not be counted"
(*emphasis* mine)
and that the desired result was a count (though a len(Set())
would return the right count if the OP wanted the true
intersection, but that's beside the point).
My error. Sorry, ladies and gentlemen :)
I'm partial to the elegance of markscala's suggestion of:
len([ref.pop(ref.index(x)) for x in lis if x in ref])
though it might need to come with a caveat that it doesn't leave
"ref" in the same state is it was originally, so it should be
copied and then manipulated thusly:
r = ref[:]
len([r.pop(r.index(x)) for x in (lis if x in r])
which would then leave ref undisturbed. As tested:
import random
ref = [random.randint(1,5) for n in range(5)]
for x in range(1,10):
lis = [random.randint(1,5) for n in range(5)]
r = ref[:]
print repr((r,lis))
print len([r.pop(r.index(x)) for x in lis if x in r])
seems to give the results the OP describes.
-tim
Hi,
I liked the twist at the end when you state that only the first two 2's
count. It reminded me
of my maths O'level revision where you always had to read the question
thoroughly.
Here is what I came up with: ref
[2, 2, 4, 1, 1] lst
[2, 2, 5, 2, 4] tmp = [ [val]*min(lst.count(val), ref.count(val)) for val in set(ref)] tmp
[[], [2, 2], [4]] answer = [x for y in tmp for x in y] answer
[2, 2, 4]
I took a lot from Peter Ottens reply to generate tmp then flattened the
inner lists.
After a bit more thought, the intermediate calculation of tmp can be
removed with a
little loss in clarity though, to give:
answer = [ val for val in set(ref) for x in range(min(lst.count(val), ref.count(val)))] answer
[2, 2, 4]
- Cheers, Paddy.
Mathijs wrote: Python beginner here and very much enjoying it. I'm looking for a pythonic way to find how many listmembers are also present in a reference list. Don't count duplicates (eg. if you already found a matching member in the ref list, you can't use the ref member anymore).
Example1: ref=[2, 2, 4, 1, 1] list=[2, 3, 4, 5, 3] solution: 2
Example2: ref=[2, 2, 4, 1, 1] list=[2, 2, 5, 2, 4] solution: 3 (note that only the first two 2's count, the third 2 in the list should not be counted)
Here's the way I would do it: def occurrences(it):
res = {}
for item in it:
if item in res:
res[item] += 1
else:
res[item] = 1
return res
ref=[2, 2, 4, 1, 1] lst=[2, 2, 5, 2, 4] oref = occurrences(ref) sum(min(v,oref.get(k,0)) for (k,v) in occurrences(lst).iteritems())
3 lst=[2, 3, 4, 5, 3] sum(min(v,oref.get(k,0)) for (k,v) in occurrences(lst).iteritems())
2
Or in other words, define a function to return a dictionary containing
a count of the number of occurrences of each element in the list (this
assumes that the list elements are hashable). Then you just add up the
values in the test list making sure each count is limited to no higher than
the reference count.
Duncan Booth wrote: Here's the way I would do it:
def occurrences(it): res = {} for item in it: if item in res: res[item] += 1 else: res[item] = 1 return res
I slightly prefer:
def occurrences(it):
res = {}
res[item] = res.get(item, 0) + 1
return res
[...] Or in other words, define a function to return a dictionary containing a count of the number of occurrences of each element in the list (this assumes that the list elements are hashable). Then you just add up the values in the test list making sure each count is limited to no higher than the reference count.
Resulting in a linear-time average case, where the posted
list-comprehension-based solutions are quadratic. The title
of the thread is unfortunate.
The generalized problem is multiset (AKA "bag") intersection: http://en.wikipedia.org/wiki/Bag_(mathematics)
--
--Bryan
Op 19 jan 2006 vond "ma*******@gmail.com" : another approach:
ref = [2,2,4,1,1] lis = [2,2,5,2,4]
len([ref.pop(ref.index(x)) for x in lis if x in ref])
This is the type of solution I was hoping to see: one-liners, with no use
of local variables. As Tim Chase already wrote, it has only one less
elegant side: it alters the original ref list.
Thanks for your suggestion.
Op 19 jan 2006 vond Peter Otten <__*******@web.de> : sum(min(list.count(n), ref.count(n)) for n in set(ref))
Is that it?
Seems like this is it! Thanks.
Op 19 jan 2006 vond "Paddy" <pa*******@netscape.net>: answer = [ val for val in set(ref) for x in range(min(lst.count(val), ref.count(val)))] answer
[2, 2, 4]
I don't think it's correct. Your algoritm with the ref and lst below gives
3 as answer. The answer should have been 2 (1,3).
ref=[3, 3, 1, 1, 3]
lst=[5, 1, 4, 5, 3]
Op 20 jan 2006 vond Duncan Booth <du**********@invalid.invalid>: Or in other words, define a function to return a dictionary containing a count of the number of occurrences of each element in the list (this assumes that the list elements are hashable). Then you just add up the values in the test list making sure each count is limited to no higher than the reference count.
Thanks. Though I don't know much about python (yet), this is more or less
the way I'de do it the language I'm more comfortable with (object pascal),
and I wouldn't label this as a pythonic solution. I could be wrong,
though:)
Mathijs wrote: Op 20 jan 2006 vond Duncan Booth <du**********@invalid.invalid>: Or in other words, define a function to return a dictionary containing a count of the number of occurrences of each element in the list (this assumes that the list elements are hashable). Then you just add up the values in the test list making sure each count is limited to no higher than the reference count.
Thanks. Though I don't know much about python (yet), this is more or less the way I'de do it the language I'm more comfortable with (object pascal), and I wouldn't label this as a pythonic solution. I could be wrong, though:)
I'm curious what you think isn't pythonic about this solution?
Mathijs wrote: Op 20 jan 2006 vond Duncan Booth <du**********@invalid.invalid>: Or in other words, define a function to return a dictionary containing a count of the number of occurrences of each element in the list (this assumes that the list elements are hashable). Then you just add up the values in the test list making sure each count is limited to no higher than the reference count.
Thanks. Though I don't know much about python (yet), this is more or less the way I'de do it the language I'm more comfortable with (object pascal), and I wouldn't label this as a pythonic solution. I could be wrong, though:)
You *are* wrong. Also, as the lists grow, Duncan's approach scales *much*
better than, e. g., mine.
If picking a better algorithm were unpythonic there would not be much value
in striving for pythonic solutions.
Peter
On Mon, 23 Jan 2006 17:41:55 +0000, Mathijs wrote: Op 19 jan 2006 vond "ma*******@gmail.com" :
another approach:
ref = [2,2,4,1,1] lis = [2,2,5,2,4]
len([ref.pop(ref.index(x)) for x in lis if x in ref])
This is the type of solution I was hoping to see: one-liners, with no use of local variables.
Because you like unreadable, incomprehensible, unmaintainable code?
*wink*
--
Steven.
Duncan Booth showed how to solve a problem posed by Mathijs. This is
very similar to Duncan's solution, except I (ab)use setdefault on a
regular basis... def occurrences(t):
.... res = {}
.... for item in t:
.... res.setdefault(item,[0])[0] += 1
.... return res
.... ref = [2,2,4,1,1] lst = [2,2,5,2,4] oref = occurrences(ref) sum(min(v[0],oref.get(k,[0])[0]) for (k,v) in occurrences(lst).iteritems())
3
Actually, this brings up an interesting point:
{}.setdefault('foo',0) += 1
SyntaxError: can't assign to function call
I understand why this doesn't work, but it would sure be nice if it
did. I think somebody was mentioning "mutable ints" at one point,
which is basically what I abuse [0] to provide. If I were doing a lot
of this in one place, I would code a mutable integer class, and then
the rest of the code would get simpler.
Regards,
Pat
Bryan Olson wrote: Duncan Booth wrote:
Here's the way I would do it:
> def occurrences(it):
res = {} for item in it: if item in res: res[item] += 1 else: res[item] = 1 return res
I slightly prefer:
def occurrences(it): res = {} res[item] = res.get(item, 0) + 1 return res
Arg! In illustrating a trivial point, I dropped something essential.
This version has almost three minutes of testing behind it:
def occurrences(it):
res = {}
for item in it:
res[item] = res.get(item, 0) + 1
return res
--
--Bryan
On 23/01/2006 18:41, Mathijs uttered: len([ref.pop(ref.index(x)) for x in lis if x in ref])
This is the type of solution I was hoping to see: one-liners, with no use of local variables. As Tim Chase already wrote, it has only one less elegant side: it alters the original ref list.
Thanks for your suggestion.
May I suggest another one-liner:
len(set(ref).intersection(lis))
I have no idea how this scales/performs compared to the other
suggestions you've received, but I find it immediately comprehensible.
-- Morten
[Mathijs] Example2: ref=[2, 2, 4, 1, 1] list=[2, 2, 5, 2, 4] solution: 3 (note that only the first two 2's count, the third 2 in the list should not be counted)
[Morten Vold]
May I suggest another one-liner:
len(set(ref).intersection(lis))
I have no idea how this scales/performs compared to the other suggestions you've received, but I find it immediately comprehensible. len(set([2, 2, 4, 1, 1]).intersection([2, 2, 5, 2, 4]))
2
Close, but no cigar.
Peter
On 24/01/2006 09:46, Peter Otten uttered: len(set([2, 2, 4, 1, 1]).intersection([2, 2, 5, 2, 4]))
2 Close, but no cigar.
I need more coffee! (note to self: Always read entire problem set first)
-- Morten
Patrick Maupin wrote: def occurrences(t): ... res = {} ... for item in t: ... res.setdefault(item,[0])[0] += 1 ... return res
.... I think somebody was mentioning "mutable ints" at one point, which is basically what I abuse [0] to provide. If I were doing a lot of this in one place, I would code a mutable integer class, and then the rest of the code would get simpler.
Or you could use the facilities built in to Python: use the get method as
Bryan did, and the code gets simpler without using mutable integers.
I prefer writing an 'if' statement here, Bryan prefers 'get', that's just a
choice of style. But 'setdefault' here, that has no style.
Duncan Booth wrote: I prefer writing an 'if' statement here, Bryan prefers 'get', that's just a choice of style. But 'setdefault' here, that has no style.
Well, I'm often told I have no style, and I _did_ admit that it's an
abuse of setdefault. However, I often use setdefault to populate
nested dictionaries, or dictionaries of sets or lists, e.g.:
for x, y in somebiglist:
bigdict.setdefault(x,set()).add(y) # Strips duplicates
for x, y in somebiglist:
bigdict.setdefault(x,[]).append(y) # Preserves duplicates
To my mind, this latter is so much cleaner and clearer than any of the
alternatives that it just isn't funny:
for x, y in somebiglist:
if x in bigdict:
bigdict[x].append(y)
else:
bigdict[x] = [y]
for x, y in somebiglist:
if x not in bigdict:
bigdict[x] = []
bigdict[x].append(y)
for x, y in somebiglist:
try:
bigdict[x].append(y)
except KeyError:
bigdict[x] = [y]
etc.
Since I use setdefault in this manner quite often, I am very
comfortable with it. On a single line I know what I am creating (list,
dict, set, etc.), what the default value is, and how it is being
modified.
Because I am NOT a believer in the Perl idiom of TMTOWTDI, and because,
TO ME, setdefault screams "This line is creating and subsequently
modifying a dictionary entry", setdefault is absolutely my first choice
for this particular task.
Having said that, my enthusiasm for setdefault (for the given problem!)
IS tempered somewhat by having to abuse a list as a mutable integer.
That is the only reason that I concede that my solution is an abuse of
setdefault, but the setdefault idiom is (to me) so clear and compelling
that I view it as a toss-up whether to use a single line setdefault or
an if/else in this case.
And, as I mentioned earlier, if I had to do this same thing LOTS of
times in a program, I most likely would code a class (perhaps not as
fancy as a true mutable int, but certainly something with an increment
method), because, IMO, it would be a lot cleaner (hmmm, maybe there's
no style points in that) than tons and tons of if/else statements.
Regards,
Pat
Patrick Maupin wrote: Duncan Booth wrote:
I prefer writing an 'if' statement here, Bryan prefers 'get', that's just a choice of style. But 'setdefault' here, that has no style.
Well, I'm often told I have no style, and I _did_ admit that it's an abuse of setdefault. However, I often use setdefault to populate nested dictionaries, or dictionaries of sets or lists, e.g.:
for x, y in somebiglist: bigdict.setdefault(x,set()).add(y) # Strips duplicates
for x, y in somebiglist: bigdict.setdefault(x,[]).append(y) # Preserves duplicates
To my mind, this latter is so much cleaner and clearer than any of the alternatives that it just isn't funny:
Yes, but storing a mutable is a not unreasonable use of setdefault. What I
objected to was storing an immutable just to overwrite it immediately.
Also, while I agree it is shorter, I'm not convinced that it is much
cleaner, and it is likely to be slower if there are a lot of duplicates,
possibly a lot slower if the mutable object has much overhead on its
construction.
Umm,
My answer is not correct, but for a different reason; it seems you need
the length of my
previous answer, thus:
PythonWin 2.4 (#60, Feb 9 2005, 19:03:27) [MSC v.1310 32 bit (Intel)]
on win32.
Portions Copyright 1994-2004 Mark Hammond (mh******@skippinet.com.au) -
see 'Help/About PythonWin' for further copyright information. ref = [3, 3, 1, 1, 3] lst=[5, 1, 4, 5, 3] answer = len([ val for val in set(ref) for x in range(min(lst.count(val), ref.count(val)))]) answer
2
- Paddy.
23 jan 2006 ta (Steven D'Aprano) shuo le: This is the type of solution I was hoping to see: one-liners, with no use of local variables. Because you like unreadable, incomprehensible, unmaintainable code?
For practical use: no! But I'm just learning python and to understand
sets/lists/dicts and it's functions, this seemed like good practise:) and
besides that: it's fun! A solution with local vaiables, if-statemenst and a
for loop wouldn't be much different from what I know with my current
programming skills.
*wink*
Yeah, yeah, I know, but I felt a response was on its place. :) This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: massimo |
last post by:
Hey,
I wrote this program which should take the numbers entered and sort them
out. It doesnąt matter what order, if decreasing or increasing.
I guess I'm confused in the sorting part. Anyone...
|
by: Kent |
last post by:
Hi!
I want to store data (of enemys in a game) as a linked list, each node will
look something like the following:
struct node
{
double x,y; // x and y position coordinates
struct enemy...
|
by: Robin Cole |
last post by:
I'd like a code review if anyone has the time. The code implements a basic
skip list library for generic use. I use the following header for debug
macros:
/*
public.h - Public declarations and...
|
by: JS |
last post by:
I have a file called test.c. There I create a pointer to a pcb struct:
struct pcb {
void *(*start_routine) (void *);
void *arg;
jmp_buf state;
int stack;
};
...
|
by: chellappa |
last post by:
hi
this simple sorting , but it not running...please correect error for
sorting using pointer or linked list sorting , i did value sorting in
linkedlist
please correct error
#include<stdio.h>...
| |
by: drewy2k12 |
last post by:
Heres the story, I have to create a doubly linked list for
class, and i have no clue on how to do it, i can barely create a
single linked list. It has to have both a head and a tail pointer,
and...
|
by: AZRebelCowgirl73 |
last post by:
This is what I have so far:
My program!
import java.util.*;
import java.lang.*;
import java.io.*;
import ch06.lists.*;
public class UIandDB {
|
by: Atos |
last post by:
SINGLE-LINKED LIST
Let's start with the simplest kind of linked list : the single-linked list which only has one link per node. That node except from the data it contains, which might be...
|
by: kalyan |
last post by:
Hi,
I am using Linux + SysV Shared memory (sorry, but my question is all
about offset + pointers and not about linux/IPC) and hence use
offset's instead on pointers to store the linked list in...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
| |
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...
| |