473,379 Members | 1,283 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,379 software developers and data experts.

ordered sets operations on lists..

Hello, Is there a *direct* way of doing set operations on lists which
preserve the order of the input lists ?
For Ex. l1 = [1, 5, 3, 2, 4, 7]
l2 = [3, 5, 10]

and (l1 intersect l2) returns [5, 3] .... (and (l2 intersect l1)
returns [3, 5])

thanks in advance,
amit.

--
----
Amit Khemka -- onyomo.com
Endless the world's turn, endless the sun's Spinning, Endless the quest;
I turn again, back to my own beginning, And here, find rest.
Feb 10 '06 #1
13 1828

Amit Khemka wrote:
Hello, Is there a *direct* way of doing set operations on lists which
preserve the order of the input lists ?
For Ex. l1 = [1, 5, 3, 2, 4, 7]
l2 = [3, 5, 10]

and (l1 intersect l2) returns [5, 3] .... (and (l2 intersect l1)
returns [3, 5])

what do you mean by "direct" way ? ugly(some said) one liner ?

filter(set(l1).intersection(set(l2)).__contains__, l1)
filter(set(l1).intersection(set(l2)).__contains__, l2)

Feb 10 '06 #2
Amit Khemka wrote:
Hello, Is there a *direct* way of doing set operations on lists which
preserve the order of the input lists ? Nope
For Ex. l1 = [1, 5, 3, 2, 4, 7]
l2 = [3, 5, 10]

and (l1 intersect l2) returns [5, 3] .... (and (l2 intersect l1)
returns [3, 5])


However:
intersection = set(list1) & set(list2)
[element for element in list1 if element in intersection]
or
[element for element in list2 if element in intersection]
Give you the result you'd like.

--Scott David Daniels
sc***********@acm.org
Feb 10 '06 #3
[Amit Khemka]
Hello, Is there a *direct* way of doing set operations on lists which
preserve the order of the input lists ?
For Ex. l1 = [1, 5, 3, 2, 4, 7]
l2 = [3, 5, 10]

and (l1 intersect l2) returns [5, 3] .... (and (l2 intersect l1)
[bonono]
what do you mean by "direct" way ? ugly(some said) one liner ?

filter(set(l1).intersection(set(l2)).__contains__, l1)
filter(set(l1).intersection(set(l2)).__contains__, l2)


The intersection step is unnecessary, so the answer can be simplified a
bit:
filter(set(l2).__contains__, l1) [5, 3] filter(set(l1).__contains__, l2)

[3, 5]

Feb 10 '06 #4

Raymond Hettinger wrote:
The intersection step is unnecessary, so the answer can be simplified a
bit:
filter(set(l2).__contains__, l1) [5, 3] filter(set(l1).__contains__, l2)

[3, 5]


stand corrected.

Feb 11 '06 #5
Raymond Hettinger <py****@rcn.com> wrote:
...
The intersection step is unnecessary, so the answer can be simplified a
bit:
filter(set(l2).__contains__, l1) [5, 3] filter(set(l1).__contains__, l2)

[3, 5]


....and if one has time to waste, "setification" being only an
optimization, it can also be removed: filter(l2.__contains__, l1) etc
(very slow for long lists, of course).

Personally, I'd always use (depending on guesses regarding lengths of
lists) [x for x in l1 if x in l2] or the setified equivalent, of course.
Alex
Feb 11 '06 #6
On Sat, 11 Feb 2006 10:24:04 -0800, al*****@yahoo.com (Alex Martelli) wrote:
Raymond Hettinger <py****@rcn.com> wrote:
...
The intersection step is unnecessary, so the answer can be simplified a
bit:
>>> filter(set(l2).__contains__, l1)

[5, 3]
>>> filter(set(l1).__contains__, l2)

[3, 5]


...and if one has time to waste, "setification" being only an
optimization, it can also be removed: filter(l2.__contains__, l1) etc
(very slow for long lists, of course).

Personally, I'd always use (depending on guesses regarding lengths of
lists) [x for x in l1 if x in l2] or the setified equivalent, of course.

Perhaps newbies should be advised that

[x for x in l1 if x in set(l2)]

is not a (well) setified equivalent? I could see them being tempted.

Regards,
Bengt Richter
Feb 12 '06 #7
Bengt Richter <bo**@oz.net> wrote:
...
Personally, I'd always use (depending on guesses regarding lengths of
lists) [x for x in l1 if x in l2] or the setified equivalent, of course.

Perhaps newbies should be advised that

[x for x in l1 if x in set(l2)]

is not a (well) setified equivalent? I could see them being tempted.


You mean, newbies should be advised that Python does NOT hoist any
computations whatsoever from the body of a loop (including LCs and
genexps), so if you want anything hoisted you need to hoist it yourself?
Yes, it is a point worth making, since the lack of hoisting is a
frequent cause of performance loss.
Alex
Feb 12 '06 #8
Alex Martelli wrote:
Bengt Richter <bo**@oz.net> wrote:
...
Personally, I'd always use (depending on guesses regarding lengths of
lists) [x for x in l1 if x in l2] or the setified equivalent, of course.


Perhaps newbies should be advised that

[x for x in l1 if x in set(l2)]

is not a (well) setified equivalent? I could see them being tempted.

You mean, newbies should be advised that Python does NOT hoist any
computations whatsoever from the body of a loop (including LCs and
genexps), so if you want anything hoisted you need to hoist it yourself?
Yes, it is a point worth making, since the lack of hoisting is a
frequent cause of performance loss.

Of course now 2.5 is planning to include the AST parser there's fruitful
ground for optimization studies that will perform hoisting automatically
(should anyone be looking for a project ;-)

Given that Python 2.4 doesn't even perform simple constant folding for
arithmetic expressions
dis.dis(compile("print 1+2", '', 'exec'))

1 0 LOAD_CONST 0 (1)
3 LOAD_CONST 1 (2)
6 BINARY_ADD
7 PRINT_ITEM
8 PRINT_NEWLINE
9 LOAD_CONST 2 (None)
12 RETURN_VALUE

even hoisting could be seen as an "advanced" optimization.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Feb 13 '06 #9
Em Dom, 2006-02-12 Ã*s 23:15 -0500, Steve Holden escreveu:
Given that Python 2.4 doesn't even perform simple constant folding for
arithmetic expressions
[snip]


May I ask why doesn't it perform such optimization? Is there any special
difficulties in doing so with the Python compiler?

Also, IIRC Psyco does optimize these constant expressions. Or am I
wrong?

Cheers,
Felipe.

--
"Quem excele em empregar a força militar subjulga os exércitos dos
outros povos sem travar batalha, toma cidades fortificadas dos outros
povos sem as atacar e destrói os estados dos outros povos sem lutas
prolongadas. Deve lutar sob o Céu com o propósito primordial da
'preservação'. Desse modo suas armas não se embotarão, e os ganhos
poderão ser preservados. Essa é a estratégia para planejar ofensivas."

-- Sun Tzu, em "A arte da guerra"

Feb 13 '06 #10
Felipe Almeida Lessa wrote:
Em Dom, 2006-02-12 Ã*s 23:15 -0500, Steve Holden escreveu:
Given that Python 2.4 doesn't even perform simple constant folding for
arithmetic expressions
[snip]

May I ask why doesn't it perform such optimization? Is there any special
difficulties in doing so with the Python compiler?

As well to ask why the sky is blue, and has those little white things in
it (unless you live in Arizona) :-)

The basic answer is that so far no developer has felt it worthwhile to
expend time on adding these optimizations.
Also, IIRC Psyco does optimize these constant expressions. Or am I
wrong?

Psyco does some very advanced things, but it does them all at run-time.
Unless I misunderstand (not unheard of), there are no circumstances
under which Psyco will improve run-time for a piece of code that is only
executed once.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Feb 13 '06 #11
Em Dom, 2006-02-12 Ã*s 23:51 -0500, Steve Holden escreveu:
The basic answer is that so far no developer has felt it worthwhile to
expend time on adding these optimizations.


I always thought these small optimizations could lead Python to be
faster overall. I remember about this every time I see CPython vs.
IronPython benchmarks (.NET and Mono do some nice optimizations at
compile and run times).
Also, IIRC Psyco does optimize these constant expressions. Or am I
wrong?

Psyco does some very advanced things, but it does them all at run-time.
Unless I misunderstand (not unheard of), there are no circumstances
under which Psyco will improve run-time for a piece of code that is only
executed once.


Sorry, I think I should have been clearer. Yes, Psyco only helps at
runtime (when the function is called), but those constant folds only
practically help on parts of the code that are called many times anyway,
right?

--
"Quem excele em empregar a força militar subjulga os exércitos dos
outros povos sem travar batalha, toma cidades fortificadas dos outros
povos sem as atacar e destrói os estados dos outros povos sem lutas
prolongadas. Deve lutar sob o Céu com o propósito primordial da
'preservação'. Desse modo suas armas não se embotarão, e os ganhos
poderão ser preservados. Essa é a estratégia para planejar ofensivas."

-- Sun Tzu, em "A arte da guerra"

Feb 13 '06 #12
Felipe Almeida Lessa wrote:
Em Dom, 2006-02-12 Ã*s 23:51 -0500, Steve Holden escreveu:
The basic answer is that so far no developer has felt it worthwhile to
expend time on adding these optimizations.

I always thought these small optimizations could lead Python to be
faster overall. I remember about this every time I see CPython vs.
IronPython benchmarks (.NET and Mono do some nice optimizations at
compile and run times).

Indeed it is true that on some benchmarks IronPython is faster than
CPython. I suspect this was helped by the fact that IronPython is Jim's
third implementation of Python (I believe he was familiar with CPython
before he started on Jython, formerly known as JPython).

The fact remains that until someone writes the code it can't be included
in the implementation. Performance optimization, unfortunately, doesn't
have the same glamorous cachet as more esoteric language features.
Also, IIRC Psyco does optimize these constant expressions. Or am I
wrong?


Psyco does some very advanced things, but it does them all at run-time.
Unless I misunderstand (not unheard of), there are no circumstances
under which Psyco will improve run-time for a piece of code that is only
executed once.

Sorry, I think I should have been clearer. Yes, Psyco only helps at
runtime (when the function is called), but those constant folds only
practically help on parts of the code that are called many times anyway,
right?

Right. Technically constant folding is a win for a single execution, but
only if you don't take the slightly-increased compile time into account :-)

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Feb 13 '06 #13
Bengt Richter wrote:
Perhaps newbies should be advised that

[x for x in l1 if x in set(l2)]


But the resulting list is a representative of bag not a set ( contains
multiple occurrences of elements ):
[x for x in [3, 3] if s in Set([3])] [3,3]

Same with Raymonds solution:
filter(Set([3]).__contains__, [3,3])

[3, 3]

Kay

Feb 13 '06 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

33
by: Jim Cobban | last post by:
I cannot get Netscape 4.79 to properly display the ordered list in the following fragment. <P>Get a specific portion of the date. Depending upon the value of index: <ol start=0> <li>complete...
2
by: John Trunek | last post by:
I need to obtain all the ordered subsets of containing 7 elements from a list of 15 elements. From what I remember, the number of ordered sets is given by the formula numOfSets = 15! / (15-8)! =...
41
by: Odd-R. | last post by:
I have to lists, A and B, that may, or may not be equal. If they are not identical, I want the output to be three new lists, X,Y and Z where X has all the elements that are in A, but not in B, and...
210
by: Christoph Zwerschke | last post by:
This is probably a FAQ, but I dare to ask it nevertheless since I haven't found a satisfying answer yet: Why isn't there an "ordered dictionary" class at least in the standard list? Time and again...
11
by: John Salerno | last post by:
I'd like to compare the values in two different sets to test if any of the positions in either set share the same value (e.g., if the third element of each set is an 'a', then the test fails). I...
22
by: bearophileHUGS | last post by:
>From this interesting blog entry by Lawrence Oluyede: http://www.oluyede.org/blog/2006/07/05/europython-day-2/ and the Py3.0 PEPs, I think the people working on Py3.0 are doing a good job, I am...
12
by: David Isaac | last post by:
Is it expected for access to set elements to be much slower than access to list elements? Explanation? Thanks, Alan Isaac 9.806250235714316 3.9823075279120701
11
by: Prateek | last post by:
I have 3 variable length lists of sets. I need to find the common elements in each list (across sets) really really quickly. Here is some sample code: # Doesn't make sense to union the sets -...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.