Diez B. Roggisch wrote:

Both of these techniques are O(n^2). You can reduce it to O(n log n)

by using sets:

>set2 = set(list2)

>[x for x in list1 if x not in set2]

Checking to see if an item is in a set is much more efficient than a

list.

Is the set-lookup reliably O(log n)? I was under the impression that it is

hash-based, and this should be O(1) usually, but couldbve O(n) worst-case

(hash the same for _all_ entries).

That's largely a theoretical concern. Google for something like

'''dict worst-case performance "tim peters"'''

to learn more. (The third article there (no doubt obsolete in some

ways, given that it was in 2000) says that Python "keeps at least 1/3 of

the internal hash table entries unused, making collisions very rarely a

problem... It's possible to contrive keys that will cause collisions

systematically ... but unlikely to happen by accident in 2.0")

-Peter