finding items that occur more than once in a list

On 18 Mar, 10:57, Simon Forman <sajmik...@gmail.comwrote:

def f(L):
'''Return a set of the items that occur more than once in L.'''
L = list(L)
for item in set(L):
L.remove(item)
return set(L)

def nonunique(lst):
slst = sorted(lst)
return list(set([s[0] for s in
filter(lambda t : not(t[0]-t[1]), zip(slst[:-1],slst[1:]))]))

Mar 18 '08 #7

On 18 Mar, 22:22, sturlamolden <sturlamol...@yahoo.nowrote:

def nonunique(lst):
slst = sorted(lst)
return list(set([s[0] for s in
filter(lambda t : not(t[0]-t[1]), zip(slst[:-1],slst[1:]))]))

Or perhaps better:

def nonunique(lst):
slst = sorted(lst)
return list(set([s[0] for s in
filter(lambda t : t[0] != t[1], zip(slst[:-1],slst[1:]))]))

Mar 18 '08 #8

On 18 Mar, 22:25, sturlamolden <sturlamol...@yahoo.nowrote:

def nonunique(lst):
slst = sorted(lst)
return list(set([s[0] for s in
filter(lambda t : t[0] != t[1], zip(slst[:-1],slst[1:]))]))

Obviously that should be 'lambda t : t[0] == t[1]'. Instead of using
the set function, there is more structure to exploit when the list is
sorted:
def nonunique(lst):
slst = sorted(lst)
dups = [s[0] for s in
filter(lambda t : t[0] == t[1], zip(slst[:-1],slst[1:]))]
return [dups[0]] + [s[1] for s in
filter(lambda t : t[0] != t[1], zip(dups[:-1],dups[1:]))]

Mar 18 '08 #9

Arnaud Delobelle

On Mar 18, 9:56*pm, sturlamolden <sturlamol...@yahoo.nowrote:

On 18 Mar, 22:25, sturlamolden <sturlamol...@yahoo.nowrote:

def nonunique(lst):
* *slst = sorted(lst)
* *return list(set([s[0] for s in
* * * *filter(lambda t : t[0] != t[1], zip(slst[:-1],slst[1:]))]))

Obviously that should be 'lambda t : t[0] == t[1]'. Instead of using
the set function, there is more structure to exploit when the list is
sorted:

def nonunique(lst):
* *slst = sorted(lst)
* *dups = [s[0] for s in
* * * * filter(lambda t : t[0] == t[1], zip(slst[:-1],slst[1:]))]
* *return [dups[0]] + [s[1] for s in
* * * * filter(lambda t : t[0] != t[1], zip(dups[:-1],dups[1:]))]

Argh! What's wrong with something like:

def duplicates(l):
i = j = object()
for k in sorted(l):
if i != j == k: yield k
i, j = j, k

>>list(duplicates([1, 2, 3, 2, 2, 4, 1]))

[1, 2]

--
Arnaud

Mar 18 '08 #10

On 18 Mar, 23:45, Arnaud Delobelle <arno...@googlemail.comwrote:

def nonunique(lst):
slst = sorted(lst)
dups = [s[0] for s in
filter(lambda t : t[0] == t[1], zip(slst[:-1],slst[1:]))]
return [dups[0]] + [s[1] for s in
filter(lambda t : t[0] != t[1], zip(dups[:-1],dups[1:]))]

Argh! What's wrong with something like:

def duplicates(l):
i = j = object()
for k in sorted(l):
if i != j == k: yield k
i, j = j, k

Nice, and more readable. But I'd use Paul Robin's solution. It is O(N)
as opposed to ours which are O(N log N).

Mar 18 '08 #11

On Mar 19, 10:08 am, sturlamolden <sturlamol...@yahoo.nowrote:

On 18 Mar, 23:45, Arnaud Delobelle <arno...@googlemail.comwrote:

def nonunique(lst):
slst = sorted(lst)
dups = [s[0] for s in
filter(lambda t : t[0] == t[1], zip(slst[:-1],slst[1:]))]
return [dups[0]] + [s[1] for s in
filter(lambda t : t[0] != t[1], zip(dups[:-1],dups[1:]))]

Argh! What's wrong with something like:

def duplicates(l):
i = j = object()
for k in sorted(l):
if i != j == k: yield k
i, j = j, k

Nice, and more readable. But I'd use Paul Robin's solution. It is O(N)
as opposed to ours which are O(N log N).

I'd use Raymond Hettinger's solution. It is as much O(N) as Paul's,
and is IMHO more readable than Paul's.

Mar 19 '08 #12

On 19 Mar, 22:48, John Machin <sjmac...@lexicon.netwrote:

I'd use Raymond Hettinger's solution. It is as much O(N) as Paul's,
and is IMHO more readable than Paul's.

Is a Python set implemented using a hash table?

Mar 19 '08 #13

Justin Bozonier

On Mar 19, 2:48 pm, John Machin <sjmac...@lexicon.netwrote:

On Mar 19, 10:08 am, sturlamolden <sturlamol...@yahoo.nowrote:

On 18 Mar, 23:45, Arnaud Delobelle <arno...@googlemail.comwrote:

def nonunique(lst):
slst = sorted(lst)
dups = [s[0] for s in
filter(lambda t : t[0] == t[1], zip(slst[:-1],slst[1:]))]
return [dups[0]] + [s[1] for s in
filter(lambda t : t[0] != t[1], zip(dups[:-1],dups[1:]))]

Argh! What's wrong with something like:

def duplicates(l):
i = j = object()
for k in sorted(l):
if i != j == k: yield k
i, j = j, k

Nice, and more readable. But I'd use Paul Robin's solution. It is O(N)
as opposed to ours which are O(N log N).

I'd use Raymond Hettinger's solution. It is as much O(N) as Paul's,
and is IMHO more readable than Paul's.

It's not as much O(N)... Paul Robin's uses a sort first which is
definitely not O(N). Paul's could be prettied up a bit but the general
principle is sound.

Mar 19 '08 #14

On Mar 20, 9:57 am, Justin Bozonier <darkxant...@gmail.comwrote:

On Mar 19, 2:48 pm, John Machin <sjmac...@lexicon.netwrote:

On Mar 19, 10:08 am, sturlamolden <sturlamol...@yahoo.nowrote:

On 18 Mar, 23:45, Arnaud Delobelle <arno...@googlemail.comwrote:

def nonunique(lst):
slst = sorted(lst)
dups = [s[0] for s in
filter(lambda t : t[0] == t[1], zip(slst[:-1],slst[1:]))]
return [dups[0]] + [s[1] for s in
filter(lambda t : t[0] != t[1], zip(dups[:-1],dups[1:]))]

Argh! What's wrong with something like:

def duplicates(l):
i = j = object()
for k in sorted(l):
if i != j == k: yield k
i, j = j, k

Nice, and more readable. But I'd use Paul Robin's solution. It is O(N)
as opposed to ours which are O(N log N).

I'd use Raymond Hettinger's solution. It is as much O(N) as Paul's,
and is IMHO more readable than Paul's.

It's not as much O(N)... Paul Robin's uses a sort first which is
definitely not O(N). Paul's could be prettied up a bit but the general
principle is sound.

"""
from collections import defaultdict
def nonunique(lst):
d = defaultdict(int)
for x in lst: d[x] += 1
return [x for x,n in d.iterkeys() if n 1]
"""

I see no sort here.

Mar 19 '08 #15

On Mar 20, 9:14 am, sturlamolden <sturlamol...@yahoo.nowrote:

On 19 Mar, 22:48, John Machin <sjmac...@lexicon.netwrote:

I'd use Raymond Hettinger's solution. It is as much O(N) as Paul's,
and is IMHO more readable than Paul's.

Is a Python set implemented using a hash table?

What don't you understand about the comments in the first two
screenfuls of Objects/setobject.c?

Mar 19 '08 #16

On 20 Mar, 00:16, John Machin <sjmac...@lexicon.netwrote:

What don't you understand about the comments in the first two
screenfuls of Objects/setobject.c?

I had not looked at it, but now I have. Is seems Hettinger is the
author :) Ok, so sets are implemented as hash tables. Then I agree,
use Raymond Hettinger's solution.

Mar 20 '08 #17

Gabriel Genellina

En Wed, 19 Mar 2008 20:16:36 -0300, John Machin <sj******@lexicon.net>
escribió:

On Mar 20, 9:14 am, sturlamolden <sturlamol...@yahoo.nowrote:

>Is a Python set implemented using a hash table?

What don't you understand about the comments in the first two
screenfuls of Objects/setobject.c?

That comment was unnecesarily harsh, don't you think?

--
Gabriel Genellina

Mar 20 '08 #18

On Mar 20, 12:50*pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:

En Wed, 19 Mar 2008 20:16:36 -0300, John Machin <sjmac...@lexicon.net*
escribió:

On Mar 20, 9:14 am, sturlamolden <sturlamol...@yahoo.nowrote:
Is a Python set implemented using a hash table?

What don't you understand about the comments in the first two
screenfuls of Objects/setobject.c?

That comment was unnecesarily harsh, don't you think?

No, otherwise I wouldn't have made it.

Mar 20 '08 #19

castironpi

On Mar 20, 2:07*am, John Machin <sjmac...@lexicon.netwrote:

On Mar 20, 12:50*pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:En Wed, 19 Mar 2008 20:16:36 -0300, John Machin <sjmac...@lexicon.net*
escribió:

On Mar 20, 9:14 am, sturlamolden <sturlamol...@yahoo.nowrote:
>Is a Python set implemented using a hash table?

What don't you understand about the comments in the first two
screenfuls of Objects/setobject.c?

That comment was unnecesarily harsh, don't you think?

No, otherwise I wouldn't have made it.

Screenfuls is an awful lot, the way you guys write. Besides, what's
the answer? What don't you understand?

Mar 21 '08 #20

castironpi

On Mar 21, 3:39*pm, John Machin <sjmac...@lexicon.netwrote:

On Mar 22, 1:11 am, castiro...@gmail.com wrote:
A collision sequence is not so rare.

>>[ hash( 2**i ) for i in range( 0, 256, 32 ) ]

[1, 1, 1, 1, 1, 1, 1, 1]

Bryan did qualify his remarks: "If we exclude the case where an
adversary is choosing the keys, ..."

Some adversary. What, you mean, my boss or my customers?

Mar 22 '08 #21

Bryan Olson

ca********@gmail.com wrote:

John Machin wrote:
>On Mar 22, 1:11 am, castiro...@gmail.com wrote:
>>A collision sequence is not so rare.
>[ hash( 2**i ) for i in range( 0, 256, 32 ) ]
[1, 1, 1, 1, 1, 1, 1, 1]

>Bryan did qualify his remarks: "If we exclude the case where an
adversary is choosing the keys, ..."

Some adversary. What, you mean, my boss or my customers?

We mean that the party supplying the keys deliberately chose
them to make the hash table inefficient. In this thread the goal
is efficiency; a party working toward an opposing goal is an
adversary.

If you find real-world data sets that tend to induce bad-case
behavior in Python's hash, please do tell. It would be reason
enough to adjust the hash function. The hashes in popular
software such as Python are already quite well vetted.

Even a hash function that behaves as a random oracle has
worst-case quadratic-time in the algorithm here, but that's
an issue in theory and not in serving customers.
--
--Bryan

Mar 22 '08 #22

Dimming A Variable More Than Once

On 22 Mar, 09:31, Bryan Olson <fakeaddr...@nowhere.orgwrote:

Even a hash function that behaves as a random oracle has
worst-case quadratic-time in the algorithm here

In which case inserts are not amortized to O(1).

Mar 22 '08 #23

by: Xif | last post by:

Hi I want to have control over all items added to a list. The first attempt was to subclass list and override its .append() method. Problem is, there are plenty of other ways the list can...

Python

by: Arpan De | last post by:

Suppose I have the following ASP code: <% Dim strName ..................... ..................... ..................... ..................... ..................... .....................

ASP / Active Server Pages

SQL Query - Selecting records that repeat / occur more than once in a table -- newbe question

by: Nimesh | last post by:

I need to find customer's names that repeat / occur more than once in the same table. I have treid many options and I have tried comparing the column to itself but Oracle gives me an error. ...

Oracle Database

finding items that occur before or after an item in lists

by: Simon Forman | last post by:

I've got a function that I'd like to improve. It takes a list of lists and a "target" element, and it returns the set of the items in the lists that appear either before or after the target...

Python

Event being receiced more than once!

by: ShaunO | last post by:

BACKGROUND I have the following classes as part of a program that opens 3 Asynchronous Sockets. Each socket is in a separate instance of a wrapping class. 1x User Interface class 1x Client...

C# / C Sharp

Removing events more than once

by: MikeP | last post by:

Hi there, I've tried hunting for this in the C# specs but can't find the relevant info. Does anyone know what the official rule says (or doesn't say) about removing an event more than once. If...

C# / C Sharp

How can I detect if a class is included more than once ?

by: wbsurfver | last post by:

Here is the problem, I am working with a bunch of old code so I can really restructure the includes, otherwise I guess I could change all the #include statements to #include-once. Anyway, If I...

ASP / Active Server Pages

error items has been added more than once

by: puzzlecracker | last post by:

I get this error during compilation and not sure what the source of it is: Error 2 The item "value\port\Port.cs" was specified more than once in the "Sources" parameter. Duplicate items are not...

C# / C Sharp

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...