Is there an easy way to grab the Unique elements from a list?
For Example:
data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]
what I am looking for is the unique elements 0.4 and 0.9 with their
index from the list.
Probably something like a Hash Table approach!!
I would like to get this done without unnecessary overhead.And the list
could be essentially anything strings,floats,int etc...
Or is it already avaliable as an attr to a list or an array?
I dont remember seeing anything like that. 24 4097
OK I need to be more clear I guess!Unique Elements I mean, elements
that are non repeating. so in the above list 0.4, 0.9 are unique as
they exist only once in the list.
This is not the most beautiful idiom, but it works...
d = {}
for k in data:
try:
d[k] += 1
except:
d[k] = 1
for k,v in d.items():
if v == 1:
print k su*******@gmail.com wrote: Is there an easy way to grab the Unique elements from a list?
Yes.
The set data type.
Look in the reference of python 2.4.

Best regards
Roie Kerstein
On Monday 09 May 2005 03:15 pm, su*******@gmail.com wrote: Is there an easy way to grab the Unique elements from a list? For Example: data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]
what I am looking for is the unique elements 0.4 and 0.9 with their index from the list. Probably something like a Hash Table approach!! I would like to get this done without unnecessary overhead.And the list could be essentially anything strings,floats,int etc...
Or is it already avaliable as an attr to a list or an array? I dont remember seeing anything like that.
from sets import Set
data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]
[x for x in Set(data) if data.count(x) == 1]

James Stroud
UCLADOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095 http://www.jamesstroud.com/
runes: d = {} for k in data: try: d[k] += 1 except: d[k] = 1
for k,v in d.items(): if v == 1: print k
For this problem, if duplicated keys are common enough, this version is
probably the faster.
With this *different* problem, it seems that sometimes the faster
solution is using "in": http://pyalot.blogspot.com/2005/04/d...aryspeed.html
Bye,
Bearophile
In article <11**********************@z14g2000cwz.googlegroups .com>,
"su*******@gmail.com" <su*******@gmail.com> wrote: Is there an easy way to grab the Unique elements from a list? For Example: data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]
what I am looking for is the unique elements 0.4 and 0.9 with their index from the list. Probably something like a Hash Table approach!! I would like to get this done without unnecessary overhead.And the list could be essentially anything strings,floats,int etc...
Or is it already avaliable as an attr to a list or an array? I dont remember seeing anything like that.
From your comments downthread, it seems you want to find those elements
of the input sequence which occur exactly once, and return not only
these elements, but also their positions.
One reasonable solution might be as follows:
def unique_elts(seq):
elts = {}
for pos, elt in enumerate(seq):
elts.setdefault(elt, []).append(pos)
return [ (x, p[0]) for (x, p) in elts.iteritems()
if len(p) == 1 ]
This returns a list of tuples of the form (x, pos), where x is an
element of seq that occurs exactly once, and pos is its index in the
original sequence. This implementation traverses the input sequence
exactly once, and requires storage proportional to the length of the
input.
M

Michael J. Fromberger  Lecturer, Dept. of Computer Science http://www.dartmouth.edu/~sting/  Dartmouth College, Hanover, NH, USA
In article <11**********************@f14g2000cwb.googlegroups .com>,
runes <ru*********@gmail.com> wrote: This is not the most beautiful idiom, but it works...
d = {} for k in data: try: d[k] += 1 except: d[k] = 1
for k,v in d.items(): if v == 1: print k
Only if "works" does not include "order preserving".

Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/
"And if that makes me an elitist...I couldn't be happier." JMS
"su*******@gmail.com" <su*******@gmail.com> writes: Is there an easy way to grab the Unique elements from a list? For Example: data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]
Untested: here's an iterator that gives you the index and value,
without reordering. Uses dicts instead of sets for backwards compatibility.
def unique_elements(data):
seen = {}
for n,x in data:
if x not in seen:
seen[x] = 1
yield n,x
You're right. I somehow missed the index part :o. It's easy to fix
though.
Paul Rubin wrote: Is there an easy way to grab the Unique elements from a list? For Example: data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9] Untested: here's an iterator that gives you the index and value, without reordering. Uses dicts instead of sets for backwards compatibility.
def unique_elements(data): seen = {} for n,x in data: if x not in seen: seen[x] = 1 yield n,x
you forgot enumerate()
and if you fix that, you'll notice that the output doesn't quite match
the OP's spec:
For Example: data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]
what I am looking for is the unique elements 0.4 and 0.9 with their index from the list.
here's a straightforward variation that gives the specified output,
in the original order:
def unique_elements(data):
count = {}
data = list(enumerate(data))
for n,x in data:
count[x] = count.setdefault(x, 0) + 1
for n,x in data:
if count[x] == 1:
yield x, n # value with index
depending on the data, it might be more efficient to store the
"last seen index" in a dictionary, and sort the result on the way
out (if necessary). something like
from operator import itemgetter
def unique_elements(data):
seen = {}; index = {}
for n, x in enumerate(data):
if x in seen:
del index[x]
else:
index[x] = seen[x] = n
index = index.items()
index.sort(key=itemgetter(1)) # leave this out if order doesn't matter
return index
could work.
</F>
James Stroud <js*****@mbi.ucla.edu> writes: from sets import Set
data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]
[x for x in Set(data) if data.count(x) == 1]
Um.
....I must have missed something, but I'll post nevertheless:
wouldn't just
[x for x in data if data.count(x) == 1]
suffice? it is also "stable" preserving order of items. Lemme demo: [x for x in Set(data) if data.count(x) == 1]
[0.90000000000000002, 0.40000000000000002]
[x for x in data if data.count(x) == 1]
[0.40000000000000002, 0.90000000000000002]
Though I'll admit I also thought of Sets first, because I didn't remember
there was such a nice method list.count().

# Edvard Majakari Software Engineer
# PGP PUBLIC KEY available Soli Deo Gloria!
One day, when he was naughty, Mr Bunnsy looked over the hedge into Farmer
Fred's field and it was full of fresh green lettuces. Mr Bunnsy, however, was
not full of lettuces. This did not seem fair. Mr Bunnsy has an adventure
Fredrik Lundh wrote: depending on the data, it might be more efficient to store the "last seen index" in a dictionary, and sort the result on the way out (if necessary). something like
form sets import Set
data = list(Set([0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]))

hilsen/regards Max M, Denmark http://www.mxm.dk/
IT's Mad Science
"Michael J. Fromberger" <Mi******************@Clothing.Dartmouth.EDU> writes: One reasonable solution might be as follows:
def unique_elts(seq): elts = {} for pos, elt in enumerate(seq): elts.setdefault(elt, []).append(pos)
return [ (x, p[0]) for (x, p) in elts.iteritems() if len(p) == 1 ]
This is neat. Being lazy (the wrong way) I've never payed much attention why
would I need dict.setdefault() when I have things as dict.get(k, [def]). This
was a nice example of good use for setdefault() because it works like
dict.get() except it also sets the value if it didn't exist.
+1 IOTW (idiom of the week).

# Edvard Majakari Software Engineer
# PGP PUBLIC KEY available Soli Deo Gloria!
$_ = '456476617264204d616a616b6172692c20612043687269737 469616e20'; print
join('',map{chr hex}(split/(\w{2})/)),uc substr(crypt(60281449,'es'),2,4),"\n";
Max M wrote: depending on the data, it might be more efficient to store the "last seen index" in a dictionary, and sort the result on the way out (if necessary). something like
form sets import Set
data = list(Set([0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]))
read the OP's spec again.
</F>
On 9 May 2005 15:36:37 0700, "su*******@gmail.com" <su*******@gmail.com> wrote: OK I need to be more clear I guess!Unique Elements I mean, elements that are non repeating. so in the above list 0.4, 0.9 are unique as they exist only once in the list.
You want to be careful of your definitions, especially with floating point values,
which may surprise the uninitiated. Dicts and sets hash numerically equal values
to the same hash, and then do equality test, so you get legitimate results like: set([9*.18*.1, .1])
set([0.10000000000000001, 0.099999999999999978]) set([123, 123.0, 123L])
set([123]) set([123.0, 123, 123L])
set([123.0]) set([123L, 123, 123.0])
set([123L])
You may want to consider creating representations other than the original data
to use in your uniqueness testing, depending on your definition.
If you are happy with the way dicts and sets compare elements, you could do
a dict that keeps counts, or FTHOI something different (hot off untested griddle,
so test before relying ;)
data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9] once=set() more=set() for el in data:
... if el in once: more.add(el)
... else: once.add(el)
... oncemore
set([0.90000000000000002, 0.40000000000000002])
Not the most efficient spacewise, not sure about speed.
Regards,
Bengt Richter
Fredrik Lundh wrote: Max M wrote:
depending on the data, it might be more efficient to store the "last seen index" in a dictionary, and sort the result on the way out (if necessary). something like
form sets import Set
data = list(Set([0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]))
read the OP's spec again.
Well if you want to haggle over minor details like a spec, it's easy to
be critical!
data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]
freqs = {}
for position in xrange(len(data)):
freqs.setdefault(data[position], []).append(position)
unique = [(value, positions[0]) for (value, positions) in freqs.items()
if len(positions) == 1]

hilsen/regards Max M, Denmark http://www.mxm.dk/
IT's Mad Science
"Fredrik Lundh" <fr*****@pythonware.com> writes: you forgot enumerate()
Oh, oops.
and if you fix that, you'll notice that the output doesn't quite match the OP's spec:
> what I am looking for is the unique elements 0.4 and 0.9 with their > index from the list.
Oh, I see, he wanted the elements that only occur once. I misunderstood.
In article <87************@titan.staselog.com>,
Edvard Majakari <ed*********@majakari.net> wrote: James Stroud <js*****@mbi.ucla.edu> writes: from sets import Set
data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]
[x for x in Set(data) if data.count(x) == 1]
Um.
...I must have missed something, but I'll post nevertheless:
wouldn't just
[x for x in data if data.count(x) == 1]
suffice? it is also "stable" preserving order of items. Lemme demo:
Only for small datasets  this is an O(N^2) algorithm.

Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/
"And if that makes me an elitist...I couldn't be happier." JMS su*******@gmail.com wrote: Is there an easy way to grab the Unique elements from a list? For Example: data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]
what I am looking for is the unique elements 0.4 and 0.9 with their index from the list.
This is basically the same as Fredrik Lundh's solution, but here it is
anyway:
py> def f(lst):
.... counts = dicttools.counts(lst)
.... return [(i, elem)
.... for i, elem in enumerate(lst)
.... if counts[elem] == 1]
....
py> f([0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9])
[(3, 0.40000000000000002), (7, 0.90000000000000002)]
Where dicttools.counts is defined as:
def counts(iterable, key=None):
result = {}
for item in iterable:
# apply key function if necessary
if key is None:
k = item
else:
k = key(item)
# increment key's count
try:
result[k] += 1
except KeyError:
result[k] = 1
return result
(I use this so often that I have a module called dicttools to hold it.
You don't actually need the key= argument for your problem, but I just
copied the code directly.)
STeVe aa**@pythoncraft.com (Aahz) writes: [x for x in data if data.count(x) == 1]
suffice? it is also "stable" preserving order of items. Lemme demo:
Only for small datasets  this is an O(N^2) algorithm.
I realized that, but maybe I should've pointed it out too. For the OP if
he/she is unaware  notation O(N^2) (big O n squared) means the computing time
of the algorithm increases exponentially (where exponent is 2) relative to the
size of the input.
Eg. if the input size is i and it takes p seconds to compute it, then given
input size 10*i, the computing time would be 100*p. These notions can apply
for memory usage as well, but the problem in here is the computing time:
list.count() must iterate through the list each time, and as such the loop
[x for x in data if data.count(x) == 1]
iterates through each item in data (x for x in data), and for each item it
will again iterate through each item in data to see how many times it
occurred. If data contains 200 items, this idiom would iterate the structure
40 000 times. With today's computers one wouldn't notice it, unless each item
requires heavy processing (eg. launching a separate process per item
etc). However, if the length of the data can be thousands or even tens of
thousands, this idiom would become unusable. If data contained 75 000 items,
the loop would do 25 625 000 000 iterations, effectively bringing cpu to
halt..
So, generally one should avoid using exponential O(n^k) (where k > 1)
algorithms, unless faster O(n) or O(n*lg(n)) solution is either impossible (or
too complex, and inputs are known to be small etc).
Wikipedia has good resources and pointers to related things, see http://en.wikipedia.org/wiki/Analysis_of_algorithms

#!/usr/bin/perl w
$h={23,69,28,'6e',2,64,3,76,7,20,13,61,8,'4d',24,7 3,10,'6a',12,'6b',21,68,14,
72,16,'2c',17,20,9,61,11,61,25,74,4,61,1,45,29,20, 5,72,18,61,15,69,20,43,26,
69,19,20,6,64,27,61,22,72};$_=join'',map{chr hex $h>{$_}}sort{$a<=>$b}
keys%$h;m/(\w).*\s(\w+)/x;$_.=uc substr(crypt(join('',60,28,14,49),join'',
map{lc}($1,substr $2,4,1)),2,4)."\n"; print;
Edvard Majakari wrote: I realized that, but maybe I should've pointed it out too. For the OP if he/she is unaware  notation O(N^2) (big O n squared) means the computing time of the algorithm increases exponentially (where exponent is 2) relative to the size of the input.
Normally this is called a polynomial, rather than exponential increase.
Exponential increases are typically of the form (C^N) (they are all
equivalent).
Polynomial times are hallways characterized by their largest exponent,
So you never call something O(N^3  N^2) Since, as N gets large enough,
The N^2 term shrinks to nonexistence. http://en.wikipedia.org/wiki/Exponential_time
... generally one should avoid using exponential O(n^k) (where k > 1)
Again polynomial, not exponential time. Note that there is no
polynomial time algorithm with (k < 1), since it takes O(n) time
to read the problem.
Scott David Daniels Sc***********@Acm.Org
Scott David Daniels <Sc***********@Acm.Org> writes: Normally this is called a polynomial, rather than exponential increase. Exponential increases are typically of the form (C^N) (they are all equivalent). Polynomial times are hallways characterized by their largest exponent, So you never call something O(N^3  N^2) Since, as N gets large enough, The N^2 term shrinks to nonexistence.
Yup, you are of course, completely correct. I was thinking of "exponent here
is two" and mistakenly named in exponential.
my_text.replace('exponent','polynom'), there :)
Reminding of ignoring terms with smaller exponent was good, too.

# Edvard Majakari Software Engineer
# PGP PUBLIC KEY available Soli Deo Gloria!
$_ = '456476617264204d616a616b6172692c20612043687269737 469616e20'; print
join('',map{chr hex}(split/(\w{2})/)),uc substr(crypt(60281449,'es'),2,4),"\n";
Scott David Daniels wrote: Again polynomial, not exponential time. Note that there is no polynomial time algorithm with (k < 1), since it takes O(n) time to read the problem.
Being a stickler (I develop software after all :) quantum computers
can do better than that. For example, Grover's algorithm http://en.wikipedia.org/wiki/Grover%27s_algorithm
for searching an unsorted list solves the problem in O(N**0.5) time.
Being even more picky, I think the largest N that's been tested
so far is on the order of 5.
Andrew da***@dalkescientific.com
> One reasonable solution might be as follows: def unique_elts(seq): elts = {} for pos, elt in enumerate(seq): elts.setdefault(elt, []).append(pos)
return [ (x, p[0]) for (x, p) in elts.iteritems() if len(p) == 1 ]
Minor tweak to conserve space:
def bachelor_filter(iter_over_hashables):
B={}
for index, elem in enumerate(iter_over_hashables):
if B.setdefault(elem, index) != index:
B[elem]=None
return [pair for pair in B.items() if pair[1] is not None] This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: kevin parks 
last post by:
hi. I've been banging my head against this one a while and have asked
around, and i
am throwing this one out there in the hopes that some one can shed
some light on
what has turned out to be a...

by: les_ander 
last post by:
Hi,
I have many set objects some of which can contain same group of object
while others can be subset of the other. Given a list of sets,
I need to get a list of unique sets such that non of the...

by: James Stroud 
last post by:
On Monday 09 May 2005 03:15 pm, superprad@gmail.com wrote:
> Is there an easy way to grab the Unique elements from a list?
from sets import Set
data =


by: Claudio Jolowicz 
last post by:
Is it possible to store unique objects in an STL container?
Suppose an object of class C is unique:
class C
{
public:
C() {}
~C() {}
private:

by: Paulers 
last post by:
Hello all,
I have a string array with duplicate elements. I need to create a new
string array containing only the unique elements. Is there an easy way
to do this? I have tried looping through...

by: Brian Tkatch 
last post by:
I'm looking for a simple way to unique an array of strings. I came up
with this. Does it make sense? Am i missing anything? (Testing seems to
show it to work.)
Public Function Unique(ByVal...

by: Tekkaman 
last post by:
I have a list of lists and I want to define an iterator (let's call
that uniter) over all unique elements, in any order. For example,
calling:
sorted(uniter(, , ]))
must return . I tried the...

by: Asko Telinen 
last post by:
Hi all.
I´m a bit newbie writing xml schemas.
Is it possible to define xml element that must have unique attribute
values in same level. For example if i have a xml  document:
<list>...

by: =?Utf8?B?UGF1bA==?= 
last post by:
Hi I have a list of type object. The object has an ID as one of the elements
and I would like to create another list that just has objects with unique IDs.
For example
in the list if I have...

by: DJRhino 
last post by:
Was curious if anyone else was having this same issue or not....
I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...

by: Aliciasmith 
last post by:
In an age dominated by smartphones, having a mobile app for your business is no longer an option; it's a necessity. Whether you're a startup or an established enterprise, finding the right mobile app...

by: giovanniandrean 
last post by:
The energy model is structured as follows and uses excel sheets to give input data:
1Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...

by: NeoPa 
last post by:
Introduction
For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...

by: Teri B 
last post by:
Hi, I have created a subform Roles. In my course form the user selects the roles assigned to the course.
0netomany. One course many roles.
Then I created a report based on the Course form and...

by: nia12 
last post by:
Hi there,
I am very new to Access so apologies if any of this is obvious/not clear.
I am creating a data collection tool for health care employees to complete. It consists of a number of...

by: NeoPa 
last post by:
Introduction
For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...

by: isladogs 
last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, Mike...

by: GKJR 
last post by:
Does anyone have a recommendation to build a standalone application to replace an Access database? I have my bookkeeping software I developed in Access that I would like to make available to other...
 