471,854 Members | 1,530 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,854 software developers and data experts.

Finding duplicates in an array

Thekid
145 100+
I'm trying to figure out a way to find if there are duplicates in an array. My idea was to take the array as 'a' and make a second array as 'b' and remove the duplicates from 'b' using 'set' and then compare a to b. If they're different then it will print out 'duplicates found'. The problem is that even after trying different arrays, some with duplicates some without, that 'b' rearranges the numbers. Here's an example:

Expand|Select|Wrap|Line Numbers
  1.  
  2. a='1934, 2311, 1001, 4056, 1001, 3459, 9078'
  3. b=list(set(a))
  4. if a != b:
  5.     print "duplicates found"
  6. else:
  7.    print "nothing found"
  8.  
  9.  
Is there a simpler way to find if there are duplicates?
Thanks
Oct 21 '09 #1
3 24856
bvdet
2,851 Expert Mod 2GB
In your code, you have assigned variable 'a' to a string.
Expand|Select|Wrap|Line Numbers
  1. >>> list(set(a))
  2. [' ', ',', '1', '0', '3', '2', '5', '4', '7', '6', '9', '8']
  3. >>> 
To see if there are any duplicates, let's start a list. Sets are unordered, but you can compare the length of the list to the length of the set.
Expand|Select|Wrap|Line Numbers
  1. >>> a=[1934, 2311, 1001, 4056, 1001, 3459, 9078]
  2. >>> b = set(a)
  3. >>> len(b)
  4. 6
  5. >>> len(a)
  6. 7
  7. >>> 
Oct 21 '09 #2
bvdet
2,851 Expert Mod 2GB
To find the items that have duplicates:
Expand|Select|Wrap|Line Numbers
  1. >>> for item in a:
  2. ...     dd[item] = dd.get(item, 0) + 1
  3. ...     
  4. >>> dd
  5. {3459: 1, 2311: 1, 1001: 2, 1934: 1, 9078: 1, 4056: 1}
  6. >>> 
OR (less efficient)
Expand|Select|Wrap|Line Numbers
  1. >>> for item in set(a):
  2. ...     if a.count(item) > 1:
  3. ...         print "Duplicate found: %s" % (item)
  4. ...         
  5. Duplicate found: 1001
  6. >>> 
Oct 21 '09 #3
Thekid
145 100+
Thanks! I went with your first suggestion, which was along the lines of what I was thinking but I didn't consider comparing the lengths since set() is unordered.
Oct 29 '09 #4

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

3 posts views Thread by Erich | last post: by
reply views Thread by Timo Nentwig | last post: by
6 posts views Thread by Maxi | last post: by
4 posts views Thread by Mokita | last post: by
NeoPa
reply views Thread by NeoPa | last post: by
reply views Thread by YellowAndGreen | last post: by
aboka
reply views Thread by aboka | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.