Connecting Tech Pros Worldwide Forums | Help | Site Map

Finding duplicates in an array

Thekid's Avatar
Member
 
Join Date: Feb 2007
Posts: 110
#1: Oct 21 '09
I'm trying to figure out a way to find if there are duplicates in an array. My idea was to take the array as 'a' and make a second array as 'b' and remove the duplicates from 'b' using 'set' and then compare a to b. If they're different then it will print out 'duplicates found'. The problem is that even after trying different arrays, some with duplicates some without, that 'b' rearranges the numbers. Here's an example:

Expand|Select|Wrap|Line Numbers
  1.  
  2. a='1934, 2311, 1001, 4056, 1001, 3459, 9078'
  3. b=list(set(a))
  4. if a != b:
  5.     print "duplicates found"
  6. else:
  7.    print "nothing found"
  8.  
  9.  
Is there a simpler way to find if there are duplicates?
Thanks

bvdet's Avatar
Moderator
 
Join Date: Oct 2006
Location: Nashville, TN
Posts: 1,569
#2: Oct 21 '09

re: Finding duplicates in an array


In your code, you have assigned variable 'a' to a string.
Expand|Select|Wrap|Line Numbers
  1. >>> list(set(a))
  2. [' ', ',', '1', '0', '3', '2', '5', '4', '7', '6', '9', '8']
  3. >>> 
To see if there are any duplicates, let's start a list. Sets are unordered, but you can compare the length of the list to the length of the set.
Expand|Select|Wrap|Line Numbers
  1. >>> a=[1934, 2311, 1001, 4056, 1001, 3459, 9078]
  2. >>> b = set(a)
  3. >>> len(b)
  4. 6
  5. >>> len(a)
  6. 7
  7. >>> 
bvdet's Avatar
Moderator
 
Join Date: Oct 2006
Location: Nashville, TN
Posts: 1,569
#3: Oct 21 '09

re: Finding duplicates in an array


To find the items that have duplicates:
Expand|Select|Wrap|Line Numbers
  1. >>> for item in a:
  2. ...     dd[item] = dd.get(item, 0) + 1
  3. ...     
  4. >>> dd
  5. {3459: 1, 2311: 1, 1001: 2, 1934: 1, 9078: 1, 4056: 1}
  6. >>> 
OR (less efficient)
Expand|Select|Wrap|Line Numbers
  1. >>> for item in set(a):
  2. ...     if a.count(item) > 1:
  3. ...         print "Duplicate found: %s" % (item)
  4. ...         
  5. Duplicate found: 1001
  6. >>> 
Thekid's Avatar
Member
 
Join Date: Feb 2007
Posts: 110
#4: 4 Weeks Ago

re: Finding duplicates in an array


Thanks! I went with your first suggestion, which was along the lines of what I was thinking but I didn't consider comparing the lengths since set() is unordered.
Reply