473,387 Members | 1,757 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Finding duplicates in an array

Thekid
145 100+
I'm trying to figure out a way to find if there are duplicates in an array. My idea was to take the array as 'a' and make a second array as 'b' and remove the duplicates from 'b' using 'set' and then compare a to b. If they're different then it will print out 'duplicates found'. The problem is that even after trying different arrays, some with duplicates some without, that 'b' rearranges the numbers. Here's an example:

Expand|Select|Wrap|Line Numbers
  1.  
  2. a='1934, 2311, 1001, 4056, 1001, 3459, 9078'
  3. b=list(set(a))
  4. if a != b:
  5.     print "duplicates found"
  6. else:
  7.    print "nothing found"
  8.  
  9.  
Is there a simpler way to find if there are duplicates?
Thanks
Oct 21 '09 #1
3 25036
bvdet
2,851 Expert Mod 2GB
In your code, you have assigned variable 'a' to a string.
Expand|Select|Wrap|Line Numbers
  1. >>> list(set(a))
  2. [' ', ',', '1', '0', '3', '2', '5', '4', '7', '6', '9', '8']
  3. >>> 
To see if there are any duplicates, let's start a list. Sets are unordered, but you can compare the length of the list to the length of the set.
Expand|Select|Wrap|Line Numbers
  1. >>> a=[1934, 2311, 1001, 4056, 1001, 3459, 9078]
  2. >>> b = set(a)
  3. >>> len(b)
  4. 6
  5. >>> len(a)
  6. 7
  7. >>> 
Oct 21 '09 #2
bvdet
2,851 Expert Mod 2GB
To find the items that have duplicates:
Expand|Select|Wrap|Line Numbers
  1. >>> for item in a:
  2. ...     dd[item] = dd.get(item, 0) + 1
  3. ...     
  4. >>> dd
  5. {3459: 1, 2311: 1, 1001: 2, 1934: 1, 9078: 1, 4056: 1}
  6. >>> 
OR (less efficient)
Expand|Select|Wrap|Line Numbers
  1. >>> for item in set(a):
  2. ...     if a.count(item) > 1:
  3. ...         print "Duplicate found: %s" % (item)
  4. ...         
  5. Duplicate found: 1001
  6. >>> 
Oct 21 '09 #3
Thekid
145 100+
Thanks! I went with your first suggestion, which was along the lines of what I was thinking but I didn't consider comparing the lengths since set() is unordered.
Oct 29 '09 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

8
by: Eric Linders | last post by:
Hi, I'm trying to figure out the most efficient method for taking the first character in a string (which will be a number), and use it as a variable to check to see if the other numbers in the...
4
by: PhilC | last post by:
Hi Folks, If I have an array holding a pair of numbers, and that pairing is unique, is there a way that I can find the array index number for that pair? Thanks, PhilC
3
by: Erich | last post by:
I have a company table and I would like to write a query that will return to me any duplicate companies. However, it is a little more complicated then just matching on exact company names. I would...
0
by: Timo Nentwig | last post by:
Hi! <node> <name>foo</name> <id>1</id> <age>7</age> </node> <node.....> <node> <name>foo</name>
6
by: Maxi | last post by:
I have 100 tabes in an Access database, every table has 1 filed with 100 names (records), no primary key assigned. I would like to find duplicates. Here is the criteria: The computer should...
11
by: paradox | last post by:
Basically I have an ArrayList of strings; I need a fast way of searching through and comparing the elements of the ArrayList and the return an ArrayList of items that have 3 Duplicates. For...
4
by: Mokita | last post by:
Hello, I am working with Taverna to build a workflow. Taverna has a beanshell where I can program in java. I am having some problems in writing a script, where I want to eliminate the duplicates...
5
by: limperger | last post by:
Hello everyone! Is out there any way to search for duplicate entries without using the "Find duplicates" option? In the Access 97 installed in my workplace, the Find duplicates option is disabled...
17
by: Gayatree | last post by:
I have to concatenate 2 colimns in a table and find duplicates in them. I already used the method select * from tableA a where (select count(*) from TableA b where acol1+ +col2 =...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.