473,327 Members | 1,896 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,327 software developers and data experts.

Count all unique elements in an array.

3
Hello everyone,

I am a biologist that is a relatively new to programming. This isn't for homework, however I would like to efficiently write a piece of a program that finds the number of unique elements in a sorted array. My array has over 200,000+ elements in it, so it would be silly to brute force the whole thing, especially because the array is sorted.

Example:
@array = (0,0,0,1,1,1,1,2,3,3);
Would return:
0: 3
1: 4
2: 1
3: 2


Any suggestions? I don't need the entire code if it is against the rules, but I would like an idea of how to program this.
Jul 26 '11 #1

✓ answered by Rabbit

I'm not sure what you mean by brute force but if it's sorted, you could just start counting and when it changes, insert into a new array with the value and count. In peudocode, it would be
Expand|Select|Wrap|Line Numbers
  1. var arr1, arr2, group, count, i
  2.  
  3. load arr1
  4. group = arr1(0)
  5. count = 0
  6.  
  7. for i = 0 to upperbound(arr1)
  8.    if arr1(i) != group
  9.       insert into arr2 (group, count)
  10.       count = 0
  11.       group = arr1(i)
  12.    end if
  13.  
  14.    count = count + 1
  15. end for

6 4940
Rabbit
12,516 Expert Mod 8TB
I'm not sure what you mean by brute force but if it's sorted, you could just start counting and when it changes, insert into a new array with the value and count. In peudocode, it would be
Expand|Select|Wrap|Line Numbers
  1. var arr1, arr2, group, count, i
  2.  
  3. load arr1
  4. group = arr1(0)
  5. count = 0
  6.  
  7. for i = 0 to upperbound(arr1)
  8.    if arr1(i) != group
  9.       insert into arr2 (group, count)
  10.       count = 0
  11.       group = arr1(i)
  12.    end if
  13.  
  14.    count = count + 1
  15. end for
Jul 26 '11 #2
Lisa L
3
Thanks, by brute force I mean nested loops that go through the whole array in order to count.

Thanks for your post, appreciate it!
-Lisa
Jul 26 '11 #3
Rabbit
12,516 Expert Mod 8TB
Not a problem. Since it's sorted, you don't need nested loops but you will need the one loop to iterate through the whole array.
Jul 26 '11 #4
chaarmann
785 Expert 512MB
If you have a rough estimate on how often each number is in the array in average, you don't need to iterate through the whole array by single steps, because it's sorted. This speeds up your program very much.

Here is the idea:
Let's say you know that each number is in the array 10 times on average.
First look at the first element.
Then look at the tenth element.
Case 1: If it's the same, go on forward by single steps until you reach a different element. (For example if the thirteenth element is different, you have used only 5 lookups for 13 elements, which is more than 2 and a half times faster than going by single step).
Case 2: If it's not the same, then go backward by single steps until you find one that's the same. (For example if the seventh element is the same, you have used only 5 lookups for 10 elements, which is more than 2 times faster than going by single step).
Then use the greatest position (11 or 14 as new starting point and repeat the whole procedure.

If you can't estimate the average, you can use a statistic from the first elements to guess the average of further elements ( the jump size) while going through the array. This works well if you know that there is an equal distribution among the different numbers in the array.

If there is no such distribution, that means among 200000 elements there can be 10000 times one element and only 2 times another, you can use binary jump steps: Jump by 2, 4, 8, 16, 32, 64 etc. That means look at the first, third, seventh, fifteenthe etc. element. With this strategy you only need 10 lookups to count more than 1000 elements of a row.

If you know in advance how many different elements are there in your array, or if you know all names of all elements that occur in your array, you can even optimize more.
Jul 27 '11 #5
chaarmann
785 Expert 512MB
Just one more idea:
Use an algorithm like the binary search: always cut the jump size in half when going forward or backward, don't go by single steps.
For example compare element 1 with 1024. If they are not the same, compare one with 512. If they are not the same, compare one with 256, if they are the same, compare one with 256+128=384etc.
Jul 27 '11 #6
toolic
70 Expert
200k will not take up much memory, so just loop through the array storing the values in a hash. This works whether the values are already sorted or not.

Expand|Select|Wrap|Line Numbers
  1. use warnings;
  2. use strict;
  3. use Data::Dumper;
  4.  
  5. my @array = (0,0,0,1,1,1,1,2,3,3);
  6. my %hash;
  7. for (@array) {
  8.     $hash{$_}++;
  9. }
  10. print Dumper(\%hash);
  11.  
  12. __END__
  13.  
  14. $VAR1 = {
  15.           '1' => 4,
  16.           '3' => 2,
  17.           '0' => 3,
  18.           '2' => 1
  19.         };
  20.  
Aug 15 '11 #7

Sign in to post your reply or Sign up for a free account.

Similar topics

24
by: superprad | last post by:
Is there an easy way to grab the Unique elements from a list? For Example: data = what I am looking for is the unique elements 0.4 and 0.9 with their index from the list. Probably something...
2
by: jjl | last post by:
I have an array, and try to find out the unique elements (case insentitive) of array. Could someone help me out. Thanks a lot.
3
by: gouqizi.lvcha | last post by:
Hi all I have a large vector with float point numbers in it, for example (1.1, 2.1 , 3.2 , 3.3 , 4, 6, 3.2, 8). Is there an easy way to determine how many uique elements in the array? Thanks...
5
by: Paulers | last post by:
Hello all, I have a string array with duplicate elements. I need to create a new string array containing only the unique elements. Is there an easy way to do this? I have tried looping through...
9
by: Brian Tkatch | last post by:
I'm looking for a simple way to unique an array of strings. I came up with this. Does it make sense? Am i missing anything? (Testing seems to show it to work.) Public Function Unique(ByVal...
6
by: Tekkaman | last post by:
I have a list of lists and I want to define an iterator (let's call that uniter) over all unique elements, in any order. For example, calling: sorted(uniter(, , ])) must return . I tried the...
1
newnewbie
by: newnewbie | last post by:
Desperately need help in creating a query to count unique values in a table. I am a Business analyst with limited knowledge of Access….My boss got me ODBC connection to the underlying tables for our...
5
by: aszia787 | last post by:
Hi guys, I need to count the unique elements in an array. For example, if i have an array, array. The number of unique elements in the array is 3. I am sorry if this message has been posted...
1
by: dez5000 | last post by:
I'm trying to get a report by location that would list the number of visits to the location for the month but also count the number of unique visitors to that location. I have a table of data with...
1
newnewbie
by: newnewbie | last post by:
Hi, Could somebody please help me with VBA code to count unique values in a Report? Report is based on a query that has grouped values. Basically, I would like to use formula...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.