473,587 Members | 2,200 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Dirty Arrays and how to clean them up!

I have this array with duplicate entries. Hundreds to be in fact.

For example:Array = 17177 9661 9661 9535 9533 9533 9533 9533 9533 9533
9533 9533 9533 9533 9533 9533 9533 9533 9532 9532 9532 9532 9531 9096
9096 9096 9095 9095 9095 8345 8345 8344 8344 8226 8226 8225 8225 8198
8198 8198 8198 8198 8198 8198 8198 8198 8198 8198 8198 8198 8198 8198
8198 8198 8198 8198 8198 8198 8198 8198 8198 8198........... ...

This is what i am doing. i am reading file1 that has these entries in
the file mutiple times. I am comparing it against file2. if i do not
find a match i write the record to file3 and then to an array. when i
read file1 again and compare it against file2 i check to see if the
recorrd has been written by rereading the array and looking for it.
if it finds the same entry in the array then read the next record do
not wirte to file3. if not then write the record to file3 and the
array.

i either want to only write once to the array per record or before i
check remove dupliactes.

here some code example:
while file1
while file2
if record1 != record2
if record2 is not in array
write to file3
write to new record to array
last
else
next
else
next

Here is another example that
shows a poor example but you should get the idea. at least i
hope!!!!

@SORTED = ();
@CLEANED = ();
$X = 0;
$R = 0;
while ($R < 4)
{
while ($X <= 100)
{
$X++;
# print "I am in X\n";
# # i really do not want to write to the array if
# # it already exists in the array.
# # if array already has X then do not right to the
# # array.
push @CHECKED, $X;
@SORTED = sort {$a <=> $b} @CHECKED;
}
# print "*********I am in R\n";
$R++;
$X = 0;
next;
}

print "Array = @SORTED\n";

foreach $Row (@SORTED)
{
$Hold = $Row;
if ($Hold == $Row)
{
@CLEANED = shift @SORTED;
}
else
{
next;
}
# # How do i remove duplicates from the array?
# # I know this is wrong but here is my delima!
}

print "Array = @CLEANED\n";

Jul 19 '05 #1
4 6959
Bob
I have been working on the solution myself, but i am still having
problems. Here is where I am so far!

Still lost!

@SORTED = ();
@CLEANED = ();
$X = 0;
$R = 0;
while ($R < 4)
{
while ($X <= 100)
{
$X++;
# print "I am in X\n";
# # i really do not want to write to the array if
# # it already exists in the array.
# # if array already has X then do not right to the
# # array.
push @CHECKED, $X;
@SORTED = sort {$a <=> $b} @CHECKED;
}
# print "*********I am in R\n";
$R++;
$X = 0;
next;
}

print "Array = @SORTED\n";

$Element2 = 0;

for ($Element1 = 0; $Element1 < @SORTED; $Element++)
{
$Element2 = ($Element1 + 1);
if ($SORTED[$Element1] == $SORTED[$Element2])
{
print "$SORTED[$Element1] == $SORTED[$Element2]\n";
delete $SORTED[$Element2];
$Element1--;
next;
}
else
{
next;
}
}
# # How do i remove duplicates from the array?
# # I know this is wrong but here is my delima!

print "Array = @CLEANED\n";

Jul 19 '05 #2
Bob
Sorry i cut and pasted only half of the code!

@SORTED = ();
@CLEANED = ();
$X = 0;
$R = 0;
while ($R < 4)
{
while ($X <= 100)
{
$X++;
# print "I am in X\n";
# # i really do not want to write to the array if
# # it already exists in the array.
# # if array already has X then do not right to the
# # array.
push @CHECKED, $X;
@SORTED = sort {$a <=> $b} @CHECKED;
}
# print "*********I am in R\n";
$R++;
$X = 0;
next;
}

print "Array = @SORTED\n";

$Element2 = 0;

for ($Element1 = 0; $Element1 < @SORTED; $Element++)
{
$Element2 = ($Element1 + 1);
if ($SORTED[$Element1] == $SORTED[$Element2])
{
print "$SORTED[$Element1] == $SORTED[$Element2]\n";
delete $SORTED[$Element2];
$Element1--;
next;
}
else
{
next;
}
}
# # How do i remove duplicates from the array?
# # I know this is wrong but here is my delima!

print "Array = @CLEANED\n";

Jul 19 '05 #3
bob

the easiest way to do this is to use all elements of your dirty as keys
to a hash. there are shorter forms to this solution, but i'll leave it
verbose for clarity;

hope this helps

-r

############### ############### ############### ############### ###
# see
# http://perlmonks.com/index.pl?node=609
# for reference
############### ############### ############### ############### ###

use strict;

my @dirty = ( 1, 2, 3, 2, 3, 4, 9, 8, 7, 6, 5, 4, 3, 2, 1);
#
# the hash we will use to track unique elements in @dirty
#
my %occurred;
foreach my $dirtyElement ( @dirty )
{
$occurred{$dirt yElement} = 1;
}
#
# we can now retrieve the unique elements from %occurred
#
my @unique = keys %occurred;

print @unique;

############### ############### ############### ############### ###


"Bob" <sp***@realcons ultants.com> wrote in news:1109737474 .473572.207520
@g14g2000cwa.go oglegroups.com:
Sorry i cut and pasted only half of the code!

@SORTED = ();
@CLEANED = ();
$X = 0;
$R = 0;
while ($R < 4)
{
while ($X <= 100)
{
$X++;
# print "I am in X\n";
# # i really do not want to write to the array if
# # it already exists in the array.
# # if array already has X then do not right to the
# # array.
push @CHECKED, $X;
@SORTED = sort {$a <=> $b} @CHECKED;
}
# print "*********I am in R\n";
$R++;
$X = 0;
next;
}

print "Array = @SORTED\n";

$Element2 = 0;

for ($Element1 = 0; $Element1 < @SORTED; $Element++)
{
$Element2 = ($Element1 + 1);
if ($SORTED[$Element1] == $SORTED[$Element2])
{
print "$SORTED[$Element1] == $SORTED[$Element2]\n";
delete $SORTED[$Element2];
$Element1--;
next;
}
else
{
next;
}
}
# # How do i remove duplicates from the array?
# # I know this is wrong but here is my delima!

print "Array = @CLEANED\n";


Jul 19 '05 #4
Here is my attempt:

my %temp = ();
print grep !$temp{$_}++, ((1..10) x 5);

....which is essentially the same concept as you described, but just a
different approach.

Jul 19 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
5081
by: Jason | last post by:
I have a number of arrays that are populated with database values. I need to determine which array has the highest ubound out of all the arrays. The array size will always change based on the database record. Therefore, I need to be able to throw the arrays into a function that will automatically determine the highest ubound array. I was...
4
5376
by: RSMEINER | last post by:
I'm having a little problem with using Select Not Exists between 2 tables. I get the infamous Primary key "Cannot insert duplicate key" error. This is on MS Sql 2000. I'm trying to create a temp table with distinct values from another table. Heres a small snippet of the code. Create table ##tblMail (ClientID Integer)
4
1981
by: k-man | last post by:
Hi! I'll try to make it short: 1) is memset() optimised is some way (I suppose at various level depending of implementations) and if so, to what level? Is it faster to memset() 4kbytes of memory to 0 or go in a for(;;) loop for 10 (or 100) iteration to clean up individual integers? 2) This one confuses me even after all those years. I...
19
2833
by: Canonical Latin | last post by:
"Leor Zolman" <leor@bdsoft.com> wrote > "Canonical Latin" <javaplus@hotmail.com> wrote: > > > ... > >But I'm still curious as to the rational of having type > >pointer-to-array-of-size-N-of-type-T (which is fine) and not having type > >array-of-size-N-of-type-T (with some exceptions, which is curious). > > So far > >the consensus seems to...
6
6854
by: Sean C. | last post by:
Helpful folks, I am having a hard time figuring out how to reduce my percentage of dirty page steal activity. Below are statistics for three fairly normal days, with the bufferpool hit ratios and page clean percentages, as well as an average of transaction rate for the entire days work. The ORDERST_BP was created to service our most...
2
3594
by: nutthatch | last post by:
I want to be able to import an Excel spreadsheet into Access 2K using the macro command Transferspreadsheet. However, the file I am importing (over which I have no control) contains some records that are "dirty" i.e. the field contents do not comply with the expected format (date/time) and they end up in a seperate table of import errors....
2
2280
by: Thomas Baruchel | last post by:
Hi, wondering about: func1: setjmp() ; func2(); func2: {FILE *f; f = fopen(); func3(); fclose(f)} func3 : if() longjmp; else return; Note that FILE *fis a local variable in func2.
8
1779
by: jodleren | last post by:
Hi! I have a function, a part of my code which I can use as a function. It will return 2 arrays, and I am wondering what way to do so. Both arrays hold strings, there are no special keys. 1) setting the arrays as globals 2) returnin an array of arrays 3) returning a large array with a known marker to indicate when the 2nd part starts.
127
4817
by: sanjay.vasudevan | last post by:
Why are the following declarations invalid in C? int f(); int f(); It would be great if anyone could also explain the design decision for such a language restricton. Regards, Sanjay
0
7843
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8205
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
1
7967
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6619
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
5392
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3872
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2347
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1452
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1185
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.