compare two file contents

Hi
I have two text files and each file contains 2 tab separated strings as below:

File R.txt

class1 12345
class2 26789
class1 4567

Feb 2 '09 #1

Subscribe Post Reply

9743

lilly07

Hi Sorry, I submitted my post before finishing the draft.

I have two text files and each file contains 2 tab separated strings as below:

File testR.txt
class1 12345
class2 26789
class1 4567
class5 567
class3 987

and another file as below:
File testP.txt
class5 525
class7 728
class1 670
class8 34
class3 567

I need to compare both the files as below. For every record in R.txt, I have to check whether the column 1 is equal in the every record in P.txt and process further. I tried as below. But some how the search is not complete. For every record in R.txt, I have to search every record in P.txt and do the comparison.

I have posted my script below. But there seems to be some flaw in the logic and also I want to know whether this kind of search is optimal or not b'cos my record size of each file is around 5000 for R.txt and 3000 for P.txt. Thanks and let me know the problem in my script.

Expand|Select|Wrap|Line Numbers

 
#!/usr/bin/perl

$file1 = 'testR.txt';

$file2 = 'testP.txt';

open (R, $file1) || die ("Could not open $file!");

open (P, $file2) || die ("Could not open $file!");

$counter = 0;

while ($Rline = <R>)

{ 

        chomp $_;

        my @R = split(/\s+/,$Rline);
 
        while ($Pline = <P>)

        { 

                chomp $_;

                my @P = split(/\s+/,$Pline);

                        if($R[0] eq $P[0]) {

                        print "$R[0]\t$R[1]\t$P[0]\t$P[1]\n";

                        }

        }

        close (P);

        print "$counter\n";

        $counter++;
 
}

close (R);

Thanks.

Feb 2 '09 #2

KevinADC

4,059

Expert 2GB

Your code looks like it should not work since you are closing file P after only searching the first line of file R. I'm not sure what to suggest though becuase what you are trying to do is not clear to me. Most likely you want to use a hash and search the hashes instead of searching the file over and over.

Feb 3 '09 #3

lilly07

Hi Kevin,

I can not add the file contents into a hash. In the above example, I need to check the first column of file1 and first column of file2 and if they are same, I have to process further. Basically I have to check for every element in the file1 and file2.

Thanks.

Feb 3 '09 #4

KevinADC

4,059

Expert 2GB

Why can't you add the file contents into a hash? This way is very inefifficient but see if it works:

Expand|Select|Wrap|Line Numbers

 
#!/usr/bin/perl

$file1 = 'testR.txt';

$file2 = 'testP.txt';

open (R, $file1) || die ("Could not open $file!");

open (P, $file2) || die ("Could not open $file!");

$counter = 0;

while ($Rline = <R>)

{ 

        chomp $_;

        my @R = split(/\s+/,$Rline);

        seek P,0,0; # return to beginning of the P file

        while ($Pline = <P>)

        { 

                chomp $_;

                my @P = split(/\s+/,$Pline);

                        if($R[0] eq $P[0]) {

                             print "$R[0]\t$R[1]\t$P[0]\t$P[1]\n";

                        }

        }

        print "$counter\n";

        $counter++;
 
}

close (P)

close (R);

Feb 3 '09 #5

lilly07

Hi Kevin,
As a beginner Hash is always confusing. My basic objective is to check whether the 1st column in R file is equal to 1st column in P file and then take a difference between their 2nd columns to check whether they are just 100 in difference. That is for example,

Expand|Select|Wrap|Line Numbers

class1 12345

from R and

Expand|Select|Wrap|Line Numbers

class1 670

of P

1st column are same and diff is mod value of (12345 -670). I have to check for all the records in R against every record in P. Since it is confusing to think using hash, I did a search in a very primitive way.

Anyway as you had suggested, I will try to put the contents of values in a hash and try to compare the array values. I can understand that my search time comes down with this, but again bit confusing to compare the values between hashes.
Thanks again.

Feb 4 '09 #6

KevinADC

4,059

Expert 2GB

I understand. It will be even harder because you have many duplicates "keys" in the files. Hash keys are unique so you would actually have to use something like a hash of arrays. I will see what I can come up with.

Feb 4 '09 #7

KevinADC

4,059

Expert 2GB

Heres a rather quick write up of some code. It does what you want, I think. The output is probably much more verbose than you want but that can be changed to only display results you want, like if the diff is 100. I would run this on a small set of data since it will print out a lot of results. If it appears to work correctly the output can be modified.

Expand|Select|Wrap|Line Numbers

 use strict;

use warnings;

#use Data::Dumper;

my $file1 = 'c:/perl_test/testR.txt';

my $file2 = 'c:/perl_test/testP.txt';

my %HoA;

open (R, $file1) or  die ("Could not open $file1!");

while(<R>){

   chomp;

   my ($k, $v) = split(/\s+/);

   push @{$HoA{'R'}{$k}},$v;

}

close(R);

open (P, $file2) or die ("Could not open $file2!");

while(<P>){

   chomp;

   my ($k, $v) = split(/\s+/);

   push @{$HoA{'P'}{$k}},$v;

}

close(P);

#print Dumper \%HoA;

foreach my $R (keys %{ $HoA{'R'} }) {

   if (exists $HoA{'P'}{$R}) {

      print "$R\ntestR     testP     diff\n------------------------------\n";

      foreach my $classR ( @{$HoA{'R'}{$R}} ) {

         foreach my $classP ( @{$HoA{'P'}{$R}} ) {

            printf "%-10s%-10s%s\n",$classR,$classP,$classR-$classP;

         }

      }

      print "\n";

   }

   else {

      print "\n$R has no match in testP\n\n";

   }

}

Feb 4 '09 #8

KevinADC

4,059

Expert 2GB

Output with your small sample data is:

Expand|Select|Wrap|Line Numbers

 
class5

testR     testP     diff

------------------------------

567       525       42
 
class1

testR     testP     diff

------------------------------

12345     670       11675

4567      670       3897
 
class2 has no match in testP
 
class3

testR     testP     diff

------------------------------

987       567       420

Feb 4 '09 #9

lilly07

Hi Kevin,

Thank you so much. It works.

Feb 4 '09 #10

KevinADC

4,059

Expert 2GB

You're welcome. Hopefully it helps you learn how to use hashes and more complex data for future needs.

Feb 4 '09 #11

lilly07

Hi Kevin,
It was too helpful especially with hashes and saved lots of time rather than primitive way of searching. Thanks a lot again for your time.

Expand|Select|Wrap|Line Numbers

push @{$HoA{'P'}{$k}},$v;

is bit tricky. Could you please explain?
Regards
Lilly

Feb 4 '09 #12

KevinADC

4,059

Expert 2GB

You are already familiar with the push function I assume:

push @array,$var;

This is really the same thing all be it with more brackets:

push @{$HoA{'P'}{$k}},$v;

its a hash of hash of array

$HoA{'P'} <-- top level of the hash
$HoA{'P'}{$R} <-- second level of the hash
@{ $HoA{'P'}{$R} } <-- this converts the second level of the hash into an array
push @{$HoA{'P'}{$k}},$v; <-- this adds $v to the end of the array @{$HoA{'P'}{$k}}

all the bracketing makes it look more complicated than it is. But notice the type casting is the same: @ for array.

Feb 4 '09 #13

by: Robin Siebler | last post by:

I have two directory trees that I want to compare and I'm trying to figure out what the best way of doing this would be. I am using walk to get a list of all of the files in each directory. I...

Python

string compare problem

by: David zhu | last post by:

I've got different result when comparing two strings using "==" and string.Compare(). The two strings seems to have same value "1202002" in the quick watch, and both have the same length 7 which I...

C# / C Sharp

Compare ArrayList

by: bengamin | last post by:

Hi, I declare two ArrayList variables.How can I compare if the two ArrayList are Value Equal. Thanks! Ben

C# / C Sharp

How to compare two table contents?

by: Shaw | last post by:

Our database is constantly updated (input data) from another DB, and sometimes it crashes our ASP.NET applications. My boss told me to write a DB utility app to check DB and make sure all apps are...

ASP.NET

File compare

by: | last post by:

What is the simplest way to determine if two files are identical (that is, all bytes the same). I want to check to see it the current version of a .jpg file is identical to the original. I wish...

Visual Basic .NET

strcmp vs. string::compare(const string &)

by: lchian | last post by:

Hi, For two stl strings s1 and s2, I got different results from strcmp(s1.c_str(), s2.c_str()) and s1.compare(s2) can someone explain what these functions do? It seems that strcmp gives...

C / C++

Compare audio files

by: drabee | last post by:

Please help 2 things: 1-I need c# code to compare 2 audio files .or any other .net code 2-code to receive bluetooth file from mobile and save it using .net code

C# / C Sharp

Compare and match readLine() and text in a text file

by: huiling25 | last post by:

I have a text file, i open the file and try to match the contents of the text file to a text. Here's my code: FileInputStream input = new FileInputStream ("BinaryFp.txt"); BufferedReader br = new...

Java

How to open 2 files at the same and compare the contents in ascii format?

by: mlco | last post by:

Pls help. I have 2 wafer maps file for comparison. The checking will only start after the "#" and ended with "##". The file contains; ERROR CODE OK WAFER ID 01 REFERENCE DIE ROWS 74

Visual Basic 4 / 5 / 6

How to compare contents of 2 files

by: raj85 | last post by:

i want to compare 2 same files which is located at different location. i olready created 2 file browser button. now i juz need c# code to do comparing text of the both files. it muct check both...

.NET Framework

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

compare two file contents

Similar topics