473,391 Members | 1,617 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,391 software developers and data experts.

How to compare two files in perl?

Srijith B
I have two files and I want to compare the first line of file1 to each lines of file2 and once it is done, next compare second line of file1 to each line of file2. So we a match is found, print that line to a new text file.

I tried and getting wired results
this is my code
Expand|Select|Wrap|Line Numbers
  1.  #!/usr/bin/perl
  2. $file1 = 'log.txt';
  3. $file2 = 'log1.txt';
  4. open(FILE1,"<$file1") || die ("Could not open $file!");
  5. open(FILE2,"<$file2") || die ("Could not open $file!");
  6. my $match = 0;
  7. my $odd=0;
  8. my $even=0;
  9. my $n=0;
  10.  
  11.  
  12. while (($file1line =~ /MISR_AUX_ODD/) && ($file2line=~ /MISR_AUX_ODD/))
  13. {
  14.        chomp $_;
  15.  
  16.       $match++;
  17.       print "$file1line===$file2line";
  18.  
  19.  
  20.  
  21.  
  22. }
  23. print "$match";
  24.  
  25. close(FILE1);
  26.  
I have attached the two files also here.
Attached Files
File Type: txt log1.txt (43.8 KB, 506 views)
File Type: txt log.txt (48.4 KB, 398 views)
Jan 20 '11 #1
9 4154
toolic
70 Expert
One problem is that you are not reading in the lines of your files after you opened them. Read perlintro.

Another problem is that you need to use nested while loops to loop through all lines of both files.
Jan 21 '11 #2
rovf
41
Maybe

perldoc File::Compare

helps?
Jan 26 '11 #3
chorny
80 Expert
By "compare lines" do you mean equality or some other operation?
Jan 26 '11 #4
I mean I want to compare two lines for equality. I want to compare each line in file1 to all the lines of file2.

Here is a new code
Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl 
  2.  
  3. $file1 = 'log.txt'; 
  4.  
  5. $file2 = 'log1.txt';
  6.  
  7. $odd_file1 = 'output_odd1.txt';
  8.  
  9. $even_file1 = 'output_even1.txt';
  10.  
  11. $odd_file2 = 'output_odd2.txt';
  12.  
  13. $even_file2 = 'output_even2.txt';
  14.  
  15. $output_file= 'output1.txt';
  16.  
  17.  
  18.  
  19. open(FILE1,"<$file1") || die ("Could not open $file!");
  20.  
  21. open(FILE2,"<$file2") || die ("Could not open $file!");
  22.  
  23.  
  24.  
  25. my $odd=0;
  26.  
  27. my $odd2=0;
  28.  
  29. my $even=0;
  30.  
  31. my $even2=0;
  32.  
  33. my $i=0;
  34.  
  35. my $match=0;
  36.  
  37.  
  38.  
  39.  
  40.  
  41. #creates 2 files with odd and evn values by reading the log.txt file
  42.  
  43. while($log1line = <FILE1>) 
  44.  
  45. {
  46.  
  47.  
  48.  
  49.  
  50.  
  51.            foreach($log1line =~ /MISR_AUX_ODD/)
  52.  
  53.         {
  54.  
  55.  
  56.  
  57.            $odd++;
  58.  
  59.  
  60.  
  61.            #print "$odd odd-times\n";
  62.  
  63.            #print "$log1line";
  64.  
  65.         my @FILE1 = split(/:/,$log1line);
  66.  
  67.         #print "$FILE1[2]";
  68.  
  69.         @data= $FILE1[2];
  70.  
  71.         #$length=@data;
  72.  
  73.         #print "@data";
  74.  
  75.         #open(ODD_FILE1,">> $odd_file1"); 
  76.  
  77.         open(ODD_FILE1,"< $odd_file1"); 
  78.  
  79.                  for($i=0;$i<1;$i++)
  80.  
  81.             {
  82.  
  83.             print ODD_FILE1 $data[i]."\n";
  84.  
  85.             }
  86.  
  87.     close(ODD_FILE1);
  88.  
  89.         }
  90.  
  91.  
  92.  
  93.  
  94.  
  95.  
  96.  
  97.        foreach($log1line =~ /MISR_AUX_EVEN/)
  98.  
  99.         {
  100.  
  101.  
  102.  
  103.            $even++;
  104.  
  105.  
  106.  
  107.         my @FILE1 = split(/:/,$log1line);
  108.  
  109.         #print "$FILE1[2]";
  110.  
  111.         @data= $FILE1[2];
  112.  
  113.         open(EVEN_FILE1,"< $even_file1"); 
  114.  
  115.                  for($i=0;$i<1;$i++)
  116.  
  117.             {
  118.  
  119.             print EVEN_FILE1 $data[i]."\n";
  120.  
  121.             }
  122.  
  123.     close(EVEN_FILE1);
  124.  
  125.         }
  126.  
  127.  
  128.  
  129.  
  130.  
  131. }
  132.  
  133.  
  134.  
  135.  
  136.  
  137. #creates 2 files with odd and evn values by reading the 1log.txt file
  138.  
  139.  
  140.  
  141. while($log2line = <FILE2>) 
  142.  
  143. {
  144.  
  145.     #my @pattern= <FILE2>;
  146.  
  147.  
  148.  
  149.     #remove all the spaces and tabs from the file 
  150.  
  151.         #my $log2line =~ s/\n\r\t\s//g; 
  152.  
  153.         #foreach($log2line = grep(/MISR_AUX_ODD/,@pattern))
  154.  
  155.         foreach($log2line =~ /MISR_AUX_ODD/)
  156.  
  157.         {
  158.  
  159.  
  160.  
  161.             $odd2++;
  162.  
  163.             my @FILE2 = split(/:/ , $log2line);
  164.  
  165.             #print "$FILE2[2]";
  166.  
  167.             @data2= $FILE2[2];
  168.  
  169.             open(ODD_FILE2,"< $odd_file2"); 
  170.  
  171.                  for($i=0;$i<1;$i++)
  172.  
  173.             {
  174.  
  175.             print ODD_FILE2 $data2[i]."\n";
  176.  
  177.             }
  178.  
  179.     close(ODD_FILE2);
  180.  
  181.     }
  182.  
  183.  
  184.  
  185.     foreach($log2line =~ /MISR_AUX_EVEN/)
  186.  
  187.         {
  188.  
  189.  
  190.  
  191.             $even2++;
  192.  
  193.             my @FILE2 = split(/:/ , $log2line);
  194.  
  195.             #print "$FILE2[2]";
  196.  
  197.             @data2= $FILE2[2];
  198.  
  199.             open(EVEN_FILE2,"< $even_file2"); 
  200.  
  201.                  for($i=0;$i<1;$i++)
  202.  
  203.             {
  204.  
  205.             print EVEN_FILE2 $data2[i]."\n";
  206.  
  207.             }
  208.  
  209.     close(EVEN_FILE2);
  210.  
  211.         }
  212.  
  213. }     
  214.  
  215.  
  216.  
  217. open F1, "< output_odd1.txt"; 
  218.  
  219. open F2, "< output_odd2.txt"; 
  220.  
  221.  
  222.  
  223. while($line = <F1>) 
  224.  
  225.  
  226.  
  227.  
  228.   while($line1=<F2>)
  229.  
  230.    {
  231.  
  232.  
  233.  
  234.        foreach($line1)
  235.  
  236.        {
  237.  
  238.  
  239.  
  240.            #for($a=0;$a<$line1;$a++)
  241.  
  242.            #{
  243.  
  244.            if($line eq $line1)
  245.  
  246.            {
  247.  
  248.  
  249.  
  250.                    $match++;
  251.  
  252.                    print "match:$match";
  253.  
  254.                    #for($a=0;$a<$line1;$a++)
  255.  
  256.                    #{
  257.  
  258.                    print "$line";
  259.  
  260.                    #}
  261.  
  262.            }
  263.  
  264.            #}
  265.  
  266.        }
  267.  
  268.  
  269.  
  270.     }
  271.  
  272. }
  273.  
  274.  
  275.  
  276. close F1; 
  277.  
  278. close F2;
  279.  
  280.  
  281.  
  282. close(FILE2);
  283.  
  284. close(FILE1);
  285.  
The problem in this code is that the code is printing only 1 data and not all common data.
Feb 3 '11 #5
rovf
41
> I mean I want to compare two lines for equality. I want to compare each line in file1 to all the lines of file2.

Sorry, I don't get it. These are different things. "To compare two files for equality" means that you just want to know whether two files have the same content. "To ompare each line in file1 to all the lines of file2" means you take the first line of file1 and compare it to all lines in file2, then you take the 2nd line in file1 and compare it again to all lines in file2", and so on. Even without reasoning whether or not this makes sense, this still doesn't tell us anything about the desired outcome of this compare.

I suggest that, before discussing your code, you should define precisely what you want to achieve.
Feb 3 '11 #6
ok. Sorry for the inconvenience.
Here is what i am trying to do
Compare line1 of file1 with all the lines of file2. If a match is found(eg:"hello, how ru?" line of file1 matches with a line of file2) print that matched line to a different file.
Feb 3 '11 #7
rovf
41
OK, I understand more, though the specification is still incomplete. What I understand is this: You want to extract all the lines in file1 which match some line in file2, and write those lines (from line1) to a new file (says file3). Correct?

What you still don't have specified, is the following:

(1) What does "lineX matches lineY" mean? Do you mean for example:
- lineX is identical to lineY
- lineX is a regular expression supposed to match lineY
- lineY is a regular expression supposed to match lineX
- lineX is a glob pattern supposed to match lineY
- lineX (with leading and trailing blanks removed) should be equal to lineY
etc.

(2) (only if "to match" doesn't mean "equality"): If we find that lineX matches lineY, should we write lineX to file3, or lineY

(3) If you find two lines in file1 (lineM and lineN), which BOTH match some line in file2, do you want to have lineM AND lineN
to be put into file3, or only lineM?

Finally, one more remark. This is a forum dedicated to solving Perl programs. Its purpose is not primarily in discussing how to learn to program in general. This doesn't mean that you won't get help if you a novice in programming. However, if it is the case that you are new to programming (and not only to Perl), as your way of asking questions suggest, it would help if you clearly help so, because you might get more helpful answers. For example, when you started the thread, I assumed that you are a programmer who is new to Perl, and my response is focusing on Perl specific issues. However, I now get the impression that you don't know much about programming in general, so you would need first help in getting the algorithm done, instead of just getting explained the Perl part. If you could clarify this, we can help you in a better way.
Feb 3 '11 #8
numberwhun
3,509 Expert Mod 2GB
Actually, @rovf, I have to stop you right there. This is not just a forum dedicated to solving Perl problems, it is also a learning forum. Everyone needs a place to go to learn and we should not be discouraging them or turning them away. If you have an issue with a user, then please, feel free to send a PM (private message) to me or one of the other moderators on this site, but please, do not turn away our users from getting help (unless its homework of course).

Now, @Srijith B, what I do agree with is that you need to take a step back and maybe go through a Perl tutorial. Something that will give you a good grounding in the language. @rovf suggested a module earlier, but I wonder if you looked into it. If you need a good tutorial, send me a Private Message (PM) with your email address and I will gladly send you one. DO NOT put your email in this thread as it will be quickly removed and is against site policy.

Regards,

Jeff
Feb 4 '11 #9
rovf
41
> Actually, @rovf, I have to stop you right there. This is not just a forum dedicated to solving Perl problems, it is also a learning forum.

I absolutely agree. That's why I think we *need* more information. If I know from the outset, that someone is a beginner in programming, I would focus in my answer on the algorithm (for example, by providing an outline of the solution in pseudocode). If I know someone is a programmer, but doesn't know Perl, I can focus on different solutions how to solve a problem in Perl.

In addition, I think it *is* necessary that the poster defines his or her problem, because we can then help better.

Don't get me wrong: I don't mind helping learners even on a basic level. I just want to make sure that I understand, what level the learner is, and what problem s/he has to solve.
Feb 4 '11 #10

Sign in to post your reply or Sign up for a free account.

Similar topics

44
by: Xah Lee | last post by:
here's a large exercise that uses what we built before. suppose you have tens of thousands of files in various directories. Some of these files are identical, but you don't know which ones are...
4
by: Lad | last post by:
Hi, What is the best method for comparing two files by words? I was thinking about reading files by words and compare them but a word in one file can be linked with a new line character ( \n) and...
2
by: SP | last post by:
Hi All, I need to compare two files line by line and copy the differing lines to a new files. I.e. Compare file1 and file2 line by line. Copy only differing lines to file3. I tried a couple...
8
by: pjsimon | last post by:
I want to compare two files like MS Visual SourceSafe's Show Differences feature. Is there a way to access methods in VB.Net that will let me use existing MS code to show the differences between...
3
by: shona | last post by:
Hi, can any one told me how to compare files with same name but different extension.. for eg. if a.txt & a.doc then ans is same files.. Thanks
0
by: ds81 | last post by:
I am trying to read a large number of image (BMP, JPG) files, and need to know if any are identical. I have been trying to store the hashcodes of the files, so that they then can be compared later. ...
4
by: Clay Hobbs | last post by:
I am making a program that (with urllib) that downloads two jpeg files and, if they are different, displays the new one. I need to find a way to compare two files in Python. How is this done? ...
0
by: norseman | last post by:
Timothy Grant wrote: =================================== If you are on a Unix platform: man cmp man identify man display (ImageMagick) gimp If you use mc (MidnightCommander) the F3 key can...
0
by: zw | last post by:
Hi I have 2 log files, each with a timestamp on the first 2 fields. However, when I do a awk '/ / {print $1,$2}' logs/x.log on a log file, it is complicated by the fact that I also get other...
3
by: Susan StLouis | last post by:
I'm writing a program that can be used to compare files. The program features a select that contains a list of files. After selecting several of the files. a "Biggest" button can be pushed to find...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.