473,320 Members | 1,695 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

How to compare two files in perl?

Srijith B
I have two files and I want to compare the first line of file1 to each lines of file2 and once it is done, next compare second line of file1 to each line of file2. So we a match is found, print that line to a new text file.

I tried and getting wired results
this is my code
Expand|Select|Wrap|Line Numbers
  1.  #!/usr/bin/perl
  2. $file1 = 'log.txt';
  3. $file2 = 'log1.txt';
  4. open(FILE1,"<$file1") || die ("Could not open $file!");
  5. open(FILE2,"<$file2") || die ("Could not open $file!");
  6. my $match = 0;
  7. my $odd=0;
  8. my $even=0;
  9. my $n=0;
  10.  
  11.  
  12. while (($file1line =~ /MISR_AUX_ODD/) && ($file2line=~ /MISR_AUX_ODD/))
  13. {
  14.        chomp $_;
  15.  
  16.       $match++;
  17.       print "$file1line===$file2line";
  18.  
  19.  
  20.  
  21.  
  22. }
  23. print "$match";
  24.  
  25. close(FILE1);
  26.  
I have attached the two files also here.
Attached Files
File Type: txt log1.txt (43.8 KB, 504 views)
File Type: txt log.txt (48.4 KB, 397 views)
Jan 20 '11 #1
9 4149
toolic
70 Expert
One problem is that you are not reading in the lines of your files after you opened them. Read perlintro.

Another problem is that you need to use nested while loops to loop through all lines of both files.
Jan 21 '11 #2
rovf
41
Maybe

perldoc File::Compare

helps?
Jan 26 '11 #3
chorny
80 Expert
By "compare lines" do you mean equality or some other operation?
Jan 26 '11 #4
I mean I want to compare two lines for equality. I want to compare each line in file1 to all the lines of file2.

Here is a new code
Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl 
  2.  
  3. $file1 = 'log.txt'; 
  4.  
  5. $file2 = 'log1.txt';
  6.  
  7. $odd_file1 = 'output_odd1.txt';
  8.  
  9. $even_file1 = 'output_even1.txt';
  10.  
  11. $odd_file2 = 'output_odd2.txt';
  12.  
  13. $even_file2 = 'output_even2.txt';
  14.  
  15. $output_file= 'output1.txt';
  16.  
  17.  
  18.  
  19. open(FILE1,"<$file1") || die ("Could not open $file!");
  20.  
  21. open(FILE2,"<$file2") || die ("Could not open $file!");
  22.  
  23.  
  24.  
  25. my $odd=0;
  26.  
  27. my $odd2=0;
  28.  
  29. my $even=0;
  30.  
  31. my $even2=0;
  32.  
  33. my $i=0;
  34.  
  35. my $match=0;
  36.  
  37.  
  38.  
  39.  
  40.  
  41. #creates 2 files with odd and evn values by reading the log.txt file
  42.  
  43. while($log1line = <FILE1>) 
  44.  
  45. {
  46.  
  47.  
  48.  
  49.  
  50.  
  51.            foreach($log1line =~ /MISR_AUX_ODD/)
  52.  
  53.         {
  54.  
  55.  
  56.  
  57.            $odd++;
  58.  
  59.  
  60.  
  61.            #print "$odd odd-times\n";
  62.  
  63.            #print "$log1line";
  64.  
  65.         my @FILE1 = split(/:/,$log1line);
  66.  
  67.         #print "$FILE1[2]";
  68.  
  69.         @data= $FILE1[2];
  70.  
  71.         #$length=@data;
  72.  
  73.         #print "@data";
  74.  
  75.         #open(ODD_FILE1,">> $odd_file1"); 
  76.  
  77.         open(ODD_FILE1,"< $odd_file1"); 
  78.  
  79.                  for($i=0;$i<1;$i++)
  80.  
  81.             {
  82.  
  83.             print ODD_FILE1 $data[i]."\n";
  84.  
  85.             }
  86.  
  87.     close(ODD_FILE1);
  88.  
  89.         }
  90.  
  91.  
  92.  
  93.  
  94.  
  95.  
  96.  
  97.        foreach($log1line =~ /MISR_AUX_EVEN/)
  98.  
  99.         {
  100.  
  101.  
  102.  
  103.            $even++;
  104.  
  105.  
  106.  
  107.         my @FILE1 = split(/:/,$log1line);
  108.  
  109.         #print "$FILE1[2]";
  110.  
  111.         @data= $FILE1[2];
  112.  
  113.         open(EVEN_FILE1,"< $even_file1"); 
  114.  
  115.                  for($i=0;$i<1;$i++)
  116.  
  117.             {
  118.  
  119.             print EVEN_FILE1 $data[i]."\n";
  120.  
  121.             }
  122.  
  123.     close(EVEN_FILE1);
  124.  
  125.         }
  126.  
  127.  
  128.  
  129.  
  130.  
  131. }
  132.  
  133.  
  134.  
  135.  
  136.  
  137. #creates 2 files with odd and evn values by reading the 1log.txt file
  138.  
  139.  
  140.  
  141. while($log2line = <FILE2>) 
  142.  
  143. {
  144.  
  145.     #my @pattern= <FILE2>;
  146.  
  147.  
  148.  
  149.     #remove all the spaces and tabs from the file 
  150.  
  151.         #my $log2line =~ s/\n\r\t\s//g; 
  152.  
  153.         #foreach($log2line = grep(/MISR_AUX_ODD/,@pattern))
  154.  
  155.         foreach($log2line =~ /MISR_AUX_ODD/)
  156.  
  157.         {
  158.  
  159.  
  160.  
  161.             $odd2++;
  162.  
  163.             my @FILE2 = split(/:/ , $log2line);
  164.  
  165.             #print "$FILE2[2]";
  166.  
  167.             @data2= $FILE2[2];
  168.  
  169.             open(ODD_FILE2,"< $odd_file2"); 
  170.  
  171.                  for($i=0;$i<1;$i++)
  172.  
  173.             {
  174.  
  175.             print ODD_FILE2 $data2[i]."\n";
  176.  
  177.             }
  178.  
  179.     close(ODD_FILE2);
  180.  
  181.     }
  182.  
  183.  
  184.  
  185.     foreach($log2line =~ /MISR_AUX_EVEN/)
  186.  
  187.         {
  188.  
  189.  
  190.  
  191.             $even2++;
  192.  
  193.             my @FILE2 = split(/:/ , $log2line);
  194.  
  195.             #print "$FILE2[2]";
  196.  
  197.             @data2= $FILE2[2];
  198.  
  199.             open(EVEN_FILE2,"< $even_file2"); 
  200.  
  201.                  for($i=0;$i<1;$i++)
  202.  
  203.             {
  204.  
  205.             print EVEN_FILE2 $data2[i]."\n";
  206.  
  207.             }
  208.  
  209.     close(EVEN_FILE2);
  210.  
  211.         }
  212.  
  213. }     
  214.  
  215.  
  216.  
  217. open F1, "< output_odd1.txt"; 
  218.  
  219. open F2, "< output_odd2.txt"; 
  220.  
  221.  
  222.  
  223. while($line = <F1>) 
  224.  
  225.  
  226.  
  227.  
  228.   while($line1=<F2>)
  229.  
  230.    {
  231.  
  232.  
  233.  
  234.        foreach($line1)
  235.  
  236.        {
  237.  
  238.  
  239.  
  240.            #for($a=0;$a<$line1;$a++)
  241.  
  242.            #{
  243.  
  244.            if($line eq $line1)
  245.  
  246.            {
  247.  
  248.  
  249.  
  250.                    $match++;
  251.  
  252.                    print "match:$match";
  253.  
  254.                    #for($a=0;$a<$line1;$a++)
  255.  
  256.                    #{
  257.  
  258.                    print "$line";
  259.  
  260.                    #}
  261.  
  262.            }
  263.  
  264.            #}
  265.  
  266.        }
  267.  
  268.  
  269.  
  270.     }
  271.  
  272. }
  273.  
  274.  
  275.  
  276. close F1; 
  277.  
  278. close F2;
  279.  
  280.  
  281.  
  282. close(FILE2);
  283.  
  284. close(FILE1);
  285.  
The problem in this code is that the code is printing only 1 data and not all common data.
Feb 3 '11 #5
rovf
41
> I mean I want to compare two lines for equality. I want to compare each line in file1 to all the lines of file2.

Sorry, I don't get it. These are different things. "To compare two files for equality" means that you just want to know whether two files have the same content. "To ompare each line in file1 to all the lines of file2" means you take the first line of file1 and compare it to all lines in file2, then you take the 2nd line in file1 and compare it again to all lines in file2", and so on. Even without reasoning whether or not this makes sense, this still doesn't tell us anything about the desired outcome of this compare.

I suggest that, before discussing your code, you should define precisely what you want to achieve.
Feb 3 '11 #6
ok. Sorry for the inconvenience.
Here is what i am trying to do
Compare line1 of file1 with all the lines of file2. If a match is found(eg:"hello, how ru?" line of file1 matches with a line of file2) print that matched line to a different file.
Feb 3 '11 #7
rovf
41
OK, I understand more, though the specification is still incomplete. What I understand is this: You want to extract all the lines in file1 which match some line in file2, and write those lines (from line1) to a new file (says file3). Correct?

What you still don't have specified, is the following:

(1) What does "lineX matches lineY" mean? Do you mean for example:
- lineX is identical to lineY
- lineX is a regular expression supposed to match lineY
- lineY is a regular expression supposed to match lineX
- lineX is a glob pattern supposed to match lineY
- lineX (with leading and trailing blanks removed) should be equal to lineY
etc.

(2) (only if "to match" doesn't mean "equality"): If we find that lineX matches lineY, should we write lineX to file3, or lineY

(3) If you find two lines in file1 (lineM and lineN), which BOTH match some line in file2, do you want to have lineM AND lineN
to be put into file3, or only lineM?

Finally, one more remark. This is a forum dedicated to solving Perl programs. Its purpose is not primarily in discussing how to learn to program in general. This doesn't mean that you won't get help if you a novice in programming. However, if it is the case that you are new to programming (and not only to Perl), as your way of asking questions suggest, it would help if you clearly help so, because you might get more helpful answers. For example, when you started the thread, I assumed that you are a programmer who is new to Perl, and my response is focusing on Perl specific issues. However, I now get the impression that you don't know much about programming in general, so you would need first help in getting the algorithm done, instead of just getting explained the Perl part. If you could clarify this, we can help you in a better way.
Feb 3 '11 #8
numberwhun
3,509 Expert Mod 2GB
Actually, @rovf, I have to stop you right there. This is not just a forum dedicated to solving Perl problems, it is also a learning forum. Everyone needs a place to go to learn and we should not be discouraging them or turning them away. If you have an issue with a user, then please, feel free to send a PM (private message) to me or one of the other moderators on this site, but please, do not turn away our users from getting help (unless its homework of course).

Now, @Srijith B, what I do agree with is that you need to take a step back and maybe go through a Perl tutorial. Something that will give you a good grounding in the language. @rovf suggested a module earlier, but I wonder if you looked into it. If you need a good tutorial, send me a Private Message (PM) with your email address and I will gladly send you one. DO NOT put your email in this thread as it will be quickly removed and is against site policy.

Regards,

Jeff
Feb 4 '11 #9
rovf
41
> Actually, @rovf, I have to stop you right there. This is not just a forum dedicated to solving Perl problems, it is also a learning forum.

I absolutely agree. That's why I think we *need* more information. If I know from the outset, that someone is a beginner in programming, I would focus in my answer on the algorithm (for example, by providing an outline of the solution in pseudocode). If I know someone is a programmer, but doesn't know Perl, I can focus on different solutions how to solve a problem in Perl.

In addition, I think it *is* necessary that the poster defines his or her problem, because we can then help better.

Don't get me wrong: I don't mind helping learners even on a basic level. I just want to make sure that I understand, what level the learner is, and what problem s/he has to solve.
Feb 4 '11 #10

Sign in to post your reply or Sign up for a free account.

Similar topics

44
by: Xah Lee | last post by:
here's a large exercise that uses what we built before. suppose you have tens of thousands of files in various directories. Some of these files are identical, but you don't know which ones are...
4
by: Lad | last post by:
Hi, What is the best method for comparing two files by words? I was thinking about reading files by words and compare them but a word in one file can be linked with a new line character ( \n) and...
2
by: SP | last post by:
Hi All, I need to compare two files line by line and copy the differing lines to a new files. I.e. Compare file1 and file2 line by line. Copy only differing lines to file3. I tried a couple...
8
by: pjsimon | last post by:
I want to compare two files like MS Visual SourceSafe's Show Differences feature. Is there a way to access methods in VB.Net that will let me use existing MS code to show the differences between...
3
by: shona | last post by:
Hi, can any one told me how to compare files with same name but different extension.. for eg. if a.txt & a.doc then ans is same files.. Thanks
0
by: ds81 | last post by:
I am trying to read a large number of image (BMP, JPG) files, and need to know if any are identical. I have been trying to store the hashcodes of the files, so that they then can be compared later. ...
4
by: Clay Hobbs | last post by:
I am making a program that (with urllib) that downloads two jpeg files and, if they are different, displays the new one. I need to find a way to compare two files in Python. How is this done? ...
0
by: norseman | last post by:
Timothy Grant wrote: =================================== If you are on a Unix platform: man cmp man identify man display (ImageMagick) gimp If you use mc (MidnightCommander) the F3 key can...
0
by: zw | last post by:
Hi I have 2 log files, each with a timestamp on the first 2 fields. However, when I do a awk '/ / {print $1,$2}' logs/x.log on a log file, it is complicated by the fact that I also get other...
3
by: Susan StLouis | last post by:
I'm writing a program that can be used to compare files. The program features a select that contains a list of files. After selecting several of the files. a "Biggest" button can be pushed to find...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.