I got the code (from the internet)for comparing two files and showing the difference in contents.Now,I tried the same code for two files written in japanese language(kanji).If I save the two japanese .txt files in ANSI format,it works fine,but, if I save them in formats like 'UTF-8','unicode','unicode bigendian',it doesn't show the differences properly....keeps showing odd symbols instead of the japanese characters.
Would be glad if someone could suggest some simple way of making it work for all formats.
The code I am using is the one pasted below:
Expand|Select|Wrap|Line Numbers
- #!C:\perl\bin\perl.exe
- # file_compare.pl
- # Purpose: compare two files(file_1,file_2) and show differences
- use strict;
- use warnings;
- my $file1 ='E:\perl_folder\file_1.txt' or die "filename missing \n";
- my $file2 ='E:\perl_folder\file_2.txt' or die "filename missing \n";
- open (FILE1, "< $file1") or die "Can not read file $file1: $! \n";
- my @file1_contents = <FILE1>; # read entire contents of file
- close (FILE1);
- open (FILE2, "< $file2") or die "Can not read file $file2: $! \n";
- my @file2_contents = <FILE2>; # read entire contents of file
- close (FILE2);
- my $length1 = $#file1_contents; # number of lines in first file
- my $length2 = $#file2_contents; # number of lines in second file
- if ($length1 > $length2) {
- # first file contains more lines than second file
- my $counter2 = 0;
- foreach my $line_file1 (@file1_contents) {
- chomp ($line_file1);
- if (defined ($file2_contents[$counter2])) {
- # line exists in second file
- chomp (my $line_file2 = $file2_contents[$counter2]);
- if ($line_file1 ne $line_file2) {
- print "\nline " . ($counter2 + 1) . " \n";
- print "< $line_file1 \n" if ($line_file1 ne "");
- print "--- \n";
- print "> $line_file2 \n\n" if ($line_file2 ne "");
- }
- }
- else {
- # there is no line in second file
- print "\nline " . ($counter2 + 1) . " \n";
- print "< $line_file1 \n" if ($line_file1 ne "");
- print "--- \n";
- print "> \n"; # this line does not exist in file2
- }
- $counter2++; # point to the next line in file2
- }
- }