Hi,
Can any one help me in writing a script in Perl to compare two csv files and pick out the records which show differences?
Any responses would be appreciated.
Thanks,
Vasuki
11 14197
post your current code and someone will probably help.
I tried and got the entire script. It is work fine now. Please find the script below. -
$f1 = 'C:\Vasuki\chm_dirx_bud_28.csv';
-
open FILE1, "$f1" or die "Could not open file chm_dirx_bud_28.csv \n";
-
$f2= 'C:\Vasuki\chm_dirx_bud_29.csv';
-
open FILE2, "$f2" or die "Could not open file chm_dirx_bud_29.csv \n";
-
-
$outfile = 'C:\Vasuki\chm_dirx_bud.csv';
-
-
my @outlines;
-
-
foreach (<FILE1>) {
-
$y = 0;
-
$outer_text = $_;
-
-
seek(FILE2,0,0);
-
-
foreach (<FILE2>) {
-
$inner_text = $_;
-
-
if($outer_text eq $inner_text) {
-
$y = 1;
-
print "Match Found \n";
-
last;
-
}
-
}
-
-
if($y != 1) {
-
print "No Match Found \n";
-
push(@outlines, $outer_text);
-
}
-
}
-
-
open (OUTFILE, ">$outfile") or die "Cannot open $outfile for writing \n";
-
print OUTFILE @outlines;
-
close OUTFILE;
-
-
close FILE1;
-
close FILE2;
-
This script is running very slow in case of large number of records. Can anyone suggest some ideas to fine tune this script? Thanks in advance.
Well, of course it's slow. You're scanning through a large portion of file2 for every line in file1. This means that your your execute time is relative to the square of the size of the files.
Ignoring your current algorithm for now though, I would suggest that you look into a cpan module to do this for you.
cpan Text::Diff
The fact that your files are CSV files is irrelavent for what you're trying to do, so just go back to simply file comparing. I don't know what type of output this module will provide, but I'm almost certainly that it can be adapted in such a way to acheive the results you desire.
- Miller
if the file isn't too large, I would try reading the first file into a hash and just increment the hash while reading the second file. I think Text::Diff might be overkill if it's just a simple comparison of matching lines between the two files. Text::Diff also has the unfortunate behavior of slurping all files into memory, which may or may not be a problem.
if the file isn't too large, I would try reading the first file into a hash and just increment the hash while reading the second file. I think Text::Diff might be overkill if it's just a simple comparison of matching lines between the two files. Text::Diff also has the unfortunate behavior of slurping all files into memory, which may or may not be a problem.
The easist way is to use something that is already made.
Try using diff. It is a Unix utility and is designed for this sort of work.
Of course it will not work if the records are not in the same order. In which case, you would have to go back to perl.
Adrian
The easist way is to use something that is already made.
Try using diff. It is a Unix utility and is designed for this sort of work.
Of course it will not work if the records are not in the same order. In which case, you would have to go back to perl.
Adrian
Rethinking this, if the key is at begining of the line, you could sort and then use diff.
Adrian
Why are you assuming unix? Looks like windows to me.
$f1 = 'C:\Vasuki\chm_dirx_bud_28.csv';
Why are you assuming unix? Looks like windows to me.
$f1 = 'C:\Vasuki\chm_dirx_bud_28.csv';
I'm not assuming Unix. There are GNU ports of Unix utilities all over the place.
Adrian
True enough
(filler for message too short)
Sign in to post your reply or Sign up for a free account.
Similar topics
by: Sam |
last post by:
I would like to store html templates in a database. By using perl I
would like to retrive the template ask the user to fill the template
and store the whole file is template + the user data in a...
|
by: pjsimon |
last post by:
I want to compare two files like MS Visual SourceSafe's Show Differences
feature. Is there a way to access methods in VB.Net that will let me use
existing MS code to show the differences between...
|
by: crazy4perl |
last post by:
Hi All,
I have some doubt related to xml. Actually I want to update a file which is in some format. So I am converting that file using Tap3edit perl module in a hash. Now I m trying to create a...
|
by: ibmcmr |
last post by:
Hi
Is there any way I can create PDF files from Postscript files uing perl ? I was using ghostscrtip till now, but it has some license problem using for business purpose. So is there any perl...
|
by: Davo1977 |
last post by:
Does anyone know a regular expression that will rename multiple files that have different extensions to have the same extension. For example, you could use this code when several text files exist in...
|
by: koti688 |
last post by:
Hi Mates,
Can u Please tell me how to connect to Berkely Db using Perl .
I have two files , i need to access these file using perl. These files with .db extension contains Keys and its...
|
by: vibhakhushi |
last post by:
How to compare two files in perl. I have two files as shown below.
First XML File
<Data>
<indep voltage>
+1.20000000000e+01
</indep>
<indep current>
+5.08474576271e-04
|
by: rajesh dogra |
last post by:
Hi All,
i am trying to move a bunch of files in a newly created directory: here is what i am trying to do.
: i read my source dir.
: i search for a file which has the folder name information...
|
by: Susan StLouis |
last post by:
I'm writing a program that can be used to compare files. The program features a select that contains a list of files. After selecting several of the files. a "Biggest" button can be pushed to find...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |