473,582 Members | 3,083 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Compare Two csv files using perl

Vasuki Masilamani
18 New Member
Hi,

Can any one help me in writing a script in Perl to compare two csv files and pick out the records which show differences?

Any responses would be appreciated.

Thanks,
Vasuki
May 17 '07 #1
11 14230
KevinADC
4,059 Recognized Expert Specialist
post your current code and someone will probably help.
May 17 '07 #2
Vasuki Masilamani
18 New Member
I tried and got the entire script. It is work fine now. Please find the script below.

Expand|Select|Wrap|Line Numbers
  1. $f1 = 'C:\Vasuki\chm_dirx_bud_28.csv';
  2. open FILE1, "$f1" or die "Could not open file chm_dirx_bud_28.csv \n";
  3. $f2= 'C:\Vasuki\chm_dirx_bud_29.csv';
  4. open FILE2, "$f2" or die "Could not open file chm_dirx_bud_29.csv \n";
  5.  
  6. $outfile = 'C:\Vasuki\chm_dirx_bud.csv';
  7.  
  8. my @outlines;
  9.  
  10. foreach (<FILE1>) {
  11.     $y = 0;
  12.     $outer_text = $_;
  13.  
  14.     seek(FILE2,0,0);
  15.  
  16.     foreach (<FILE2>) {
  17.         $inner_text = $_;
  18.  
  19.         if($outer_text eq $inner_text) {
  20.             $y = 1;
  21.             print "Match Found \n";
  22.             last;
  23.         }
  24.     }
  25.  
  26.     if($y != 1) {
  27.         print "No Match Found \n";
  28.         push(@outlines, $outer_text);
  29.     }
  30. }
  31.  
  32. open (OUTFILE, ">$outfile") or die "Cannot open $outfile for writing \n";
  33. print OUTFILE @outlines;
  34. close OUTFILE;
  35.  
  36. close FILE1;
  37. close FILE2;
  38.  
This script is running very slow in case of large number of records. Can anyone suggest some ideas to fine tune this script? Thanks in advance.
May 17 '07 #3
miller
1,089 Recognized Expert Top Contributor
Well, of course it's slow. You're scanning through a large portion of file2 for every line in file1. This means that your your execute time is relative to the square of the size of the files.

Ignoring your current algorithm for now though, I would suggest that you look into a cpan module to do this for you.

cpan Text::Diff


The fact that your files are CSV files is irrelavent for what you're trying to do, so just go back to simply file comparing. I don't know what type of output this module will provide, but I'm almost certainly that it can be adapted in such a way to acheive the results you desire.

- Miller
May 17 '07 #4
KevinADC
4,059 Recognized Expert Specialist
if the file isn't too large, I would try reading the first file into a hash and just increment the hash while reading the second file. I think Text::Diff might be overkill if it's just a simple comparison of matching lines between the two files. Text::Diff also has the unfortunate behavior of slurping all files into memory, which may or may not be a problem.
May 17 '07 #5
AdrianH
1,251 Recognized Expert Top Contributor
if the file isn't too large, I would try reading the first file into a hash and just increment the hash while reading the second file. I think Text::Diff might be overkill if it's just a simple comparison of matching lines between the two files. Text::Diff also has the unfortunate behavior of slurping all files into memory, which may or may not be a problem.
The easist way is to use something that is already made.

Try using diff. It is a Unix utility and is designed for this sort of work.

Of course it will not work if the records are not in the same order. In which case, you would have to go back to perl.


Adrian
May 18 '07 #6
AdrianH
1,251 Recognized Expert Top Contributor
The easist way is to use something that is already made.

Try using diff. It is a Unix utility and is designed for this sort of work.

Of course it will not work if the records are not in the same order. In which case, you would have to go back to perl.


Adrian
Rethinking this, if the key is at begining of the line, you could sort and then use diff.


Adrian
May 18 '07 #7
KevinADC
4,059 Recognized Expert Specialist
Why are you assuming unix? Looks like windows to me.

$f1 = 'C:\Vasuki\chm_ dirx_bud_28.csv ';
May 18 '07 #8
AdrianH
1,251 Recognized Expert Top Contributor
Why are you assuming unix? Looks like windows to me.

$f1 = 'C:\Vasuki\chm_ dirx_bud_28.csv ';
I'm not assuming Unix. There are GNU ports of Unix utilities all over the place.


Adrian
May 18 '07 #9
KevinADC
4,059 Recognized Expert Specialist
True enough

(filler for message too short)
May 18 '07 #10

Sign in to post your reply or Sign up for a free account.

Similar topics

2
5940
by: Sam | last post by:
I would like to store html templates in a database. By using perl I would like to retrive the template ask the user to fill the template and store the whole file is template + the user data in a database. How can I do this? Should I use Perl Mason to do this. Or Can I store the template in database using text field and then retrive the...
8
2794
by: pjsimon | last post by:
I want to compare two files like MS Visual SourceSafe's Show Differences feature. Is there a way to access methods in VB.Net that will let me use existing MS code to show the differences between two files using the same interface that MS VSS uses? Are there other methods that can be exposed in VB.Net for easily showing differences between...
3
4695
crazy4perl
by: crazy4perl | last post by:
Hi All, I have some doubt related to xml. Actually I want to update a file which is in some format. So I am converting that file using Tap3edit perl module in a hash. Now I m trying to create a XML file using this hash so that I can edit that file and convert it back to the same format (might be there are some better way to edit... any...
1
2669
by: ibmcmr | last post by:
Hi Is there any way I can create PDF files from Postscript files uing perl ? I was using ghostscrtip till now, but it has some license problem using for business purpose. So is there any perl modules which can be used for converting PS to PDF ? I looked into PDF:API2 and PDF::Create. But I couldnt find a way to use it with a postscript file. ...
3
2872
by: Davo1977 | last post by:
Does anyone know a regular expression that will rename multiple files that have different extensions to have the same extension. For example, you could use this code when several text files exist in a directory but have slightly different extensions such as .txt,. TXT, or text. This expression should show how to rename them to all have .txt...
0
1534
by: koti688 | last post by:
Hi Mates, Can u Please tell me how to connect to Berkely Db using Perl . I have two files , i need to access these file using perl. These files with .db extension contains Keys and its Signatures. I need to List out the keys and vaules and signatures respectively.I Have installed Berkely-DB(0.22), BerkelyDB-Lite,DB_File(1.816) using PPM...
3
7395
by: vibhakhushi | last post by:
How to compare two files in perl. I have two files as shown below. First XML File <Data> <indep voltage> +1.20000000000e+01 </indep> <indep current> +5.08474576271e-04
3
2563
by: rajesh dogra | last post by:
Hi All, i am trying to move a bunch of files in a newly created directory: here is what i am trying to do. : i read my source dir. : i search for a file which has the folder name information in it. This file is in the source folder. : i create that folder. : copy all the files from source dir in the new folder. Now every thing goes...
3
2096
by: Susan StLouis | last post by:
I'm writing a program that can be used to compare files. The program features a select that contains a list of files. After selecting several of the files. a "Biggest" button can be pushed to find the biggest of the selected files. Each time that the "Biggest" button is pushed, the name of the biggest file is displayed as a hyperlink that can be...
0
7886
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7809
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8312
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7920
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
8183
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6569
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
3809
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3835
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1413
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.