473,396 Members | 2,034 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

perl string comparison

89
Hi,
I have one column of strings in 1st file file and another file which consists of 5 clumns in each line and my basic objective is to find each item/line of 1st file is available in 3rd column of 2 nd file.
And I tried the following logic. It might be bit round about way but as a beginner am trying as follows.

The column in the 1st file is having data as example
Expand|Select|Wrap|Line Numbers
  1. NS008_456_R0030_3008
The 2nd file data is as follows:

Expand|Select|Wrap|Line Numbers
  1.  
  2. +   test   NS008_456_R0030_3008   67   223
  3.  
My logic is as follows:

I am opening the 1st file in an array and for each item I am opening the second file and scanning through each line and checking whether the array content is equal to $V[2] of second file. The logic seems to work even though the search is taking time.

But I considered
Expand|Select|Wrap|Line Numbers
  1. NS008_456_R0030_3008   
as a string literal and my if loop is as below:

Expand|Select|Wrap|Line Numbers
  1.  
  2. if($rawdata[0] eq $v[2]) {
  3. do something here
  4. }
  5.  
  6.  
But it does not seem to work. Anything wrong in considering the data as string literal or when I read the file contents in an array, anymore maniputaion is wrong with string comparison? Please let me know. Regards
Jan 12 '09 #1
10 8494
numberwhun
3,509 Expert Mod 2GB
@lilly07
Can you please post the rest of your code so that we can see how you go to this point? That will give us a better understanding all around.

Regards,

Jeff
Jan 12 '09 #2
KevinADC
4,059 Expert 2GB
if you compare the strings using "eq" they must be an exact match, including spaces, control chracters, and upper/lower case of any alpha characters. My guess is that you need to chomp() the records in the first file before comparing to records in the second file, but why make us guess? Post the code. ;)
Jan 12 '09 #3
lilly07
89
Thx, Kevin I tried chomping the data from the first file but still it is not working. Please find my code as below. It is not complaining about compilation error. Even though I didn't copy, I retyped again.

Expand|Select|Wrap|Line Numbers
  1.  
  2. #!usr/bin/perl
  3. $first_data = "first.txt";
  4. open(DAT,$first_data) || die("Could not open file!");
  5. @search_data = <DAT>;
  6. $searchSize = scalar( @search_data);
  7.  
  8. $second_file = "second.txt";
  9. for($count=0, $count < $searchSize; $count++) {
  10.  open (RF, $second_file) || die("Could not open file!");
  11.  
  12.  $find_raw = @search_data[$count];
  13.  $find = chomp $find_raw;
  14.  
  15.   while($line=<RF>) {
  16.    chomp $line;
  17.    @v = split(/\s+/,$line);
  18.  
  19.     if($v[2] eq $find){
  20.      print "$line \n";
  21.     }
  22.   }
  23.  
  24.  close RF;
  25.  
Jan 12 '09 #4
lilly07
89
Actually the program works if I modify the following code
  1. Expand|Select|Wrap|Line Numbers
    1. $find_raw = @search_data[$count];
    2.  $find = chomp $find_raw;
as below:

Expand|Select|Wrap|Line Numbers
  1. $find_raw = @search_data[$count]; 
  2. chomp $find_raw;
  3.  
Is there any tricky way or shorter way for this kind of search as it takes a longer duration. Thanks.
Jan 13 '09 #5
KevinADC
4,059 Expert 2GB
When you assign the return value of chomp to a scalar it returns the number of times chomp() was succesful. So in your case $find was probably either a 0 or 1.

This line:

$find_raw = @search_data[$count];

should be:

$find_raw = $search_data[$count];

using @ for a single array element is long deprecated. Use $ for a single array element and @ for multiple array elelments.
Jan 13 '09 #6
KevinADC
4,059 Expert 2GB
If neither file is too big you can do something like this:

Expand|Select|Wrap|Line Numbers
  1. #!usr/bin/perl
  2. use strict;
  3. use warnings;
  4.  
  5. my $first_data = "first.txt";
  6. open(DAT,$first_data) or die "Could not open file: $!";
  7. my @search_data = <DAT>;
  8. close DAT;
  9. chomp @search_data;
  10.  
  11. my $second_file = "second.txt";
  12. open (RF, $second_file) or die "Could not open file: $!";
  13. while(my $line = <RF>) {
  14.    chomp $line;
  15.    foreach my $find (@search_data) {
  16.       my $v = (split(/\s+/,$line))[2];
  17.       if ($v eq $find){
  18.          print "Found '$find' in second.txt at line number $. : [$line] \n";
  19.          last;
  20.       }
  21.    }
  22. }
  23. close RF;
  24.  
Jan 13 '09 #7
lilly07
89
yes Kevin, you are right initially after chomping the value was 1 and hence I overcame that as I did.

My objective is to find all the possible 1st file columns available in the second file and print them and hence
Expand|Select|Wrap|Line Numbers
  1.  last; 
may not work in my case. I just thought that whether I am doing a round about way? Thanks again.
Cheers
Jan 13 '09 #8
KevinADC
4,059 Expert 2GB
Try the code I posted. "last" ends the "foreach" loop after an element in the array is found in the file. It then goes to the next line in the file and searches the entire array again. Now this entire process could probably be speeded up considerably using a hash and/or the memoize module.

Memoize - perldoc.perl.org
Jan 13 '09 #9
lilly07
89
Hi Kevin, Thanks for your help.

Basically my data file (second file looks as follows)

Expand|Select|Wrap|Line Numbers
  1.  
  2. +   test   NS008_456_R0030_3008   67   223 
  3. +   ghi    NS008_456_R0030_3678   17   678
  4. +   ggl    NS008_456_R0030_3678   17   270
  5. +   ghi    NS008_456_R0030_3672   17   209
  6. +   ghi    NS008_456_R0030_3690   17   280
  7. +   ghi    NS008_456_R0030_3690   15   267
  8.  
My objective is to find the records which has multiple enteries on the 3rd column. For example in the above case,
Expand|Select|Wrap|Line Numbers
  1. +   ghi    NS008_456_R0030_3678   17   678
  2. +   ggl    NS008_456_R0030_3678   17   270
  3.  
and
Expand|Select|Wrap|Line Numbers
  1. +   ghi    NS008_456_R0030_3690   17   280
  2. +   ghi    NS008_456_R0030_3690   15   267
  3.  
are the candidate record which I am interested.

And my logic is as follows:
1. I added the 3rd column and 4th in a hashmap and checked all the values in the hash map. If the value in the hash map is more than 1, then I collect them as a multiple records and store 3rd column
Expand|Select|Wrap|Line Numbers
  1. NS008_456_R0030_3690   
in a file ($first_file) Then I search for the records in the second_file as I had explained before. But this is taking enormous amount of time as the file is huge and hence extensive search.
Is tehre anyway to pick up from second_file directly. I need the records which shows multiple entries in the 3rd column. Please let me know.
I tried your code also and the sript is still executing and hence I thought let me explain you about the whole picture.
Thanks.
Jan 13 '09 #10
lilly07
89
I would like to know whether any shell script would do?
Jan 13 '09 #11

Sign in to post your reply or Sign up for a free account.

Similar topics

9
by: Xah Lee | last post by:
here's a interesting real-world algoritm to have fun with. attached below is the Perl documentation that i wrote for a function called "reduce", which is really the heart of a larger software. ...
44
by: Xah Lee | last post by:
here's a large exercise that uses what we built before. suppose you have tens of thousands of files in various directories. Some of these files are identical, but you don't know which ones are...
4
by: B McInnes | last post by:
Hello, I am working on creating a perl implementatin of quick sort, I know that there is a perl sort function but I am doing this so that I can later sort a vec based on the information in another...
32
by: len v | last post by:
A recent (Oct 3) Fox Trox comic (Bill Amend ) got me thinking causing me to edit the origional comic. Bill then had to write a patch, as most C programers must do.(...
20
by: Xah Lee | last post by:
Sort a List Xah Lee, 200510 In this page, we show how to sort a list in Python & Perl and also discuss some math of sort. To sort a list in Python, use the “sort” method. For example: ...
0
by: srinu123 | last post by:
Hi all, I am tring the install perl module(GD-2.35) on my linux machine.But i am getting some error...Please find the error below... D.xs:1450: invalid lvalue in assignment GD.xs:1450:...
5
KevinADC
by: KevinADC | last post by:
Introduction This discussion of the sort function is targeted at beginners to perl coding. More experienced perl coders will find nothing new or useful. Sorting lists or arrays is a very common...
1
KevinADC
by: KevinADC | last post by:
Introduction In part one we discussed the default sort function. In part two we will discuss more advanced techniques you can use to sort data. Some of the techniques might introduce unfamiliar...
4
crystal2005
by: crystal2005 | last post by:
Hello guys.... I'm newbie in Perl Language. So, in here i would like to ask several questions about Perl itself and not about the coding. I have been given a task to use Perl for directory...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.