473,805 Members | 2,270 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Searching for Keywords in Files in a Dir

154 New Member
script to search for keywords in files in a dir

Basically the script consists of a file with part numbers

I got a dir with files I want to search the files line by line and if a line has the part number I want it to print that line to an outfile file.

I did this script but it does not find matching products.
I aded an else statement and it only prints the else statements alone when i commented out the else statements nothing prints tot output file.

This is the script i am working with hopefully someone could show where i am going wrong. Thanks in advance.

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl
  2. $| = 1;
  3.  
  4. use File::Copy;
  5. # Contains a list of product numbers
  6. my $prodfile = "d:\\prod\\automated_Script\\testfile\\prod.txt";
  7.  
  8. #Contains a bunch of flat files where i would take a product number and check line in each file of that dir and if they have a line containing the prod number it will print to the output file.
  9. my $srcdir = "d:\\prod\\automated_Script\\archivetest";
  10. my $finalfile = "d:\\prod\\automated_Script\\testfile\\output.txt";
  11.  
  12. ##### Listing files in archive dir #####
  13. opendir(DIR, $srcdir) or die "Can't open $srcdir: $!";
  14. @files = grep {!/^\.+$/} readdir(DIR);
  15. close(DIR);
  16. print @files;
  17.  
  18. if (!@files) {
  19.     print "No files in dir.\n\n";
  20.     last;
  21. }
  22.  
  23. ##### Reading Line from File #####
  24. open(PRODFILE, "< $prodfile") or die (" Could not open Product File $!");
  25. while ($product = <PRODFILE>) {
  26.     chomp($product);
  27.     foreach $files (@files) {
  28.         print " This product is : $product \n";
  29.  
  30.         ### Open file and searching for matching products ###
  31.  
  32.         open(INFILE2,"< $srcdir\\$files") || die(" Could not open INFILE2 File!");
  33.         $findprod=<INFILE2>;
  34.         chomp($findprod);
  35.  
  36.         open(OUTFILE2, ">> $finalfile") || die ("Could not OUTFILE2 open file $!");
  37.  
  38.         while (<INFILE2>) {
  39.             if ($product =~ /$findprod/) {
  40.                 print OUTFILE2 "$findprod";
  41.             #} else {
  42.             #    print OUTFILE2 "Product not found \n";
  43.             }
  44.  
  45.             close OUTFILE2;
  46.         }
  47.         close INFILE2;
  48.  
  49.         print "Product search completed";
  50.     }
  51. }
  52. close PRODFILE;
  53.  
Feb 27 '07 #1
11 2306
KevinADC
4,059 Recognized Expert Specialist
is this a one line file?

Expand|Select|Wrap|Line Numbers
  1.         open(INFILE2,"< $srcdir\\$files") || die(" Could not open INFILE2 File!");
  2.         $findprod=<INFILE2>;
  3.         chomp($findprod);
  4.  
Feb 27 '07 #2
jonathan184
154 New Member
no it has muliple lines
Feb 27 '07 #3
KevinADC
4,059 Recognized Expert Specialist
try this:

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl
  2. use strict;
  3. use warnings;
  4. $| = 1;
  5.  
  6. use File::Copy;
  7. # Contains a list of product numbers
  8. my $prodfile = "d:\\prod\\automated_Script\\testfile\\prod.txt";
  9.  
  10. #Contains a bunch of flat files where i would take a product number and check line in each file of that dir and if they have a line containing the prod number it will print to the output file.
  11. my $srcdir = "d:\\prod\\automated_Script\\archivetest";
  12. my $finalfile = "d:\\prod\\automated_Script\\testfile\\output.txt";
  13.  
  14. ##### Listing files in archive dir #####
  15. opendir(DIR, $srcdir) or die "Can't open $srcdir: $!";
  16. my @files = grep {!/^\.+$/} readdir(DIR);
  17. close(DIR);
  18. print @files;
  19.  
  20. if (!@files) {
  21.     print "No files in dir.\n\n";
  22. }
  23.  
  24. ##### Reading Line from File #####
  25. open(PRODFILE, "< $prodfile") or die (" Could not open Product File $!");
  26. while (my $product = <PRODFILE>) {
  27.     chomp($product);
  28.     foreach my $files (@files) {
  29.         print " This product is : $product \n";
  30.  
  31.         ### Open file and searching for matching products ###
  32.  
  33.         open(INFILE2,"< $srcdir\\$files") || die(" Could not open INFILE2 File!");
  34.         open(OUTFILE2, ">> $finalfile") || die ("Could not OUTFILE2 open file $!");
  35.  
  36.       while (my $findprod = <INFILE2>) {
  37.          chomp($findprod);
  38.             if ($product =~ /$findprod/) {
  39.                 print OUTFILE2 "$findprod\n";
  40.             #} else {
  41.             #    print OUTFILE2 "Product not found \n";
  42.             }
  43.       }
  44.     }
  45.    print "Product search completed for: $product\n ";
  46. }
  47. close OUTFILE2;
  48. close INFILE2;
  49. close PRODFILE;
you can use forward slashes in your path statements even if you are using windows:

Expand|Select|Wrap|Line Numbers
  1. my $prodfile = 'd:/prod/automated_Script/testfile/prod.txt';
Feb 27 '07 #4
jonathan184
154 New Member
The script ran but it did not print the lines containing the matching part numbers.
When i searched the output file the part number could not be found in any of the lines. There are other products than the ones i was searching for.

Pretty strange.
Feb 27 '07 #5
jonathan184
154 New Member
Hi Kevin, It looks like the comparison string is not working , I did a print statement. The lines do not have anything relating to the product number i am trying to find. Is ther somthing wrong with my comparison string?

Expand|Select|Wrap|Line Numbers
  1. while (my $findprod = <INFILE2>) {
  2.          chomp($findprod);
  3.             if ($product =~ /$findprod/) {
  4.                 print OUTFILE2 "$findprod \n";
  5.                  print "******* Match Found: $findprod ********* \n\n"
try this:

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl
  2. use strict;
  3. use warnings;
  4. $| = 1;
  5.  
  6. use File::Copy;
  7. # Contains a list of product numbers
  8. my $prodfile = "d:\\prod\\automated_Script\\testfile\\prod.txt";
  9.  
  10. #Contains a bunch of flat files where i would take a product number and check line in each file of that dir and if they have a line containing the prod number it will print to the output file.
  11. my $srcdir = "d:\\prod\\automated_Script\\archivetest";
  12. my $finalfile = "d:\\prod\\automated_Script\\testfile\\output.txt";
  13.  
  14. ##### Listing files in archive dir #####
  15. opendir(DIR, $srcdir) or die "Can't open $srcdir: $!";
  16. my @files = grep {!/^\.+$/} readdir(DIR);
  17. close(DIR);
  18. print @files;
  19.  
  20. if (!@files) {
  21.     print "No files in dir.\n\n";
  22. }
  23.  
  24. ##### Reading Line from File #####
  25. open(PRODFILE, "< $prodfile") or die (" Could not open Product File $!");
  26. while (my $product = <PRODFILE>) {
  27.     chomp($product);
  28.     foreach my $files (@files) {
  29.         print " This product is : $product \n";
  30.  
  31.         ### Open file and searching for matching products ###
  32.  
  33.         open(INFILE2,"< $srcdir\\$files") || die(" Could not open INFILE2 File!");
  34.         open(OUTFILE2, ">> $finalfile") || die ("Could not OUTFILE2 open file $!");
  35.  
  36.       while (my $findprod = <INFILE2>) {
  37.          chomp($findprod);
  38.             if ($product =~ /$findprod/) {
  39.                 print OUTFILE2 "$findprod\n";
  40.             #} else {
  41.             #    print OUTFILE2 "Product not found \n";
  42.             }
  43.       }
  44.     }
  45.    print "Product search completed for: $product\n ";
  46. }
  47. close OUTFILE2;
  48. close INFILE2;
  49. close PRODFILE;
you can use forward slashes in your path statements even if you are using windows:

Expand|Select|Wrap|Line Numbers
  1. my $prodfile = 'd:/prod/automated_Script/testfile/prod.txt';
Feb 27 '07 #6
KevinADC
4,059 Recognized Expert Specialist
Your best bet is to add as many "print" lines as you can to "watch" the data and script as it's being processed. This often results in catching the error in the logic of code that runs but does not do what you expect it to.
Feb 27 '07 #7
jonathan184
154 New Member
After putting the print statements i think i troubleshoot it down to the if statement not work and it just prints all the lines from the extract files and not doing what the if statement said. IS there something I am doing wrong?

Expand|Select|Wrap|Line Numbers
  1.       while (my $findprod = <INFILE2>) {
  2.          chomp($findprod);
  3.             if ($product =~ /$findprod/) {
  4.                 print OUTFILE2 "$findprod \n";
  5.                  print "******* Match Found: $findprod ********* \n\n"}
  6.       }
Feb 27 '07 #8
KevinADC
4,059 Recognized Expert Specialist
do this:

Expand|Select|Wrap|Line Numbers
  1.       while (my $findprod = <INFILE2>) {
  2.          chomp($findprod);
  3.          print qq~if ($product =~ /$findprod/)\n~; 
  4.             if ($product =~ /$findprod/) {
  5.                 print OUTFILE2 "$findprod \n";
  6.                  print "******* Match Found: $findprod ********* \n\n"}
  7.       }
and post some of the lins that get printed
Feb 27 '07 #9
jonathan184
154 New Member
Hi Kevin


Ok this is crazy I ran the code you gave just put the print statement quotes

Expand|Select|Wrap|Line Numbers
  1. qq~if (3CRTP0400C96C-ME =~ /EA|0223-000-151-1|PKG,END CAP,PROTEUS||||0.000|0.000|KG|0.000||0.000|0.000|0.000||90|N/A||| | ||/)
  2. ~******* Match Found: EA|0223-000-151-1|PKG,END CAP,PROTEUS||||0.000|0.000|KG|0.000||0.000|0.000|0.000||90|N/A||| | || ********* 
This is one of the lines it is printing fromt he extract file.

Now the number it is checking would be 0223-000-151-1 and as you can see 3CRTP0400C96C-ME does not match 0223-000-151-1. For some reason it comes up as a match. The only thing that did not match was the header but every other line showed as match which is wrong should only match about two lines that was in the extract but all lines are coming up as above.

I created a test extract file with the 3CRTP0400C96C-ME dummy lines and did not use the original extract format and the script worked.
as you could see below. I am puzzled what in the format of the orginal file could be causing it not to print the correct records only. Any ideas would be greatly appreciated.


Expand|Select|Wrap|Line Numbers
  1. qq~if (3CRTP0400CF96C-KR =~ /This is the products header/)
  2. ~qq~if (3CRTP0400CF96C-KR =~ /EA|3CRTP0400C96C-ME|vfdggfgdfg|gfdgdfgf/)
  3. ~qq~if (3CRTP0400CF96C-KR =~ /EA|34RTP0400CF96C-KR|fdgfgdfgdfgdfgfd|fdgdfgdfgf/)
  4. ~qq~if (3CRTP0400CF96C-KR =~ /EA|3SRTP0400C96C-ME|vfdggfgdfg|gfdgdfgf/)
  5. ~qq~if (3CRTP0400CF96C-KR =~ /EA|3LRTP0400CF96C-KR|fdgfgdfgdfgdfgfd|fdgdfgdfgf/)
  6. ~qq~if (3CRTP0400CF96C-KR =~ /EA|3URTP0400C96C-ME|vfdggfgdfg|gfdgdfgf/)
  7. ~qq~if (3CRTP0400CF96C-KR =~ /EA|3HRTP0400CF96C-KR|fdgfgdfgdfgdfgfd|fdgdfgdfgf/)
  8. ~qq~if (3CRTP0400CF96C-KR =~ /EA|3YRTP0400C96C-ME|vfdggfgdfg|gfdgdfgf/)
  9. ~qq~if (3CRTP0400CF96C-KR =~ /EA|3HRTP0400CF96C-KR|fdgfgdfgdfgdfgfd|fdgdfgdfgf/)
  10. ~qq~if (3CRTP0400CF96C-KR =~ /EA|3DRTP0400C96C-ME|vfdggfgdfg|gfdgdfgf/)
  11. ~qq~if (3CRTP0400CF96C-KR =~ /EA|3CRTP0400CF96C-KR|fdgfgdfgdfgdfgfd|fdgdfgdfgf/)
  12. ~******* Match Found: EA|3CRTP0400CF96C-KR|fdgfgdfgdfgdfgfd|fdgdfgdfgf ********* 
Feb 28 '07 #10

Sign in to post your reply or Sign up for a free account.

Similar topics

0
1574
by: Chris Chandler | last post by:
Hello I am developing a web application that uses full text searching quite extensively. This is ideal where I am searching a large number of large varchar fields for a set of key words. My problem is that each user on the site has a number of keywords in the database (approx 200000 users) and when I add a new item with the large varchar field I need to match all users who's keywords are found within the varchar text.
3
1943
by: sal achhala | last post by:
I'm working with java and XML documents in order to search for keywords in a given element name, eg element name 'author' == "jo blogs". The problem is the XML documents are downloaded (this process is automated) from different websites thus the element names for author may differ! Is their a way of dealing with this, such as perhaps a standard adopted by, say educational websites to agree on element names ? Thanks very much
3
1314
by: Ratnakar Pedagani | last post by:
Hi, I'm implementing a job portal project. A employer will enter the site and search for the job seeker's resume by typing the keyword provided in the web form in the form of text box. I need able to search this keyword in different resume's present in my system and retreive the matched files. Any help or reference would be appreciated.
3
1420
by: Antoine Junod | last post by:
Hello, I definitely have a problem to build a clean data structure. I would be very happy if some of you could help me as well as in the past. Here is my problem: -> I have a list of keywords. -> I would like to link each keyword with a string. -> I should be able to access the string of a given keyword via that keyword.
5
2409
by: justobservant | last post by:
When more than one keyword is typed into a search-query, most of the search-results displayed indicate specified keywords scattered throughout an entire website of content i.e., this is shown as three bolded periods '...' in search-result listings. Additionally, most content is outdated; as many users need up-to-date content. Hence, filtering-through search-results becomes quite cumbersome. The newsgroup listings allow detailed...
6
1714
by: Advo | last post by:
Hi Basically, i need to write a php search function which will search all our pages in the directory depending on user keywords.. the thing is, this could be difficult as we may have 8000+ dynamic pages (these would also need searching) plus we will be adding more and more of these dynamic pages (so wont have a list of page names). Any ideas please?
0
1437
by: daveleominster | last post by:
I am trying to read in a XML file into to vb.net say I have a text box and I am searching for a command "Aname" I enter the text and push the button, I want my syntax info and summary info to appear on the next form in a two separate list box. So I am thinking a loop and if statements are need. What do you suggestion for the vb code Next if the there is no commandname that match then it search for a keyword and then lists the commands...
3
2746
by: dittiman | last post by:
Hi everyone, here's what I need to do... I'm working on a windows application with c# I have a table with a list of keywords. The user will be selecting a word document from his computer. Once the document has been selected, I want to search the document for the keywords I have in another table. Once I find a key word, I want to add it to a list and start looking for the next keyword. But if I don't find the keyword till the end of the...
9
1918
by: drhowarddrfine | last post by:
I don't want to use a db manager, like mysql, for such a small database but I'm finding this trickier than I thought and hope someone can provide some guidance. I have a restaurant menu with prices and other info in a small file. It's set up in a YAML-ish style, if you're familiar with that format. I'm just looking for some ideas on the best way to retrieve data based on a "keyword". What complicates things for me is that some of these...
0
9718
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10614
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10369
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10109
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7649
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6876
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5678
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4327
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3847
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.