473,395 Members | 1,532 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

perl scan files and do counting for regex expression matches

123 100+
i use below try to search from "search.txt" and find any string which match the regex expression i listed from file "keywords.txt", however, it fail. if i put keywords like

keyword1
keyword2

then it will works and display line number and keyword, but if use regex expression instead, it won't works. please help.

actually i want to run the perl script and do a counting for number of matches for each regex expression but even the simple one i tried above not works.

thanks.

Expand|Select|Wrap|Line Numbers
  1.  
  2. open my $keywords,    '<', 'c:\perl\csv\keywords.txt' or die "Can't open keywords: $!";
  3. open my $search_file, '<', 'c:\perl\csv\search.txt'   or die "Can't open search file: $!";
  4.  
  5. my $keyword_or = join '|', map {chomp;qr/\Q$_\E/} <$keywords>;
  6. #my $regex = qr|\b($keyword_or)\b|;
  7. #my $regex = qr/$keyword_or/;
  8. my $regex = qr|($keyword_or)|;
  9.  
  10.  
  11. while (<$search_file>)
  12. {
  13.     while (/$regex/g)
  14.     {
  15.         print "$.: $1\n";
  16.     }
  17. }
  18.  
  19.  
  20.  
  21.  
  22.  
  23.  
Nov 6 '14 #1
1 1948
nithinpes
410 Expert 256MB
If I understood your question right, I am guessing the keywords in your keywords.txt will be something like:
keywords.txt
Expand|Select|Wrap|Line Numbers
  1. keyword\d
  2. \d\d\d
  3. p\w{7}\d
  4.  
if that's the case, the line 5 of you code should be
Expand|Select|Wrap|Line Numbers
  1. my $keyword_or = join '|', map {chomp;qr/$_/} <$keywords>;
  2.  

Using \Q will escape all special characters that follow, which may not be what you want if you are trying to pass the regex from keyword.txt as is.
Feb 26 '15 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

4
by: GregMa | last post by:
Does anyone have a good regex expression to replace any invalid filename characters in a string? Those characters are: /, \, :, *, ?, ", <, >, | I have it right now with string.replace for each...
2
by: Julie | last post by:
I'm an admitted regex moron, but I need help w/ an expression for parsing. Here is the input string: (123.45)ABC(44.55) with a regex expression of \(|\)
5
by: lgbjr | last post by:
Hello All, I have the following type of string: "X:Y\Z.exe" "123" What I need is an array of strings with the information from within each set of quotes. I was trying to use a Regex.Split, but...
3
by: Spondishy | last post by:
Hi, Does anyone have a good regex expression to remove some html tags that would be efficient in .Net. Basically I want to keep anchors, bolds and a few others, so an expression that says remove...
9
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use...
2
by: shruti | last post by:
hiii all I'm tryin to call a perl script from a C program in following 2 ways- 1.By callin system function. But there's some problem because the system function is not able to executeany...
1
by: Indian Offshore Company | last post by:
Hi, I want to parse html with multiple <a href.. text...</atags as follow: "... some html... <a class="1" href="city1.html" onclick="etc."click for info on city1 </a.. some html.. <a...
3
by: trashman.horlicks | last post by:
Hi, I'm getting into a tangle with Regex in C#. If I had an expression like abc.def.ghi etc. How do I count how many instances there are of "" and "." Thanks a lot! TH
1
by: rote | last post by:
I'm out for a regex expression to validate mobile phones like +61408777888 or 61405673777 and not letters allowed Thanks in advance
6
by: mohaaron | last post by:
Hello all, I'm not very good with writing regular expressions and need some help with this one. I need to validate an email address which has the full name of the person appended to the...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.