473,320 Members | 1,945 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

filter something from file or string

hi,
I'm dealing with a problem in which I have to scan through a short text and print out all ip addresses within it.

assume I have a text stored in a file named 'test', and I write a small perl program which read all lines of the text and print out any ip address:

open FILE, "test" or die $!;
Expand|Select|Wrap|Line Numbers
  1. open FILE, "header" or die $!;
  2.  
  3. print "IP addresses found:\n";
  4. while ($line = <FILE>){
  5.  
  6.         if ($line=~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/) {
  7.                 print $1;
  8.                 print "\n";
  9.         }
  10. }
  11.  
There is a problem, however, that the perl program can print only 1 ip on each line. it's ok if there are all ip addresses in the same line are similar, yet if there are different ips on 1 line, we only get 1 ip but miss the others.

example:
"Hello this is the msg from 192.168.1.1 blah blah... 192.168.1.2
This is the second text from 192.168.1.3 blah blah..."

will print out only 192.168.1.1 and 192.168.1.3, and we'll miss the 192.168.1.2

any suggestion :-?

thanx :D
Sep 15 '08 #1
11 7499
Ganon11
3,652 Expert 2GB
Try the /g modifier on your regexp to progressively search. You'll have to fiddle with your code a bit, but it shouldn't be too bad.
Sep 15 '08 #2
Try the /g modifier on your regexp to progressively search. You'll have to fiddle with your code a bit, but it shouldn't be too bad.
err, how is it? perl is pretty new to me though I ppl say it's just like php. is there any more optimal way to read thru the whole text and pick up what we want?
Sep 15 '08 #3
numberwhun
3,509 Expert Mod 2GB
err, how is it? perl is pretty new to me though I ppl say it's just like php. is there any more optimal way to read thru the whole text and pick up what we want?

What Ganon is referring to is taking your regular expression and making it lok like this:

Expand|Select|Wrap|Line Numbers
  1. $line=~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/g
  2.  
The /g will apply globally. If you look at any Perl regex tutorial, they should explain the characters after the regex. They are simply options to the regex as a whole.

Also, it looks like you are cycling through the file fine, line by line. Its a slightly slower way to do it, but slurping the entire file into an array or something tends to be ram heavy for larger files.

Regards,

Jeff
Sep 15 '08 #4
it still doesnot work, the new regular expression can only print the first thing that matches the pattern, then it ignore the rest of that line. does anybody know how to force it read until the end of line and print whatever matching.

is there any way to read the whole text but not line by line? can we read word by word instead? does a string work in this case?
Sep 17 '08 #5
nithinpes
410 Expert 256MB
it still doesnot work, the new regular expression can only print the first thing that matches the pattern, then it ignore the rest of that line. does anybody know how to force it read until the end of line and print whatever matching.

is there any way to read the whole text but not line by line? a string works in this case?
That should work. But, remember that though /g forces the pattern match to match all occurences, the if() loop will execute only once. You should be using while() loop.

Expand|Select|Wrap|Line Numbers
  1. while ($line=~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/g) {
  2.                 print $1;
  3.                 print "\n";
  4.         }
  5.  
  6.  
- Nithin
Sep 17 '08 #6
nithinpes
410 Expert 256MB
is there any way to read the whole text but not line by line? can we read word by word instead? does a string work in this case?
Inorder to read whole text from the file, you need to undefine input record separator ($/) which is "\n" by default ,before reading from file.
Expand|Select|Wrap|Line Numbers
  1. $/ = "";
  2. while(<FILE>)  {  ## read entire file
  3.  
Sep 17 '08 #7
it works like a charm when while loop is used instead of if. I see the problem: 'if' will quit immediately as soon as a match is found, yet 'while' continues until reaching the end of line :D

actually I see there's no need to set $\=""; still works well;

thx so much!
Sep 17 '08 #8
nithinpes
410 Expert 256MB
actually I see there's no need to set $\=""; still works well;

thx so much!
Yes, offcourse. That was just to answer your second question.
Sep 17 '08 #9
errr, another problem raised:
assume that there are some similar ip addresses found duplicated in the text: ip1, ip1, ip2, ip2, ip2, ip3... how can we filter and get each ip address printed once only (i.e eliminate the duplicated ips)
Sep 17 '08 #10
asedt
125 100+
errr, another problem raised:
assume that there are some similar ip addresses found duplicated in the text: ip1, ip1, ip2, ip2, ip2, ip3... how can we filter and get each ip address printed once only (i.e eliminate the duplicated ips)
I know nothing about perl, but I think that the solution is to putt them in a list or array and then do something like:

http://www.perlmonks.org/?node_id=604547

http://perldoc.perl.org/perlfaq4.htm...st-or-array%3f
Sep 17 '08 #11
numberwhun
3,509 Expert Mod 2GB
asedt is correct. See the Perlmonks link and you will see the following code in one of the responses there:

Expand|Select|Wrap|Line Numbers
  1. my @array  = ("abc","def","abc","ghi","ghi","abc","jklm","abc","def");
  2. my %hash   = map { $_ => 1 } @array;
  3. my @unique = keys %hash;
  4.  
So, you would simply create an array that contains all of the IPs, then the hash would be created from that array. Since the keys in an array have to be unique, it will not create a new key. In the end, print the list of keys.

Regards,

Jeff
Sep 17 '08 #12

Sign in to post your reply or Sign up for a free account.

Similar topics

9
by: Robin Cull | last post by:
Imagine I have a dict looking something like this: myDict = {"key 1": , "key 2": , "key 3": , "key 4": } That is, a set of keys which have a variable length list of associated values after...
1
by: Shmulik | last post by:
How can I (can I?) use a regular expression as the filter to a FileSystemWatcher? I want to watch for something like this: Regex regex = new Regex(@"^Current(One|Two|Three)File\.txt$",...
1
by: Lou | last post by:
I'm using the Response Filter Class in MS KB article 811162 (http://support.microsoft.com/default.aspx?scid=kb;EN-US;811162) to generate a static htm page from an asp.net template page being...
8
by: dick | last post by:
I am just trying to print/report the results of a "filter by selection" which is done by right-clicking a form, filling in values, and "applying the filter." I have searched the newsgroups, and...
2
by: Salad | last post by:
I have a log file with a list of records. The log file can be unfiltered or filtered. I have a command button to call a data entry form from the log. At first I was only going to present the...
0
by: Anonieko Ramos | last post by:
Answer. Use IHttpHandler. thanks Ro ry for coming up with this code. It processes css file to add variables. neat idea using System; using System.IO; using System.Text; using...
2
by: Leo | last post by:
I am using OPENFILENAME and GetSaveFileName to let user save a document out. However, the default name won't show up in the dialog and the selected filter can't be obtained (always got the default...
1
by: ocbka1 | last post by:
i'm using creating a webpage on the fly that i save as an xls file to be attached to an email and sent dynamically. i've got a custom response filter class to write it out. the problems start...
5
by: Ron S | last post by:
After days of searching I finally an example that would work with my application, the only problem is after entering all of the code it is not working. Would someone be kind enough to take a look at...
3
by: franc sutherland | last post by:
Hello, I have a report which I filter using the me.filter command in the OnOpen event. Me.Filter = "OrderID=" & Forms!variable_form_name! Me.FilterOn = True I want to be able to open that...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.