467,202 Members | 1,022 Online
Bytes | Developer Community
Ask Question

Home New Posts Topics Members FAQ

Post your question to a community of 467,202 developers. It's quick & easy.

filter something from file or string

hi,
I'm dealing with a problem in which I have to scan through a short text and print out all ip addresses within it.

assume I have a text stored in a file named 'test', and I write a small perl program which read all lines of the text and print out any ip address:

open FILE, "test" or die $!;
Expand|Select|Wrap|Line Numbers
  1. open FILE, "header" or die $!;
  2.  
  3. print "IP addresses found:\n";
  4. while ($line = <FILE>){
  5.  
  6.         if ($line=~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/) {
  7.                 print $1;
  8.                 print "\n";
  9.         }
  10. }
  11.  
There is a problem, however, that the perl program can print only 1 ip on each line. it's ok if there are all ip addresses in the same line are similar, yet if there are different ips on 1 line, we only get 1 ip but miss the others.

example:
"Hello this is the msg from 192.168.1.1 blah blah... 192.168.1.2
This is the second text from 192.168.1.3 blah blah..."

will print out only 192.168.1.1 and 192.168.1.3, and we'll miss the 192.168.1.2

any suggestion :-?

thanx :D
Sep 15 '08 #1
  • viewed: 7179
Share:
11 Replies
Ganon11
Expert 2GB
Try the /g modifier on your regexp to progressively search. You'll have to fiddle with your code a bit, but it shouldn't be too bad.
Sep 15 '08 #2
Try the /g modifier on your regexp to progressively search. You'll have to fiddle with your code a bit, but it shouldn't be too bad.
err, how is it? perl is pretty new to me though I ppl say it's just like php. is there any more optimal way to read thru the whole text and pick up what we want?
Sep 15 '08 #3
numberwhun
Expert Mod 2GB
err, how is it? perl is pretty new to me though I ppl say it's just like php. is there any more optimal way to read thru the whole text and pick up what we want?

What Ganon is referring to is taking your regular expression and making it lok like this:

Expand|Select|Wrap|Line Numbers
  1. $line=~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/g
  2.  
The /g will apply globally. If you look at any Perl regex tutorial, they should explain the characters after the regex. They are simply options to the regex as a whole.

Also, it looks like you are cycling through the file fine, line by line. Its a slightly slower way to do it, but slurping the entire file into an array or something tends to be ram heavy for larger files.

Regards,

Jeff
Sep 15 '08 #4
it still doesnot work, the new regular expression can only print the first thing that matches the pattern, then it ignore the rest of that line. does anybody know how to force it read until the end of line and print whatever matching.

is there any way to read the whole text but not line by line? can we read word by word instead? does a string work in this case?
Sep 17 '08 #5
nithinpes
Expert 256MB
it still doesnot work, the new regular expression can only print the first thing that matches the pattern, then it ignore the rest of that line. does anybody know how to force it read until the end of line and print whatever matching.

is there any way to read the whole text but not line by line? a string works in this case?
That should work. But, remember that though /g forces the pattern match to match all occurences, the if() loop will execute only once. You should be using while() loop.

Expand|Select|Wrap|Line Numbers
  1. while ($line=~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/g) {
  2.                 print $1;
  3.                 print "\n";
  4.         }
  5.  
  6.  
- Nithin
Sep 17 '08 #6
nithinpes
Expert 256MB
is there any way to read the whole text but not line by line? can we read word by word instead? does a string work in this case?
Inorder to read whole text from the file, you need to undefine input record separator ($/) which is "\n" by default ,before reading from file.
Expand|Select|Wrap|Line Numbers
  1. $/ = "";
  2. while(<FILE>)  {  ## read entire file
  3.  
Sep 17 '08 #7
it works like a charm when while loop is used instead of if. I see the problem: 'if' will quit immediately as soon as a match is found, yet 'while' continues until reaching the end of line :D

actually I see there's no need to set $\=""; still works well;

thx so much!
Sep 17 '08 #8
nithinpes
Expert 256MB
actually I see there's no need to set $\=""; still works well;

thx so much!
Yes, offcourse. That was just to answer your second question.
Sep 17 '08 #9
errr, another problem raised:
assume that there are some similar ip addresses found duplicated in the text: ip1, ip1, ip2, ip2, ip2, ip3... how can we filter and get each ip address printed once only (i.e eliminate the duplicated ips)
Sep 17 '08 #10
100+
errr, another problem raised:
assume that there are some similar ip addresses found duplicated in the text: ip1, ip1, ip2, ip2, ip2, ip3... how can we filter and get each ip address printed once only (i.e eliminate the duplicated ips)
I know nothing about perl, but I think that the solution is to putt them in a list or array and then do something like:

http://www.perlmonks.org/?node_id=604547

http://perldoc.perl.org/perlfaq4.htm...st-or-array%3f
Sep 17 '08 #11
numberwhun
Expert Mod 2GB
asedt is correct. See the Perlmonks link and you will see the following code in one of the responses there:

Expand|Select|Wrap|Line Numbers
  1. my @array  = ("abc","def","abc","ghi","ghi","abc","jklm","abc","def");
  2. my %hash   = map { $_ => 1 } @array;
  3. my @unique = keys %hash;
  4.  
So, you would simply create an array that contains all of the IPs, then the hash would be created from that array. Since the keys in an array have to be unique, it will not create a new key. In the end, print the list of keys.

Regards,

Jeff
Sep 17 '08 #12

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

9 posts views Thread by Robin Cull | last post: by
1 post views Thread by Shmulik | last post: by
1 post views Thread by Lou | last post: by
8 posts views Thread by dick | last post: by
2 posts views Thread by Salad | last post: by
1 post views Thread by ocbka1@gmail.com | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.