By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,282 Members | 1,617 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,282 IT Pros & Developers. It's quick & easy.

string match/discard issue

P: 2
Hello,

I am very new to Perl. I have been trying to modify an existing script that is used to read tcpdump files and then eventually generate Top Talker stats from it. If my tcpdump file from System-A looks like this, it works fine:
Expand|Select|Wrap|Line Numbers
  1. 02:02:28.578439 IP 10.56.252.32.ssh > 10.56.252.66.3722: tcp 116
  2. 02:02:28.544248 IP 10.56.252.66.3722 > 10.56.252.32.ssh: tcp 0
  3. 02:02:28.550995 IP 10.69.252.2.hsrp > 224.0.0.2.hsrp: UDP, length 20

On System-B and newer systems, the tcpdump output is attaching two extra columns as shown here:
Expand|Select|Wrap|Line Numbers
  1. 12:50:23.671341 10.62.58.112.ssh > 10.62.119.87.1397: tcp 68 (DF) [tos 0x10] 
  2. 12:50:23.672143 10.62.119.87.1397 > 10.62.58.112.ssh: tcp 0 (DF) tail-type  255 len 255 f5 type 255 len 255
  3. 12:50:23.696168 0:50:56:47:7c:64 Broadcast 74: 
  4. 12:50:23.696404 127.2.0.1.4401 > 127.2.0.2.32838: tcp 264 (DF)
These two extra fields are the (DF) and another optional field. I know to find out how to modify my script to read everything that it normally does but once it hurst that first "(" then ignore/drop the rest. Here is my current code snippet that breaks on the newer tcpdump files. The "if ($rest =~ /^IP\s+(\d+\.\" is the line in question.

Expand|Select|Wrap|Line Numbers
  1. open(INFILE,"$infile");
  2. while (<INFILE>) {
  3.     $line=$_;
  4.     chop($line);
  5.     if ($line=~ /^(\d+)\:(\d+)\:(\d+)\.\d+\s+(.*)/) {
  6.         $newhour=$1; $min=$2; $sec=$3; $rest=$4;
  7.         # check time; if hour goes DOWN, must be a new day
  8.         if ($hour > $newhour) { $newhour=$newhour+24; }
  9.         $hour=$newhour;
  10.         if ($time eq 0) { $inittime = 60*$hour+$min; }
  11.         $time=int((60*$hour+$min)/$timeblock);
  12.         #new event
  13.         # look for IP addresses
  14.         if ($rest =~ /^IP\s+(\d+\.\d+\.\d+\.\d+)\.(\S+)\s+\>\s+(\d+\.\d+\.\d+\.\d+)\.(\S+)\:\s+(\S+)\s+(\S+)/) {
  15.             $srcip=$1;
  16.             $srcport=$2;
  17.             $dstip=$3;
  18.             $dstport=$4;
  19.             $proto=$5;
  20.  
  21.             # find replies
  22.             if ($con{"$proto,$dstip,$srcip,$srcport"} > 0) {
  23.                 if (($dstport =~ /^\d+$/) && ($dstport > 1023)) {
  24.                     if (($srcport =~ /\D/) || ($dstport < 1150)) {
  25.                     # is probably reply packet; switch src & dst
  26.                         #print "$srcport, $dstport\n" unless ($srcport =~ /netbios/);
  27.                         $temp=$srcport;
  28.                         $srcport=$dstport;
  29.                         $dstport=$temp;
  30.                         $temp=$srcip;
  31.                         $srcip=$dstip;
  32.                         $dstip=$temp;
  33.                     }
  34.                 }
  35.             }
  36.             #cleaning up formatting
Thank you,
Feb 28 '08 #1
Share this Question
Share on Google+
3 Replies


KevinADC
Expert 2.5K+
P: 4,059
I'm confused, is the code you posted to parse the old format or the new format? Your code is looking for 'IP' in the new format? But there is no 'IP' in the new format?

Keep in mind that parenthesis are used to capture patterns in memory in a regexp, so if you have to match parenthesis in the search string you must escape them :

Expand|Select|Wrap|Line Numbers
  1. /\( foo\) /;
if you want to match and capture parenthesis:

Expand|Select|Wrap|Line Numbers
  1. /(\( foo\) )/;
  2. print $1;
Feb 28 '08 #2

nithinpes
Expert 100+
P: 410
I could see that 'IP' is missing in tcpdump output from SystemB. Make the pattern 'IP' optional(?) in your search. Other than that, the regex you have used will meet your objective.
Expand|Select|Wrap|Line Numbers
  1. if ($rest =~ /^(IP\s+)?(\d+\.\d+\.\d+\.\d+)\.(\S+)\s+\>\s+(\d+\.\d+\.\d+\.\d+)\.(\S+)\:\s+(\S+)\s+(\S+)/)  {
  2.             $srcip=$2;
  3.             $srcport=$3;
  4.             $dstip=$4;
  5.             $dstport=$5;
  6.             $proto=$6;
  7.  
If you deliberately want to ignore pattern after "(", modifying your first regex as below would do that:
Expand|Select|Wrap|Line Numbers
  1. if ($line=~ /(\d+)\:(\d+)\:(\d+)\.\d+\s+([^(]*)/) {  ## match anything, not "("
  2.  
Feb 28 '08 #3

P: 2
I actually didn't even notice that the word "IP" was missing on the new captures. Either way I appreciate the help.

Thank you
Feb 28 '08 #4

Post your reply

Sign in to post your reply or Sign up for a free account.