By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,710 Members | 1,906 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,710 IT Pros & Developers. It's quick & easy.

Problem with regex in the script ???

P: 65
Hi All,

Thanks in Advance..

Problem statement: Need to display the raw stats file report but script count the stat's file which has *.xls

Am reading the input raw stats files from a directory & comparing the stats type & pushing it to array then later
counting the occurance.

now the problem is if that directory contains raw stats excel file, it reads that file & increment the count which is not correct.

all i need to print only raw stats file count (file which doesn't have any file type extension)

how to achieve this ???

i know problem is with regex "if" stmts in the script

directory content look like this:
prstat-Ls-20080118-1800
prstat-Ls-20080118-1900
prstat-Ls-20080118-1900.xls
prstat-Lvs-20080118-1800
prstat-Lvs-20080118-1900


Output should look like this:
There are totally 4 files present in C:\Performance_svap\INPUT_FILES\
There are totally 2 prstat_Ls files present (even though we have prstat*.xls file..script should discard this file)
There are totally 2 prstat_Lvs files present


But i get this output:
There are totally 5 files present in C:\Performance_svap\INPUT_FILES\
There are totally 3 prstat_Ls files present (even though we have prstat*.xls file..script counted *.xls file too)
There are totally 2 prstat_Lvs files present

As per the script what ever the ouput am getting is correct but i just want script to count only raw files but not the *.xls
how to do this ???

Plz can anyone help me on this ???

Script goes like this:
Expand|Select|Wrap|Line Numbers
  1. my $dir = "C:\\Performance_svap\\INPUT_FILES\\";
  2.  
  3. &raw_stats_report;
  4.  
  5. #######function to display the raw stats type report ###################
  6.  
  7. sub raw_stats_report(){
  8.         my $f;
  9.         opendir(D, "$dir") || die "Can't opendir $dir: $!\n";
  10.         my @list = readdir(D);
  11.         closedir(D);
  12.         foreach my $f (@list){
  13.             if ($f =~ /sar-d/){
  14.                 push (@sar_d,$f);
  15.                 }
  16.             if ($f =~ /sar-g/){
  17.                 push (@sar_g,$f);
  18.                 }
  19.             if ($f =~ /sar-u/){
  20.                 push (@sar_u,$f);
  21.                 }
  22.             if ($f =~ /sar-r/){
  23.                 push (@sar_r,$f);
  24.                 }
  25.             if ($f =~ /vmstat/){
  26.                 push (@vmstat,$f);
  27.                 }
  28.             if ($f =~ /mpstat/){
  29.                 push (@mpstat,$f);
  30.                 }
  31.             if ($f =~ /prstat-mLV/){
  32.                 push (@prstat_mLV,$f);
  33.                 }
  34.             if ($f =~ /prstat-Ls/){
  35.                 push (@prstat_Ls,$f);
  36.                 }
  37.                 if ($f =~ /prstat-Lvs/){
  38.                 push (@prstat_Lvs,$f);
  39.                 }
  40.             if ($f =~ /netstat/){
  41.                 push (@netstat,$f);
  42.                 }
  43.             if ($f =~ /iostat/){
  44.                 push (@iostat,$f);
  45.                 }
  46.         }        
  47.     chdir $dir;
  48.     @files =<*>;
  49.     print "############# Raw_Stats_Report ################\n \n ";
  50.     print "There are totally ",scalar(@files)," files present in $dir \n";
  51.     print "\n \n There are totally ",scalar(@iostat)," iostat files present \n";
  52.     print "\n There are totally ",scalar(@netstat)," netstat files present \n";
  53.     print "\n There are totally ",scalar(@prstat_Ls)," prstat_Ls files present \n";
  54.     print "\n There are totally ",scalar(@prstat_Lvs)," prstat_Lvs files present \n";
  55.     print "\n There are totally ",scalar(@sar_d)," sar-d files present \n";
  56.     print "\n There are totally ",scalar(@sar_g)," sar-g files present \n";
  57.     print "\n There are totally ",scalar(@sar_u)," sar-u files present \n";
  58.     print "\n There are totally ",scalar(@sar_r)," sar-r files present \n";
  59.     print "\n There are totally ",scalar(@prstat_mLV)," prstat_mLV files present \n";
  60.     print "\n There are totally ",scalar(@mpstat),"  mpstat files present \n";
  61.     print "\n There are totally ",scalar(@vmstat)," vmstat files present \n\n";
  62.     print "################################################\n \n ";
  63.     print "Do you want me to continue ? Y | N \n";
  64.     chomp(my $pick = <STDIN>);
  65.     if($pick =~/y/){
  66.     print "Process execution will continue !!! \n";
  67.     }
  68.     else{
  69.     print "Process execution stopped !!! \n";die;
  70.     }
  71. }
  72.  
  73.  
Regards,
Vijayarl
Oct 13 '08 #1
Share this Question
Share on Google+
10 Replies


KevinADC
Expert 2.5K+
P: 4,059
Expand|Select|Wrap|Line Numbers
  1.         foreach my $f (@list){
  2.             next if ($f =~ /\.xls$/i); #<-- skip files with a .xls extension
As a side note, a hash would be better to count the files instead of using arrays for each different filetype.
Oct 13 '08 #2

P: 65
Thanks Kevin !!!!

Would like to implement hash as you said...
but am still learning perl, as jeff told me to go through the hash method in my another post...

i will be very greatful if you can assist me on how to implement hash method to count the files..
one example would be sufficient for me or just explaination step by step..

i would like to try by self..just tell me how to go head...
hope you won't mind...

anyway's thanks once again...

Regards,
Vijayarl
Oct 13 '08 #3

KevinADC
Expert 2.5K+
P: 4,059
Here is a general rewrite of your code including using a hash to store the counts and other changes. Notably if/elsif/elsif instead of if/if/if. When a string or line can have only one true value don't use if/if/if as perl has to evaluate all the 'if' conditions even after it finds the only true one. if/elsif enables perl to stop executing the conditions after the first true value if found. If you ever neeeded a fall-through condition you add an 'else' condition to the end to catch exceptions. In your case there is no need that I can see for a fall-through condition. I also cleaned up your regexp, mostly just to show you ways of writing them to check for patterns. You were really checking for substrings instead of patterns, in which case index() would have been better to use than regular expressions. But since we want to capture the value of the pattern match and use it as the hash key I went with pure regexps instead of index() and predefined keys, which is also a good possible way to do what you are doing.

Untested code:

Expand|Select|Wrap|Line Numbers
  1. use strict;
  2. use warnings;
  3. my $dir = 'C:/Performance_svap/INPUT_FILES';#<-- windows supports forward slashes in directory paths
  4. my %count = (); #<-- hash to store counts
  5. raw_stats_report($dir);#<-- call the function with $dir as its argument
  6.  
  7. #######function to display the raw stats type report ###################
  8.  
  9. sub raw_stats_report {
  10.     my $dir = $_[0] or die "No start directory defined\n";
  11.     chdir($dir) or die "Can't chdir to $dir: $!\n";
  12.     opendir(D, '.') or die "Can't opendir $dir: $!\n";
  13.     my @list = readdir(D);
  14.     closedir(D);
  15.     foreach my $f (@list){
  16.         next if ($f =~ /\.xls$/i);
  17.         if ($f =~ /(sar-[dgur])/){
  18.            $count{$1}++;
  19.         }
  20.         elsif ($f =~ /(vmstat|mpstat)/){
  21.             $count{$1}++;
  22.         }
  23.         elsif ($f =~ /(prstat-(?:mLV|Ls|Lvs))/){
  24.            $count{$1}++;
  25.         }
  26.         elsif ($f =~ /((?:net|io)stat)/){
  27.            $count{$1}++;
  28.         }
  29.     }        
  30.     print "############# Raw_Stats_Report ################\n \n ";
  31.     print 'There are total ',scalar(@list)," files present in $dir\n";
  32.     foreach my $c (sort keys %count) {
  33.         print "There are total $count{$c} $c files present in $dir\n";
  34.     }
  35.     print "################################################\n   \n ";
  36.     print "Do you want me to continue ? Y | N \n";
  37.     chomp(my $pick = <STDIN>);
  38.     if ($pick =~ /y/i){
  39.         print "Process execution will continue !!! \n";
  40.     }
  41.     else{
  42.         print "Process execution stopped !!! \n";
  43.         exit(0); # <-- use exit instead of 'die' to end a script early
  44.     }
  45. }
  46.  
Oct 13 '08 #4

KevinADC
Expert 2.5K+
P: 4,059
Another thing to keep in mind is that the scalar value of @list:

scalar(@list)

will include '.' and '..' in the count/length of the array. If you don't want those you can substract 2 from the length:

scalar(@list)-2
Oct 13 '08 #5

P: 65
Thanks Kevin !!!!!

It worked successfully... thank you very much..

last one question:
can we skip for any filetype extension instead of only skipping *.xls

i did change this part in the script
Expand|Select|Wrap|Line Numbers
  1. next if ($f =~ /\.xls$/i); to next if ($f =~ /\.*$/i); 
  2.  
but didn't get desired result..

what i thought was, in the script we skip file which has *.xls but whatif i have other than *.xls in the directory like
Expand|Select|Wrap|Line Numbers
  1. prstat-Ls-20080118-1800
  2. prstat-Ls-20080118-1800.doc
  3. prstat-Ls-20080118-1900
  4. prstat-Ls-20080118-1900.txt
  5. prstat-Ls-20080118-1900.xls
  6. prstat-Lvs-20080118-1800
  7. prstat-Lvs-20080118-1900
  8. prstat-Lvs-20080118-1900.txt
  9.  
script still count all the *.txt & *.doc entry. so thought instead of telling the script to skip *.xls can't we just skip all the occurance of file which has file type extension ???

i know it's too much of asking.. just to know that can we do this ??

anyway's thanks for you patience reply...
your just too good.. lots N lots left to learn from you ppl :-)

Regards,
Vijayarl
Oct 14 '08 #6

nithinpes
Expert 100+
P: 410
last one question:
can we skip for any filetype extension instead of only skipping *.xls

i did change this part in the script
Expand|Select|Wrap|Line Numbers
  1. next if ($f =~ /\.xls$/i); to next if ($f =~ /\.*$/i); 
  2.  
This does not work as the character before '*' quantifier is \.(literal .). That would mean 0 or more occurence of '.'. Hence it matches files with/without extensions.


script still count all the *.txt & *.doc entry. so thought instead of telling the script to skip *.xls can't we just skip all the occurance of file which has file type extension ???

Regards,
Vijayarl
You may use:
Expand|Select|Wrap|Line Numbers
  1. next if ($f =~ /\..+$/i); 
  2.  
Oct 14 '08 #7

P: 65
Thanks nithinpes !!!!

It worked fine...

Regards,
Vijayarl
Oct 14 '08 #8

P: 65
Another one:

As we printing the total number of files persent in the directory,
Expand|Select|Wrap|Line Numbers
  1. print 'There are total ',scalar(@list)-2," files present in $dir\n \n";
  2.  
this line give correct value but what i thought is to print only the total number of
raw stat file count. the above line prints count of all the files

i did change the script :
Expand|Select|Wrap|Line Numbers
  1. print "############# Raw_Stats_Report ################\n \n "; 
  2.     print 'There are total ',scalar(@list)-2," files present in $dir\n \n";
  3.     print "#############***********################\n \n";
  4.     my @rawstat; 
  5.     foreach my $c (sort keys %count) {
  6.         @rawstat = %count;
  7.         print "\n There are total $count{$c} $c files present in $dir\n \n ";
  8.     }
  9.     print 'There are totally ',scalar(@rawstat)," raw stats files present in $dir\n \n";
  10.  
but i get the incorrect ouput count it gives only 18 raw stat file count even though we have 36 raw stat file count.
is this correct way to do :
Expand|Select|Wrap|Line Numbers
  1. @rawstat = %count; ## added inside the for loop
  2.  
  3.  
  4. print 'There are totally ',scalar(@rawstat)," raw stats files present in $dir\n \n"; ## kept out side the for loop
  5.  
ouput look like this:
Expand|Select|Wrap|Line Numbers
  1. C:\Performance_svap\misc>perl chkempty.pl
  2. ############# Raw_Stats_Report ################
  3.  
  4.  There are total 46 files present in C:/Performance_svap/INPUT_FILES
  5.  
  6. #############***********################
  7.  
  8.  
  9.  There are total 4 iostat files present in C:/Performance_svap/INPUT_FILES
  10.  
  11.  
  12.  There are total 4 netstat files present in C:/Performance_svap/INPUT_FILES
  13.  
  14.  
  15.  There are total 4 prstat-Ls files present in C:/Performance_svap/INPUT_FILES
  16.  
  17.  
  18.  There are total 4 prstat-Lvs files present in C:/Performance_svap/INPUT_FILES
  19.  
  20.  
  21.  There are total 4 sar-d files present in C:/Performance_svap/INPUT_FILES
  22.  
  23.  
  24.  There are total 4 sar-g files present in C:/Performance_svap/INPUT_FILES
  25.  
  26.  
  27.  There are total 4 sar-r files present in C:/Performance_svap/INPUT_FILES
  28.  
  29.  
  30.  There are total 4 sar-u files present in C:/Performance_svap/INPUT_FILES
  31.  
  32.  
  33.  There are total 4 vmstat files present in C:/Performance_svap/INPUT_FILES
  34.  
  35.  There are totally 18 raw stats files present in C:/Performance_svap/INPUT_FILES
  36.  
  37.  
  38. ################################################
  39.  
Regards,
Vijayarl
Oct 14 '08 #9

nithinpes
Expert 100+
P: 410
In your script, the %count has has the type of file(vmstat,...) as key and it's count as values. Hence, to get the total count of files, you should be summing up values of all the keys in the hash.
Expand|Select|Wrap|Line Numbers
  1. my $raw_count=0;
  2. foreach (keys %count)  {
  3.  $raw_count+ = $count{$_} ; ## sum up the values
  4. }
  5.  
Oct 14 '08 #10

P: 65
Thanks nithinpes !!!!

It worked fine...thanks once again...

Working Code:
Expand|Select|Wrap|Line Numbers
  1. use strict; 
  2. use warnings; 
  3. my $dir = 'C:/Performance_svap/INPUT_FILES';#<-- windows supports forward slashes in directory paths 
  4. my %count = (); #<-- hash to store counts 
  5. raw_stats_report($dir);#<-- call the function with $dir as its argument 
  6.  
  7. #######function to display the raw stats type report ################### 
  8. sub raw_stats_report { 
  9.     my $dir = $_[0] or die "No start directory defined\n"; 
  10.     chdir($dir) or die "Can't chdir to $dir: $!\n"; 
  11.     opendir(D, '.') or die "Can't opendir $dir: $!\n"; 
  12.     my @list = readdir(D); 
  13.     closedir(D);
  14.     foreach my $f (@list){ 
  15.         next if ($f =~ /\..+$/i); 
  16.         if ($f =~ /(sar-[dgur])/){ 
  17.            $count{$1}++;
  18.  
  19.         } 
  20.         elsif ($f =~ /(vmstat|mpstat)/){ 
  21.             $count{$1}++; 
  22.  
  23.         } 
  24.         elsif ($f =~ /(prstat-(?:mLV|Ls|Lvs))/){ 
  25.            $count{$1}++; 
  26.  
  27.         } 
  28.         elsif ($f =~ /((?:net|io)stat)/){ 
  29.            $count{$1}++; 
  30.  
  31.         }
  32.     }         
  33.     print "############# Raw_Stats_Report ################\n \n "; 
  34.     print 'There are total ',scalar(@list)-2," files present in $dir\n \n";
  35.     print "#############***********################\n \n";
  36.  
  37.     foreach my $c (sort keys %count) {
  38.         print "\n There are total $count{$c} $c files present in $dir\n \n ";
  39.     }
  40.     my $raw_count=0;
  41.     foreach (keys %count){ 
  42.         $raw_count+= $count{$_} ; ## sum up the values 
  43.     } 
  44.     print 'There are totally ',scalar($raw_count)," raw stats files present in $dir\n \n";
  45.     print "################################################\n \n "; 
  46.     print "Do you want me to continue ? Y | N \n"; 
  47.     chomp(my $pick = <STDIN>); 
  48.     if ($pick =~ /y/i){ 
  49.         print "Process execution will continue !!! \n"; 
  50.     } 
  51.     else{ 
  52.         print "Process execution stopped !!! \n"; 
  53.         exit(0); # <-- use exit instead of 'die' to end a script early 
  54.     } 
  55.  
Regards,
Vijayarl
Oct 14 '08 #11

Post your reply

Sign in to post your reply or Sign up for a free account.