473,408 Members | 1,876 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,408 software developers and data experts.

Problem with regex in the script ???

Hi All,

Thanks in Advance..

Problem statement: Need to display the raw stats file report but script count the stat's file which has *.xls

Am reading the input raw stats files from a directory & comparing the stats type & pushing it to array then later
counting the occurance.

now the problem is if that directory contains raw stats excel file, it reads that file & increment the count which is not correct.

all i need to print only raw stats file count (file which doesn't have any file type extension)

how to achieve this ???

i know problem is with regex "if" stmts in the script

directory content look like this:
prstat-Ls-20080118-1800
prstat-Ls-20080118-1900
prstat-Ls-20080118-1900.xls
prstat-Lvs-20080118-1800
prstat-Lvs-20080118-1900


Output should look like this:
There are totally 4 files present in C:\Performance_svap\INPUT_FILES\
There are totally 2 prstat_Ls files present (even though we have prstat*.xls file..script should discard this file)
There are totally 2 prstat_Lvs files present


But i get this output:
There are totally 5 files present in C:\Performance_svap\INPUT_FILES\
There are totally 3 prstat_Ls files present (even though we have prstat*.xls file..script counted *.xls file too)
There are totally 2 prstat_Lvs files present

As per the script what ever the ouput am getting is correct but i just want script to count only raw files but not the *.xls
how to do this ???

Plz can anyone help me on this ???

Script goes like this:
Expand|Select|Wrap|Line Numbers
  1. my $dir = "C:\\Performance_svap\\INPUT_FILES\\";
  2.  
  3. &raw_stats_report;
  4.  
  5. #######function to display the raw stats type report ###################
  6.  
  7. sub raw_stats_report(){
  8.         my $f;
  9.         opendir(D, "$dir") || die "Can't opendir $dir: $!\n";
  10.         my @list = readdir(D);
  11.         closedir(D);
  12.         foreach my $f (@list){
  13.             if ($f =~ /sar-d/){
  14.                 push (@sar_d,$f);
  15.                 }
  16.             if ($f =~ /sar-g/){
  17.                 push (@sar_g,$f);
  18.                 }
  19.             if ($f =~ /sar-u/){
  20.                 push (@sar_u,$f);
  21.                 }
  22.             if ($f =~ /sar-r/){
  23.                 push (@sar_r,$f);
  24.                 }
  25.             if ($f =~ /vmstat/){
  26.                 push (@vmstat,$f);
  27.                 }
  28.             if ($f =~ /mpstat/){
  29.                 push (@mpstat,$f);
  30.                 }
  31.             if ($f =~ /prstat-mLV/){
  32.                 push (@prstat_mLV,$f);
  33.                 }
  34.             if ($f =~ /prstat-Ls/){
  35.                 push (@prstat_Ls,$f);
  36.                 }
  37.                 if ($f =~ /prstat-Lvs/){
  38.                 push (@prstat_Lvs,$f);
  39.                 }
  40.             if ($f =~ /netstat/){
  41.                 push (@netstat,$f);
  42.                 }
  43.             if ($f =~ /iostat/){
  44.                 push (@iostat,$f);
  45.                 }
  46.         }        
  47.     chdir $dir;
  48.     @files =<*>;
  49.     print "############# Raw_Stats_Report ################\n \n ";
  50.     print "There are totally ",scalar(@files)," files present in $dir \n";
  51.     print "\n \n There are totally ",scalar(@iostat)," iostat files present \n";
  52.     print "\n There are totally ",scalar(@netstat)," netstat files present \n";
  53.     print "\n There are totally ",scalar(@prstat_Ls)," prstat_Ls files present \n";
  54.     print "\n There are totally ",scalar(@prstat_Lvs)," prstat_Lvs files present \n";
  55.     print "\n There are totally ",scalar(@sar_d)," sar-d files present \n";
  56.     print "\n There are totally ",scalar(@sar_g)," sar-g files present \n";
  57.     print "\n There are totally ",scalar(@sar_u)," sar-u files present \n";
  58.     print "\n There are totally ",scalar(@sar_r)," sar-r files present \n";
  59.     print "\n There are totally ",scalar(@prstat_mLV)," prstat_mLV files present \n";
  60.     print "\n There are totally ",scalar(@mpstat),"  mpstat files present \n";
  61.     print "\n There are totally ",scalar(@vmstat)," vmstat files present \n\n";
  62.     print "################################################\n \n ";
  63.     print "Do you want me to continue ? Y | N \n";
  64.     chomp(my $pick = <STDIN>);
  65.     if($pick =~/y/){
  66.     print "Process execution will continue !!! \n";
  67.     }
  68.     else{
  69.     print "Process execution stopped !!! \n";die;
  70.     }
  71. }
  72.  
  73.  
Regards,
Vijayarl
Oct 13 '08 #1
10 1848
KevinADC
4,059 Expert 2GB
Expand|Select|Wrap|Line Numbers
  1.         foreach my $f (@list){
  2.             next if ($f =~ /\.xls$/i); #<-- skip files with a .xls extension
As a side note, a hash would be better to count the files instead of using arrays for each different filetype.
Oct 13 '08 #2
Thanks Kevin !!!!

Would like to implement hash as you said...
but am still learning perl, as jeff told me to go through the hash method in my another post...

i will be very greatful if you can assist me on how to implement hash method to count the files..
one example would be sufficient for me or just explaination step by step..

i would like to try by self..just tell me how to go head...
hope you won't mind...

anyway's thanks once again...

Regards,
Vijayarl
Oct 13 '08 #3
KevinADC
4,059 Expert 2GB
Here is a general rewrite of your code including using a hash to store the counts and other changes. Notably if/elsif/elsif instead of if/if/if. When a string or line can have only one true value don't use if/if/if as perl has to evaluate all the 'if' conditions even after it finds the only true one. if/elsif enables perl to stop executing the conditions after the first true value if found. If you ever neeeded a fall-through condition you add an 'else' condition to the end to catch exceptions. In your case there is no need that I can see for a fall-through condition. I also cleaned up your regexp, mostly just to show you ways of writing them to check for patterns. You were really checking for substrings instead of patterns, in which case index() would have been better to use than regular expressions. But since we want to capture the value of the pattern match and use it as the hash key I went with pure regexps instead of index() and predefined keys, which is also a good possible way to do what you are doing.

Untested code:

Expand|Select|Wrap|Line Numbers
  1. use strict;
  2. use warnings;
  3. my $dir = 'C:/Performance_svap/INPUT_FILES';#<-- windows supports forward slashes in directory paths
  4. my %count = (); #<-- hash to store counts
  5. raw_stats_report($dir);#<-- call the function with $dir as its argument
  6.  
  7. #######function to display the raw stats type report ###################
  8.  
  9. sub raw_stats_report {
  10.     my $dir = $_[0] or die "No start directory defined\n";
  11.     chdir($dir) or die "Can't chdir to $dir: $!\n";
  12.     opendir(D, '.') or die "Can't opendir $dir: $!\n";
  13.     my @list = readdir(D);
  14.     closedir(D);
  15.     foreach my $f (@list){
  16.         next if ($f =~ /\.xls$/i);
  17.         if ($f =~ /(sar-[dgur])/){
  18.            $count{$1}++;
  19.         }
  20.         elsif ($f =~ /(vmstat|mpstat)/){
  21.             $count{$1}++;
  22.         }
  23.         elsif ($f =~ /(prstat-(?:mLV|Ls|Lvs))/){
  24.            $count{$1}++;
  25.         }
  26.         elsif ($f =~ /((?:net|io)stat)/){
  27.            $count{$1}++;
  28.         }
  29.     }        
  30.     print "############# Raw_Stats_Report ################\n \n ";
  31.     print 'There are total ',scalar(@list)," files present in $dir\n";
  32.     foreach my $c (sort keys %count) {
  33.         print "There are total $count{$c} $c files present in $dir\n";
  34.     }
  35.     print "################################################\n   \n ";
  36.     print "Do you want me to continue ? Y | N \n";
  37.     chomp(my $pick = <STDIN>);
  38.     if ($pick =~ /y/i){
  39.         print "Process execution will continue !!! \n";
  40.     }
  41.     else{
  42.         print "Process execution stopped !!! \n";
  43.         exit(0); # <-- use exit instead of 'die' to end a script early
  44.     }
  45. }
  46.  
Oct 13 '08 #4
KevinADC
4,059 Expert 2GB
Another thing to keep in mind is that the scalar value of @list:

scalar(@list)

will include '.' and '..' in the count/length of the array. If you don't want those you can substract 2 from the length:

scalar(@list)-2
Oct 13 '08 #5
Thanks Kevin !!!!!

It worked successfully... thank you very much..

last one question:
can we skip for any filetype extension instead of only skipping *.xls

i did change this part in the script
Expand|Select|Wrap|Line Numbers
  1. next if ($f =~ /\.xls$/i); to next if ($f =~ /\.*$/i); 
  2.  
but didn't get desired result..

what i thought was, in the script we skip file which has *.xls but whatif i have other than *.xls in the directory like
Expand|Select|Wrap|Line Numbers
  1. prstat-Ls-20080118-1800
  2. prstat-Ls-20080118-1800.doc
  3. prstat-Ls-20080118-1900
  4. prstat-Ls-20080118-1900.txt
  5. prstat-Ls-20080118-1900.xls
  6. prstat-Lvs-20080118-1800
  7. prstat-Lvs-20080118-1900
  8. prstat-Lvs-20080118-1900.txt
  9.  
script still count all the *.txt & *.doc entry. so thought instead of telling the script to skip *.xls can't we just skip all the occurance of file which has file type extension ???

i know it's too much of asking.. just to know that can we do this ??

anyway's thanks for you patience reply...
your just too good.. lots N lots left to learn from you ppl :-)

Regards,
Vijayarl
Oct 14 '08 #6
nithinpes
410 Expert 256MB
last one question:
can we skip for any filetype extension instead of only skipping *.xls

i did change this part in the script
Expand|Select|Wrap|Line Numbers
  1. next if ($f =~ /\.xls$/i); to next if ($f =~ /\.*$/i); 
  2.  
This does not work as the character before '*' quantifier is \.(literal .). That would mean 0 or more occurence of '.'. Hence it matches files with/without extensions.


script still count all the *.txt & *.doc entry. so thought instead of telling the script to skip *.xls can't we just skip all the occurance of file which has file type extension ???

Regards,
Vijayarl
You may use:
Expand|Select|Wrap|Line Numbers
  1. next if ($f =~ /\..+$/i); 
  2.  
Oct 14 '08 #7
Thanks nithinpes !!!!

It worked fine...

Regards,
Vijayarl
Oct 14 '08 #8
Another one:

As we printing the total number of files persent in the directory,
Expand|Select|Wrap|Line Numbers
  1. print 'There are total ',scalar(@list)-2," files present in $dir\n \n";
  2.  
this line give correct value but what i thought is to print only the total number of
raw stat file count. the above line prints count of all the files

i did change the script :
Expand|Select|Wrap|Line Numbers
  1. print "############# Raw_Stats_Report ################\n \n "; 
  2.     print 'There are total ',scalar(@list)-2," files present in $dir\n \n";
  3.     print "#############***********################\n \n";
  4.     my @rawstat; 
  5.     foreach my $c (sort keys %count) {
  6.         @rawstat = %count;
  7.         print "\n There are total $count{$c} $c files present in $dir\n \n ";
  8.     }
  9.     print 'There are totally ',scalar(@rawstat)," raw stats files present in $dir\n \n";
  10.  
but i get the incorrect ouput count it gives only 18 raw stat file count even though we have 36 raw stat file count.
is this correct way to do :
Expand|Select|Wrap|Line Numbers
  1. @rawstat = %count; ## added inside the for loop
  2.  
  3.  
  4. print 'There are totally ',scalar(@rawstat)," raw stats files present in $dir\n \n"; ## kept out side the for loop
  5.  
ouput look like this:
Expand|Select|Wrap|Line Numbers
  1. C:\Performance_svap\misc>perl chkempty.pl
  2. ############# Raw_Stats_Report ################
  3.  
  4.  There are total 46 files present in C:/Performance_svap/INPUT_FILES
  5.  
  6. #############***********################
  7.  
  8.  
  9.  There are total 4 iostat files present in C:/Performance_svap/INPUT_FILES
  10.  
  11.  
  12.  There are total 4 netstat files present in C:/Performance_svap/INPUT_FILES
  13.  
  14.  
  15.  There are total 4 prstat-Ls files present in C:/Performance_svap/INPUT_FILES
  16.  
  17.  
  18.  There are total 4 prstat-Lvs files present in C:/Performance_svap/INPUT_FILES
  19.  
  20.  
  21.  There are total 4 sar-d files present in C:/Performance_svap/INPUT_FILES
  22.  
  23.  
  24.  There are total 4 sar-g files present in C:/Performance_svap/INPUT_FILES
  25.  
  26.  
  27.  There are total 4 sar-r files present in C:/Performance_svap/INPUT_FILES
  28.  
  29.  
  30.  There are total 4 sar-u files present in C:/Performance_svap/INPUT_FILES
  31.  
  32.  
  33.  There are total 4 vmstat files present in C:/Performance_svap/INPUT_FILES
  34.  
  35.  There are totally 18 raw stats files present in C:/Performance_svap/INPUT_FILES
  36.  
  37.  
  38. ################################################
  39.  
Regards,
Vijayarl
Oct 14 '08 #9
nithinpes
410 Expert 256MB
In your script, the %count has has the type of file(vmstat,...) as key and it's count as values. Hence, to get the total count of files, you should be summing up values of all the keys in the hash.
Expand|Select|Wrap|Line Numbers
  1. my $raw_count=0;
  2. foreach (keys %count)  {
  3.  $raw_count+ = $count{$_} ; ## sum up the values
  4. }
  5.  
Oct 14 '08 #10
Thanks nithinpes !!!!

It worked fine...thanks once again...

Working Code:
Expand|Select|Wrap|Line Numbers
  1. use strict; 
  2. use warnings; 
  3. my $dir = 'C:/Performance_svap/INPUT_FILES';#<-- windows supports forward slashes in directory paths 
  4. my %count = (); #<-- hash to store counts 
  5. raw_stats_report($dir);#<-- call the function with $dir as its argument 
  6.  
  7. #######function to display the raw stats type report ################### 
  8. sub raw_stats_report { 
  9.     my $dir = $_[0] or die "No start directory defined\n"; 
  10.     chdir($dir) or die "Can't chdir to $dir: $!\n"; 
  11.     opendir(D, '.') or die "Can't opendir $dir: $!\n"; 
  12.     my @list = readdir(D); 
  13.     closedir(D);
  14.     foreach my $f (@list){ 
  15.         next if ($f =~ /\..+$/i); 
  16.         if ($f =~ /(sar-[dgur])/){ 
  17.            $count{$1}++;
  18.  
  19.         } 
  20.         elsif ($f =~ /(vmstat|mpstat)/){ 
  21.             $count{$1}++; 
  22.  
  23.         } 
  24.         elsif ($f =~ /(prstat-(?:mLV|Ls|Lvs))/){ 
  25.            $count{$1}++; 
  26.  
  27.         } 
  28.         elsif ($f =~ /((?:net|io)stat)/){ 
  29.            $count{$1}++; 
  30.  
  31.         }
  32.     }         
  33.     print "############# Raw_Stats_Report ################\n \n "; 
  34.     print 'There are total ',scalar(@list)-2," files present in $dir\n \n";
  35.     print "#############***********################\n \n";
  36.  
  37.     foreach my $c (sort keys %count) {
  38.         print "\n There are total $count{$c} $c files present in $dir\n \n ";
  39.     }
  40.     my $raw_count=0;
  41.     foreach (keys %count){ 
  42.         $raw_count+= $count{$_} ; ## sum up the values 
  43.     } 
  44.     print 'There are totally ',scalar($raw_count)," raw stats files present in $dir\n \n";
  45.     print "################################################\n \n "; 
  46.     print "Do you want me to continue ? Y | N \n"; 
  47.     chomp(my $pick = <STDIN>); 
  48.     if ($pick =~ /y/i){ 
  49.         print "Process execution will continue !!! \n"; 
  50.     } 
  51.     else{ 
  52.         print "Process execution stopped !!! \n"; 
  53.         exit(0); # <-- use exit instead of 'die' to end a script early 
  54.     } 
  55.  
Regards,
Vijayarl
Oct 14 '08 #11

Sign in to post your reply or Sign up for a free account.

Similar topics

5
by: Juho Saarikko | last post by:
I made a Python script which takes Usenet message bodies from a database, decodes uuencoded contents and inserts them as Large Object into a PostGreSQL database. However, it appears that the to...
1
by: philipl | last post by:
hi, does any one how to do regex to get the charset in this strings? "SCRIPT language="JavaScript" charset="ISO-8859-1">..."...."..."...">?" ^^^^^^^^^^ I want the value after charset,...
6
by: Du Dang | last post by:
Text: ===================== <script1> ***stuff A </script1> ***more stuff <script2> ***stuff B
4
by: | last post by:
Here is an interesting one. Running asp.net 2.0 beta 2. I have a regular expression used in a regex validator that works on the client side in Firefox but not in IE. Any ideas? IE always reports...
6
by: bruce | last post by:
Hi. Not exactly a regex pro, but usually I know enough to get by. Here is my problem: I wrote a regex to edit a decimal number. you're allowed 1,2, or 3 digits before the decimal point, and...
3
by: Brian | last post by:
I have a very small script: import re text = open('eq.txt','r').read() regex = '{3}(){3}' pattern = re.compile(regex) match = pattern.findall(text) print ''.join(match)
2
by: ars | last post by:
hi everyone i'm using some regular expression to paging, it work's fine in IE but not in Firefox i removed every thing to detect problem but i Can't , the only thing i got, is the innerHtml doent...
0
by: taidokas | last post by:
Hello I am using Regex in my masterpage to control the html. The problem is that when the HtmlTextWriter writes links to the stylesheets, ot writes lite this: <link rel="stylesheet"...
0
by: Karch | last post by:
I have these two methods that are chewing up a ton of CPU time in my application. Does anyone have any suggestions on how to optimize them or rewrite them without Regex? The most time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.