By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,852 Members | 1,426 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,852 IT Pros & Developers. It's quick & easy.

Regular Expression Question

crystal2005
P: 44
Hi all,

My short program below is to read directories names form command line. It should only start with alphanumeric characters and contain no other characters except spaces, underscores and dashes.

So far what i have done below is only accepting alphanumeric characters and underscores. My question is, how do i add another character exception instead of using "\W" that reads only alphanumerics and underscores?

Thanks of any kind help.

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl -w
  2. use strict;
  3.  
  4. my $path1=$ARGV[0];
  5. my $path2=$ARGV[1];
  6. my @dir1;
  7. my @dir2;
  8.  
  9. #Read The First Directory
  10. if($path1 =~ /\W/) {    
  11.     print "Invalid directory!\n";
  12.     exit;    
  13. }
  14. else {
  15.     opendir DIR1, $path1 or die exit;
  16.     @dir1 = sort readdir DIR1;
  17.     closedir DIR1;
  18. }
  19.  
  20. #Read The Second Directory
  21. if($path2 =~ /\W/) {    
  22.     print "Invalid directory!\n";
  23.     exit;    
  24. }
  25. else {
  26.     opendir DIR2, $path2 or die exit;
  27.     @dir2 = sort readdir DIR2;
  28.     closedir DIR2;
  29. }
  30.  
Apr 15 '08 #1
Share this Question
Share on Google+
5 Replies


nithinpes
Expert 100+
P: 410
Hi all,

My short program below is to read directories names form command line. It should only start with alphanumeric characters and contain no other characters except spaces, underscores and dashes.

So far what i have done below is only accepting alphanumeric characters and underscores. My question is, how do i add another character exception instead of using "\W" that reads only alphanumerics and underscores?

Thanks of any kind help.

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl -w
  2. use strict;
  3.  
  4. my $path1=$ARGV[0];
  5. my $path2=$ARGV[1];
  6. my @dir1;
  7. my @dir2;
  8.  
  9. #Read The First Directory
  10. if($path1 =~ /\W/) {    
  11.     print "Invalid directory!\n";
  12.     exit;    
  13. }
  14. else {
  15.     opendir DIR1, $path1 or die exit;
  16.     @dir1 = sort readdir DIR1;
  17.     closedir DIR1;
  18. }
  19.  
  20. #Read The Second Directory
  21. if($path2 =~ /\W/) {    
  22.     print "Invalid directory!\n";
  23.     exit;    
  24. }
  25. else {
  26.     opendir DIR2, $path2 or die exit;
  27.     @dir2 = sort readdir DIR2;
  28.     closedir DIR2;
  29. }
  30.  
Apart from not including spaces in the exception of special characters, you are also not checking that name should begin with alphanumeric characters.
\W also includes spaces or in other words \w doesn't include spaces. To exclude spaces use the expanded form.

Replace:

Expand|Select|Wrap|Line Numbers
  1. if($path1 =~ /\W/)
  2.  
with
Expand|Select|Wrap|Line Numbers
  1. if($path1 =~ /[^A-Za-z0-9_ ]/|| /^[^A-Za-z0-9]/)
  2.  
The pattern /^[^A-Za-z0-9]/ will match words that begin with non-alphanumeric characters.
Apr 15 '08 #2

crystal2005
P: 44
Apart from not including spaces in the exception of special characters, you are also not checking that name should begin with alphanumeric characters.
\W also includes spaces or in other words \w doesn't include spaces. To exclude spaces use the expanded form.

Replace:

Expand|Select|Wrap|Line Numbers
  1. if($path1 =~ /\W/)
  2.  
with
Expand|Select|Wrap|Line Numbers
  1. if($path1 =~ /[^A-Za-z0-9_ ]/|| /^[^A-Za-z0-9]/)
  2.  
The pattern /^[^A-Za-z0-9]/ will match words that begin with non-alphanumeric characters.
Hi, thanks for the help.

The pattern /^[^A-Za-z0-9]/, does it mean it will accept all special characters?
From what i have read about directory naming, it doesn't accept \ / : * ? " < > |.
Apr 15 '08 #3

eWish
Expert 100+
P: 971
Check out an article that was posted in the Howto's section titled "Character Classes and Special Characters" maybe that will help clear things up.

--Kevin
Apr 15 '08 #4

nithinpes
Expert 100+
P: 410
Hi, thanks for the help.

The pattern /^[^A-Za-z0-9]/, does it mean it will accept all special characters?
From what i have read about directory naming, it doesn't accept \ / : * ? " < > |.
In the pattern, the caret(^) symbol denotes begining of line/string. But when used within the character class([ ]), caret stands for negation. In this case [^A-Za-z0-9] would match any character other than A-Z,a-z and 0-9.
Hence, /^[^A-Za-z0-9]/ will match all lines that do not begin with alphanumeric characters.

For better understanding of character class and special characters, go through the link that Kevin has posted.
Apr 16 '08 #5

crystal2005
P: 44
Thanks a lot, i got it useful . . .
Apr 16 '08 #6

Post your reply

Sign in to post your reply or Sign up for a free account.