469,289 Members | 2,129 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,289 developers. It's quick & easy.

Negative regular expression not working at end of string

nithinpes
410 Expert 256MB
I tried the following regular expression in script to extract lines not begining with capital letters and not ending with numbers.
Expand|Select|Wrap|Line Numbers
  1.  if ($str =~/^[^A-Z].+[^0-9]$/)       ## similarly /^[^A-Z].+\D$/
  2.    {
  3.     ######
  4.    }
  5.  
This didn't work. The pattern condition was applied only to begining of string. The lines returned contained both digits & non-digits at the end.

However, I could achieve the task as follows:
Expand|Select|Wrap|Line Numbers
  1.  if (($str !~ /^[A-Z]/)&&($str !~ /[0-9]$/))
  2.    {
  3.     ######
  4.    }
  5.  
But, I need to know why the previous expression didn't work. I understand the problem with using negative grammar for pattern matching. But, I am unable to find any conceptual/ logical error in the first expression.
To boil down the above query,
Expand|Select|Wrap|Line Numbers
  1. if(/^[^A-Z]/)
  2. { print $_; }
  3.  
will successfully return lines not begining with capital letters, while
Expand|Select|Wrap|Line Numbers
  1. if(/[^0-9]$/)
  2. { print $_; }
  3.  
fails to return lines not ending with digits.It returns all lines.

Any help would be greatly appreciated.
Jan 21 '08 #1
2 1570
KevinADC
4,059 Expert 2GB
The end of string anchor "$" is supposed to match the pattern before any newline at the end of the string/line. And it does in this example:

Expand|Select|Wrap|Line Numbers
  1. while(<DATA>){
  2.    print if ($_ !~ /[0-9]$/);
  3. }
  4.  
  5. __DATA__
  6. This is a test
  7. This is a test 96
  8. this is a test
  9. this is a test 99
  10. this is another test
The lines with digits on the end do not get printed. But when I change it to a negated character class the newline is not being ignored:

Expand|Select|Wrap|Line Numbers
  1. while(<DATA>){
  2.    print if ($_ =~ /[^0-9]$/);
  3. }
  4.  
  5. __DATA__
  6. This is a test
  7. This is a test 96
  8. this is a test
  9. this is a test 99
  10. this is another test 
and all the lines are printed. But if I chomp() each line the ones with digits on the end are not printed:

Expand|Select|Wrap|Line Numbers
  1. while(<DATA>){
  2.    chomp;
  3.    print if ($_ =~ /[^0-9]$/);
  4. }
  5.  
  6. __DATA__
  7. This is a test
  8. This is a test 96
  9. this is a test
  10. this is a test 99
  11. this is another test
Evidently a negated character class tied to the end of the line/string does not ignore the record seperator (a newline in this case) and the match fails. I could not find this documented anywhere though.
Jan 21 '08 #2
nithinpes
410 Expert 256MB
Thanks a lot!
This concept is not documented in any of the books that I refered or in online references.
Jan 21 '08 #3

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

reply views Thread by brian fischer via .NET 247 | last post: by
3 posts views Thread by pagates | last post: by
7 posts views Thread by Billa | last post: by
25 posts views Thread by Mike | last post: by
2 posts views Thread by writebrent | last post: by
6 posts views Thread by =?Utf-8?B?ZGlhdG9tQG5ld3Nncm91cC5ub3NwYW0=?= | last post: by
4 posts views Thread by carlos | last post: by
1 post views Thread by CARIGAR | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.