Connecting Tech Pros Worldwide Forums | Help | Site Map

help with regular expressions

Newbie
 
Join Date: Dec 2007
Posts: 7
#1: Dec 21 '07
for a program i'm makin I need to be able to analyze lines like the following one

1 waarschijnlijk neiging tot suikerziekte ccaggcccac ctggagactc

and split it up in a part that containes the number
a part that containes the text
and as a last part the dna sequence ccaggcccac ctggagactc

now i can do that with the following code

Expand|Select|Wrap|Line Numbers
  1. # haal de ziekte codes er uit
  2.         if($line =~ m/(\b[ctga]+\b)(.*)/)
  3.         {
  4.  
  5.             $part3 = $1.$2; 
  6.  
  7.             # haal het einde lijn teken eraf
  8.             chop $part3;
  9.  
  10.         }
  11.  
  12.         # haal het nummer en de naam uit de string
  13.         if($line =~ m/(\d+)(.*?)(\b[gtac]+\b)/)
  14.         {
  15.  
  16.                         # $1 = the numeric part, $2 the text
  17.             # nu maken we een hash met het nummer als keyword 
  18.             $ziektenaamhash{$1} = $2;
  19.  
  20.             # we maken ook een hash waarbij het nummer verwijst 
  21.             $ziektehash{$1} = $part3;
  22.  
  23.         }
but I'm not really happy with the $1.$2 part as it forces me to use the chop operation every time.
Is there a more efficient way to be split the lines in those parts or is my code just fine?

KevinADC's Avatar
Expert
 
Join Date: Jan 2007
Location: Southern California USA
Posts: 4,091
#2: Dec 21 '07

re: help with regular expressions


just remove the \b parts from the capturing parenthesis:
Expand|Select|Wrap|Line Numbers
  1. if($line =~ m/\b([ctga]+)\b(.*)/)
and you shouldn't have to use chop;
Newbie
 
Join Date: Dec 2007
Posts: 7
#3: Dec 21 '07

re: help with regular expressions


Quote:

Originally Posted by KevinADC

just remove the \b parts from the capturing parenthesis:

Expand|Select|Wrap|Line Numbers
  1. if($line =~ m/\b([ctga]+)\b(.*)/)
and you shouldn't have to use chop;

I have to chop because when perl somehow adds a character after the concatenation and that screws up my searching expression when I'm trying to find what i found here in text file
KevinADC's Avatar
Expert
 
Join Date: Jan 2007
Location: Southern California USA
Posts: 4,091
#4: Dec 21 '07

re: help with regular expressions


Quote:

Originally Posted by adriaan

I have to chop because when perl somehow adds a character after the concatenation and that screws up my searching expression when I'm trying to find what i found here in text file

if you are reading the lines in from a file you should just chomp the line before doing anything else to it to remove the newline from the end. Perl will not add any characters that are not already there.
Reply