Connecting Tech Pros Worldwide Help | Site Map

how to extract $1, $2, etc on same line?

BeemerBiker's Avatar
Member
 
Join Date: Jul 2008
Location: San Antonio, Texas
Posts: 68
#1: Feb 16 '09
I spent some time on this but need help. I do not know how to organize my match pattern so as to extract several strings at once.

Expand|Select|Wrap|Line Numbers
  1. $a = "<a href=blahblah?tnow=12345&amp;ttok=ce9ea78>go</a>";
  2. $a =~ m|tnow=(.*?)&|;
  3. print "$1\n";
  4. $a =~ m|ttok=(.*?)>|;
  5. print "$1\n";
  6.  
The above prints
12345
ce9ea78

but how can I get it in one line so as to use (for example)

Expand|Select|Wrap|Line Numbers
  1. print "$1\n$2\n" if ...
  2.  
thanks for looking
KevinADC's Avatar
Expert
 
Join Date: Jan 2007
Location: Southern California USA
Posts: 4,091
#2: Feb 16 '09

re: how to extract $1, $2, etc on same line?


Not sure I understand your question, but maybe:

Expand|Select|Wrap|Line Numbers
  1. $a = "<a href=blahblah?tnow=12345&amp;ttok=ce9ea78>go</a>";
  2. $a =~ m|tnow=(.*?)&|;
  3. $var1 = $1;
  4. $a =~ m|ttok=(.*?)>|;
  5. $var2 = $1;
  6. print "$var1\n$var2\n";
  7.  
BeemerBiker's Avatar
Member
 
Join Date: Jul 2008
Location: San Antonio, Texas
Posts: 68
#3: Feb 16 '09

re: how to extract $1, $2, etc on same line?


Quote:

Originally Posted by KevinADC View Post

Not sure I understand your question, but maybe:

Hi Kevin - Look at example 5 here I spent a couple of hours looking at it before I gave up. The author managed to extract 4 strings all at once with this single line of perl code
$url =~ m|(\w+)://([^/:]+)(:\d+)?/(.*)|;
ie: $1, $2, $3 and $4 were assigned in the above statement.

I was trying to do the same thing ie: all in one line of code. However most of the code examples I see (perl) are what I consider cryptic and extracting one variable at a time on each line (like I showed in my original post) is easier to read and only took a few minutes to code once I understood how (.*?) worked.

The problem I had with example 5 is in figuring out how to arrange the multiple () as I needed two of them so as to extract tnow and ttok into $1 and $2

Currently, I have found it is a lot easier to use split to extract strings I want but I would like to get up to speed on regular expressions.
numberwhun's Avatar
Site Moderator
 
Join Date: May 2007
Location: New Hampshire
Posts: 2,565
#4: Feb 16 '09

re: how to extract $1, $2, etc on same line?


First, please learn to use and do use, code tags, as they are required.

Second, regular expressions are certainly cryptic if you haven't learned them. Once you learn what all the elements do, then you will understand how to read them. Its really quite simple once you apply yourself and learn them.

Putting the match all into one line, as you described the other author doing is exactly what you should do. Its a matter of making a regex that matches not only the elements you want to match and capture, but also matches the text around, without capturing.

Expand|Select|Wrap|Line Numbers
  1. $a = "<a href=blahblah?tnow=12345&amp;ttok=ce9ea78>go</a>";
  2. $a =~ /^.*tnow=(\w*)&.*ttok=(\w*)>.*$/;
  3.  
Its quick and dirty, but I tested the above code and it works. I really recommend that you learn all about regular expressions. Perl regex's are the most powerful and are an incredible tool in your arsenal of coding.

Regards,

Jeff
KevinADC's Avatar
Expert
 
Join Date: Jan 2007
Location: Southern California USA
Posts: 4,091
#5: Feb 16 '09

re: how to extract $1, $2, etc on same line?


Quote:

Originally Posted by BeemerBiker View Post

Hi Kevin - Look at example 5 here I spent a couple of hours looking at it before I gave up. The author managed to extract 4 strings all at once with this single line of perl code
$url =~ m|(\w+)://([^/:]+)(:\d+)?/(.*)|;
ie: $1, $2, $3 and $4 were assigned in the above statement.

I was trying to do the same thing ie: all in one line of code. However most of the code examples I see (perl) are what I consider cryptic and extracting one variable at a time on each line (like I showed in my original post) is easier to read and only took a few minutes to code once I understood how (.*?) worked.

The problem I had with example 5 is in figuring out how to arrange the multiple () as I needed two of them so as to extract tnow and ttok into $1 and $2

Currently, I have found it is a lot easier to use split to extract strings I want but I would like to get up to speed on regular expressions.

If you know the patterns in the string are always in this order:

<a href=blahblah?tnow=12345&amp;ttok=ce9ea78>go</a>

it is pretty straight forward to write a regexp to search and capture the data you want. Jeff posted one, here is another one:

Expand|Select|Wrap|Line Numbers
  1. $a = "<a href=blahblah?tnow=12345&amp;ttok=ce9ea78>go</a>";
  2. $a =~ /<a href=[^?]+\?tnow=([^&]+)&amp;ttok=([^>]+)>/;
  3. print "$1\n$2";
Regexp are probably the most cryptic code there is in perl. Its all symbols that you need to learn the meaning of and how their meaning changes depending on the context they are used in. You can learn the basics pretty quickly but it takes a long time to master them.
Reply