By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,321 Members | 1,909 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,321 IT Pros & Developers. It's quick & easy.

How can I get the information between the <tr><td> and </td></tr>?

P: 6
Hi,everyone:

I have a problem now. I can't get the information between the <tr><td> and </td></tr>.
for example:
I use this regular expression can't get it, I don't know why.
$test=~/<tr><td>(.*)<\/td><\/td>(.*)<\/td><\/tr>/ms;
<tr><td>station</td> <td>station number/identification, see chart above: <br>
B = GoMoos buoy B location<br>
S = New Scantum at the southern edge Jefferys Ledge</td></tr>.

I want to get this information as follow:
station: station number/identification, see chart above: <br>
B = GoMoos buoy B location<br>
S = New Scantum at the southern edge Jefferys Ledge

Can you help me resolve this problem? Thank you!
Apr 7 '08 #1
Share this Question
Share on Google+
7 Replies


KevinADC
Expert 2.5K+
P: 4,059
You need to use stingy matching, instead of (.*) use (.*?)
Apr 7 '08 #2

P: 6
You need to use stingy matching, instead of (.*) use (.*?)
I tried it,but still doesn't work. so I don't know why.
Thank you
Apr 10 '08 #3

KevinADC
Expert 2.5K+
P: 4,059
I tried it,but still doesn't work. so I don't know why.
Thank you

Me either. Post your code and the data for more help.
Apr 10 '08 #4

P: 6
Me either. Post your code and the data for more help.
Expand|Select|Wrap|Line Numbers
  1. #! /usr/bin/perl -w
  2.  
  3. use strict;
  4. use CGI;
  5. use LWP::Simple;
  6.  
  7. my $contest=get("http://nec.whoi.edu/jg/info/NEC/Habitat_Ecology/ctd_JR{dir=nec.whoi.edu/jg/dir/NEC/Habitat_Ecology/,data=www.pulse.unh.edu/jg/serv/NEC/CTD_Hydrography/ctd_JR.html0");
  8.      my @contest=split("\n",$contest);        # get every line in the contest.
  9.  
  10.      my $tmp;
  11.      my ($flag_tbl_beg,$flag_tbl_end)=0;
  12.  
  13.       for (my $k=0;$k<=$#contest;$k++)
  14.      {
  15.        $contest[$k]=~s/^\s+//;
  16.        $contest[$k]=~s/\s+$//;
  17.        if ($contest[$k]=~/<h1>/)     #  the behind of <h1> is the dataset_name
  18.         { 
  19.           $tmp=$k;              # get the number of dataset name's line, be used to delete the two control button
  20.          }
  21.       }
  22.  
  23.       for (my $k=$tmp;$k<=$#contest;$k++)
  24.        { 
  25.          $contest[$k]=~s/^\s+//;
  26.          $contest[$k]=~s/\s+$//;
  27.  
  28.  
  29.          if ($contest[$k]=~/^<tr><th>(.*)<\/th><th>(.*)<\/th><\/tr>$/)
  30.              { $flag_tbl_beg=$k;next;}
  31.  
  32.          if ($contest[$k]=~/<\/table>/i) 
  33.              { $flag_tbl_end=$k;next;}
  34.          }
  35.  
  36.  
  37.  
  38.  
  39. my $test;
  40. for (my $m=$flag_tbl_beg+1;$m<$flag_tbl_end;$m++) 
  41. {
  42.    $contest[$m]=~s/^\s+//;
  43.   $contest[$m]=~s/\s+$//;
  44.  
  45.      # get the table of parameters
  46.    my @test;    
  47.  if($contest[$m]=~/<tr><td>(.*)?<\/td><td>(.*)?<\/td><\/tr>/)
  48.  { my $test1=$1;
  49.           my $test2=$2;
  50.           print "#################################","\n";
  51.           print $test1,"\n";
  52.           print "#################################","\n";
  53.           print $test2,"\n";
  54.           next;
  55.  
  56.    }
  57.   else
  58.   {
  59.     push @test,$contest[$m];
  60.     my $test=join("\n",@test);
  61.     $test=~s/\n//g;
  62.  
  63.  
  64.     if($test=~/<tr><td>(.*)?<\/td><td>(.*)?<\/td><\/tr>/ms)
  65.     {
  66.      $test=~/<tr><td>(.*)?<\/td><td>(.*)?<\/td><\/tr>/ms; 
  67.      my $test3=$1;
  68.      my $test4=$2;
  69.      print "#################################","\n";
  70.      print $test3,"\n";
  71.      print "#################################","\n";
  72.      print $test4,"\n"; 
  73.      #print $test,"\n";
  74.      @test=();
  75.      }
  76.    }
  77.  
  78. }
  79.  
This is the script. you can run it and then you can't see the "visit" and "station" in the result. so this is the problem. I tried several times and take lots of time on it.and I still don't know how to resolve.
Apr 11 '08 #5

KevinADC
Expert 2.5K+
P: 4,059
This is not what I recommened you do:

(.*)?

look at me post, I said to use:

(.*?)

correct that and retry your code. I am not going to run your code and do all your debugging. You run it and narrow down the scope of the problem. Or maybe someone else will do that for you.
Apr 11 '08 #6

P: 6
This is not what I recommened you do:

(.*)?

look at me post, I said to use:

(.*?)

correct that and retry your code. I am not going to run your code and do all your debugging. You run it and narrow down the scope of the problem. Or maybe someone else will do that for you.
I was try it, but it still doesn't work.
Apr 14 '08 #7

P: 6
Thank you
I know where is my problem; that is just because I set wrong of the variable's range.
Apr 14 '08 #8

Post your reply

Sign in to post your reply or Sign up for a free account.