468,301 Members | 1,447 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,301 developers. It's quick & easy.

Trying to extract a string from HTTP::Request object

I'm trying to extract HTML from a website in the form of a string, and then I want to extract particular elements from the string using the substr function:
here is some sample code that I have thus far:

Expand|Select|Wrap|Line Numbers
  1. use HTTP::Request::Common;
  2. use LWP::UserAgent;
  3. use LWP::Simple;
  4.  
  5. $ua = LWP::UserAgent->new;
  6.  
  7. $request = HTTP::Request->new(GET => 'http://www.cnn.com');
  8. $response = $ua->request($request);
  9. $content = $response->content();
  10.  
  11. my $result2 = substr $content, index($content, 'Headlines');
  12.  
So, the variable $content seems to be an HTML object or something that is NOT a string. How can I convert $content to a string, so that I can use the substr function?

I have tried other methods including simpler code:

Expand|Select|Wrap|Line Numbers
  1. my $content = get('http://securities.stanford.edu//1014/TCHC00');
  2.  
however, I am not able to process $content as a string.

I have even tried putting the contents into a text file, but I am not able to extract a string from a text file either?

any help is appreciated!!!
Mar 27 '07 #1
1 3255
I ran your program through the Perl debugger, and confirmed that $response->content() definitely contains HTML. When I put the output into an editor, I found that the content does NOT contain "Headlines." Try another keyword, like "Weather."

If you want to examine variables without the debugger, use this code (provided your Perl has the Dumpvalue module):
Expand|Select|Wrap|Line Numbers
  1. use HTTP::Request::Common;
  2. use LWP::UserAgent;
  3. use LWP::Simple;
  4. use Dumpvalue;
  5.  
  6. $dumper=new Dumpvalue;
  7.  
  8. $ua = LWP::UserAgent->new;
  9.  
  10. $request = HTTP::Request->new(GET => 'http://www.cnn.com');
  11. $response = $ua->request($request);
  12. $content = $response->content();
  13.  
  14. $dumper->dumpValue(\$response);
  15.  
  16. my $result2 = substr $content, index($content, 'Headlines');
  17.  
Then when you run it, save the output to a text file. On my Windows box, with ActiveState Perl, I used this:
Expand|Select|Wrap|Line Numbers
  1. C:\cygwin\home\Rick\perl>perl getreq.pl > output.txt
  2.  
In the Perl debugger, this is what I see when I used 'Weather':
Expand|Select|Wrap|Line Numbers
  1. main::(getreq.pl:15):   my $result2 = substr $content, index($content, 'Weather');
  2.   DB<1>
  3. main::(getreq.pl:17):   print $result2;
  4.   DB<1> print length($result2)
  5. 105335
  6.   DB<2> print substr $result2, 0, 20
  7. Weather, Entertainme
  8.  
You're on the right track. Prove what each line does, and learn the Perl debugger to get interactive.
Mar 29 '07 #2

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

11 posts views Thread by Marcos | last post: by
7 posts views Thread by Mullin Yu | last post: by
5 posts views Thread by Henrik | last post: by
2 posts views Thread by Centaury | last post: by
1 post views Thread by shannonw | last post: by
2 posts views Thread by vunet.us | last post: by
7 posts views Thread by Ron Garret | last post: by
1 post views Thread by Edwin.Madari | last post: by
reply views Thread by NPC403 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.