473,386 Members | 1,668 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Trying to extract a string from HTTP::Request object

I'm trying to extract HTML from a website in the form of a string, and then I want to extract particular elements from the string using the substr function:
here is some sample code that I have thus far:

Expand|Select|Wrap|Line Numbers
  1. use HTTP::Request::Common;
  2. use LWP::UserAgent;
  3. use LWP::Simple;
  4.  
  5. $ua = LWP::UserAgent->new;
  6.  
  7. $request = HTTP::Request->new(GET => 'http://www.cnn.com');
  8. $response = $ua->request($request);
  9. $content = $response->content();
  10.  
  11. my $result2 = substr $content, index($content, 'Headlines');
  12.  
So, the variable $content seems to be an HTML object or something that is NOT a string. How can I convert $content to a string, so that I can use the substr function?

I have tried other methods including simpler code:

Expand|Select|Wrap|Line Numbers
  1. my $content = get('http://securities.stanford.edu//1014/TCHC00');
  2.  
however, I am not able to process $content as a string.

I have even tried putting the contents into a text file, but I am not able to extract a string from a text file either?

any help is appreciated!!!
Mar 27 '07 #1
1 3622
I ran your program through the Perl debugger, and confirmed that $response->content() definitely contains HTML. When I put the output into an editor, I found that the content does NOT contain "Headlines." Try another keyword, like "Weather."

If you want to examine variables without the debugger, use this code (provided your Perl has the Dumpvalue module):
Expand|Select|Wrap|Line Numbers
  1. use HTTP::Request::Common;
  2. use LWP::UserAgent;
  3. use LWP::Simple;
  4. use Dumpvalue;
  5.  
  6. $dumper=new Dumpvalue;
  7.  
  8. $ua = LWP::UserAgent->new;
  9.  
  10. $request = HTTP::Request->new(GET => 'http://www.cnn.com');
  11. $response = $ua->request($request);
  12. $content = $response->content();
  13.  
  14. $dumper->dumpValue(\$response);
  15.  
  16. my $result2 = substr $content, index($content, 'Headlines');
  17.  
Then when you run it, save the output to a text file. On my Windows box, with ActiveState Perl, I used this:
Expand|Select|Wrap|Line Numbers
  1. C:\cygwin\home\Rick\perl>perl getreq.pl > output.txt
  2.  
In the Perl debugger, this is what I see when I used 'Weather':
Expand|Select|Wrap|Line Numbers
  1. main::(getreq.pl:15):   my $result2 = substr $content, index($content, 'Weather');
  2.   DB<1>
  3. main::(getreq.pl:17):   print $result2;
  4.   DB<1> print length($result2)
  5. 105335
  6.   DB<2> print substr $result2, 0, 20
  7. Weather, Entertainme
  8.  
You're on the right track. Prove what each line does, and learn the Perl debugger to get interactive.
Mar 29 '07 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

11
by: Marcos | last post by:
This script not work with last Opera 8, but its work with Internet Explorer and Firefox, the HTTP Request yet not work fine on the Opera? function sendmessage(url, querystrings) { xmlhttp =...
7
by: Mullin Yu | last post by:
I want to submit a utf-8 xml request to a servlet by the following coding. it seesm that the servlet can't recognize it correctly. can i just using string postData = "..... utf-8 data" and then...
5
by: Henrik | last post by:
Hi, I am trying to read some industrial webservers using the HTTP/CGI webequest like this: wrs = (HttpWebRequest)WebRequest.Create(HTTP/CGI-string); mwst = (HttpWebResponse wrs.GetResponse();...
2
by: Centaury | last post by:
Hello, some time ago I've created a site in asp in combination with the XML http request object, this way I was able to load content from a database into div elements, this way there is no need to...
1
by: omantawy | last post by:
Hi, I have some legacy ASP web applications that use an unmanaged COM component to connect to a third party application. The third part application has moved to the managed code in the current...
1
by: shannonw | last post by:
New to .NET & VB and looking for some code to help me on my way. Looking particularly for an http request object. That is: I want an object I can plug into an application that will accept all...
2
by: vunet.us | last post by:
Why ASP's MS XML HTTP request object gets another page's HTML source without interpreting path differences. For example, if my page is: www.test.asp/one/two/page1.asp with XMLHTTP getting...
7
by: Ron Garret | last post by:
I'm writing a little HTTP server and need to parse request content that is mime-encoded. All the MIME routines in the Python standard library seem to have been subsumed into the email package,...
1
by: Edwin.Madari | last post by:
from each line separate out url and request parts. split the request into key-value pairs, use urllib to unquote key-value pairs......as show below... import urllib line = "GET...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.