By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,541 Members | 1,455 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,541 IT Pros & Developers. It's quick & easy.

Downloading data from online site

P: 79
Hi,

I wanted to download some online data from some site so i used perl LWP module.

code i wrote was like this:

Expand|Select|Wrap|Line Numbers
  1. use LWP;
  2. use LWP::UserAgent;
  3. $ua=new LWP::UserAgent;
  4. $ua->proxy(['http','ftp'],'http://group:guest@proxy.ibab.ac.in:3128');
  5.  
  6. open(FH,">datafile.txt");
  7. chmod 0777,"datafile.txt";
  8. $url="http://xyz.com";
  9. sleep(5);
  10. $response=$ua->get($url);
  11. $result=$response->content;
  12. print FH $result
  13. close FH;
  14.  
The data is written to datafile. I am connecting 3 times with different query terms to that url. The problem i am facing is the speed i.e time taken to get the data.

Is there any other method which can reduce the time taken to get the data?

I tried fcgi but no improvements!!

Is LWP module itself is slow??

Any suggestions?

with regards
Archana
Dec 29 '08 #1
Share this Question
Share on Google+
9 Replies


KevinADC
Expert 2.5K+
P: 4,059
The time it takes the module to compile and connect to the remote site should not be significant. How long does it take? How long do you think it should take?
Dec 29 '08 #2

P: 79
@KevinADC
Hi,

For one connection i.e once fetching data its taking 1 min so total 3 mins.

I am working on linux platform.

Atleast it should take a minute for 3 connections (since 3 times i am connecting)

Any suggestions??

with regards
Archana
Dec 30 '08 #3

KevinADC
Expert 2.5K+
P: 4,059
If your script takes a full minute to fetch a short webpage then I would assume you just have a poor internet connection, on the other hand if it is fethcing a few megabytes in a minute that doesn't seem to bad. But I have no idea how it could be speeded up.
Dec 30 '08 #4

P: 79
@KevinADC
Hi,

Actually the data is text data and the speed i use is about 100.0 Mbps.

Is LWP module itself is slow???

Any other alternative to speedup the process?

with regards
Archana
Dec 30 '08 #5

numberwhun
Expert Mod 2.5K+
P: 3,503
Even if you are using a fast connection, if the place you are fetching from is slow, then that will certainly effect the time as well so please take that into account.

You could, as an alternative, try using the WWW::Mechanize module. It is quite the little gem and I have used it to fetch files from the internet.

Regards,

Jeff
Dec 30 '08 #6

P: 79
@numberwhun
Hi,

I installed WWW::Mechanize module and the problem i am facing is that it is working in the command prompt and not in the browser.

Here is the code

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl
  2.  
  3. use CGI;
  4.  
  5. use CGI qw(:standard);
  6.  
  7. use WWW::Mechanize;
  8.  
  9. print header;
  10.  
  11. my $m = WWW::Mechanize->new;
  12.  
  13. $m->get("http://www.google.com");
  14.  
  15. print $m->content();
  16.  
  17.  
It is giving error when i run in the browser.

Error is:

Expand|Select|Wrap|Line Numbers
  1.  
  2. Error GETing http://www.google.com: Can't connect to www.google.com:80 (connect: timeout) at  line 13
  3.  
  4.  
How can i solve this problem??

Regards
Archana
Jan 17 '09 #7

numberwhun
Expert Mod 2.5K+
P: 3,503
I noticed in your script you didn't start off the CGI with something like this:

Expand|Select|Wrap|Line Numbers
  1. print "Content-type: text/html\r\n\r\n";
  2.  
You should probably check out a CGI tutorial before continuing.

Regards,

Jeff
Jan 17 '09 #8

KevinADC
Expert 2.5K+
P: 4,059
@numberwhun
Jeff,

They have "print header;" in the script, which essentially does what you have suggested, prints the http header.
Jan 17 '09 #9

KevinADC
Expert 2.5K+
P: 4,059
The code as posted works for me. Could be a firewall is blocking perl from accessing the internet.
Jan 17 '09 #10

Post your reply

Sign in to post your reply or Sign up for a free account.