473,503 Members | 3,740 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Downloading data from online site

79 New Member
Hi,

I wanted to download some online data from some site so i used perl LWP module.

code i wrote was like this:

Expand|Select|Wrap|Line Numbers
  1. use LWP;
  2. use LWP::UserAgent;
  3. $ua=new LWP::UserAgent;
  4. $ua->proxy(['http','ftp'],'http://group:guest@proxy.ibab.ac.in:3128');
  5.  
  6. open(FH,">datafile.txt");
  7. chmod 0777,"datafile.txt";
  8. $url="http://xyz.com";
  9. sleep(5);
  10. $response=$ua->get($url);
  11. $result=$response->content;
  12. print FH $result
  13. close FH;
  14.  
The data is written to datafile. I am connecting 3 times with different query terms to that url. The problem i am facing is the speed i.e time taken to get the data.

Is there any other method which can reduce the time taken to get the data?

I tried fcgi but no improvements!!

Is LWP module itself is slow??

Any suggestions?

with regards
Archana
Dec 29 '08 #1
9 1902
KevinADC
4,059 Recognized Expert Specialist
The time it takes the module to compile and connect to the remote site should not be significant. How long does it take? How long do you think it should take?
Dec 29 '08 #2
Archanak
79 New Member
@KevinADC
Hi,

For one connection i.e once fetching data its taking 1 min so total 3 mins.

I am working on linux platform.

Atleast it should take a minute for 3 connections (since 3 times i am connecting)

Any suggestions??

with regards
Archana
Dec 30 '08 #3
KevinADC
4,059 Recognized Expert Specialist
If your script takes a full minute to fetch a short webpage then I would assume you just have a poor internet connection, on the other hand if it is fethcing a few megabytes in a minute that doesn't seem to bad. But I have no idea how it could be speeded up.
Dec 30 '08 #4
Archanak
79 New Member
@KevinADC
Hi,

Actually the data is text data and the speed i use is about 100.0 Mbps.

Is LWP module itself is slow???

Any other alternative to speedup the process?

with regards
Archana
Dec 30 '08 #5
numberwhun
3,509 Recognized Expert Moderator Specialist
Even if you are using a fast connection, if the place you are fetching from is slow, then that will certainly effect the time as well so please take that into account.

You could, as an alternative, try using the WWW::Mechanize module. It is quite the little gem and I have used it to fetch files from the internet.

Regards,

Jeff
Dec 30 '08 #6
Archanak
79 New Member
@numberwhun
Hi,

I installed WWW::Mechanize module and the problem i am facing is that it is working in the command prompt and not in the browser.

Here is the code

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl
  2.  
  3. use CGI;
  4.  
  5. use CGI qw(:standard);
  6.  
  7. use WWW::Mechanize;
  8.  
  9. print header;
  10.  
  11. my $m = WWW::Mechanize->new;
  12.  
  13. $m->get("http://www.google.com");
  14.  
  15. print $m->content();
  16.  
  17.  
It is giving error when i run in the browser.

Error is:

Expand|Select|Wrap|Line Numbers
  1.  
  2. Error GETing http://www.google.com: Can't connect to www.google.com:80 (connect: timeout) at  line 13
  3.  
  4.  
How can i solve this problem??

Regards
Archana
Jan 17 '09 #7
numberwhun
3,509 Recognized Expert Moderator Specialist
I noticed in your script you didn't start off the CGI with something like this:

Expand|Select|Wrap|Line Numbers
  1. print "Content-type: text/html\r\n\r\n";
  2.  
You should probably check out a CGI tutorial before continuing.

Regards,

Jeff
Jan 17 '09 #8
KevinADC
4,059 Recognized Expert Specialist
@numberwhun
Jeff,

They have "print header;" in the script, which essentially does what you have suggested, prints the http header.
Jan 17 '09 #9
KevinADC
4,059 Recognized Expert Specialist
The code as posted works for me. Could be a firewall is blocking perl from accessing the internet.
Jan 17 '09 #10

Sign in to post your reply or Sign up for a free account.

Similar topics

1
2316
by: Reimar Bauer | last post by:
Dear all I recognized today that if I have an echo statement on an else at the end of my script the only thing which was downloaded by wget was this string. Is this a common way to prevent...
1
2493
by: Hugh McLaughlin | last post by:
Hello everyone and thanks for your help in advance. I am working on an application that allows a user to downlad a zip file from an FTP site. I have used the following code: Dim URL As String...
10
2876
by: Noozer | last post by:
Below is some ASP, HTML and javascript. It is part of a page used to maintain a small database. This code did work at one time, but has since stopped. For some reason the data on my form is not...
16
12610
by: thomas peter | last post by:
I am building a precache engine... one that request over 100 pages on an remote server to cache them remotely... can i use the HttpWebRequest and WebResponse classes for this? or must i use the...
3
1393
by: ACaunter | last post by:
Hi all, I was wondering if there was a way a client's machine could play audio or view a word document which is stored on the server, without having them download it first. This will be done over...
4
2916
by: Joe | last post by:
I'm hosting my web service on a Windows 2003 box which is remotely located. When trying to add a web reference to a C# project I get an error message 'There was an error downloading...
9
6607
by: MR | last post by:
I get the following Exception "The data at the root level is invalid. Line 1, position 642" whenever I try to deserialize an incoming SOAP message. The incoming message is formed well and its...
5
1589
by: Trapulo | last post by:
Hi, I've a big trouble with an asp.net page. Basically, the page has an swf that starts and downloads some data from an other asp.net page. The application works well on my testing IIS 6, but...
3
1707
by: xeroxero | last post by:
I would like to prevent people from downloading a .zip from a ASP.NET 2.0 web site, but still allow people to touch a .aspx page in the same directory. I also want to set things so if a user clicks...
0
7064
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7261
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
1
6974
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7445
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
4991
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4665
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
1492
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
721
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
369
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.