473,569 Members | 2,768 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Downloading data from online site

79 New Member
Hi,

I wanted to download some online data from some site so i used perl LWP module.

code i wrote was like this:

Expand|Select|Wrap|Line Numbers
  1. use LWP;
  2. use LWP::UserAgent;
  3. $ua=new LWP::UserAgent;
  4. $ua->proxy(['http','ftp'],'http://group:guest@proxy.ibab.ac.in:3128');
  5.  
  6. open(FH,">datafile.txt");
  7. chmod 0777,"datafile.txt";
  8. $url="http://xyz.com";
  9. sleep(5);
  10. $response=$ua->get($url);
  11. $result=$response->content;
  12. print FH $result
  13. close FH;
  14.  
The data is written to datafile. I am connecting 3 times with different query terms to that url. The problem i am facing is the speed i.e time taken to get the data.

Is there any other method which can reduce the time taken to get the data?

I tried fcgi but no improvements!!

Is LWP module itself is slow??

Any suggestions?

with regards
Archana
Dec 29 '08 #1
9 1905
KevinADC
4,059 Recognized Expert Specialist
The time it takes the module to compile and connect to the remote site should not be significant. How long does it take? How long do you think it should take?
Dec 29 '08 #2
Archanak
79 New Member
@KevinADC
Hi,

For one connection i.e once fetching data its taking 1 min so total 3 mins.

I am working on linux platform.

Atleast it should take a minute for 3 connections (since 3 times i am connecting)

Any suggestions??

with regards
Archana
Dec 30 '08 #3
KevinADC
4,059 Recognized Expert Specialist
If your script takes a full minute to fetch a short webpage then I would assume you just have a poor internet connection, on the other hand if it is fethcing a few megabytes in a minute that doesn't seem to bad. But I have no idea how it could be speeded up.
Dec 30 '08 #4
Archanak
79 New Member
@KevinADC
Hi,

Actually the data is text data and the speed i use is about 100.0 Mbps.

Is LWP module itself is slow???

Any other alternative to speedup the process?

with regards
Archana
Dec 30 '08 #5
numberwhun
3,509 Recognized Expert Moderator Specialist
Even if you are using a fast connection, if the place you are fetching from is slow, then that will certainly effect the time as well so please take that into account.

You could, as an alternative, try using the WWW::Mechanize module. It is quite the little gem and I have used it to fetch files from the internet.

Regards,

Jeff
Dec 30 '08 #6
Archanak
79 New Member
@numberwhun
Hi,

I installed WWW::Mechanize module and the problem i am facing is that it is working in the command prompt and not in the browser.

Here is the code

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl
  2.  
  3. use CGI;
  4.  
  5. use CGI qw(:standard);
  6.  
  7. use WWW::Mechanize;
  8.  
  9. print header;
  10.  
  11. my $m = WWW::Mechanize->new;
  12.  
  13. $m->get("http://www.google.com");
  14.  
  15. print $m->content();
  16.  
  17.  
It is giving error when i run in the browser.

Error is:

Expand|Select|Wrap|Line Numbers
  1.  
  2. Error GETing http://www.google.com: Can't connect to www.google.com:80 (connect: timeout) at  line 13
  3.  
  4.  
How can i solve this problem??

Regards
Archana
Jan 17 '09 #7
numberwhun
3,509 Recognized Expert Moderator Specialist
I noticed in your script you didn't start off the CGI with something like this:

Expand|Select|Wrap|Line Numbers
  1. print "Content-type: text/html\r\n\r\n";
  2.  
You should probably check out a CGI tutorial before continuing.

Regards,

Jeff
Jan 17 '09 #8
KevinADC
4,059 Recognized Expert Specialist
@numberwhun
Jeff,

They have "print header;" in the script, which essentially does what you have suggested, prints the http header.
Jan 17 '09 #9
KevinADC
4,059 Recognized Expert Specialist
The code as posted works for me. Could be a firewall is blocking perl from accessing the internet.
Jan 17 '09 #10

Sign in to post your reply or Sign up for a free account.

Similar topics

1
2323
by: Reimar Bauer | last post by:
Dear all I recognized today that if I have an echo statement on an else at the end of my script the only thing which was downloaded by wget was this string. Is this a common way to prevent uncontrolled downoading of php scripts? The software I wrote is GPL licensed so I would share it as package not as partial files.
1
2512
by: Hugh McLaughlin | last post by:
Hello everyone and thanks for your help in advance. I am working on an application that allows a user to downlad a zip file from an FTP site. I have used the following code: Dim URL As String = "f= "ftp://ftpsite/data.zip" Dim outputFile As String = "C:\Data.zip" Dim wReq As WebRequest Dim wResp As WebResponse
10
2882
by: Noozer | last post by:
Below is some ASP, HTML and javascript. It is part of a page used to maintain a small database. This code did work at one time, but has since stopped. For some reason the data on my form is not being passed to the page specified in the Action property of the form. This is on a Windows 2000 Pro PC. I copied the code to another server...
16
12616
by: thomas peter | last post by:
I am building a precache engine... one that request over 100 pages on an remote server to cache them remotely... can i use the HttpWebRequest and WebResponse classes for this? or must i use the MSHTML objects to really load the HTML and request all of the images on site? string lcUrl = http://www.cnn.com; // *** Establish the request
3
1402
by: ACaunter | last post by:
Hi all, I was wondering if there was a way a client's machine could play audio or view a word document which is stored on the server, without having them download it first. This will be done over the web with an asp.net application. -- AdamPC@hotmail.com
4
2923
by: Joe | last post by:
I'm hosting my web service on a Windows 2003 box which is remotely located. When trying to add a web reference to a C# project I get an error message 'There was an error downloading 'http://mydomain.com:port/webservice.asmx' The operation has timed-out (I've tried with and without using a separate port for the service) The weird thing is...
9
6624
by: MR | last post by:
I get the following Exception "The data at the root level is invalid. Line 1, position 642" whenever I try to deserialize an incoming SOAP message. The incoming message is formed well and its length is 642 bytes ( I have appended it to the end of this message). I suspect that the reason may have something to do with an incorrect declaration...
5
1595
by: Trapulo | last post by:
Hi, I've a big trouble with an asp.net page. Basically, the page has an swf that starts and downloads some data from an other asp.net page. The application works well on my testing IIS 6, but when I upload it on production server (an other IIS6), it cannot download anything. The application is in ASP.NET 2.0. The page with swf, the swf,...
3
1713
by: xeroxero | last post by:
I would like to prevent people from downloading a .zip from a ASP.NET 2.0 web site, but still allow people to touch a .aspx page in the same directory. I also want to set things so if a user clicks on a button, the ASP.NET page will facilitate a download with Response.BinaryWrite. Is that possible? Thanks.
0
7695
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7922
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
8119
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
7964
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6281
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
3653
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3637
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2111
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
0
936
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.