Hi guys and gals,
I'm wrestling with the urlretrieve function in the urllib module. I
want to download a file from a web server and save it locally with the
same name. The problem is the URL - it's on the form http://www.page.com/?download=12345. It doesn't reveal the file name.
Some hints to point me in the right direction are greatly appreciated.
Sven 6 5108
At Thursday 9/11/2006 19:11, Sven wrote:
>I'm wrestling with the urlretrieve function in the urllib module. I want to download a file from a web server and save it locally with the same name. The problem is the URL - it's on the form http://www.page.com/?download=12345. It doesn't reveal the file name. Some hints to point me in the right direction are greatly appreciated.
The file name *may* come in the Content-Disposition header (ex:
Content-Disposition: attachment; filename="budget.xls")
Use urlopen to obtain a file-like object; its info() method gives you
those headers.
--
Gabriel Genellina
Softlab SRL
__________________________________________________
Correo Yahoo!
Espacio para todos tus mensajes, antivirus y antispam ˇgratis!
ˇAbrí tu cuenta ya! - http://correo.yahoo.com.ar
Hello Gabriel,
Thanks for your help, but I'm a guy with no luck. :-) I can't get the
file name from response header...
On Nov 10, 12:39 am, Gabriel Genellina <gagsl...@yahoo.com.arwrote:
At Thursday 9/11/2006 19:11, Sven wrote:
I'm wrestling with the urlretrieve function in the urllib module. I
want to download a file from a web server and save it locally with the
same name. The problem is the URL - it's on the form http://www.page.com/?download=12345. It doesn't reveal the file name.
Some hints to point me in the right direction are greatly appreciated.The file name *may* come in the Content-Disposition header (ex:
Content-Disposition: attachment; filename="budget.xls")
Use urlopen to obtain a file-like object; its info() method gives you
those headers.
--
Gabriel Genellina
Softlab SRL
__________________________________________________
Correo Yahoo!
Espacio para todos tus mensajes, antivirus y antispam ˇgratis!
ˇAbrí tu cuenta ya! -http://correo.yahoo.com.ar
At Thursday 9/11/2006 20:52, Sven wrote:
>Thanks for your help, but I'm a guy with no luck. :-) I can't get the file name from response header...
Try using a browser and "Save as..."; if it suggests a file name, it
*must* be in the headers - so look again carefully.
If it does not suggests a filen ame, the server is not providing one
(there is no obligation to do so).
--
Gabriel Genellina
Softlab SRL
__________________________________________________
Correo Yahoo!
Espacio para todos tus mensajes, antivirus y antispam ˇgratis!
ˇAbrí tu cuenta ya! - http://correo.yahoo.com.ar
Yes the browser suggests a file name, but I did a little research using http://web-sniffer.net/. The Response Header contains roughly this:
HTTP Status Code: HTTP/1.1 302 Found
Location: http://page.com/filename.zip
Content-Length: 0
Connection: close
Content-Type: text/html
The status code 302 tells the browser where to find the file. The funny
thing is that calling the info() function, on the file-like response
object, in Python doesn't return the same header. I'm so stuck. :-)
Thanks for your help.
On 10 Nov, 01:27, Gabriel Genellina <gagsl...@yahoo.com.arwrote:
At Thursday 9/11/2006 20:52, Sven wrote:
Thanks for your help, but I'm a guy with no luck. :-) I can't get the
file name from response header...Try using a browser and "Save as..."; if it suggests a file name, it
*must* be in the headers - so look again carefully.
If it does not suggests a filen ame, the server is not providing one
(there is no obligation to do so).
--
Gabriel Genellina
Softlab SRL
__________________________________________________
Correo Yahoo!
Espacio para todos tus mensajes, antivirus y antispam ˇgratis!
ˇAbrí tu cuenta ya! -http://correo.yahoo.com.ar
At Friday 10/11/2006 16:58, Sven wrote:
>Yes the browser suggests a file name, but I did a little research using http://web-sniffer.net/. The Response Header contains roughly this:
HTTP Status Code: HTTP/1.1 302 Found Location: http://page.com/filename.zip Content-Length: 0 Connection: close Content-Type: text/html
The status code 302 tells the browser where to find the file. The funny thing is that calling the info() function, on the file-like response object, in Python doesn't return the same header. I'm so stuck. :-) Thanks for your help.
Because urlopen is smart enough to detect the redirection and do a
second request.
You can use the geturl() method to obtain the true URL used (that
would be http://page.com/filename.zip) and then rename the file.
Or, you can install your own URLOpener (I think a FancyURLOpener with
retries=0 would be OK) and process the Location header yourself. See
the urllib documentation.
--
Gabriel Genellina
Softlab SRL
__________________________________________________
Correo Yahoo!
Espacio para todos tus mensajes, antivirus y antispam ˇgratis!
ˇAbrí tu cuenta ya! - http://correo.yahoo.com.ar
You can use the geturl() method to obtain the true URL used (that
would behttp://page.com/filename.zip) and then rename the file.
Thanks mate, this was exactly what I needed. A realy clean and simple
solution to my problem. :-) This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: HP |
last post by:
I am using urllib.urlretrieve() to download a file from a web site.
THe trouble is that the file name has spaces in it,
such as "string string1 foo.doc".
The statement:...
|
by: Sam Sungshik Kong |
last post by:
Hello!
I'm trying to download PDF files from web to my computer using urllib.
Some pdf files are fine but other files are downloaded only 6kB which is
wrong.
It didn't show any error message.
...
|
by: Justin |
last post by:
does anybody know if there is a way to stop urlretrieve mid transfer or
are you simply bound to finishing the file you are on? If anybody
knows the answer to this let me know. i have read all the...
|
by: Josh |
last post by:
Hi. I am writing a script that downloads lots of zips from a usgs site.
All is going well except that occasionally, in the midst of
downloading a file, the script just hangs and i must either...
|
by: Ray Slakinski |
last post by:
I got a small issue, I am using urllib.urlretreive to download files
but in some cases I'm downloading from a CGI that is redirecting
urlretrieve to a different url.
Example:
...
|
by: Rene Lopez |
last post by:
I'm building an app that needs to download a file from the
web.
I'm trying to make sure I catch any issues with the download
but I've run into a problem.
here's what I have so far:
try:...
|
by: silverburgh.meryl |
last post by:
Hi,
I use urlretrieve to retrieve resources from a http server.
Can you please tell me how can I get the HTTP error (whenever
happens)?
And what kind of different exception urlretrieve will...
|
by: Abandoned |
last post by:
Hi..
I want to set 30 second urllib.urlretrieve timeout..
Because if urllib.urlretrieve can't connect to page wait 1-2 hour...
I download the images to my server with urlretrieve if you know the...
|
by: triplezone3 |
last post by:
I have looked in to urllib2, and I can't find a
function which would allow me to get the progress of
the download as it happens, bit by bit, like
urlretrieve does, at least not easily....
|
by: DolphinDB |
last post by:
Tired of spending countless mintues downsampling your data? Look no further!
In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
|
by: ryjfgjl |
last post by:
ExcelToDatabase: batch import excel into database automatically...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: jfyes |
last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
|
by: ArrayDB |
last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
|
by: PapaRatzi |
last post by:
Hello,
I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
|
by: Defcon1945 |
last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
|
by: Shćllîpôpď 09 |
last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
|
by: af34tf |
last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
| |