473,385 Members | 2,005 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

httplib/socket problems reading 404 Not Found response

I am attempting to use a HEAD request against Amazon S3 to check
whether a file exists or not and if it does parse the md5 hash from
the ETag in the response to verify the contents of the file so as to
save on bandwidth of uploading files when it is not necessary.

If the file exist, the HEAD works as expected and I get valid headers
back that I can parse and pull the ETag out of the dictionary using
getheader('ETag')[1:-1] (using the slice to trim off the double-quotes
in the string.

The problem lies when I attempt to send a HEAD request when no file
exists. As expected, a 404 Not Found response is sent back from
Amazon however, my test scripts seem to hang. I run python with
trace.py and it hangs here:

--- modulename: httplib, funcname: _read_chunked
httplib.py(536): assert self.chunked != _UNKNOWN
httplib.py(537): chunk_left = self.chunk_left
httplib.py(538): value = ''
httplib.py(542): while True:
httplib.py(543): if chunk_left is None:
httplib.py(544): line = self.fp.readline()
--- modulename: socket, funcname: readline
socket.py(321): data = self._rbuf
socket.py(322): if size < 0:
socket.py(324): if self._rbufsize <= 1:
socket.py(326): assert data == ""
socket.py(327): buffers = []
socket.py(328): recv = self._sock.recv
socket.py(329): while data != "\n":
socket.py(330): data = recv(1)

It eventually completes with an exception here:

File "C:\Python25\lib\httplib.py", line 509, in read
return self._read_chunked(amt)
File "C:\Python25\lib\httplib.py", line 548, in _read_chunked
chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: ''

For reference, ethereal captured the following request and response:

HEAD <REMOVEDHTTP/1.1
Host: s3.amazonaws.com
Accept-Encoding: identity
Date: Tue, 13 Mar 2007 02:54:12 GMT
Authorization: AWS <REMOVED>

HTTP/1.1 404 Not Found
x-amz-request-id: E20B4C0D0C48B2EF
x-amz-id-2: <REMOVED>
Content-Type: application/xml
Transfer-Encoding: chunked
Date: Tue, 13 Mar 2007 02:54:16 GMT
Server: AmazonS3

Am I doing something wrong? Is this a known issue? I am an
experienced developer, but pretty new to Python and dynamic languages
in general.

Thanks,
Patrick

Mar 13 '07 #1
4 2296
En Tue, 13 Mar 2007 00:07:55 -0300, Patrick Altman <pa*****@gmail.com>
escribió:
I am attempting to use a HEAD request against Amazon S3 to check
whether a file exists or not and if it does parse the md5 hash from
the ETag in the response to verify the contents of the file so as to
save on bandwidth of uploading files when it is not necessary.
The problem lies when I attempt to send a HEAD request when no file
exists. As expected, a 404 Not Found response is sent back from
Amazon however, my test scripts seem to hang. I run python with
trace.py and it hangs here:
Yes, it's a known problem. See this message with a self-response:
http://mail.python.org/pipermail/pyt...ch/375087.html

--
Gabriel Genellina

Mar 13 '07 #2
Yes, it's a known problem. See this message with a self-response:http://mail.python.org/pipermail/pyt...ch/375087.html

Are there plans to include this fix in the standard Python libraries
or must I make the modifications myself (I'm running Python 2.5)?
Mar 13 '07 #3
En Tue, 13 Mar 2007 10:38:24 -0300, Patrick Altman <pa*****@gmail.com>
escribió:
>Yes, it's a known problem. See this message with a
self-response:http://mail.python.org/pipermail/pyt...ch/375087.html
Are there plans to include this fix in the standard Python libraries
or must I make the modifications myself (I'm running Python 2.5)?
Submit a bug report, if not already done.
http://sourceforge.net/tracker/?group_id=5470

--
Gabriel Genellina

Mar 13 '07 #4
On Mar 13, 3:16 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
En Tue, 13 Mar 2007 10:38:24 -0300, Patrick Altman <palt...@gmail.com>
escribió:
Yes, it's a known problem. See this message with a
self-response:http://mail.python.org/pipermail/pyt...ch/375087.html
Are there plans to include this fix in the standard Python libraries
or must I make the modifications myself (I'm running Python 2.5)?

Submit a bug report, if not already done.http://sourceforge.net/tracker/?group_id=5470

--
Gabriel Genellina
Bug already exists at:
https://sourceforge.net/tracker/inde...70&atid=105470

In the meantime, I implemented a work around for my specific case in
the Amazon S3 library in that I implemented a head() method but am
actually just requesting a GET operation with a very small byte
range. This is essentially yielding all the same header data that I
need (md5 hash in the ETag if the file exists, 404 Not Found if it
doesn't).

Mar 14 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Glauco | last post by:
I'm using a library based on httplib. Recently i've done a conversion for use of https with a key and certificate file. This goes perfectly :-) . Now, the problem is passing throw an http_proxy....
0
by: Milos Prudek | last post by:
How can I set httplib timeout for httplib.request() ? httplib.request timeouts after 3:10 with "socket" timeout error message. I found socket.settimeout() and I believe there is a way to add...
0
by: Terry Kerr | last post by:
Hi, I have an app that makes a https POST to a remote server that I have no control over. The app runs fine in python 2.1.3 with socket.ssl compiled with openssl-0.9.6, however it will not run...
4
by: michaelparkin | last post by:
Hi, Sorry to post what might seem like a trivial problem here, but its driving me mad! I have a simple https client that uses httplib to post data to a web server. When I post over http &...
13
by: coloradowebdev | last post by:
i am working on basically a proxy server that handles requests via remoting from clients and executes transactions against a third-party server via TCP. the remoting site works like a champ. my...
0
by: philip20060308 | last post by:
Hi all, Has anyone ever seen Python 2.4.1's httplib choke when reading chunked content? I'm using it via urrlib2, and I ran into a particular server that returns something that httplib doesn't...
1
by: Mitch.Garnaat | last post by:
Hi - I'm writing some Python code to interact with Amazon's S3 service. One feature of S3 is that it will allow you to use the HTTP HEAD request to retrieve metadata about an S3 object without...
0
by: Dustin J. Mitchell | last post by:
I'm building an interface to Amazon's S3, using httplib. It uses a single object for multiple transactions. What's happening is this: HTTP PUT /unitest-temp-1161039691 HTTP/1.1 HTTP Date: Mon,...
3
by: rhXX | last post by:
hi all, i'm using this tutorial example import httplib h = httplib.HTTP("www.python.org") h.putrequest('GET','/index.html') h.putheader('User-Agent','Lame Tutorial Code')...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.