472,143 Members | 1,318 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,143 software developers and data experts.

httplib/socket problems reading 404 Not Found response

I am attempting to use a HEAD request against Amazon S3 to check
whether a file exists or not and if it does parse the md5 hash from
the ETag in the response to verify the contents of the file so as to
save on bandwidth of uploading files when it is not necessary.

If the file exist, the HEAD works as expected and I get valid headers
back that I can parse and pull the ETag out of the dictionary using
getheader('ETag')[1:-1] (using the slice to trim off the double-quotes
in the string.

The problem lies when I attempt to send a HEAD request when no file
exists. As expected, a 404 Not Found response is sent back from
Amazon however, my test scripts seem to hang. I run python with
trace.py and it hangs here:

--- modulename: httplib, funcname: _read_chunked
httplib.py(536): assert self.chunked != _UNKNOWN
httplib.py(537): chunk_left = self.chunk_left
httplib.py(538): value = ''
httplib.py(542): while True:
httplib.py(543): if chunk_left is None:
httplib.py(544): line = self.fp.readline()
--- modulename: socket, funcname: readline
socket.py(321): data = self._rbuf
socket.py(322): if size < 0:
socket.py(324): if self._rbufsize <= 1:
socket.py(326): assert data == ""
socket.py(327): buffers = []
socket.py(328): recv = self._sock.recv
socket.py(329): while data != "\n":
socket.py(330): data = recv(1)

It eventually completes with an exception here:

File "C:\Python25\lib\httplib.py", line 509, in read
return self._read_chunked(amt)
File "C:\Python25\lib\httplib.py", line 548, in _read_chunked
chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: ''

For reference, ethereal captured the following request and response:

HEAD <REMOVEDHTTP/1.1
Host: s3.amazonaws.com
Accept-Encoding: identity
Date: Tue, 13 Mar 2007 02:54:12 GMT
Authorization: AWS <REMOVED>

HTTP/1.1 404 Not Found
x-amz-request-id: E20B4C0D0C48B2EF
x-amz-id-2: <REMOVED>
Content-Type: application/xml
Transfer-Encoding: chunked
Date: Tue, 13 Mar 2007 02:54:16 GMT
Server: AmazonS3

Am I doing something wrong? Is this a known issue? I am an
experienced developer, but pretty new to Python and dynamic languages
in general.

Thanks,
Patrick

Mar 13 '07 #1
4 2210
En Tue, 13 Mar 2007 00:07:55 -0300, Patrick Altman <pa*****@gmail.com>
escribió:
I am attempting to use a HEAD request against Amazon S3 to check
whether a file exists or not and if it does parse the md5 hash from
the ETag in the response to verify the contents of the file so as to
save on bandwidth of uploading files when it is not necessary.
The problem lies when I attempt to send a HEAD request when no file
exists. As expected, a 404 Not Found response is sent back from
Amazon however, my test scripts seem to hang. I run python with
trace.py and it hangs here:
Yes, it's a known problem. See this message with a self-response:
http://mail.python.org/pipermail/pyt...ch/375087.html

--
Gabriel Genellina

Mar 13 '07 #2
Yes, it's a known problem. See this message with a self-response:http://mail.python.org/pipermail/pyt...ch/375087.html

Are there plans to include this fix in the standard Python libraries
or must I make the modifications myself (I'm running Python 2.5)?
Mar 13 '07 #3
En Tue, 13 Mar 2007 10:38:24 -0300, Patrick Altman <pa*****@gmail.com>
escribió:
>Yes, it's a known problem. See this message with a
self-response:http://mail.python.org/pipermail/pyt...ch/375087.html
Are there plans to include this fix in the standard Python libraries
or must I make the modifications myself (I'm running Python 2.5)?
Submit a bug report, if not already done.
http://sourceforge.net/tracker/?group_id=5470

--
Gabriel Genellina

Mar 13 '07 #4
On Mar 13, 3:16 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
En Tue, 13 Mar 2007 10:38:24 -0300, Patrick Altman <palt...@gmail.com>
escribió:
Yes, it's a known problem. See this message with a
self-response:http://mail.python.org/pipermail/pyt...ch/375087.html
Are there plans to include this fix in the standard Python libraries
or must I make the modifications myself (I'm running Python 2.5)?

Submit a bug report, if not already done.http://sourceforge.net/tracker/?group_id=5470

--
Gabriel Genellina
Bug already exists at:
https://sourceforge.net/tracker/inde...70&atid=105470

In the meantime, I implemented a work around for my specific case in
the Amazon S3 library in that I implemented a head() method but am
actually just requesting a GET operation with a very small byte
range. This is essentially yielding all the same header data that I
need (md5 hash in the ETag if the file exists, 404 Not Found if it
doesn't).

Mar 14 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Glauco | last post: by
reply views Thread by Milos Prudek | last post: by
reply views Thread by Terry Kerr | last post: by
4 posts views Thread by michaelparkin | last post: by
13 posts views Thread by coloradowebdev | last post: by
reply views Thread by philip20060308 | last post: by
1 post views Thread by Mitch.Garnaat | last post: by
reply views Thread by Dustin J. Mitchell | last post: by
3 posts views Thread by rhXX | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.