472,362 Members | 2,026 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,362 software developers and data experts.

httplib continuation packets

After a long debugging session while scripting my webmail,
I believe I have traced the problem to the way httplib sends
POST requests.

I have compared tcpdump listings from Python 2.4.3 and 2.5.0's
httplib (via urllib/urllib2), Perl's LWP::UserAgent 2.033 and
Firefox 2.0. Only Python sends the request in such a way that
the mailserver closes the connection before I get any data from
the POST request (immediate FIN packet after the POST request).

httplib always sends the urlencoded POST data in a separate packet
from the HTTP headers, and this seems to cause problems with
the web interface in Ipswitch-IMail/8.05 (the software running on
Doteasy's webmail).

Firefox 2.0 has most of the headers in a single packet, but unlike
httplib, it always places a couple of headers in the continuation
packet as well (usually the content-length and content-type
headers).

LWP::UserAgent 2.033 doesn't use continuation at all, and sends
everything in a single packet.

Is this a bug in httplib or the web server? Is there a
workaround, or should I use Perl for this?

--
Haakon
Nov 11 '06 #1
6 2282
Haakon Riiser wrote:
Is this a bug in httplib or the web server?
it could be that they're blocking requests from Python's urllib, of
course. have you tried overriding the user-agent string ?

</F>

Nov 11 '06 #2
[Fredrik Lundh]
Haakon Riiser wrote:
>Is this a bug in httplib or the web server?

it could be that they're blocking requests from Python's urllib, of
course. have you tried overriding the user-agent string ?
Yes, and it doesn't help.

By the way, this is the closest thing I've found in the bug tracker:
https://sourceforge.net/tracker/?fun...&group_id=5470
The bug was closed in 2002 with this comment:

"I changed httplib to send requests as a single packet in rev
1.60. The change was made to address a performance problem,
but happens to fix the problem you had with the bogus
server, too."

Has someone changed it back since then?

--
Haakon
Nov 11 '06 #3
Haakon Riiser wrote:
Yes, and it doesn't help.
then the server is mostly likely broken beyond repair.
By the way, this is the closest thing I've found in the bug tracker:
https://sourceforge.net/tracker/?fun...&group_id=5470
The bug was closed in 2002 with this comment:

"I changed httplib to send requests as a single packet in rev
1.60. The change was made to address a performance problem,
but happens to fix the problem you had with the bogus
server, too."

Has someone changed it back since then?
nope; that change buffers the *header* part of the request to avoid
problems with certain TCP/IP mechanisms; see

http://svn.python.org/view?rev=27644&view=rev

for a discussion. note that there's still no guarantee that the entire
header is sent in a single TCP packet.

to see if this really is the problem, you could try moving the call to
self._send_output() from the end of the endheaders() method to the end
of the _send_request() method (around line 870 in httplib.py, at least
in 2.5).

</F>

Nov 11 '06 #4
[Fredrik Lundh]
Haakon Riiser wrote:
>Yes, and it doesn't help.

then the server is mostly likely broken beyond repair.
It's not in my power to upgrade the server, unfortunately.
Guess I'll have to use Perl.
to see if this really is the problem, you could try moving the call to
self._send_output() from the end of the endheaders() method to the end
of the _send_request() method (around line 870 in httplib.py, at least
in 2.5).
Tried this, but the tcpdump still looks the same (two packets: one
with the headers, one with the body), and now it fails with

urllib2.HTTPError: HTTP Error 501: Not Implemented

Nevertheless, I'm fairly sure that the packet fragmentation is
the culprit. It works perfectly with Perl, even when I make
no effort at all to spoof the browser (no user-agent, referer,
cookies, etc.).

--
Haakon
Nov 11 '06 #5
Haakon Riiser wrote:
[Fredrik Lundh]
>Haakon Riiser wrote:
>>Yes, and it doesn't help.
then the server is mostly likely broken beyond repair.

It's not in my power to upgrade the server, unfortunately.
Guess I'll have to use Perl.
>to see if this really is the problem, you could try moving the call to
self._send_output() from the end of the endheaders() method to the end
of the _send_request() method (around line 870 in httplib.py, at least
in 2.5).

Tried this, but the tcpdump still looks the same (two packets: one
with the headers, one with the body), and now it fails with

urllib2.HTTPError: HTTP Error 501: Not Implemented

Nevertheless, I'm fairly sure that the packet fragmentation is
the culprit. It works perfectly with Perl, even when I make
no effort at all to spoof the browser (no user-agent, referer,
cookies, etc.).
It really does seem quite bizarre that a server should respond
differently to the same TCP request when it is split differently into IP
datagrams.

There really is nothing wrong (from a standards point of view) with
sending FIN with your last data segment. FIN means "I guarantee to send
no more data, and will continue to acknowledge your segments until I see
your FIN".

Are you planning to report this bug to Ipswitch? It certainly sounds
like someone should.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Nov 13 '06 #6
[Steve Holden]
It really does seem quite bizarre that a server should respond
differently to the same TCP request when it is split differently into IP
datagrams.

There really is nothing wrong (from a standards point of view) with
sending FIN with your last data segment. FIN means "I guarantee to send
no more data, and will continue to acknowledge your segments until I see
your FIN".
It is the server that sends the FIN. What happens is this (each
line corresponds to one packet):

client: POST request headers
client: POST request body
server: FIN + ACK

On receiving the FIN + ACK packet, Python gets immediate
end-of-file on the POST request. Unless the order of the
parameters in the POST request matters (I haven't yet tested this),
I have no other explanation than the fragmentation. If Ipswitch
bothers to reply to my bug report, I'll look into it. Otherwise,
I'm not wasting any more time on this -- it's not that big a deal
for me personally, since I have already scripted the stuff I needed
with Perl.
Are you planning to report this bug to Ipswitch? It certainly sounds
like someone should.
I quickly browsed through ipswitch.com, but couldn't find any good
place to submit bugs. I ended up using the product feedback web
form. Wrote a one-line summary, and referred to this thread on
Google Groups.

--
Haakon
Nov 13 '06 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: scummer | last post by:
Hi, I am having a problem with the httplib HTTPConnection object. While I can easily send requests that don't have any payload (ie. "get"), I encounter issues if I want to post xml data. If you...
0
by: Milos Prudek | last post by:
How can I set httplib timeout for httplib.request() ? httplib.request timeouts after 3:10 with "socket" timeout error message. I found socket.settimeout() and I believe there is a way to add...
1
by: Brian Beck | last post by:
Hi. I'm having some problems with code based directly on the following httplib documentation code: http://www.zvon.org/other/python/doc21/lib/httplib-examples.html I've included the code and...
0
by: Laszlo Zsolt Nagy | last post by:
Hello, This is from the docs, from section 11.6.1 (HTTPConnection Objects) HTTPConnection instances have the following methods: request( method, url]) The headers argument should be a...
0
by: knguyen | last post by:
Hi, For some reason, httplib request() method splits the request packet into two packets, the first packet contains only HTTP headers, the body in the second packet. The first packet size is way...
0
by: Robert | last post by:
did you solve this problem? It seems to be still present here with py2.3.5. Robert -- From: Manish Jethani <manish.j@gmx.net> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US;...
0
by: philip20060308 | last post by:
Hi all, Has anyone ever seen Python 2.4.1's httplib choke when reading chunked content? I'm using it via urrlib2, and I ran into a particular server that returns something that httplib doesn't...
4
by: Patrick Altman | last post by:
I am attempting to use a HEAD request against Amazon S3 to check whether a file exists or not and if it does parse the md5 hash from the ETag in the response to verify the contents of the file so...
3
by: rhXX | last post by:
hi all, i'm using this tutorial example import httplib h = httplib.HTTP("www.python.org") h.putrequest('GET','/index.html') h.putheader('User-Agent','Lame Tutorial Code')...
2
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and efficiency. While initially associated with cryptocurrencies...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was proposed, which integrated multiple engines and...
0
by: Arjunsri | last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and credentials and received a successful connection...
0
hi
by: WisdomUfot | last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific technical details, Gmail likely implements measures...
1
by: Matthew3360 | last post by:
Hi, I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web server and have made sure to enable curl. I get a...
0
BLUEPANDA
by: BLUEPANDA | last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS starter kit that's not only easy to use but also...
0
by: Rahul1995seven | last post by:
Introduction: In the realm of programming languages, Python has emerged as a powerhouse. With its simplicity, versatility, and robustness, Python has gained popularity among beginners and experts...
1
by: Ricardo de Mila | last post by:
Dear people, good afternoon... I have a form in msAccess with lots of controls and a specific routine must be triggered if the mouse_down event happens in any control. Than I need to discover what...
1
by: ezappsrUS | last post by:
Hi, I wonder if someone knows where I am going wrong below. I have a continuous form and two labels where only one would be visible depending on the checkbox being checked or not. Below is the...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.