By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,258 Members | 1,284 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,258 IT Pros & Developers. It's quick & easy.

Making a socket connection via a proxy server

P: n/a
In a nutshell - the question I'm asking is, how do I make a socket
conenction go via a proxy server ?
All our internet traffic has to go through a proxy-server at location
'dav-serv:8080' and I need to make a socket connection through it.

The reason (with code example) is as follows :

I am hacking "Tiny HTTP Proxy" by SUZUKI Hisao to make an http proxy
that modifies URLs. I haven't got very far - having started from zero
knowledge of the 'hyper text transfer protocol'.

It looks like the Tiny HTTP Proxy (using BaseHTTPServer as it's
foundation) intercepts all requests to local addresses and then
re-implements the request (whether it is CONNECT, GET, PUT or
whatever). It logs everything that goes through it - I will simply
edit it to amend the URL that is being asked for.

It looks like the CONNECT and GET requests are just implemented using
simple socket commands. (I say simple because there isn't a lot of
code - I'm not familiar with the actual behaviour of sockets, but it
doesn't look too complicated).

What I need to do is rewrite the soc.connect(host_port) line in the
following example so that it connects *via* my proxy-server. (which it
doesn't by default).

I think the current format of host_port is a tuple : (host_domain,
port_no)

Below is a summary of the GET command (I've inlined all the method
calls - this example starts from the do_GET method) :

soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
soc.connect(host_port)
soc.send("%s %s %s\r\n" % (
self.command,
urlparse.urlunparse(('', '', path, params, query, '')),
self.request_version))
self.headers['Connection'] = 'close'
del self.headers['Proxy-Connection']
for key_val in self.headers.items():
soc.send("%s: %s\r\n" % key_val)
soc.send("\r\n")

max_idling=20 # this is really
part of a self._read_write method
iw = [self.connection, soc]
ow = []
count = 0
while 1:
count += 1
(ins, _, exs) = select.select(iw, ow, iw, 3)
if exs: break
if ins:
for i in ins:
if i is soc:
out = self.connection
else:
out = soc
data = i.recv(8192)
if data:
out.send(data)
count = 0
else:
print "\t" "idle", count
if count == max_idling: break

print "\t" "bye"
soc.close()
self.connection.close()

Regards,
Fuzzy

http://www.voidspace.org.uk/atlantib...thonutils.html
Jul 18 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
> Fuzzyman wrote:
In a nutshell - the question I'm asking is, how do I make a socket
conenction go via a proxy server ?
All our internet traffic has to go through a proxy-server at location
'dav-serv:8080' and I need to make a socket connection through it.

Am Freitag, 30. Juli 2004 18:51 schrieb Diez B. Roggisch: Short answer: Its not possible.


Longer answer: it is possible if you use DNAT on some router between the
computer which opens the request and the destination machine. Check out squid
transparent proxy howtos you can find on the net. The protocol will need
HTTP/1.1 for this, though.

Small example, which clarifies why this is possible:

Computer 1 opens http (port 80) connection to computer 2.

Router 1 sits in the middle, sees a port 80 connection is opened to some
computer 2, and rewrites the incoming packet to have a new destination
address/port (DNAT), namely proxy 1 with port 3128 (standard http-proxy port,
at least for squid), and a new source address/port (SNAT), namely router 1
with some port.

Proxy 1 gets the following (from router 1):

GET /foo.html HTTP/1.1
Host: www.foo.com:80
<other headers>

Proxy 1 opens the connection to www.foo.com port 80 (now, the router sees that
the connection comes from proxy, it must not do address rewriting), gets the
result, and stores it locally.

proxy 1 then sends the packets back to router 1 (because the proxy request
seems to have come from router; if you leave out SNAT in the rewriting step,
it'll seem to have come from the actual computer, and this is fine too, but
then you have to be sure that the return packet also has to go through the
router), and now router 1 does reverse DNAT and SNAT to return the packet to
computer 1, which will see a source address of computer 2 and port 80 on the
packet.

computer 1 sees the result, and thinks it came from the outside machine,
although through some SNAT/DNAT the packets actually originated from the
proxy.

This is basically it.

If you want to implement this, as I said, read up on transparent proxy howtos
for squid. Pretty much every proxy can be made to support this, as with
HTTP/1.1 the Host: header is a required header, and thus the proxy can always
extract the host which was queried from the request, even when it isn't
passed as the others have suggested.

On another note: I assumed you wanted to transparently relay/rewrite HTTP
through the proxy. If you need to open some form of socket connection to the
proxy which is not HTTP, the proxy protocol supports the method CONNECT,
which will simply open up a socket connection which is relayed by the proxy.
But: This cannot be made transparent, except by some deeper magic in the
router.

HTH!

Heiko.
Jul 18 '05 #2

P: n/a
[snip..]
It looks like the CONNECT and GET requests are just implemented using
simple socket commands. (I say simple because there isn't a lot of
code - I'm not familiar with the actual behaviour of sockets, but it
doesn't look too complicated).

What I need to do is rewrite the soc.connect(host_port) line in the
following example so that it connects *via* my proxy-server. (which it
doesn't by default).

I think the current format of host_port is a tuple : (host_domain,
port_no)

Below is a summary of the GET command (I've inlined all the method
calls - this example starts from the do_GET method) :

soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
soc.connect(host_port)


What is the value of host_port at this point? It *should* be the
address of your external access proxy, i.e. dav-serv:8080
soc.send("%s %s %s\r\n" % (
self.command,
urlparse.urlunparse(('', '', path, params, query, '')),
self.request_version))


And you're not sending an absoluteURI: this should be amended to
contain the server details of the the server that is finally going to
service the request. For the python.org example above, this code would be

soc.send("%s %s %s\r\n" % (
self.command,
urlparse.urlunparse(('http', 'www.python.org:80', path, params,
query, '')),
self.request_version))

though of course, these values should be made available to you by
TinyHTTPProxy. Taking a brief look at the code, these values should
available through the variables "scm" and "netloc". So your outgoing
connection code from TinyHTTPProxy should look something like this

soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
soc.connect( ('dav-serv', 8080) )
soc.send("%s %s %s\r\n" % (
self.command,
urlparse.urlunparse((scm, netloc, path, params, query, '')),
self.request_version))

HTH,

Thanks to all of you who replied.
I think I uderstand enough to have a go - I need to make the
connection to the proxy and the request for the absolute URI. That at
least gives me something to go at and it shouldn't be too hard.

Many Thanks for your help.

Fuzzyman

http://www.voidspace.org.uk/atlantib...thonutils.html
Jul 18 '05 #3

P: n/a
[snip..]

It looks like the Tiny HTTP Proxy (using BaseHTTPServer as it's
foundation) intercepts all requests to local addresses and then
re-implements the request (whether it is CONNECT, GET, PUT or
whatever). It logs everything that goes through it - I will simply
edit it to amend the URL that is being asked for.


Yes, that is exactly what the proxy should do. It relays requests
between client and server. However, there is one vital detail you're
probably missing that is preventing you from chaining client + proxy*N
+ server together.

When sending a HTTP GET request to a server, a client sends a request
line containing a URI without a server component. This is because the
socket connection to the server is already formed, therefore the
server connection details do not need to be repeated. So a standard
GET will look like this

GET /index.html HTTP/1.1

However, it's different when a client connects to a proxy, because the
socket no longer connects directly to the server, but to the proxy
instead. The proxy still needs to know to which server it should send
the request. So the correct format for sending requests to a proxy is
to use the "absoluteURI" form, which includes the server details, e.g.

GET http://www.python.org:80/index.html HTTP/1.1

Any proxy that receives such a request now knows that the server to
forward to is "www.python.org:80". It will open a connection to
www.python.org:80, and send it a GET request for the URI.

Since you want your proxy to forward to another proxy, i.e. your proxy
is a client from your external-access-proxy's point of view, you
should also use the absoluteURI form when making requests from your
python proxy to your external proxy.


Well the two minor changes you suggested worked straight away for
normal HTML pages - great.
It's not fetching images and a couple of other problems (possibly
because that proxy server can only handle HTTP/1.0 - but I have a more
advanced one called TcpWatch from Zope that I might hack around).

But there's more than enough for me to go on and get it working.

MANY THANKS

Regards,

Fuzzy

http://www.voidspace.org.uk/atlantib...thonutils.html
Jul 18 '05 #4

P: n/a
[snip..]
On another note: I assumed you wanted to transparently relay/rewrite HTTP
through the proxy. If you need to open some form of socket connection to the
proxy which is not HTTP, the proxy protocol supports the method CONNECT,
which will simply open up a socket connection which is relayed by the proxy.
But: This cannot be made transparent, except by some deeper magic in the
router.

HTH!

Heiko.


Thanks for your help.
It's only http that I'll be relayign and I only need it to be
transparent to the user - I'm not using this for anonymity.

I don't yet understand the detail of what you've said, but I am
following hte resources you've suggested and now have enough to get to
the next stage of my work.

Thanks

Fuzzyman

http://www.voidspace.org.uk/atlantib...thonutils.html
Jul 18 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.