473,407 Members | 2,546 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,407 software developers and data experts.

Making a socket connection via a proxy server

In a nutshell - the question I'm asking is, how do I make a socket
conenction go via a proxy server ?
All our internet traffic has to go through a proxy-server at location
'dav-serv:8080' and I need to make a socket connection through it.

The reason (with code example) is as follows :

I am hacking "Tiny HTTP Proxy" by SUZUKI Hisao to make an http proxy
that modifies URLs. I haven't got very far - having started from zero
knowledge of the 'hyper text transfer protocol'.

It looks like the Tiny HTTP Proxy (using BaseHTTPServer as it's
foundation) intercepts all requests to local addresses and then
re-implements the request (whether it is CONNECT, GET, PUT or
whatever). It logs everything that goes through it - I will simply
edit it to amend the URL that is being asked for.

It looks like the CONNECT and GET requests are just implemented using
simple socket commands. (I say simple because there isn't a lot of
code - I'm not familiar with the actual behaviour of sockets, but it
doesn't look too complicated).

What I need to do is rewrite the soc.connect(host_port) line in the
following example so that it connects *via* my proxy-server. (which it
doesn't by default).

I think the current format of host_port is a tuple : (host_domain,
port_no)

Below is a summary of the GET command (I've inlined all the method
calls - this example starts from the do_GET method) :

soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
soc.connect(host_port)
soc.send("%s %s %s\r\n" % (
self.command,
urlparse.urlunparse(('', '', path, params, query, '')),
self.request_version))
self.headers['Connection'] = 'close'
del self.headers['Proxy-Connection']
for key_val in self.headers.items():
soc.send("%s: %s\r\n" % key_val)
soc.send("\r\n")

max_idling=20 # this is really
part of a self._read_write method
iw = [self.connection, soc]
ow = []
count = 0
while 1:
count += 1
(ins, _, exs) = select.select(iw, ow, iw, 3)
if exs: break
if ins:
for i in ins:
if i is soc:
out = self.connection
else:
out = soc
data = i.recv(8192)
if data:
out.send(data)
count = 0
else:
print "\t" "idle", count
if count == max_idling: break

print "\t" "bye"
soc.close()
self.connection.close()

Regards,
Fuzzy

http://www.voidspace.org.uk/atlantib...thonutils.html
Jul 18 '05 #1
4 17588
> Fuzzyman wrote:
In a nutshell - the question I'm asking is, how do I make a socket
conenction go via a proxy server ?
All our internet traffic has to go through a proxy-server at location
'dav-serv:8080' and I need to make a socket connection through it.

Am Freitag, 30. Juli 2004 18:51 schrieb Diez B. Roggisch: Short answer: Its not possible.


Longer answer: it is possible if you use DNAT on some router between the
computer which opens the request and the destination machine. Check out squid
transparent proxy howtos you can find on the net. The protocol will need
HTTP/1.1 for this, though.

Small example, which clarifies why this is possible:

Computer 1 opens http (port 80) connection to computer 2.

Router 1 sits in the middle, sees a port 80 connection is opened to some
computer 2, and rewrites the incoming packet to have a new destination
address/port (DNAT), namely proxy 1 with port 3128 (standard http-proxy port,
at least for squid), and a new source address/port (SNAT), namely router 1
with some port.

Proxy 1 gets the following (from router 1):

GET /foo.html HTTP/1.1
Host: www.foo.com:80
<other headers>

Proxy 1 opens the connection to www.foo.com port 80 (now, the router sees that
the connection comes from proxy, it must not do address rewriting), gets the
result, and stores it locally.

proxy 1 then sends the packets back to router 1 (because the proxy request
seems to have come from router; if you leave out SNAT in the rewriting step,
it'll seem to have come from the actual computer, and this is fine too, but
then you have to be sure that the return packet also has to go through the
router), and now router 1 does reverse DNAT and SNAT to return the packet to
computer 1, which will see a source address of computer 2 and port 80 on the
packet.

computer 1 sees the result, and thinks it came from the outside machine,
although through some SNAT/DNAT the packets actually originated from the
proxy.

This is basically it.

If you want to implement this, as I said, read up on transparent proxy howtos
for squid. Pretty much every proxy can be made to support this, as with
HTTP/1.1 the Host: header is a required header, and thus the proxy can always
extract the host which was queried from the request, even when it isn't
passed as the others have suggested.

On another note: I assumed you wanted to transparently relay/rewrite HTTP
through the proxy. If you need to open some form of socket connection to the
proxy which is not HTTP, the proxy protocol supports the method CONNECT,
which will simply open up a socket connection which is relayed by the proxy.
But: This cannot be made transparent, except by some deeper magic in the
router.

HTH!

Heiko.
Jul 18 '05 #2
[snip..]
It looks like the CONNECT and GET requests are just implemented using
simple socket commands. (I say simple because there isn't a lot of
code - I'm not familiar with the actual behaviour of sockets, but it
doesn't look too complicated).

What I need to do is rewrite the soc.connect(host_port) line in the
following example so that it connects *via* my proxy-server. (which it
doesn't by default).

I think the current format of host_port is a tuple : (host_domain,
port_no)

Below is a summary of the GET command (I've inlined all the method
calls - this example starts from the do_GET method) :

soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
soc.connect(host_port)


What is the value of host_port at this point? It *should* be the
address of your external access proxy, i.e. dav-serv:8080
soc.send("%s %s %s\r\n" % (
self.command,
urlparse.urlunparse(('', '', path, params, query, '')),
self.request_version))


And you're not sending an absoluteURI: this should be amended to
contain the server details of the the server that is finally going to
service the request. For the python.org example above, this code would be

soc.send("%s %s %s\r\n" % (
self.command,
urlparse.urlunparse(('http', 'www.python.org:80', path, params,
query, '')),
self.request_version))

though of course, these values should be made available to you by
TinyHTTPProxy. Taking a brief look at the code, these values should
available through the variables "scm" and "netloc". So your outgoing
connection code from TinyHTTPProxy should look something like this

soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
soc.connect( ('dav-serv', 8080) )
soc.send("%s %s %s\r\n" % (
self.command,
urlparse.urlunparse((scm, netloc, path, params, query, '')),
self.request_version))

HTH,

Thanks to all of you who replied.
I think I uderstand enough to have a go - I need to make the
connection to the proxy and the request for the absolute URI. That at
least gives me something to go at and it shouldn't be too hard.

Many Thanks for your help.

Fuzzyman

http://www.voidspace.org.uk/atlantib...thonutils.html
Jul 18 '05 #3
[snip..]

It looks like the Tiny HTTP Proxy (using BaseHTTPServer as it's
foundation) intercepts all requests to local addresses and then
re-implements the request (whether it is CONNECT, GET, PUT or
whatever). It logs everything that goes through it - I will simply
edit it to amend the URL that is being asked for.


Yes, that is exactly what the proxy should do. It relays requests
between client and server. However, there is one vital detail you're
probably missing that is preventing you from chaining client + proxy*N
+ server together.

When sending a HTTP GET request to a server, a client sends a request
line containing a URI without a server component. This is because the
socket connection to the server is already formed, therefore the
server connection details do not need to be repeated. So a standard
GET will look like this

GET /index.html HTTP/1.1

However, it's different when a client connects to a proxy, because the
socket no longer connects directly to the server, but to the proxy
instead. The proxy still needs to know to which server it should send
the request. So the correct format for sending requests to a proxy is
to use the "absoluteURI" form, which includes the server details, e.g.

GET http://www.python.org:80/index.html HTTP/1.1

Any proxy that receives such a request now knows that the server to
forward to is "www.python.org:80". It will open a connection to
www.python.org:80, and send it a GET request for the URI.

Since you want your proxy to forward to another proxy, i.e. your proxy
is a client from your external-access-proxy's point of view, you
should also use the absoluteURI form when making requests from your
python proxy to your external proxy.


Well the two minor changes you suggested worked straight away for
normal HTML pages - great.
It's not fetching images and a couple of other problems (possibly
because that proxy server can only handle HTTP/1.0 - but I have a more
advanced one called TcpWatch from Zope that I might hack around).

But there's more than enough for me to go on and get it working.

MANY THANKS

Regards,

Fuzzy

http://www.voidspace.org.uk/atlantib...thonutils.html
Jul 18 '05 #4
[snip..]
On another note: I assumed you wanted to transparently relay/rewrite HTTP
through the proxy. If you need to open some form of socket connection to the
proxy which is not HTTP, the proxy protocol supports the method CONNECT,
which will simply open up a socket connection which is relayed by the proxy.
But: This cannot be made transparent, except by some deeper magic in the
router.

HTH!

Heiko.


Thanks for your help.
It's only http that I'll be relayign and I only need it to be
transparent to the user - I'm not using this for anonymity.

I don't yet understand the detail of what you've said, but I am
following hte resources you've suggested and now have enough to get to
the next stage of my work.

Thanks

Fuzzyman

http://www.voidspace.org.uk/atlantib...thonutils.html
Jul 18 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: b_prikls | last post by:
Hi all, i need your help, here is code, that makes http connection to a website throu a proxy server, and it works fine: ========================================== $proxy = "111.111.111.111";...
7
by: Nathan Davis | last post by:
Hi, I am having problems reading data from a socket. The problem only occurs while trying to read from one particular server, and does not occur on Windows (as far as I know, this may occur...
0
by: Fuzzyman | last post by:
I'm trying to create a proxy server - one that will modify requests made through it. I've started with Tiny HTTP Proxy by SUZUKI Hisao which is built on BaseHTTPServer - and I'm starting to get...
5
by: Russell Warren | last post by:
Does anyone know the scope of the socket.setdefaulttimeout call? Is it a cross-process/system setting or does it stay local in the application in which it is called? I've been testing this and...
7
by: Adam Clauss | last post by:
I am trying to work-around a firewall which limits me to only being able to accept inbound connections on port 80. Unfortunately, I need to two different applications to be able to accept...
0
by: Marc Bogaard | last post by:
Hello all! I am trying to develop some kind of proxy which filters the data the browser sents to a webserver (www.google.de, www.gmx.net, ...). For this reason I set a Proxy Connection in my IE...
7
by: | last post by:
Hi all, I have a simple .aspx page running on net 2.0 that is trying to do a http post to a remote server. Here is the code Private Function ProcessRequests(ByVal strbody As String) As String...
13
by: coloradowebdev | last post by:
i am working on basically a proxy server that handles requests via remoting from clients and executes transactions against a third-party server via TCP. the remoting site works like a champ. my...
2
by: vasu1308 | last post by:
Hi all I am working on a socket program in Perl. Main goal is to develop a proxy server. Here is the code attached. An error is encountered. Anyone Please help me out. #!/usr/bin/perl -w #...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.