473,657 Members | 2,505 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Transparent (redirecting) proxy with BaseHTTPServer

Hi list,

My ultimate goal is to have a small HTTP proxy which is able to show a
message specific to clients name/ip/status then handle the original
request normally either by redirecting the client, or acting as a proxy.

I started with a modified[1] version of TinyHTTPProxy postet by Suzuki
Hisao somewhere in 2003 to this list and tried to extend it to my needs.
It works quite well if I configure my client to use it, but using
iptables REDIRECT feature to point the clients transparently to the
proxy caused some issues.

Precisely, the "self.path" member variable of baseHTTPRequest Handler is
missing the <command> and the host (i.e www.python.org) part of the
request line for REDIRECTed connections:

without iptables REDIRECT:
self.path -> GET http://www.python.org/ftp/python/contrib/ HTTP/1.1

with REDIRECT:
self.path -> GET /ftp/python/contrib/ HTTP/1.1

I asked about this on the squid mailing list and was told this is normal
and I have to reconstuct the request line from the real destination IP,
the URL-path and the Host header (if any). If the Host header is sent
it's an (unsafe) nobrainer, but I cannot for the life of me figure out
where to get the "real destination IP". Any ideas?

thanks
Paul

[1] HTTP Debugging Proxy
Modified by Xavier Defrang (http://defrang.com/)
Jul 18 '05 #1
3 4797
If you actually want the IP, resolve the host header would give you that.

In the redirect case you should get a host header like

Host: www.python.org

From that you can reconstruct the original URL as
http://www.python.org/ftp/python/contrib/. With that you can open it using
urllib and proxy the data to the client.

The second form of HTTP request without the host part is for compatability
of pre-HTTP/1.1 standard. All modern web browser should send the Host
header.

Hi list,

My ultimate goal is to have a small HTTP proxy which is able to show a
message specific to clients name/ip/status then handle the original
request normally either by redirecting the client, or acting as a proxy.

I started with a modified[1] version of TinyHTTPProxy postet by Suzuki
Hisao somewhere in 2003 to this list and tried to extend it to my needs.
It works quite well if I configure my client to use it, but using
iptables REDIRECT feature to point the clients transparently to the
proxy caused some issues.

Precisely, the "self.path" member variable of baseHTTPRequest Handler is
missing the <command> and the host (i.e www.python.org) part of the
request line for REDIRECTed connections:

without iptables REDIRECT:
self.path -> GET http://www.python.org/ftp/python/contrib/ HTTP/1.1

with REDIRECT:
self.path -> GET /ftp/python/contrib/ HTTP/1.1

I asked about this on the squid mailing list and was told this is normal
and I have to reconstuct the request line from the real destination IP,
the URL-path and the Host header (if any). If the Host header is sent
it's an (unsafe) nobrainer, but I cannot for the life of me figure out
where to get the "real destination IP". Any ideas?

thanks
Paul

[1] HTTP Debugging Proxy
Modified by Xavier Defrang (http://defrang.com/)


Jul 18 '05 #2

Thanks, aurora ;),

aurora wrote:
If you actually want the IP, resolve the host header would give you that. I' m only interested in the hostname.

The second form of HTTP request without the host part is for
compatability of pre-HTTP/1.1 standard. All modern web browser should
send the Host header.

How safe is the assumtion that the Host header will be there? Is it part
of the HTTP/1.1 spec? And does it mean all "pre 1.1" clients will fail?
Hmm, maybe I should look on the wire whats really happening...

thanks again
Paul
Jul 18 '05 #3
It should be very safe to count on the host header. Maybe some really
really old browser would not support that. But they probably won't work in
today's WWW anyway. Majority of today's web site is likely to be virtually
hosted. One Apache maybe hosting for 50 web addresses. If a client strip
the host name and not sending the host header either the web server
wouldn't what address it is really looking for. If you caught some request
that doesn't have host header it is a good idea to redirect them to a
browser upgrade page.

Thanks, aurora ;),

aurora wrote:
If you actually want the IP, resolve the host header would give you
that.

I' m only interested in the hostname.
The second form of HTTP request without the host part is for
compatability of pre-HTTP/1.1 standard. All modern web browser should
send the Host header.

How safe is the assumtion that the Host header will be there? Is it part
of the HTTP/1.1 spec? And does it mean all "pre 1.1" clients will fail?
Hmm, maybe I should look on the wire whats really happening...

thanks again
Paul


Jul 18 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
1882
by: Yin | last post by:
Hello. I am using the basehttpserver to implement the HTTP protocol to serve a fairly large lexicon that I have loaded as a dictionary in python. Rather than writing a whole server, I would like to reuse the BaseHTTPserver classes. I am interested in finding a way to serve the dict without loading the whole dict into memory everytime an HTTP request is made. The dict lives in local memory when it is loaded and takes a long time to load.
2
8959
by: Fuzzyman | last post by:
I am trying to write a small server program that will work on a *client* machine as a localhost server. It should then act as a proxy server but modify URLs fetched through it - so that the fetches go via an external CGI proxy.... Understand all that ? :-) (I have a working CGI proxy that will remotely fetch web pages that I can't access. The CGI proxy knows which page to fetch through the PATH_INFO environment variable. What I'd like to...
0
2400
by: Fuzzyman | last post by:
I'm trying to create a proxy server - one that will modify requests made through it. I've started with Tiny HTTP Proxy by SUZUKI Hisao which is built on BaseHTTPServer - and I'm starting to get somewhere. It listens on local ports and tells you whats goign through it. I'm in a slightly unusual situation though - all internet traffic goes through a proxy - I have to set the proxy settings to dav-serv:8080 to work.
4
17691
by: Fuzzyman | last post by:
In a nutshell - the question I'm asking is, how do I make a socket conenction go via a proxy server ? All our internet traffic has to go through a proxy-server at location 'dav-serv:8080' and I need to make a socket connection through it. The reason (with code example) is as follows : I am hacking "Tiny HTTP Proxy" by SUZUKI Hisao to make an http proxy that modifies URLs. I haven't got very far - having started from zero knowledge of...
0
2093
by: Daylor | last post by:
hi. appdomain1 created CBmwCar Object. appdomain1 hold the CBmwCar object in ICar var. appdomain1 send this object to appdomain2. how appdomain2 can cast the ICar transparent proxy to CBmwCar transparent proxy ?
0
1494
by: Daylor | last post by:
hi. appdomain1 created CBmwCar Object. appdomain1 hold the CBmwCar object in ICar var. appdomain1 send this object to appdomain2. how appdomain2 can cast the ICar transparent proxy to CBmwCar transparent proxy ?
4
5067
by: Sharon | last post by:
Hi all, Can any one explain the relationship between real & transparent proxy? I couldn't fully understand it from the explanation at MSDN. Thanks, Sharon.
13
2656
by: Ron Garret | last post by:
I'm trying to figure out how to use BaseHTTPServer. Here's my little test app: ================================= #!/usr/bin/python from BaseHTTPServer import * import cgi
3
5943
by: =?Utf-8?B?c3VyZXNocGFuZGk=?= | last post by:
Hi I am in a web site A. I want to redirect to a web site B with basic authentication. HttpWebContext and WebRespose methods are downloading a site page as html and this is not helping me. I dont want to get the HTML , i want to redirect to the web site b with basic authentication. I appreciate if some one helps me to solve this problem.
0
8842
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8516
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7353
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6176
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4173
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4330
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2743
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1970
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1733
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.