Are the built-in HTTP servers production quality?

Paul Morrow

I've heard it said that web servers built upon the standard library's
SimpleHTTPServer or CGIHTTPServer aren't really suitable for use in a
production system. Is this still the case? And if so, why?

Is it primarily a performance issue? If so, aren't there a number of
things that can easily be done to improve webserver performance ---
caching proxy front-ends (e.g Apache's mod_proxy), faster hardware,
server clusters, etc.? Also it seems fairly straightforward to make
them asynchronous (via the ForkingMixIn and ThreadingMixIn)...

Is it that they're not safe; can be easily compromised/cracked? If so,
wouldn't hiding them behind trusted front-ends (like Apache) help there
too?

Are they simply too buggy to be relied upon? They leak memory, don't
correctly handle certain types of requests, or perhaps they're not
standards compliant enough?

They're so easy to work with, I'd really love to understand what we
believe their shortcomings to be.

Thanks.

Paul

Jul 18 '05 #1

Subscribe Post Reply

2808

Irmen de Jong

Paul Morrow wrote:

I've heard it said that web servers built upon the standard library's
SimpleHTTPServer or CGIHTTPServer aren't really suitable for use in a
production system. Is this still the case? And if so, why?

For starters, the SimpleHTTPServer reports the wrong Content-Length.
See my patch at http://tinyurl.com/56frb

--Irmen

Jul 18 '05 #2

Paul Morrow

Irmen de Jong wrote:

For starters, the SimpleHTTPServer reports the wrong Content-Length.
See my patch at http://tinyurl.com/56frb

Yes, something is wrong there. I wonder though if it makes sense to
continue to open text files in 'text' mode, so that the treatment of
newlines is normalized, but then adjust the content-length to be the
length of the (newline converted) string as read from the file.

Jul 18 '05 #3

Irmen de Jong

Paul Morrow wrote:

Irmen de Jong wrote:

For starters, the SimpleHTTPServer reports the wrong Content-Length.
See my patch at http://tinyurl.com/56frb

Yes, something is wrong there. I wonder though if it makes sense to
continue to open text files in 'text' mode, so that the treatment of
newlines is normalized, but then adjust the content-length to be the
length of the (newline converted) string as read from the file.

Well, I think that could be done too.
But why not ditch the newline conversion altogether and
treat text files just as any other file (which is what my patch does).

-Irmen

Jul 18 '05 #4

Paul Morrow

Irmen de Jong wrote:

Paul Morrow wrote:
Irmen de Jong wrote:

For starters, the SimpleHTTPServer reports the wrong Content-Length.
See my patch at http://tinyurl.com/56frb

Yes, something is wrong there. I wonder though if it makes sense to
continue to open text files in 'text' mode, so that the treatment of
newlines is normalized, but then adjust the content-length to be the
length of the (newline converted) string as read from the file.

Well, I think that could be done too.
But why not ditch the newline conversion altogether and
treat text files just as any other file (which is what my patch does).

-Irmen

I'm not sure that we can rely on the client browser doing the right
thing with the newlines. I'm not aware of an rfc that really covers
this, but do all browsers convert text files as needed to their native
format? If so, sending them as binary (unaltered) would be fine. But
if not, maybe the solution is to detect (or make a guess at) the OS of
the client, then adjust the newlines accordingly (yuck!)...

Jul 18 '05 #5

Richard Brodie

"Paul Morrow" <pm****@yahoo.com> wrote in message
news:ma*************************************@pytho n.org...

I'm not sure that we can rely on the client browser doing the right
thing with the newlines. I'm not aware of an rfc that really covers
this,

HTTP 1.1 covers this topic:

http://www.w3.org/Protocols/rfc2616/....html#sec3.7.1

Still, that doesn't help if your server text format isn't one of these.

Jul 18 '05 #6

Irmen de Jong

Richard Brodie wrote:

"Paul Morrow" <pm****@yahoo.com> wrote in message
news:ma*************************************@pytho n.org...

I'm not sure that we can rely on the client browser doing the right
thing with the newlines. I'm not aware of an rfc that really covers
this,

HTTP 1.1 covers this topic:

http://www.w3.org/Protocols/rfc2616/....html#sec3.7.1

Still, that doesn't help if your server text format isn't one of these.

Hmm, well, reading that particular section of the HTTP spec makes
me think that the solution I programmed in the patch isn't the optimal
one.
Rather than treating text files (with content-type text/...) as any
other -binary- file, I now think that it's actually better to read them
in and convert the line endings to CR LF. But this requires:

- buffering of the entire text file in memory during the CR LF conversion
- fixing the content-type at the end because it's not the same as
the filesize (this was the original problem I solved).

What do you think?
Bye,
Irmen.

Jul 18 '05 #7

Grant Edwards

On 2004-07-19, Irmen de Jong <irmen@-nospam-remove-this-xs4all.nl> wrote:

Rather than treating text files (with content-type text/...) as any
other -binary- file, I now think that it's actually better to read them
in and convert the line endings to CR LF. But this requires:

- buffering of the entire text file in memory during the CR LF conversion
Why do you have to buffer the entire file? You couldm ake two
almost-identical conversion passes through the file: the first
time, just count the "output" bytes. Finish sending the HTTP
headers out, then make the second pass actually writing the
output bytes.

If the OS had enough spare RAM sitting around, it will buffer
the file for you and the second pass won't even hit the disk.
- fixing the content-type at the end because it's not the same as
the filesize (this was the original problem I solved).

--
Grant Edwards grante Yow! Here I am at the flea
at market but nobody is buying
visi.com my urine sample bottles...

Jul 18 '05 #8

Irmen de Jong

Grant Edwards wrote:

Why do you have to buffer the entire file? You couldm ake two
almost-identical conversion passes through the file: the first
time, just count the "output" bytes. Finish sending the HTTP
headers out, then make the second pass actually writing the
output bytes.

That would slow things down quite a bit, because you now
have to do the same CPU-intensive task twice.

But it saves memory.

Oh, the choices...

--Irmen

Jul 18 '05 #9

Christopher T King

On Mon, 19 Jul 2004, Irmen de Jong wrote:

Grant Edwards wrote:
Why do you have to buffer the entire file? You couldm ake two
almost-identical conversion passes through the file: the first
time, just count the "output" bytes. Finish sending the HTTP
headers out, then make the second pass actually writing the
output bytes.

That would slow things down quite a bit, because you now
have to do the same CPU-intensive task twice.

But it saves memory.

Oh, the choices...

Choices, indeed:

from StringIO import StringIO
from gzip import GzipFile

gzipdata = StringIO()
gzipfile = GzipFile(mode='w', fileobj=gzipdata)

length = 0
for line in slurp_data():
line = mangle(line)
length += length(line)
gzipfile.write(line)

gzipfile.close()

gzipdata.seek(0)
print length
for line in GzipFile(fileobj=gzipdata):
spew_data(line)

;)

Jul 18 '05 #10

David Fraser

Irmen de Jong wrote:

Grant Edwards wrote:
Why do you have to buffer the entire file? You couldm ake two
almost-identical conversion passes through the file: the first
time, just count the "output" bytes. Finish sending the HTTP
headers out, then make the second pass actually writing the
output bytes.

That would slow things down quite a bit, because you now
have to do the same CPU-intensive task twice.

But it saves memory.

Oh, the choices...

--Irmen

Or to improve it slightly you could do a find loop on the characters you
have to convert [LF for Unix], work out how many characters you're going
to have to add, give the contentlength, send the headers, and then use
the cached list of LF positions to send strings + inserted characters

Now that would have been easier to say in Python...

David

Jul 18 '05 #11

Paul Morrow

Irmen de Jong wrote:

Richard Brodie wrote:
HTTP 1.1 covers this topic:

http://www.w3.org/Protocols/rfc2616/....html#sec3.7.1

Still, that doesn't help if your server text format isn't one of these.

Hmm, well, reading that particular section of the HTTP spec makes
me think that the solution I programmed in the patch isn't the optimal
one.
Rather than treating text files (with content-type text/...) as any
other -binary- file, I now think that it's actually better to read them
in and convert the line endings to CR LF. But this requires:

- buffering of the entire text file in memory during the CR LF conversion
- fixing the content-type at the end because it's not the same as
the filesize (this was the original problem I solved).

What do you think?

I think that it might be best if these files were already in the correct
format on the disk -- maybe via a daemon that periodically 'fixes'
those in need, a special ftp server, or a file-system plugin/hook
(mmm...!) -- so that nothing special has to be done to them by the web
server. In which case, reading them all as binary (as your patch takes
care of) works fine.

Jul 18 '05 #12

Irmen de Jong

Irmen de Jong wrote:

- fixing the content-type at the end because it's not the same as
the filesize (this was the original problem I solved).

Whoops, ofcourse I meant: fixing the content-length

--Irmen

Jul 18 '05 #13

Alan Kennedy

[Irmen de Jong]

in and convert the line endings to CR LF. But this requires:

- buffering of the entire text file in memory during the CR LF conversion
- fixing the content-type at the end because it's not the same as
the filesize (this was the original problem I solved).

What do you think?

Or send it with "Transfer-encoding: chunked", which is purpose built
for dealing with content of unknown length.

http://www.cse.ohio-state.edu/cgi-bi...html#sec-3.6.1

Each chunk is preceded by a header indicating the size of the
forthcoming chunk. An algorithm for reading chunked encoding is given
in one of the appendices: it's pretty simple to understand:-

http://www.cse.ohio-state.edu/cgi-bi...tml#sec-19.4.6

So on the server side, simply continually buffer as much of the
textual content as you want to send in each chunk, do your
line-endings translation on the buffer, send an ascii rep of the size
of the buffer, and then send the buffer contents. Repeat until EOF.

This is how robust production servers like Apache and Tomcat do it:
they don't try to buffer the content, for a whole load of good reasons.

Each chunk can also have its own separate headers, which are appended
to the headers for the whole response, while processing that chunk, so
it should be relatively simple to do "Transfer-encoding: gzip" on each
chunk as well, for that extra bandwith saving.

Just one more way to do it.

--
alan kennedy
------------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan: http://xhaus.com/contact/alan

Jul 18 '05 #14

Alan Kennedy

[Alan Kennedy]

[snip spiel about chunked transfer encoding]

This is how robust production servers like Apache and Tomcat do it: they
don't try to buffer the content, for a whole load of good reasons.

I should have stressed: this strategy is most often used when there is
some form of dynamic content generation going on, when it's not
possible to know when the output of the users code (e.g. CGI, PHP,
ASP, etc) is going to stop.

--
alan kennedy
------------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan: http://xhaus.com/contact/alan

Jul 18 '05 #15

by: JimmyT | last post by:

I just configured and installed 2.3.4 and noticed there is no datetime module. I noticed there is a datetimemodule.c file that did not get built (ie no object file). Is there something I need to...

Python

does MYSQL have a quota built into it

by: Andrew Crook | last post by:

does MYSQL have a quota built into it! I need it limit the size of each database AndiC

MySQL Database

Executing a dynamically built statement

by: Mark | last post by:

Is there a way to execute a statement that is built dynamically by a .NET application. For example I have a loop that is reading values from a database and I want to do something like the...

.NET Framework

How to determine which version of .NET Framework any given assembly is built with?

by: Yasutaka Ito | last post by:

Hi, Is there a way to determine which version of .NET Framework any given assembly is built with? thanks! -Yasutaka

.NET Framework

Looking for a pre built dotnet corporate or small business website

by: William | last post by:

Looking for a pre built dotnet corporate or small business website template.

.NET Framework

Looking for a pre built dot net website for consulting business.

by: William | last post by:

Looking for a pre built dot net website for consulting business. I am trying to put up a quick business web for a dot net frame work. I have a provider already. I am trying to save time. Any...

.NET Framework

is there any way to get to a unique build verion of an assembly at runtime? e.g. a version that is unique to the time that the assembly was built?

by: Daniel | last post by:

is there any way to get to a unique build verion of an assembly at runtime? e.g. a version that is unique to the time that the assembly was built?

C# / C Sharp

using cygwin-built dlls inside .net framework

by: anthony Lichnewsky | last post by:

Hi, I have here a huge bunch of cygwin-built dlls using heavily posix api calls, and I wanted to know how to link them to some VC .NET library wrapper (using microsoft C runtime libraries this...

.NET Framework

Which compiler will Python 2.5 / Windows (Intel) be built with?

by: meyer | last post by:

Hi everyone, which compiler will Python 2.5 on Windows (Intel) be built with? I notice that Python 2.4 apparently has been built with the VS2003 toolkit compiler, and I read a post from Scott...

Python

Stopping a custom-built Windows service from another custom-built Windows service

by: drewj840 | last post by:

I built a Windows service that sweeps a set of folders every 60 seconds and puts the files into a SQL Server database. I am creating a second service that will delete this set of folders and recreate...

.NET Framework

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

Are the built-in HTTP servers production quality?

Similar topics