473,394 Members | 1,693 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Are the built-in HTTP servers production quality?

I've heard it said that web servers built upon the standard library's
SimpleHTTPServer or CGIHTTPServer aren't really suitable for use in a
production system. Is this still the case? And if so, why?

Is it primarily a performance issue? If so, aren't there a number of
things that can easily be done to improve webserver performance ---
caching proxy front-ends (e.g Apache's mod_proxy), faster hardware,
server clusters, etc.? Also it seems fairly straightforward to make
them asynchronous (via the ForkingMixIn and ThreadingMixIn)...

Is it that they're not safe; can be easily compromised/cracked? If so,
wouldn't hiding them behind trusted front-ends (like Apache) help there
too?

Are they simply too buggy to be relied upon? They leak memory, don't
correctly handle certain types of requests, or perhaps they're not
standards compliant enough?

They're so easy to work with, I'd really love to understand what we
believe their shortcomings to be.

Thanks.

Paul

Jul 18 '05 #1
14 2808
Paul Morrow wrote:
I've heard it said that web servers built upon the standard library's
SimpleHTTPServer or CGIHTTPServer aren't really suitable for use in a
production system. Is this still the case? And if so, why?


For starters, the SimpleHTTPServer reports the wrong Content-Length.
See my patch at http://tinyurl.com/56frb

--Irmen
Jul 18 '05 #2
Irmen de Jong wrote:

For starters, the SimpleHTTPServer reports the wrong Content-Length.
See my patch at http://tinyurl.com/56frb


Yes, something is wrong there. I wonder though if it makes sense to
continue to open text files in 'text' mode, so that the treatment of
newlines is normalized, but then adjust the content-length to be the
length of the (newline converted) string as read from the file.

Jul 18 '05 #3
Paul Morrow wrote:
Irmen de Jong wrote:

For starters, the SimpleHTTPServer reports the wrong Content-Length.
See my patch at http://tinyurl.com/56frb


Yes, something is wrong there. I wonder though if it makes sense to
continue to open text files in 'text' mode, so that the treatment of
newlines is normalized, but then adjust the content-length to be the
length of the (newline converted) string as read from the file.


Well, I think that could be done too.
But why not ditch the newline conversion altogether and
treat text files just as any other file (which is what my patch does).

-Irmen
Jul 18 '05 #4
Irmen de Jong wrote:
Paul Morrow wrote:
Irmen de Jong wrote:

For starters, the SimpleHTTPServer reports the wrong Content-Length.
See my patch at http://tinyurl.com/56frb


Yes, something is wrong there. I wonder though if it makes sense to
continue to open text files in 'text' mode, so that the treatment of
newlines is normalized, but then adjust the content-length to be the
length of the (newline converted) string as read from the file.


Well, I think that could be done too.
But why not ditch the newline conversion altogether and
treat text files just as any other file (which is what my patch does).

-Irmen


I'm not sure that we can rely on the client browser doing the right
thing with the newlines. I'm not aware of an rfc that really covers
this, but do all browsers convert text files as needed to their native
format? If so, sending them as binary (unaltered) would be fine. But
if not, maybe the solution is to detect (or make a guess at) the OS of
the client, then adjust the newlines accordingly (yuck!)...

Jul 18 '05 #5

"Paul Morrow" <pm****@yahoo.com> wrote in message
news:ma*************************************@pytho n.org...
I'm not sure that we can rely on the client browser doing the right
thing with the newlines. I'm not aware of an rfc that really covers
this,


HTTP 1.1 covers this topic:

http://www.w3.org/Protocols/rfc2616/....html#sec3.7.1

Still, that doesn't help if your server text format isn't one of these.
Jul 18 '05 #6
Richard Brodie wrote:
"Paul Morrow" <pm****@yahoo.com> wrote in message
news:ma*************************************@pytho n.org...

I'm not sure that we can rely on the client browser doing the right
thing with the newlines. I'm not aware of an rfc that really covers
this,

HTTP 1.1 covers this topic:

http://www.w3.org/Protocols/rfc2616/....html#sec3.7.1

Still, that doesn't help if your server text format isn't one of these.


Hmm, well, reading that particular section of the HTTP spec makes
me think that the solution I programmed in the patch isn't the optimal
one.
Rather than treating text files (with content-type text/...) as any
other -binary- file, I now think that it's actually better to read them
in and convert the line endings to CR LF. But this requires:

- buffering of the entire text file in memory during the CR LF conversion
- fixing the content-type at the end because it's not the same as
the filesize (this was the original problem I solved).

What do you think?
Bye,
Irmen.
Jul 18 '05 #7
On 2004-07-19, Irmen de Jong <irmen@-nospam-remove-this-xs4all.nl> wrote:
Rather than treating text files (with content-type text/...) as any
other -binary- file, I now think that it's actually better to read them
in and convert the line endings to CR LF. But this requires:

- buffering of the entire text file in memory during the CR LF conversion
Why do you have to buffer the entire file? You couldm ake two
almost-identical conversion passes through the file: the first
time, just count the "output" bytes. Finish sending the HTTP
headers out, then make the second pass actually writing the
output bytes.

If the OS had enough spare RAM sitting around, it will buffer
the file for you and the second pass won't even hit the disk.
- fixing the content-type at the end because it's not the same as
the filesize (this was the original problem I solved).


--
Grant Edwards grante Yow! Here I am at the flea
at market but nobody is buying
visi.com my urine sample bottles...
Jul 18 '05 #8
Grant Edwards wrote:
Why do you have to buffer the entire file? You couldm ake two
almost-identical conversion passes through the file: the first
time, just count the "output" bytes. Finish sending the HTTP
headers out, then make the second pass actually writing the
output bytes.


That would slow things down quite a bit, because you now
have to do the same CPU-intensive task twice.

But it saves memory.

Oh, the choices...

--Irmen
Jul 18 '05 #9
On Mon, 19 Jul 2004, Irmen de Jong wrote:
Grant Edwards wrote:
Why do you have to buffer the entire file? You couldm ake two
almost-identical conversion passes through the file: the first
time, just count the "output" bytes. Finish sending the HTTP
headers out, then make the second pass actually writing the
output bytes.


That would slow things down quite a bit, because you now
have to do the same CPU-intensive task twice.

But it saves memory.

Oh, the choices...


Choices, indeed:

from StringIO import StringIO
from gzip import GzipFile

gzipdata = StringIO()
gzipfile = GzipFile(mode='w', fileobj=gzipdata)

length = 0
for line in slurp_data():
line = mangle(line)
length += length(line)
gzipfile.write(line)

gzipfile.close()

gzipdata.seek(0)
print length
for line in GzipFile(fileobj=gzipdata):
spew_data(line)

;)
Jul 18 '05 #10
Irmen de Jong wrote:
Grant Edwards wrote:
Why do you have to buffer the entire file? You couldm ake two
almost-identical conversion passes through the file: the first
time, just count the "output" bytes. Finish sending the HTTP
headers out, then make the second pass actually writing the
output bytes.

That would slow things down quite a bit, because you now
have to do the same CPU-intensive task twice.

But it saves memory.

Oh, the choices...

--Irmen


Or to improve it slightly you could do a find loop on the characters you
have to convert [LF for Unix], work out how many characters you're going
to have to add, give the contentlength, send the headers, and then use
the cached list of LF positions to send strings + inserted characters

Now that would have been easier to say in Python...

David
Jul 18 '05 #11
Irmen de Jong wrote:
Richard Brodie wrote:
HTTP 1.1 covers this topic:

http://www.w3.org/Protocols/rfc2616/....html#sec3.7.1

Still, that doesn't help if your server text format isn't one of these.

Hmm, well, reading that particular section of the HTTP spec makes
me think that the solution I programmed in the patch isn't the optimal
one.
Rather than treating text files (with content-type text/...) as any
other -binary- file, I now think that it's actually better to read them
in and convert the line endings to CR LF. But this requires:

- buffering of the entire text file in memory during the CR LF conversion
- fixing the content-type at the end because it's not the same as
the filesize (this was the original problem I solved).

What do you think?


I think that it might be best if these files were already in the correct
format on the disk -- maybe via a daemon that periodically 'fixes'
those in need, a special ftp server, or a file-system plugin/hook
(mmm...!) -- so that nothing special has to be done to them by the web
server. In which case, reading them all as binary (as your patch takes
care of) works fine.

Jul 18 '05 #12
Irmen de Jong wrote:
- fixing the content-type at the end because it's not the same as
the filesize (this was the original problem I solved).


Whoops, ofcourse I meant: fixing the content-length

--Irmen
Jul 18 '05 #13
[Irmen de Jong]
in and convert the line endings to CR LF. But this requires:

- buffering of the entire text file in memory during the CR LF conversion
- fixing the content-type at the end because it's not the same as
the filesize (this was the original problem I solved).

What do you think?


Or send it with "Transfer-encoding: chunked", which is purpose built
for dealing with content of unknown length.

http://www.cse.ohio-state.edu/cgi-bi...html#sec-3.6.1

Each chunk is preceded by a header indicating the size of the
forthcoming chunk. An algorithm for reading chunked encoding is given
in one of the appendices: it's pretty simple to understand:-

http://www.cse.ohio-state.edu/cgi-bi...tml#sec-19.4.6

So on the server side, simply continually buffer as much of the
textual content as you want to send in each chunk, do your
line-endings translation on the buffer, send an ascii rep of the size
of the buffer, and then send the buffer contents. Repeat until EOF.

This is how robust production servers like Apache and Tomcat do it:
they don't try to buffer the content, for a whole load of good reasons.

Each chunk can also have its own separate headers, which are appended
to the headers for the whole response, while processing that chunk, so
it should be relatively simple to do "Transfer-encoding: gzip" on each
chunk as well, for that extra bandwith saving.

Just one more way to do it.

--
alan kennedy
------------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan: http://xhaus.com/contact/alan
Jul 18 '05 #14
[Alan Kennedy]

[snip spiel about chunked transfer encoding]
This is how robust production servers like Apache and Tomcat do it: they
don't try to buffer the content, for a whole load of good reasons.


I should have stressed: this strategy is most often used when there is
some form of dynamic content generation going on, when it's not
possible to know when the output of the users code (e.g. CGI, PHP,
ASP, etc) is going to stop.

--
alan kennedy
------------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan: http://xhaus.com/contact/alan
Jul 18 '05 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: JimmyT | last post by:
I just configured and installed 2.3.4 and noticed there is no datetime module. I noticed there is a datetimemodule.c file that did not get built (ie no object file). Is there something I need to...
0
by: Andrew Crook | last post by:
does MYSQL have a quota built into it! I need it limit the size of each database AndiC
1
by: Mark | last post by:
Is there a way to execute a statement that is built dynamically by a .NET application. For example I have a loop that is reading values from a database and I want to do something like the...
4
by: Yasutaka Ito | last post by:
Hi, Is there a way to determine which version of .NET Framework any given assembly is built with? thanks! -Yasutaka
1
by: William | last post by:
Looking for a pre built dotnet corporate or small business website template.
1
by: William | last post by:
Looking for a pre built dot net website for consulting business. I am trying to put up a quick business web for a dot net frame work. I have a provider already. I am trying to save time. Any...
1
by: Daniel | last post by:
is there any way to get to a unique build verion of an assembly at runtime? e.g. a version that is unique to the time that the assembly was built?
0
by: anthony Lichnewsky | last post by:
Hi, I have here a huge bunch of cygwin-built dlls using heavily posix api calls, and I wanted to know how to link them to some VC .NET library wrapper (using microsoft C runtime libraries this...
48
by: meyer | last post by:
Hi everyone, which compiler will Python 2.5 on Windows (Intel) be built with? I notice that Python 2.4 apparently has been built with the VS2003 toolkit compiler, and I read a post from Scott...
3
by: drewj840 | last post by:
I built a Windows service that sweeps a set of folders every 60 seconds and puts the files into a SQL Server database. I am creating a second service that will delete this set of folders and recreate...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.