473,326 Members | 2,175 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

Python's CGI and Javascripts uriEncode: A disconnect.

It's all Netscape's fault.

RFC 2396 (URI Specifications) specifies that a space shall be encoded
using %20 and the plus symbol is always safe. Netscape (and possibly
even earlier browsers like Mosaic) used the plus symbol '+' as a
substitute for the space in the last part of the URI, arguments to the
object referenced (you know, all the stuff after the question mark in
a URL).

The ECMA-262 "Javascript" standard now supported by both Netscape and
Internet Explorer honor RFC 2396, translating spaces into their hex
equivalent %20 and leaving pluses alone.

The Python library cgi.FieldStorage decodes it backwards, expecting
pluses to be spaces and %2b to represent pluses. This behavior is
present even in python 2.2, and arguably helps support older browsers.
But when web applications are heavily javascript-dependent, this can
cause major headaches.

Other than override cgi.FieldStorage's parse_qsl, is there anyway to
fix this disconnect?

Elf
Jul 18 '05 #1
1 4827
Elf M. Sternberg <el*@drizzle.com> wrote:
Netscape (and possibly even earlier browsers like Mosaic) used the
plus symbol '+' as a substitute for the space in the last part of
the URI
This is correct in a query parameter. eg. in ...?foo=abc+def, the symbol
is a space.

This is part of the specification for the media type
application/x-www-form-urlencoded, defined by HTML itself (section
17.13.4.1 of the 4.01 spec). This states that spaces should normally
be encoded as '+', however really using '%20' is just as good and
causes less confusion, so that's what newer browsers (and I) do.

Elsewhere, spaces should not be encoded as '+'.

The reasoning for this initial decision is unclear - presumably it is
intended to improve readability, but URIs with query parts are
generally not going to be very readable anyway.
The ECMA-262 "Javascript" standard now supported by both Netscape and
Internet Explorer honor RFC 2396, translating spaces into their hex
equivalent %20 and leaving pluses alone.
Depends which function you are talking about. The 'escape' and 'encodeURI'
built-in functions are not designed to encode single URI query parameter
values, they're designed to encode larger chunks of URI. As such they do
not need to encode plus characters.

The encodeURIComponent function *does*, and it is this function that you
should use if you want some JavaScript code to submit a query parameter.

The only drawback is that encodeURIComponent is relatively new, so you
won't find it on medium-old browsers like Netscape 4 and IE 5.0. (The
same goes for encodeURI - you only get 'escape' in older browsers.)
The Python library cgi.FieldStorage decodes it backwards, expecting
pluses to be spaces and %2b to represent pluses.


The Python library is correct per spec. If your scripts are not encoding
plus symbols in query parameters to %2B, they are at fault (and will go
equally wrong in any other language).

Possible solutions:

a. use encodeURIComponent() instead. This is best, but won't work
universally.
b. use escape(), then replace any pluses in its output with %2B. This
is OK, but won't handle Unicode properly or predictably. (note: in IE,
encodeURI() also fails to handle Unicode predictably.)
c. roll your own encodeURIComponent function.

It's a bit off-topic for c.l.py, but here's a (c.)-style solution I've used
before:

function encPar(wide) {
var narrow= encUtf8(wide);
var enc= '';
for (var i= 0; i<narrow.length; i++) {
if (encPar_OK.indexOf(narrow.charAt(i))==-1)
enc= enc+encHex2(narrow.charCodeAt(i));
else
enc= enc+narrow.charAt(i);
}
return enc;
}
var encPar_OK= 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVW XYZ'+
'0123456789*@-_./';

function encHex2(v) {
return '%'+encHex2_DIGITS.charAt(v>>>4)+encHex2_DIGITS.ch arAt(v&0xF);
}
var encHex2_DIGITS= '0123456789ABCDEF';

function encUtf8(wide) {
var c, s;
var enc= '';
var i= 0;
while(i<wide.length) {
c= wide.charCodeAt(i++);
// handle UTF-16 surrogates
if (c>=0xDC00 && c<0xE000) continue;
if (c>=0xD800 && c<0xDC00) {
if (i>=wide.length) continue;
s= wide.charCodeAt(i++);
if (s<0xDC00 || c>=0xDE00) continue;
c= ((c-0xD800)<<10)+(s-0xDC00)+0x10000;
}
// output value
if (c<0x80) enc+=
String.fromCharCode(c);
else if (c<0x800) enc+=
String.fromCharCode(0xC0+(c>>6),0x80+(c&0x3F));
else if (c<0x10000) enc+=
String.fromCharCode(0xE0+(c>>12),0x80+(c>>6&0x3F), 0x80+(c&0x3F));
else enc+=
String.fromCharCode(0xF0+(c>>18),0x80+(c>>12&0x3F) ,
0x80+(c>>6&0x3F),0x80+(c&0x3F));
}
return enc;
}

if that's of any use.

Kind of sucks having to do this, eh?

--
Andrew Clover
mailto:an*@doxdesk.com
http://www.doxdesk.com/
Jul 18 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

40
by: Shufen | last post by:
Hi all, Can someone who has use PHP before and know quite well about the language, tell me what are the stuffs that Python offers and PHP doesn't. A few examples will be nice. I know about the...
3
by: Maboroshi | last post by:
Hi I am building a simple chat program but I am running into problems interacting with more than one client I can send a message to a server and it can recieve it no problem but how do I get more...
3
by: Kerberos | last post by:
When I deliver a page as text/html, the javascripts work, but if delivered as application/xhtml+xml, the javascripts don't work: function OpenBrWindow(theURL,winName,features, myWidth, myHeight,...
2
by: sdvoranchik | last post by:
We have an application that contains links that run javascripts to create pages in a separate frame. When these links open an external site, it causes the javascripts to no longer function. When...
4
by: David Virgil Hobbs | last post by:
My web host inserts banner ads into my html pages. The javascript in these banner ads interferes with the onload triggered javascript functions in my pages. Whether I trigger my javascript...
1
by: getelectronic | last post by:
Hi all I have a sample code to implement opc client in Python. i use a file .py making by makepy with pythonwin for Com Interface. i can get all server in machine, connect to server opc,...
4
by: januarynow | last post by:
Generally, my site contains javascripts (a couple of freebie counters plus some CPM (pay-per-impression) and CPC (pay-per-click) ads), from four different firms, but they are all suffering from the...
0
by: Brian Vanderburg II | last post by:
I don't know if any such support is already built in, so I ended up making my own simple signals/slots like mechanism. If anyone is interested then here it is, along with a simple test. It can...
1
by: Scott SA | last post by:
On 5/1/08, Brian Vanderburg II (BrianVanderburg2@aim.com) wrote: Did you review this? <http://pydispatcher.sourceforge.net/> from what I understand is originally based upon this:...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.