473,734 Members | 2,567 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

A critique of cgi.escape

The "escape" function in the "cgi" module escapes characters with special
meanings in HTML. The ones that need escaping are '<', '&' and '"'.
However, cgi.escape only escapes the quote character if you pass a second
argument of True (the default is False):
>>cgi.escape("t he \"quick\" & <brownfox")
'the "quick" &amp; &lt;brown&gt ; fox'
>>cgi.escape("t he \"quick\" & <brownfox", True)
'the &quot;quick&quo t; &amp; &lt;brown&gt ; fox'

This seems to me to be dumb. The default option should be the safe one: that
is, escape _all_ the potentially troublesome characters. The only time you
can get away with NOT escaping the quote character is outside of markup,
e.g.

<TEXTAREA>
unescaped "quotes" allowed here
</TEXTAREA>

Nevertheless, even in that situation, escaped quotes are acceptable.

So I think the default for the second argument to cgi.escape should be
changed to True. Or alternatively, the second argument should be removed
altogether, and quotes should always be escaped.

Can changing the default break existing scripts? I don't see how. It might
even fix a few lurking bugs out there.
Sep 23 '06 #1
131 9255
Lawrence D'Oliveiro wrote:
So I think the default for the second argument to cgi.escape should be
changed to True. Or alternatively, the second argument should be removed
altogether, and quotes should always be escaped.
you're confused: cgi.escape(s) is designed to be used for ordinary text,
cgi.escape(s, True) is designed for attributes. if you use the code the
way it's intended to be used, it works perfectly fine.
Can changing the default break existing scripts? I don't see how. It might
even fix a few lurking bugs out there.
I'm not sure this "every time I don't immediately understand something,
I'll write a change proposal instead of reading the library reference"
approach is healthy, really.

</F>

Sep 23 '06 #2
In message <ma************ *************** ***********@pyt hon.org>, Fredrik
Lundh wrote:
Lawrence D'Oliveiro wrote:
>So I think the default for the second argument to cgi.escape should be
changed to True. Or alternatively, the second argument should be removed
altogether, and quotes should always be escaped.

you're confused: cgi.escape(s) is designed to be used for ordinary text,
cgi.escape(s, True) is designed for attributes.
What works for attributes also works for ordinary text.

Sep 23 '06 #3
In article <ma************ *************** ***********@pyt hon.org>, Fredrik Lundh wrote:
Lawrence D'Oliveiro wrote:
>So I think the default for the second argument to cgi.escape should be
changed to True. Or alternatively, the second argument should be removed
altogether, and quotes should always be escaped.

you're confused: cgi.escape(s) is designed to be used for ordinary text,
cgi.escape(s, True) is designed for attributes. if you use the code the
way it's intended to be used, it works perfectly fine.
He's not confused, he's correct; the author of cgi.escape is the
confused one. The optional extra parameter is completely unnecessary
and achieves nothing except to make it easier for people to end up
with bugs in their code.

Making cgi.escape always escape the '"' character would not break
anything, and would probably fix a few bugs in existing code. Yes,
those bugs are not cgi.escape's fault, but that's no reason not to
be helpful. It's a minor improvement with no downside.

One thing that is flat-out wrong, by the way, is that cgi.escape()
does not encode the apostrophe (') character. This is essentially
identical to the quote character in HTML, so any code which escaping
one should always be escaping the other.
Sep 24 '06 #4
In message <sl************ ***********@sno wy.squish.net>, Jon Ribbens wrote:
In article <ma************ *************** ***********@pyt hon.org>, Fredrik
Lundh wrote:
>Lawrence D'Oliveiro wrote:
>>>
So I think the default for the second argument to cgi.escape should be
changed to True. Or alternatively, the second argument should be removed
altogether, and quotes should always be escaped.

you're confused: cgi.escape(s) is designed to be used for ordinary text,
cgi.escape(s , True) is designed for attributes. if you use the code the
way it's intended to be used, it works perfectly fine.

He's not confused, he's correct; the author of cgi.escape is the
confused one.
Thanks for backing me up. :)
One thing that is flat-out wrong, by the way, is that cgi.escape()
does not encode the apostrophe (') character. This is essentially
identical to the quote character in HTML, so any code which escaping
one should always be escaping the other.
I must confess I did a double-take on this. But I rechecked the HTML spec
(HTML 4.0, section 3.2.2, "Attributes "), and you're right--single quotes
ARE allowed as an alternative to double quotes. It's just I've never used
them as quotes. :)
Sep 24 '06 #5
Lawrence D'Oliveiro wrote:
What works for attributes also works for ordinary text.
attributes and ordinary text are two different things in HTML and XML.
you're arguing that it's a good idea for *everyone* to bloat down
ordinary text just because you're too lazy to use a piece of code in the
intended way.

</F>

Sep 24 '06 #6
Jon Ribbens wrote:
Making cgi.escape always escape the '"' character would not break
anything, and would probably fix a few bugs in existing code. Yes,
those bugs are not cgi.escape's fault, but that's no reason not to
be helpful. It's a minor improvement with no downside.
the "improvemen t with no downside" would bloat down the output for
everyone who's using the function in the intended way, and will also
break unit tests.
One thing that is flat-out wrong, by the way, is that cgi.escape()
does not encode the apostrophe (') character.
it's intentional, of course: you're supposed to use " if you're using
cgi.escape(s, True) to escape attributes. again, punishing people who
actually read the docs and understand them is not a very good way to
maintain software.

btw, you're both missing that cgi.escape isn't good enough for general
use anyway, since it doesn't deal with encodings at all. if you want a
general purpose function that can be used for everything that can be put
in an HTML file, you need more than just a modified cgi.escape. feel
free to propose a general-purpose replacement (which should have a new
name), but make sure you think through *all* the issues before you do that.

</F>

Sep 24 '06 #7
In message <ma************ *************** ***********@pyt hon.org>, Fredrik
Lundh wrote:
Jon Ribbens wrote:
>Making cgi.escape always escape the '"' character would not break
anything, and would probably fix a few bugs in existing code. Yes,
those bugs are not cgi.escape's fault, but that's no reason not to
be helpful. It's a minor improvement with no downside.

the "improvemen t with no downside" would bloat down the output for
everyone who's using the function in the intended way, and will also
break unit tests.
I don't understand this "bloat down" nonsense. Any tests that would break
are obviously testing the wrong thing.
One thing that is flat-out wrong, by the way, is that cgi.escape()
does not encode the apostrophe (') character.

it's intentional, of course: you're supposed to use " if you're using
cgi.escape(s, True) to escape attributes.
Attributes can be quoted with either single or double quotes. That's what
the HTML spec says. cgi.escape doesn't correctly allow for that. Ergo,
cgi.escape is broken. QED.
btw, you're both missing that cgi.escape isn't good enough for general
use anyway, since it doesn't deal with encodings at all.
Why does it need to?

Sep 24 '06 #8
Lawrence D'Oliveiro wrote:
Attributes can be quoted with either single or double quotes. That's what
the HTML spec says. cgi.escape doesn't correctly allow for that. Ergo,
cgi.escape is broken. QED.
do you ever think before you post?

</F>

Sep 24 '06 #9
Lawrence D'Oliveiro wrote:
In message <ma************ *************** ***********@pyt hon.org>, Fredrik
Lundh wrote:
>Jon Ribbens wrote:
>>Making cgi.escape always escape the '"' character would not break
anything, and would probably fix a few bugs in existing code. Yes,
those bugs are not cgi.escape's fault, but that's no reason not to
be helpful. It's a minor improvement with no downside.

the "improvemen t with no downside" would bloat down the output for
everyone who's using the function in the intended way, and will also
break unit tests.

I don't understand this "bloat down" nonsense. Any tests that would break
are obviously testing the wrong thing.
&quot; is 4 characters more than ".
> One thing that is flat-out wrong, by the way, is that cgi.escape()
does not encode the apostrophe (') character.

it's intentional, of course: you're supposed to use " if you're using
cgi.escape(s , True) to escape attributes.

Attributes can be quoted with either single or double quotes. That's what
the HTML spec says. cgi.escape doesn't correctly allow for that. Ergo,
cgi.escape is broken. QED.
A function is broken if its implementation doesn't match the documentation.

As a courtesy, I've pasted it below.

escape(s[, quote])
Convert the characters "&", "<" and ">" in string s to HTML-safe sequences.
Use this if you need to display text that might contain such characters in HTML.
If the optional flag quote is true, the quotation mark character (""") is also
translated; this helps for inclusion in an HTML attribute value, as in <A
HREF="...">. If the value to be quoted might include single- or double-quote
characters, or both, consider using the quoteattr() function in the
xml.sax.saxutil s module instead.
Now, do you still think cgi.escape is broken?
Georg
Sep 24 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
4815
by: animeprogrammer | last post by:
hello fellow programmers, Im having problems regarding this code, if this is impossible please let me know. What i have here is a code that inputs as string. If the user press the "escape key" the program will go to a user-defined function. #include "maidcafe.h" ///user-defined..it has all the header files i can imagine xD/// ..... if (strcmp(string, (escape key)) == 0){ fn_menu(); // this will make the program go back to the menu...
0
8946
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8776
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9310
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9236
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9182
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6735
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4550
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4809
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3261
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.