473,498 Members | 1,992 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

& and &

How can one stop a browser from converting

&

to

& ?

We have a textarea in our system wehre a user can type in some html code
and have it saved to the database. When the data is retireved and
redisplayed it is displayed as simply &.

HTML snippet:

<TEXTAREA NAME="p_html" ROWS=6 COLS=70 ALIGN="VIRTUAL" WRAP="YES">
fred
&
&amp;
</TEXTAREA>

When displayed, the user predictably sees

fred
&
&

What workarounds are there for this - I am sure it's a problem for
others - is there a way of "escaping" the value before display?
--

jeremy

Mar 3 '06 #1
11 6377

Jeremy wrote:
We have a textarea in our system wehre a user can type in some html code
and have it saved to the database.


Users can't type "HTML code" into a <textarea> What they type _is_
plain text, which they might _intend_ to have interpreted later as if
it were HTML. To help them do this you must first convert their plain
text as entered into HTML - part of this process would be to encode
their plaintext "&" into the HTML "&amp;", probably just before storing
it.

Good variable naming in your server code will help too - try prefixing
variable with "strUserJunk", or "htmlUserJunk" as appropriate. Whenever
you see code that assigns variables with mis-matched names, be
suspicious that there's an encoding / decoding process missing.

This stuff isn't hard to do, but it does require clarity of thought and
attention to detail. It's also very important to get right (there are
some interesting attacks you can make on blogs etc. if you let users
post arbitrary chunks of HTML).

Mar 3 '06 #2
Jeremy <je********@gmail.com> wrote:
How can one stop a browser from converting

&amp;

to

& ?
You can't if the document is served as text/html.

If you want a browser to display &amp; literally in a document served as
text/html use &amp;amp;

Depending on what you need it may be possible to serve it as text/plain
in which case you can use the literal. This can also be embedded into a
document served as text/html.
We have a textarea in our system wehre a user can type in some html code
and have it saved to the database. When the data is retireved and
redisplayed it is displayed as simply &.

HTML snippet:

<TEXTAREA NAME="p_html" ROWS=6 COLS=70 ALIGN="VIRTUAL" WRAP="YES">
fred
&
&amp;
</TEXTAREA>

When displayed, the user predictably sees

fred
&
&


You haven't made it clear what this is used for, but maybe your server
side data processing needs to convert character references (not just
&amp;) that the user enters to &amp;char_ref

--
Spartanicus
Mar 3 '06 #3
In article <11*********************@p10g2000cwp.googlegroups. com>, Andy
Dingley says...
Users can't type "HTML code" into a <textarea> What they type _is_
plain text, which they might _intend_ to have interpreted later as if
it were HTML. To help them do this you must first convert their plain
text as entered into HTML - part of this process would be to encode
their plaintext "&" into the HTML "&amp;", probably just before storing
it.

Good variable naming in your server code will help too - try prefixing
variable with "strUserJunk", or "htmlUserJunk" as appropriate. Whenever
you see code that assigns variables with mis-matched names, be
suspicious that there's an encoding / decoding process missing.

This stuff isn't hard to do, but it does require clarity of thought and
attention to detail. It's also very important to get right (there are
some interesting attacks you can make on blogs etc. if you let users
post arbitrary chunks of HTML).


Yep I understand all of that. The user types &amp; into a field and
submits the form. The &amp; is stored in the database as typed by the
user. When the data is redisplayed for editing, the browser changes the
&amp; to simply &

So it really has nothing to do with variable naming and so on - the
question is how can we present back to the user the data that they
entered into the field?

--

jeremy

Mar 3 '06 #4
On Fri, 3 Mar 2006, Andy Dingley wrote:
Jeremy wrote:
We have a textarea in our system wehre a user can type in some
html code and have it saved to the database.
Users can't type "HTML code" into a <textarea>


I don't see for a moment why not. In fact I've been doing it for
ages. (Of course I would term it "markup", not "code").

And see http://www.htmlhelp.com/tools/validator/direct.html.en
for a practical use of such a thing.
What they type _is_ plain text,
What they type is text. Whether it's plain or otherwise is determined
by what the server-side process is going to use it for. There's no
way to control this: whatever they type-in, be it plain text, HTML
markup, C++ code, raw PostScript, or Linear B, gets submitted to the
server-side in accordance with the rules for forms submission. HTML
markup plays no special role in this part of the action - but it's not
for a moment excluded.

It's all about what you *do* with it when it reaches the server side.
which they might _intend_ to have interpreted later as if
it were HTML. To help them do this you must first convert their plain
text as entered into HTML - part of this process would be to encode
their plaintext "&" into the HTML "&amp;", probably just before storing
it.
*That* would certainly not be helpful if they were supplying HTML
markup.
This stuff isn't hard to do, but it does require clarity of thought
That's very true.
It's also very important to get right (there are some interesting
attacks you can make on blogs etc. if you let users post arbitrary
chunks of HTML).


Indeed; so block the raw-HTML options to untrusted contributors. But
that doesn't mean there's anything wrong in principle with the
existence of a raw-HTML option.
Mar 3 '06 #5
Jeremy wrote:
How can one stop a browser from converting

&amp;

to

& ?

We have a textarea in our system wehre a user can type in some html code
and have it saved to the database. When the data is retireved and
redisplayed it is displayed as simply &.

HTML snippet:

<TEXTAREA NAME="p_html" ROWS=6 COLS=70 ALIGN="VIRTUAL" WRAP="YES">
fred
&
&amp;
</TEXTAREA>

When displayed, the user predictably sees

fred
&
&


Easy--convert all the & to &amp; before displaying them. "&" will become
&amp; and will display as "&", and "&amp;" will become "&amp;amp;" and
will display as "&amp;".
Mar 3 '06 #6
In article <46************@individual.net>, Harlan Messinger says...


Easy--convert all the & to &amp; before displaying them. "&" will become
&amp; and will display as "&", and "&amp;" will become "&amp;amp;" and
will display as "&amp;".


Brilliant - obvious but brilliant - thanks that is all I needed.

--

jeremy
Mar 3 '06 #7
hug
Jeremy <je********@gmail.com> wrote:
In article <46************@individual.net>, Harlan Messinger says...


Easy--convert all the & to &amp; before displaying them. "&" will become
&amp; and will display as "&", and "&amp;" will become "&amp;amp;" and
will display as "&amp;".


Brilliant - obvious but brilliant - thanks that is all I needed.


If you're working in PHP see htmlentities() it'll get them all in one
swell foop and it has the speed advantage of being a builtin.

--
http://www.ren-prod-inc.com/hug_soft...action=contact
Mar 3 '06 #8
In article <s8********************************@4ax.com>, hug says...
Brilliant - obvious but brilliant - thanks that is all I needed.


If you're working in PHP see htmlentities() it'll get them all in one
swell foop and it has the speed advantage of being a builtin.


Thanks - actually working in Oracle pl/sql - 'tis a simple replace()
call.

--

jeremy
Mar 3 '06 #9
Jeremy wrote:
In article <s8********************************@4ax.com>, hug says...
Brilliant - obvious but brilliant - thanks that is all I needed.


If you're working in PHP see htmlentities() it'll get them all in one
swell foop and it has the speed advantage of being a builtin.

Thanks - actually working in Oracle pl/sql - 'tis a simple replace()
call.


Hug makes a good general point--you might need to make other conversions
too, like < and > and possibly quotation marks. Server-side applications
generally have access in some way to an HTMLEncode function that handles
all of that.
Mar 3 '06 #10
Jeremy wrote:
In article <11*********************@p10g2000cwp.googlegroups. com>, Andy
Dingley says...
Good variable naming in your server code will help too - try prefixing
variable with "strUserJunk", or "htmlUserJunk" as appropriate. Whenever
you see code that assigns variables with mis-matched names, be
suspicious that there's an encoding / decoding process missing.


So it really has nothing to do with variable naming and so on


See this article from Joel on Software to get a better idea of what Andy
was talking about.
http://www.joelonsoftware.com/articles/Wrong.html

In short, you're receiving unsafe content from the user, expecting it to
be plain text, failing to process it to make it is safe by encoding it
in HTML syntax and then outputting it directly. I suspect this will
probably be another bug if the user happens to enter this:

Hello World!</textarea>
<script>//do something evil</script>

When you include that fragment within your document and have not
processed it, the markup recieved by the browser would look something
like this:

<TEXTAREA NAME="p_html" ROWS=6 COLS=70 ALIGN="VIRTUAL" WRAP="YES">
Hello World!</textarea>
<script>//do something evil</script>
</TEXTAREA>

Also, I'm not sure what that align="virtual" attribute in your markup is
supposed to do, I've never heard of it before. Neither align nor wrap
are valid attributes of the textarea element.

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox
Mar 7 '06 #11
In article <3s***************@news-server.bigpond.net.au>, Lachlan Hunt
says...
Jeremy wrote:
In article <11*********************@p10g2000cwp.googlegroups. com>, Andy
Dingley says...
Good variable naming in your server code will help too - try prefixing
variable with "strUserJunk", or "htmlUserJunk" as appropriate. Whenever
you see code that assigns variables with mis-matched names, be
suspicious that there's an encoding / decoding process missing.


So it really has nothing to do with variable naming and so on


See this article from Joel on Software to get a better idea of what Andy
was talking about.
http://www.joelonsoftware.com/articles/Wrong.html

In short, you're receiving unsafe content from the user, expecting it to
be plain text, failing to process it to make it is safe by encoding it
in HTML syntax and then outputting it directly. I suspect this will
probably be another bug if the user happens to enter this:

Hello World!</textarea>
<script>//do something evil</script>

When you include that fragment within your document and have not
processed it, the markup recieved by the browser would look something
like this:

<TEXTAREA NAME="p_html" ROWS=6 COLS=70 ALIGN="VIRTUAL" WRAP="YES">
Hello World!</textarea>
<script>//do something evil</script>
</TEXTAREA>

Also, I'm not sure what that align="virtual" attribute in your markup is
supposed to do, I've never heard of it before. Neither align nor wrap
are valid attributes of the textarea element.


Thanks for all your feedback on this. Andy was addressing another issue
- I guess something related t owhat I was asking about. I see and
understand the point about potentially unsafe content. This is part of
an administrative toolset used by experienced and responsible site
administrators.

--

jeremy
Mar 8 '06 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
8524
by: Collin VanDyck | last post by:
I have a basic understanding of this, so forgive me if I am overly simplistic in my explanation of my problem.. I am trying to get a Java/Xalan transform to pass through a numeric character...
1
11413
by: DrTebi | last post by:
Hello, I have the following problem: I used to "encode" my email address within links, in order to avoid (most) email spiders. So I had a link like this: <a...
0
2401
by: Thomas Scheffler | last post by:
Hi, I runned in trouble using XALAN for XSL-Transformation. The following snipplet show what I mean: <a href="http://blah.com/?test=test&amp;test2=test2">Test1&amp;</a> <a...
4
3202
by: johkar | last post by:
When the output method is set to xml, even though I have CDATA around my JavaScript, the operaters of && and < are converted to XML character entities which causes errors in my JavaScript. I know...
8
2773
by: Nathan Sokalski | last post by:
I add a JavaScript event handler to some of my Webcontrols using the Attributes.Add() method as follows: Dim jscode as String = "return (event.keyCode>=65&&event.keyCode<=90);"...
14
5904
by: Arne | last post by:
A lot of Firefox users I know, says they have problems with validation where the ampersand sign has to be written as &amp; to be valid. I don't have Firefox my self and don't wont to install it only...
12
10057
by: InvalidLastName | last post by:
We have been used XslTransform. .NET 1.1, for transform XML document, Dataset with xsl to HTML. Some of these html contents contain javascript and links. For example: // javascript if (a &gt; b)...
7
4594
by: John Nagle | last post by:
I've been parsing existing HTML with BeautifulSoup, and occasionally hit content which has something like "Design & Advertising", that is, an "&" instead of an "&amp;". Is there some way I can get...
3
2184
by: Tony | last post by:
I see that many pages have &amp; in querystring instead &. What is difference? Can I put page link (url) www.mysite.com/mypage.aspx?lang=EN&ID=15 or I need to write...
0
7125
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7002
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7165
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7203
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
7379
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
4588
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
1417
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
656
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
290
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.