473,397 Members | 2,084 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,397 software developers and data experts.

Previewing user input HTML

I have a page that accepts user input, including HTML. I would like to
offer a preview of what the users HTML will look like, but I'd also like
to avoid having to parse their HTML to ensure that it is valid.

The sorts of things that cause problems are unmatched quotes inside the
HTML and mismatched <>'s around the HTML. There are probably others
(thus demonstrating why I need to avoid parsing it).

The mismatched <>'s are not too difficult - I can add a ">" of my own,
but then it will be visible.

I realise we are into the land of handling invalid HTML, so all bets are
off, but is there any good approach to such a problem?

If I do end up parsing the users HTML, do I need to worry about more
than mismatched <>'s and quotes (inside the <>'s). Remember, I don't
actually care what it looks like, as long as it doesn't upset my own
HTML which follows the preview.

--
Steve Swift
http://www.swiftys.org.uk/swifty.html
http://www.ringers.org.uk
Sep 30 '08 #1
4 2019
On 2008-09-30, Steve Swift <St***********@gmail.comwrote:
I have a page that accepts user input, including HTML. I would like to
offer a preview of what the users HTML will look like, but I'd also like
to avoid having to parse their HTML to ensure that it is valid.

The sorts of things that cause problems are unmatched quotes inside the
HTML and mismatched <>'s around the HTML. There are probably others
(thus demonstrating why I need to avoid parsing it).

The mismatched <>'s are not too difficult - I can add a ">" of my own,
but then it will be visible.

I realise we are into the land of handling invalid HTML, so all bets are
off, but is there any good approach to such a problem?

If I do end up parsing the users HTML, do I need to worry about more
than mismatched <>'s and quotes (inside the <>'s). Remember, I don't
actually care what it looks like, as long as it doesn't upset my own
HTML which follows the preview.
I think if you user innerHTML, your own HTML will probably be OK.

The browser will parse their garbage to create a subtree for the element
whose innerHTML you're setting, and then attach that subtree to your DOM
tree. It won't paste their garbage into your HTML and parse the whole
lot again.

To be absolutely sure, you could parse their input before attaching it
to your DOM tree.

Something like:

var div = document.createElement("div"); // unattached node
div.innerHTML = userGarbage;

Then use appendChild to attach the div into your DOM tree.

But I don't think that will be necessary.
Sep 30 '08 #2
Steve Swift <St***********@gmail.comwrites:
I have a page that accepts user input, including HTML. I would like to
offer a preview of what the users HTML will look like, but I'd also
like to avoid having to parse their HTML to ensure that it is valid.
<snip>
>... Remember, I don't
actually care what it looks like, as long as it doesn't upset my own
HTML which follows the preview.
Can you side-step the problem by keeping the user HTML separate and
displaying it using an <objectelement?

--
Ben.
Sep 30 '08 #3
In article <sl*********************@bowser.marioworld>,
Ben C <sp******@spam.eggswrote:
On 2008-09-30, Steve Swift <St***********@gmail.comwrote:
I have a page that accepts user input, including HTML. I would like to
offer a preview of what the users HTML will look like, but I'd also like
to avoid having to parse their HTML to ensure that it is valid.

The sorts of things that cause problems are unmatched quotes inside the
HTML and mismatched <>'s around the HTML. There are probably others
(thus demonstrating why I need to avoid parsing it).

The mismatched <>'s are not too difficult - I can add a ">" of my own,
but then it will be visible.

I realise we are into the land of handling invalid HTML, so all bets are
off, but is there any good approach to such a problem?

If I do end up parsing the users HTML, do I need to worry about more
than mismatched <>'s and quotes (inside the <>'s). Remember, I don't
actually care what it looks like, as long as it doesn't upset my own
HTML which follows the preview.

I think if you user innerHTML, your own HTML will probably be OK.

The browser will parse their garbage to create a subtree for the element
whose innerHTML you're setting, and then attach that subtree to your DOM
tree. It won't paste their garbage into your HTML and parse the whole
lot again.

To be absolutely sure, you could parse their input before attaching it
to your DOM tree.

Something like:

var div = document.createElement("div"); // unattached node
div.innerHTML = userGarbage;

Then use appendChild to attach the div into your DOM tree.

But I don't think that will be necessary.
I don't know about that, but it seems to me that you will need to run the
user-provided html through something first, just to ensure that no
malicious code has been inserted that could pose a security risk. I
believe the perl CGI module has a function or functions you can use to
do this, and I would be willing to bet you can find equivalent JS tools.

Which leads to the thought that, since you're going to have to
pre-process the user html anyway, maybe you could also pipe it through
something like htmlTidy (I think that's it's name)?
Sep 30 '08 #4
On 2008-09-30, David Stone <no******@domain.invalidwrote:
In article <sl*********************@bowser.marioworld>,
Ben C <sp******@spam.eggswrote:
>On 2008-09-30, Steve Swift <St***********@gmail.comwrote:
I have a page that accepts user input, including HTML. I would like to
offer a preview of what the users HTML will look like, but I'd also like
to avoid having to parse their HTML to ensure that it is valid.

The sorts of things that cause problems are unmatched quotes inside the
HTML and mismatched <>'s around the HTML. There are probably others
(thus demonstrating why I need to avoid parsing it).

The mismatched <>'s are not too difficult - I can add a ">" of my own,
but then it will be visible.

I realise we are into the land of handling invalid HTML, so all bets are
off, but is there any good approach to such a problem?

If I do end up parsing the users HTML, do I need to worry about more
than mismatched <>'s and quotes (inside the <>'s). Remember, I don't
actually care what it looks like, as long as it doesn't upset my own
HTML which follows the preview.

I think if you user innerHTML, your own HTML will probably be OK.

The browser will parse their garbage to create a subtree for the element
whose innerHTML you're setting, and then attach that subtree to your DOM
tree. It won't paste their garbage into your HTML and parse the whole
lot again.

To be absolutely sure, you could parse their input before attaching it
to your DOM tree.

Something like:

var div = document.createElement("div"); // unattached node
div.innerHTML = userGarbage;

Then use appendChild to attach the div into your DOM tree.

But I don't think that will be necessary.

I don't know about that, but it seems to me that you will need to run the
user-provided html through something first, just to ensure that no
malicious code has been inserted that could pose a security risk. I
believe the perl CGI module has a function or functions you can use to
do this, and I would be willing to bet you can find equivalent JS tools.

Which leads to the thought that, since you're going to have to
pre-process the user html anyway, maybe you could also pipe it through
something like htmlTidy (I think that's it's name)?
I think we're thinking about different things. Perhaps because he used
the word "preview" I got it into my head that this user HTML was not
going back to the server (wiki style) but being added to the page there
and then with JS on the client.

The idea of innerHTML is that you're using the browser's own normal
broken HTML handling to deal with things, and it's basically all you've
got on the client.

But it's much more likely that the data is going back to the server, in
which case yes you could run it through tidy and other checkers like
that easily.
Oct 1 '08 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Jay | last post by:
Hi everybody! Please help me with this problem. I try to write code for server side data validation. Initially, I have a html file called "form.html" which just contains all the necessary fields...
3
by: zlst | last post by:
Many technological innovations rely upon User Interface Design to elevate their technical complexity to a usable product. Technology alone may not win user acceptance and subsequent marketability....
10
by: sqlboy2000 | last post by:
Hello all, I have something very simple going on here and I'm scratching my head as to what the problem is. There are 4 items in my project, 2 webforms, a user control, and a module: ...
9
by: MadingS | last post by:
Is there an HTML (or CSS) way to say "use the user's INPUT font, whatever that might be."? Most browsers have the ability for the user to pick his or her own preference for what font to use...
2
by: underground | last post by:
Hi, everyone I've been trying to figure out a way for a user to update there information. I'm using sections to identify the specific user..Here is the form <? include("include/session.php");...
1
by: lstuyck73 | last post by:
Hi all, I have a question concerning previewing a file in an ASP.NET application. I would like to have a feature that a file on the server (like powerpoint, word, pdf or excel file) could be...
2
by: vipinkmathews | last post by:
I wrote code for previewing a word dcoument in C#. In the code when the user click preview button the document gets opened and later the user manually closes the document. But in Task manager the...
82
by: happyse27 | last post by:
Hi All, I modified the user registration script, but not sure how to make it check for each variable in terms of preventing junk registration and invalid characters? Two codes below : a)...
9
by: happyse27 | last post by:
Hi All, In perl script(item b below) where we check if html registration form are filled in properly without blank with the necessary fields, how to prompt users that the field are incomplete...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.