By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,384 Members | 1,794 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,384 IT Pros & Developers. It's quick & easy.

RegExp help needed!

P: n/a
Greetings, all...

I've got an issue that I'm trying to solve and RegExp looks to be my
only avenue.

Background: We are using a popular third party control for input into a
textarea. By default, the text is HTML but the UI does not display the
HTML tags. Because we are using this in a web-messaging scenario, the
user may choose to send their message in plain text and we want to
display that on the screen.

The third party control we're using does not have any built-in ability
to switch from their rich-text editing environment to a "plain-text"
environment.

Proposed Resolution: 1) Read the data input into the third-party
control, 2) strip all HTML markup from the data while maintaining
carriage return/line feed data, 3) Hide the third-party control and
reveal an HTML text area, and, 4) copy the reformatted text into the
text area.

Additional Feature: The user should be able to re-select HTML editing
effectively reversing this process. While any previous HTML formatting
would be lost, the cr/lf data would be preserved when switching back to
HTML.

I have played around with some regular expressions with only moderate
luck. None of the freely available patters on RegExLib.com seem to do
what is needed.

Here's the current pattern I'm working with: <[^>]+?>
This pattern strips all the HTML, but does not preserve the cr/lf data.

I would really appreciate any assistance that can be offered. If you
know of a pattern that would reliably do the conversion from HTML to
plain-text, and the companion that would do the plain-text to HTML, I
would be very grateful!

Thanks in advance,
Ric Castagna

Oct 13 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
wrote on 13 okt 2005 in comp.lang.javascript:
2) strip all HTML markup from the data while maintaining
carriage return/line feed data


http://www.google.com/search?q=strip.html%20regex
and
http://groups.google.com/groups?q=strip.html+regex
will give you ample examples of the needed regex

However what crlf do you want to safe?
the <br> and the <hr> or also the <div>?

you could very well detect them first and reassign them first,
and restort the <br> at the end:
================================================== ===
t = t.replace(/(<br>)|(<hr>)|(</div>)/gi, "*\\*")

t = t.replace(/[do your stripping]/gi, "")

t = t.replace(/\*\\\\\*/g, "<br>")
================================================== ===
--
Evertjan.
The Netherlands.
(Replace all crosses with dots in my emailaddress)

Oct 13 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.