473,387 Members | 1,897 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Convert RTF to plain text

Ted
I have a SQL Server 2000 table with a few fields of "text" data type
that contain rich text. I have to downstream this data and the
recipient cannot handle rich text. I need to figure out a way to
convert it back to plain text. Any suggetions?

TIA

Jul 23 '05 #1
4 17744
It's painful, but you can loop thru every character. Use the ASCII
function to identify and remove any non-alphanumerics (0-9 or a-z or
A-Z or space or period or comma ...).

Jul 23 '05 #2

"Ted" <te********@yahoo.com> wrote in message
news:11**********************@l41g2000cwc.googlegr oups.com...
I have a SQL Server 2000 table with a few fields of "text" data type
that contain rich text. I have to downstream this data and the
recipient cannot handle rich text. I need to figure out a way to
convert it back to plain text. Any suggetions?

TIA


The best idea is probably to export the data to file and convert it
externally - the MSSQL string functions are extremely basic, and writing a
script in something like Perl, Python, C# or whatever will be much more
efficient. With a bit of Googling, you'll probably be able to find something
for your preferred language - there's already a Perl module, for example.

Simon
Jul 23 '05 #3
Ted
agreed. i was trying to avoid a front end process because the client
is going to be pulling data directly from a view in a production
environment. i'll have to throw a little .net app together to do the
conversion i suppose. thanks for all the feedback!!

Jul 23 '05 #4
"louis" <lo************@gmail.com> wrote:
It's painful, but you can loop thru every character. Use the ASCII
function to identify and remove any non-alphanumerics (0-9 or a-z or
A-Z or space or period or comma ...).


I've written software (as a standalone utility, not in the context of
SQL) that goes the other way, but can't offer anything that helps in
this direction. I can offer some advice though.

You need to be a bit more careful than outlined above. In RTF the
backslash "\" and brace characters "{}" are reserved. RTF is
essentially a markup language and the "tags" start with a backslash,
and can contain alphanumerics (typically alphas and then - optionally
- numerics). Braces are used to delimit sections. Some sections
such as those in the header (info and font tables) can be entirely
discarded from the visible output.

Braces and backslashes in the text are escaped - IIRC - with a
backslash. Using this, you could indeed convert most of the RTF
to text.

This approach would recover most text, but some features such as
lists might come out strange, and it wouldn't be formatted nicely,
unless you wanted to honour the \par and \line tags to give line
formatting, but in the context of insertion into a database I guess
that's the most you's want to do.

Also, depending on the source of the RTF you may be dealing with easy
to parse snippets, as opposed to a fully-featured document.

I should imagine that programming the above in SQL would be - as you
rightly point out - quite painful.
--
HTML-to-text and markup removal with Detagger
http://www.jafsoft.com/detagger/
Jul 23 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Alfredo Agosti | last post by:
Hi folks, I have an Access 2000 db with a memo field. Into the memo field I put text with bold attributes, URL etc etc What I need to to is converting the rich text contained into the memo...
10
by: J. Alan Rueckgauer | last post by:
Hello. I'm looking for a simple way to do the following: We have a database that serves-up content to a website. Some of those items are events, some are news articles. They're stored in the...
58
by: Jeff_Relf | last post by:
Hi Tom, You showed: << private const string PHONE_LIST = "495.1000__424.1111___(206)564-5555_1.800.325.3333"; static void Main( string args ) { foreach (string phoneNumber in Regex.Split...
1
by: rob.kellington at gmail | last post by:
Does anyone have the logic to convert RTF formatted data in a text column into plain ascii text that I can use in a varchar variable or field? We have an app that allows formatted comments/notes...
4
by: Nedo | last post by:
hi is there a simple way to convert a html-string to a plain text-string? thanks Nedo
1
by: Zachovich | last post by:
' hi all ' suppose i have this: Dim strText As String = "Hello! <b>this</b> text <i>is</i> formatted" ' NOTE: i want the "this" to be bold and the "is" italic ' but when i do this:...
8
by: Doominato | last post by:
good day, I was just wondering how can I download a web page as plain text from a certain web site. I have tried to use the OpenURL() method from INET control in my VB.NET app, but it returns...
6
by: Dennis | last post by:
Is there anything built in to vb.net that will take a plain text string and reformat it as HTML? What I mean is: o replace newlines with <BR> o replace " with &quot; o etc. I am using vb.net...
8
by: geoffbache | last post by:
I have some marked up text and would like to convert it to plain text, by simply removing all the tags. Of course I can do it from first principles but I felt that among all Python's markup tools...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.