I have a database that has been populated with content pasted out of MS
Word, and is full of special characters -- em dashes, curly quotes, curly
apostrophes, etc. Now I'm generating plain text email summaries out of the
database and of course those special chars appear as garbage chars in the
emails.
How can I filter the extracted text and transform these characters into
plain text equivalents? Is there a builtin function for this, external class
available, or do I need to try and hack it out from scratch?
Thanks,
Scot 6 3305
On Wed, 24 Nov 2004 19:00:33 GMT, Scot Hacker <sh*****@birdhouse.org> wrote: I have a database that has been populated with content pasted out of MS Word, and is full of special characters -- em dashes, curly quotes, curly apostrophes, etc. Now I'm generating plain text email summaries out of the database and of course those special chars appear as garbage chars in the emails.
How can I filter the extracted text and transform these characters into plain text equivalents? Is there a builtin function for this, external class available, or do I need to try and hack it out from scratch?
You might want to check out the wordDocumentHandler class. http://psbweb.mirrors.phpclasses.org...kage/1352.html
On 11/24/04 11:22 AM, in article 8n********************************@4ax.com,
"us****@isotopeREEMOOVEmedia.com" <us****@isotopeREEMOOVEmedia.com> wrote: You might want to check out the wordDocumentHandler class. http://psbweb.mirrors.phpclasses.org...kage/1352.html
Hmm... That sounded promising, but then I found this comment in the header:
// Of course, you need MsWord installed on the server, so Windows OS.
Not an option here. Also, that class seems to want an actual Word doc as
input, and outputs text or html files. All I want to do is examine the
contents of a variable for typical Word special characters and transform
them (IOTW I don't need document I/O, just a quick filter to scan text for
the funky chars).
Thanks,
Scot
Scot Hacker wrote: How can I filter the extracted text and transform these characters into plain text equivalents? Is there a builtin function for this, external class available, or do I need to try and hack it out from scratch?
Have a look at http://www.ph.net/htmlentities
--
Mail sent to my "From:" address is publicly readable at http://www.dodgeit.com/
== ** ## !! !! ## ** ==
TEXT-ONLY mail to the complete "Reply-To:" address ("My Name" <my@address>) may
bypass the spam filter. I will answer all pertinent mails from a valid address.
Pedro Graca wrote: Scot Hacker wrote: How can I filter the extracted text and transform these characters into plain text equivalents? Is there a builtin function for this, external class available, or do I need to try and hack it out from scratch?
Have a look at http://www.ph.net/htmlentities
Of course I meant http://www.php.net/htmlentities
example, written on the command-line
php$ php -r 'echo htmlentities("João Graça"), "\n";'
João Graça
php$
--
Mail to my "From:" address is readable by all at http://www.dodgeit.com/
== ** ## !! ------------------------------------------------ !! ## ** ==
TEXT-ONLY mail to the whole "Reply-To:" address ("My Name" <my@address>)
may bypass my spam filter. If it does, I may reply from another address!
.oO(Pedro Graca) Scot Hacker wrote: How can I filter the extracted text and transform these characters into plain text equivalents? Is there a builtin function for this, external class available, or do I need to try and hack it out from scratch?
Have a look at http://www.ph.net/htmlentities
The OP wants to print plain text, no HTML.
Micha
Scot Hacker wrote: I have a database that has been populated with content pasted out of MS Word, and is full of special characters -- em dashes, curly quotes, curly apostrophes, etc. Now I'm generating plain text email summaries out of the database and of course those special chars appear as garbage chars in the emails.
How can I filter the extracted text and transform these characters into plain text equivalents? Is there a builtin function for this, external class available, or do I need to try and hack it out from scratch?
Thanks, Scot
I've managed to do just that using a call to antiword. http://www.winfield.demon.nl/
$cmd = "/usr/local/bin/antiword -t " . $filename . " > " . $txt_file ;
system($cmd);
$document = mysql_real_escape_string(htmlentities("<pre>" .
fread(fopen($txt_file, "r"), filesize($txt_file)) . "</pre>"));
then store $document in the database. Gets a bit ugly at times, but
works :-)
Sacs This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: John van Terheijden |
last post by:
Hi.
I'm trying to develop a program that uses XML files store data. I'm using
Windows XP, Apache 1.3.29 and PHP 4.3.4.
Right now the XML file is read using the xml_parser_create(),...
|
by: Barry Olly |
last post by:
Hi,
I'm working on a mini content management system and need help with
dealing with special characters.
The input are taken from html form which are then stored into a
varchar column in...
|
by: Ewok |
last post by:
let me just say. it's not by choice but im dealing with a .net web app (top
down approach with VB and a MySQL database) sigh.....
Anyhow, I've just about got all the kinks worked out but I am...
|
by: Sakharam Phapale |
last post by:
Hi All,
I am using an API function, which takes file path as an input.
When file path contains special characters (@,#,$,%,&,^, etc), API function
gives an error as "Unable to open input file"....
|
by: Wim Cossement |
last post by:
Hello,
I was wondering if there are a few good pages and/or examples on how to
process form data correctly for putting it in a MySQL DB.
Since I'm not used to using PHP a lot, I already found...
|
by: ronrsr |
last post by:
I have an MySQL database called zingers. The structure is:
zid - integer, key, autoincrement
keyword - varchar
citation - text
quotation - text
the encoding and collation is utf-8
|
by: rogoflap |
last post by:
I have some regular text I export to a word document. I build this in
VBA in Access and want dump it into word.
I can do this, but would like to know how I can turn on an off bolding
or...
|
by: DolphinDB |
last post by:
Tired of spending countless mintues downsampling your data? Look no further!
In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: jfyes |
last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
|
by: ArrayDB |
last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
|
by: CloudSolutions |
last post by:
Introduction:
For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
|
by: Defcon1945 |
last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
|
by: Shællîpôpï 09 |
last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
|
by: af34tf |
last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
|
by: Faith0G |
last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
| |