Il se trouve que turnitup a formulé :
Dear all,
I have a problem with a form, and I have tried various permutations of
htmlentities() and html_entity_decode() to resolve, but without success.
Here is the workflow.
1: User pastes MS Word formatted text into form field.
2: Server uses mail() to send input text to mail client.
3: Recipient pastes text into html file.
The problem is that MS Word contains peculiar characters for things like
bullets, which come out as tabs, which then come out as different, but
spurious, html characters in the html translation.
Does anyone know of a function(s) that can clean up MS Word input into
something that can be simply pasted as plain text without spurious
characters?
Turner
From a comment on the PHP documentation for the utf8_decode() function
http://us2.php.net/manual/en/function.utf8-decode.php
peter dot mescalchin at geemail dot com
27-Dec-2005 06:43
Adding to below I have a few more MS word characters that need
replacing. Found this was required when "fixing" some phpmyadmin export
scripts from a live server where MS word characters were all through
the
content - before importing them back into my local mySQL database.
The code I wrote for this process also does a strpos for any extra
"\\xe2\\x80" strings - which are the tell-tale sign of any funny
characters I want removed.
Here are my updated arrays()
<?php
$badchr = array(
"\\xe2\\x80\\xa6", // ellipsis
"\\xe2\\x80\\x93", // long dash
"\\xe2\\x80\\x94", // long dash
"\\xe2\\x80\\x98", // single quote opening
"\\xe2\\x80\\x99", // single quote closing
"\\xe2\\x80\\x9c", // double quote opening
"\\xe2\\x80\\x9d", // double quote closing
"\\xe2\\x80\\xa2" // dot used for bullet points
);
$goodchr = array(
'...',
'-',
'-',
'\\'',
'\\'',
'"',
'"',
'*'
);
?>
--
Julien CROUZET - DSI Theoconcept
julien.crouzet@/enlever ca\theoconcept.com
http://www.theoconcept.com