472,328 Members | 1,990 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,328 software developers and data experts.

How to deal with all of those MS Word Funky characters

Put simply, I have a text box, and people commonly cut + paste
information into this text box from Microsoft word. The problem is that
word has all types of funky characters (smart quotes, em-dashes), that
the system (php-based) doesn't understand. Does anyone know of a way to
filter out these Microsoft-specific characters? Does PHP have a special
function for this? Thanks a lot!

Aug 19 '05 #1
1 3949
ge********@gmail.com wrote:
Put simply, I have a text box, and people commonly cut + paste
information into this text box from Microsoft word. The problem is that
word has all types of funky characters (smart quotes, em-dashes), that
the system (php-based) doesn't understand. Does anyone know of a way to
filter out these Microsoft-specific characters? Does PHP have a special
function for this? Thanks a lot!


Hooray I can actually be of use to this group for once. Yes, if you look
in the user notes on php.net for the htmlentities function you will see
an entry from mail at britlinks dot com (19-May-2004 05:27). I've listed
it below for reference. Mind you I'm sure the hardcore programmers on
this group will be able to formulate a one-line regexp for this and we
look forward to seeing it.

In the meantime, I hope this helps.
<?php
// strips slashes, and converts special characters to HTML equivalents
for string defined in $var
function htmlfriendly($var,$nl2br = false){
$chars = array(
128 => '€',
130 => '‚',
131 => 'ƒ',
132 => '„',
133 => '…',
134 => '†',
135 => '‡',
136 => 'ˆ',
137 => '‰',
138 => 'Š',
139 => '‹',
140 => 'Œ',
142 => 'Ž',
145 => '‘',
146 => '’',
147 => '“',
148 => '”',
149 => '•',
150 => '–',
151 => '—',
152 => '˜',
153 => '™',
154 => 'š',
155 => '›',
156 => 'œ',
158 => 'ž',
159 => 'Ÿ');
$var = str_replace(array_map('chr', array_keys($chars)), $chars,
htmlentities(stripslashes($var)));
if($nl2br){
return nl2br($var);
} else {
return $var;
}
}
?>
Aug 19 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Scot Hacker | last post by:
I have a database that has been populated with content pasted out of MS Word, and is full of special characters -- em dashes, curly quotes, curly...
3
by: gdogg1587 | last post by:
Greetings. I'm working on a program that will "descramble" words. Think of a word scramble game where there is a list of characters, and several...
1
by: Markus Ernst | last post by:
Hi I wrote a function that "normalizes" strings for use in URLs in a UTF-8 encoded content administration application. After having removed the...
3
by: turnitup | last post by:
Dear all, I have a problem with a form, and I have tried various permutations of htmlentities() and html_entity_decode() to resolve, but without...
12
by: comp.lang.php | last post by:
I have a textarea where people can cut & paste their resume. Unfortunately they often cut & paste their Word resume into the textarea, funky...
48
by: Frederick Gotham | last post by:
The "toupper" function takes an int as an argument. That's not too irrational given that a character literal is of type "int" in C. (Although why...
0
by: CtrlAltDel | last post by:
This isn't a PHP issue per say, but I guess a question to other PHP developers. I've just recently converted everything I got to Unicode UTF-8...
2
by: Ola K | last post by:
Hi guys, I wrote a script that works *almost* perfectly, and this lack of perfection simply puzzles me. I simply cannot point the whys, so any...
89
by: Tubular Technician | last post by:
Hello, World! Reading this group for some time I came to the conclusion that people here are split into several fractions regarding size_t,...
0
by: tammygombez | last post by:
Hey fellow JavaFX developers, I'm currently working on a project that involves using a ComboBox in JavaFX, and I've run into a bit of an issue....
0
better678
by: better678 | last post by:
Question: Discuss your understanding of the Java platform. Is the statement "Java is interpreted" correct? Answer: Java is an object-oriented...
0
by: teenabhardwaj | last post by:
How would one discover a valid source for learning news, comfort, and help for engineering designs? Covering through piles of books takes a lot of...
0
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and...
0
by: CD Tom | last post by:
This happens in runtime 2013 and 2016. When a report is run and then closed a toolbar shows up and the only way to get it to go away is to right...
0
by: CD Tom | last post by:
This only shows up in access runtime. When a user select a report from my report menu when they close the report they get a menu I've called Add-ins...
0
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge...
0
jalbright99669
by: jalbright99669 | last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made...
0
by: Matthew3360 | last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function. Here is my code. ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.