473,471 Members | 2,008 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Treating text copied from MS Word

I've built a MySQL database for a client and a web interface to be able to
add/edit/delete records in it. When he's adding stuff to the database he's
copying text from MS Word. I've tried various substitutions that I've found
hanging around the internet, but nothing's working for the "long dash" that
it insists on converting normal hyphens to.

This morning I did a bin2hex to see exactly what was being sent from $_POST:

A - long dash -.

41 20 >>>e2 80 93<<< 20 6c 6f 6e 67 20 64 61 73 68 20 2d 2e 20 20

The offending character is the one I've highlighted. As far as I can tell,
it should be getting found by this -

"\\xe2\\x80\\x93", // long dash

but it isn't, which makes me think there's something wrong with the code
I've copied. How to find the hex string? I've tried "\xe2\x80\x93" and
"\xe2x80x93" in addition, but to no avail.

Is driving me scatty!!!

Any help much appreciated.

$search = array( chr(145),
chr(146),
chr(147),
chr(148),
chr(151),
chr(196),
'?o', // left side double smart quote
'?', // right side double smart quote
'?~', // left side single smart quote
'?T', // right side single smart quote
'?', // elipsis
'?"', // em dash
'?"', // en dash
"\\xe2\\x80\\xa6", // ellipsis
"\\xe2\\x80\\x93", // long dash
"\\xe2\\x80\\x94", // long dash
"\\xe2\\x80\\x9c", // double quote opening
"\\xe2\\x80\\x9d", // double quote closing
"\\xe2\\x80\\xa2" // dot used for bullet points
);
$replace = array( "'",
"'",
'"',
'"',
'-',
'-',
'"',
'"',
"'",
"'",
"&hellip;",
"-",
"-",
'&hellip;',
'-',
'-',
'"',
'"',
'*'
);
ECHO '<p>'.BIN2HEX( $_POST['short_desc'] ).'</p>';
$short_desc = STR_REPLACE($search, $replace, $_POST['short_desc']);

+mrcakey
Jul 9 '08 #1
3 1894
On Jul 9, 12:03*pm, "+mrcakey" <webmas...@listyblue.comwrote:
I've built a MySQL database for a client and a web interface to be able to
add/edit/delete records in it. *When he's adding stuff to the database he's
copying text from MS Word. *I've tried various substitutions that I've found
hanging around the internet, but nothing's working for the "long dash" that
it insists on converting normal hyphens to.

This morning I did a bin2hex to see exactly what was being sent from $_POST:

A - long dash -.

41 20 >>>e2 80 93<<< 20 6c 6f 6e 67 20 64 61 73 68 20 2d 2e 20 20

The offending character is the one I've highlighted. *As far as I can tell,
it should be getting found by this -

"\\xe2\\x80\\x93", // long dash

but it isn't, which makes me think there's something wrong with the code
I've copied. *How to find the hex string? *I've tried "\xe2\x80\x93" and
"\xe2x80x93" in addition, but to no avail.
<snip>

Not really a PHP question - configure your webserver to use a 7 bit
charset.

C.
Jul 10 '08 #2
I V
On Wed, 09 Jul 2008 12:03:57 +0100, +mrcakey wrote:
The offending character is the one I've highlighted. As far as I can
tell, it should be getting found by this -

"\\xe2\\x80\\x93", // long dash
You want to use one backslash here, not two. But, rather than specifying
the search-and-replace yourself, it's probably easier to use
htmlentities. You need to know what encoding your data has been sent in
(it looks, from your post, like you're receiving UTF-8), and do, like so:

$short_desc = htmlentities($_POST['short_desc'], ENT_COMPAT, 'UTF-8');
Jul 10 '08 #3
On Jul 10, 5:07*pm, "C. (http://symcbean.blogspot.com/)"
<colin.mckin...@gmail.comwrote:
On Jul 9, 12:03*pm, "+mrcakey" <webmas...@listyblue.comwrote:
I've built a MySQL database for a client and a web interface to be ableto
add/edit/delete records in it. *When he's adding stuff to the database he's
copying text from MS Word. *I've tried various substitutions that I've found
hanging around the internet, but nothing's working for the "long dash" that
it insists on converting normal hyphens to.
This morning I did a bin2hex to see exactly what was being sent from $_POST:
A - long dash -.
41 20 >>>e2 80 93<<< 20 6c 6f 6e 67 20 64 61 73 68 20 2d 2e 20 20
The offending character is the one I've highlighted. *As far as I cantell,
it should be getting found by this -
"\\xe2\\x80\\x93", // long dash
but it isn't, which makes me think there's something wrong with the code
I've copied. *How to find the hex string? *I've tried "\xe2\x80\x93" and
"\xe2x80x93" in addition, but to no avail.

<snip>

Not really a PHP question - configure your webserver to use a 7 bit
charset.

C.
Sorry - bum steer. Apparrently MSIE is (once again) completely broken
in this regard. There is a hack though - see
http://www.crazysquirrel.com/computi...-encoding.jspx

C.
Jul 13 '08 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Mirano | last post by:
Hi everybody. I open a Word document in the web browser control, I then select some text in it using a mouse, and I want to paste the selected text into another text box on the same form, but it...
5
by: Son KwonNam | last post by:
I found some information about copying text to clipboard in IE, but it does not preserve html formats. Is this possible to copy some html area, and when I paste it into MS Word or Excel, I could...
2
by: alexsg | last post by:
I'm setting up a resolutions database where each resolution will be copied from Word documents and pasted into a memo field. The resolution will be in the form: Resolution title <cr> Project no...
5
by: visu | last post by:
Hi this is a question asked in this group two years back.. No answer for this question till date. now i am in the same situation of the questioner.. to find a solution for this problem. Can any...
10
by: mojmir | last post by:
hello, i've just encountered following piece of code: struct Vector { float x, y, z; inline float & operator (size_t i) {
2
by: Killer42 | last post by:
Hi all. I'm using MS Word 2003, and MS Excel 2003. When I copy text from a Word document and paste it into Excel as plain text, Word creates a bookmark around the selected text. I'd like to know...
7
by: Eric Wertman | last post by:
I have a set of files with this kind of content (it's dumped from WebSphere): ]
6
by: Flyzone | last post by:
Hello, i'm trying to paste copied text from word into an input box. This text is saved into a oracle db and then used as text in another javascript. The problem is that using the saved text...
0
by: Robert | last post by:
Hi! I want to fill the clipboard programmatically in oder to manual copy the data to word. The data consists of cells of a grid which contains text or images. Using the standard copy/paste of...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing,...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.