473,548 Members | 2,780 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to upload form data containing special characters correctly?

Hello,

I was wondering if there are a few good pages and/or examples on how to
process form data correctly for putting it in a MySQL DB.

Since I'm not used to using PHP a lot, I already found out that
addslashes() can be used escape some characters, but I'm having some
more problems with for instance ä, å and µ (since the text is scientifical)
Now some people also throw in htmlspecialchar s() to convert those to
HTML entities, but some nest htmlspecialchar s() in addslashes() and
others do the opposite.

Is there a good and error proof way of ensuring that what one puts in a
textarea gets stored and can be retrieved safe and sound?

Thanks in advance,

Wimmy

--
Being owned by someone used to be called slavery.
Now it's called commitment.
Sep 4 '06 #1
25 5308
try that:

- $input_string = 'some text with special characters';
- $input_string = base64_encode($ input_string);
- write to database,
- read from database,
- $output_string = base64_decode($ output_string);

Hope It will help.

Sep 4 '06 #2
"Wim Cossement" <wc******@nospa m.bcol.bewrote in message
news:ed******** **@snic.vub.ac. be...
Hello,

I was wondering if there are a few good pages and/or examples on how to
process form data correctly for putting it in a MySQL DB.

Since I'm not used to using PHP a lot, I already found out that
addslashes() can be used escape some characters, but I'm having some more
problems with for instance ä, å and µ (since the text is scientifical)
Now some people also throw in htmlspecialchar s() to convert those to HTML
entities, but some nest htmlspecialchar s() in addslashes() and others do
the opposite.

Is there a good and error proof way of ensuring that what one puts in a
textarea gets stored and can be retrieved safe and sound?

Use Unicode for everything. Set utf-8 encoding to your database, save the
pages in utf-8, tell the browsers in every possibly imaginable way that you
are providing the content as utf-8. Not exactly easy process, but I
recommend you to try that.

--
"Ohjelmoija on organismi joka muuttaa kofeiinia koodiksi" - lpk
http://outolempi.net/ahdistus/ - Satunnaisesti päivittyvä nettisarjis
sp**@outolempi. net || Gedoon-S @ IRCnet || rot13(xv***@bhg byrzcv.arg)
Sep 4 '06 #3
Wim Cossement wrote:
Hello,

I was wondering if there are a few good pages and/or examples on how to
process form data correctly for putting it in a MySQL DB.

Since I'm not used to using PHP a lot, I already found out that
addslashes() can be used escape some characters, but I'm having some
more problems with for instance ä, å and µ (since the text is scientifical)
Now some people also throw in htmlspecialchar s() to convert those to
HTML entities, but some nest htmlspecialchar s() in addslashes() and
others do the opposite.

Is there a good and error proof way of ensuring that what one puts in a
textarea gets stored and can be retrieved safe and sound?

Thanks in advance,

Wimmy
You'll need to select the correct character set for MySQL. It might be
utf-8, as some have suggested, but you might find another charaset more
applicable. See the MySQL doc and comp.databases. mysql newsgroup for
more info on mysql topics.

Also, rather than use addslashes() you should use
mysql_real_esca pe_string() to escape your characters.

You shouldn't use htmlspecialchar s() for storing data into the database;
that's a display issue, not a storage issue. You should only use it
when displaying data (if necessary).

And also ensure you're using the correct character set on your html page
to display the data.

--
=============== ===
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attgl obal.net
=============== ===
Sep 4 '06 #4
Jerry Stuckle wrote:
>
You'll need to select the correct character set for MySQL. It might be
utf-8, as some have suggested, but you might find another charaset more
applicable. See the MySQL doc and comp.databases. mysql newsgroup for
more info on mysql topics.
Well, I've been hearing for a while UTF-8 is the best for all that
stuff, so tables and DB's are all in utf8_general_ci (does anyone know
the difference between that and utf8_bin, and what's utf8_unicode_ci
doing in that list)
Also, rather than use addslashes() you should use
mysql_real_esca pe_string() to escape your characters.
Some like the other better, there are still discussions going on... :-)
http://www.sitepoint.com/forums/showthread.php?t=337881
You shouldn't use htmlspecialchar s() for storing data into the database;
that's a display issue, not a storage issue. You should only use it
when displaying data (if necessary).
The fact is that the data does not realy need to be displayed in a
webpage, this is just for uploading. I'll rather use OpenOffice with
MyODBC to edit the data when needed and use a report to display it.
And also ensure you're using the correct character set on your html page
to display the data.
I guess this is the case.
The header contains <meta http-equiv="content-type"
content="applic ation/xhtml+xml; charset=utf-8" />

Now I'm going to try this and I'll let you know the outcome.

Thanks a bunch,

Wimmy
Sep 4 '06 #5
On Mon, 04 Sep 2006 11:24:04 +0200, Wim Cossement <wc******@nospa m.bcol.bewrote:
>Hello,

I was wondering if there are a few good pages and/or examples on how to
process form data correctly for putting it in a MySQL DB.

Since I'm not used to using PHP a lot, I already found out that
addslashes() can be used escape some characters, but I'm having some
more problems with for instance ä, å and µ (since the text is scientifical)
Now some people also throw in htmlspecialchar s() to convert those to
HTML entities, but some nest htmlspecialchar s() in addslashes() and
others do the opposite.

Is there a good and error proof way of ensuring that what one puts in a
textarea gets stored and can be retrieved safe and sound?

Thanks in advance,

Wimmy


i found user comments in the php manual under htmlspecialchar
think these might help

also if you need to save special characters I sugget turning off magic quotes and that supresses
the backslashes normally adds with set_magic_quote _runtime(0);

After inspecting the non-native encoding problem, I noticed that for example, if the encoding is
cyrillic, and I write Latin characters that are not part of the encoding (ę for example -
ae-ligature), the browser will send the real entity, such as &aelig; for this case.
Therefore, the only way I see to display multilingual text that is encoded with entities is by:
<?php
echo str_replace('&a mp;', '&', htmlspecialchar s($txt));
?>
The regex for numeric entities will skip the Latin-1 textual entities.



A sample function, if anybody want to turn html entities (and special characters) back to simple.
(eg: "&egrave;", "<" etc)
function html2specialcha rs($str){
$trans_table = array_flip(get_ html_translatio n_table(HTML_EN TITIES));
return strtr($str, $trans_table);
}


Quite often, on HTML pages that are not encoded as UTF-8, and people write in not native encoding,
some browser (for sure IExplorer) will send the different charset characters using HTML Entities,
such as б for small russian 'b'.
htmlspecialchar s() will convert this character to the entity, since it changes all & to &amp;
What I usually do, is either turn &amp; back to & so the correct characters will appear in the
output, or I use some regex to replace all entities of characters back to their original entity:
<?php
// treat this as pseudo-code, it hasn't been tested...
$result = preg_replace('/&amp;#(x[a-f0-9]+|[0-9]+);/i', '&#$1;', $source);
?>

Why '? The HTML and XML DTDs proposed &apos; for this.
See http://www.w3.org/TR/html/dtds.html#...ial_characters
So better use this:
$text = htmlspecialchar s($text, ENT_QUOTES);
$text = preg_replace('/&#0*39;/', '&apos;', $text);

Sep 4 '06 #6
Wim Cossement wrote:
Jerry Stuckle wrote:
>>
You'll need to select the correct character set for MySQL. It might
be utf-8, as some have suggested, but you might find another charaset
more applicable. See the MySQL doc and comp.databases. mysql newsgroup
for more info on mysql topics.


Well, I've been hearing for a while UTF-8 is the best for all that
stuff, so tables and DB's are all in utf8_general_ci (does anyone know
the difference between that and utf8_bin, and what's utf8_unicode_ci
doing in that list)
That some peoples opinions. And remember, they are opinions. Some
people know what they're talking about, and some don't. Take anything
you get on the internet (including this) with a grain of salt.

Personally, I use the characterset which matches my data. This may or
may not be utf-8.
>Also, rather than use addslashes() you should use
mysql_real_esc ape_string() to escape your characters.


Some like the other better, there are still discussions going on... :-)
http://www.sitepoint.com/forums/showthread.php?t=337881
Not much discussion. addslashes() is a PHP construct which escapes
certain characters. mysql_real_esca pe_string() is a mysql function to
escape the characters necessary to place the data in a mysql database
using the current charset.

mysql_real_esca pe_string needs no special processing when reading the
data out - the data is exactly as it was before mysql_real_esca pe_string
was called. That is not the case for addslashes().
>You shouldn't use htmlspecialchar s() for storing data into the
database; that's a display issue, not a storage issue. You should
only use it when displaying data (if necessary).


The fact is that the data does not realy need to be displayed in a
webpage, this is just for uploading. I'll rather use OpenOffice with
MyODBC to edit the data when needed and use a report to display it.
That's fine. So don't use htmlspecialchar s() at all then.
>And also ensure you're using the correct character set on your html
page to display the data.


I guess this is the case.
The header contains <meta http-equiv="content-type"
content="applic ation/xhtml+xml; charset=utf-8" />

Now I'm going to try this and I'll let you know the outcome.

Thanks a bunch,

Wimmy

--
=============== ===
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attgl obal.net
=============== ===
Sep 4 '06 #7
Hi again,

I must say I've tried all the suggested options but I still can't do a
proper upload.

There is one textarea where users must put in text about their subject
(more or less 2 formatted pages in a PFD/DOC document), so most (not to
say all) of them cut 'n' paste it from Acrobat/Word/OpenOffce into their
browser.

Most of them contain double quotes that are not escaped by addslashes or
htmlspecialchar s , I've copied a few myself: "bla" "bla" "bla"

If I add an entry by hand in phpMyAdmin for instance and one field
contains these characters they are stored and displayed OK.
When I store the resulting page and look at it in vi those quoted bla's
are displayed as ā~@~\blaā~@~]

How do I get rid of those, since Thunderbird wants to convert the
message to UTF-8?

Is there a way to limit or convert the encoding used in a textarea?
Or is this more HTML related?

Regards,

Wimmy
Sep 5 '06 #8
Wim Cossement wrote:
Hi again,

I must say I've tried all the suggested options but I still can't do a
proper upload.

There is one textarea where users must put in text about their subject
(more or less 2 formatted pages in a PFD/DOC document), so most (not to
say all) of them cut 'n' paste it from Acrobat/Word/OpenOffce into their
browser.

Most of them contain double quotes that are not escaped by addslashes or
htmlspecialchar s , I've copied a few myself: "bla" "bla" "bla"

If I add an entry by hand in phpMyAdmin for instance and one field
contains these characters they are stored and displayed OK.
When I store the resulting page and look at it in vi those quoted bla's
are displayed as ā~@~\blaā~@~]

How do I get rid of those, since Thunderbird wants to convert the
message to UTF-8?

Is there a way to limit or convert the encoding used in a textarea?
Or is this more HTML related?

Regards,

Wimmy
Well, what Thunderbird does is completely client side and has nothing to
do with PHP. What charset do you have defined for the page?

And if they care cutting and pasting from a Word document or a PDF,
chances are the document itself has the special characters. For
instance, Word can use different characters for left and right double
quotes, depending on the version and releases.

Nothing in PHP or MySQL would handle such characters; you'll have to
handle them yourself, i.e. with str_replace().

--
=============== ===
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attgl obal.net
=============== ===
Sep 5 '06 #9
Jerry Stuckle wrote:
Well, what Thunderbird does is completely client side and has nothing to
do with PHP. What charset do you have defined for the page?
The header contains <meta http-equiv="content-type"
content="applic ation/xhtml+xml; charset=utf-8" />, so it should be UTF-8
And if they care cutting and pasting from a Word document or a PDF,
chances are the document itself has the special characters. For
instance, Word can use different characters for left and right double
quotes, depending on the version and releases.
Well, when I save the text with those weird things in a textfile with
UTF-8 encoding they are still there when I open it, so it must be a
character that exists in this character set.
But how do I determine which one it is specificly?

I've put an example here in case someone knows how to do it:
http://ultr23.vub.ac.be/~wcosseme/someFile.txt
Nothing in PHP or MySQL would handle such characters; you'll have to
handle them yourself, i.e. with str_replace().
Then I might be able to replace it, who knows...

Many cheers to the one that can do it!

Wimmy
Sep 5 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
11743
by: dave | last post by:
Hello there, I am at my wit's end ! I have used the following script succesfully to upload an image to my web space. But what I really want to be able to do is to update an existing record in a table in MySQL with the path & filename to the image. I have successfully uploaded and performed an update query on the database, but the problem...
21
3887
by: Stefan Richter | last post by:
Hi, after coding for days on stupid form validations - Like: strings (min / max length), numbers(min / max value), money(min / max value), postcodes(min / max value), telefon numbers, email adresses and so on. I thought it might be a better way to programm an automated, dynamic form validation that works for all kinds of fields, shows the...
0
2360
by: Paul Hamlington | last post by:
Hello, I've been programming in ASP for a little while now and quite an advanced user, but I have come across an unusual problem in which I need assistance. I have built my own image upload, I have two versions of the binary to string conversion one fast, one slow because some servers use chillisoft and therefore the append function in...
14
5405
by: StumpY | last post by:
HI, I have set up a page with a form which appends data to a .csv file on my asp server, this is to enable a limited number of users to add news threads to this file, which contains the data in the following format (date,headline,article); "181003","news title","news article" "171003","older news title","news article" The form adds the data...
2
6050
by: Tom Wells | last post by:
I have a little file upload page that I have been able to use to successfully upload files to the C: drive of LocalHost (my machine). I need to be able to upload to a network drive from the intranet server. On the line: dirs = Directory.GetDirectories(currentDir) I get "Access to the path "\\les-net\les\Special Projects\ATSPDF" is denied." ...
0
1412
by: hoenes1 | last post by:
Hi all, I have a standard html form containing several textboxes. Since this is a german application, the boxes are likely to contain special characters like ä, ö, ß, etc. The form is passed to the asp.net application via action="FormHandler.aspx" method="post". Evaluating the values in Request.Form.GetValues(key) yields the content of the...
9
3818
by: Wayne Smith | last post by:
I've come up against a major headache that I can't seem to find a solution for but I'm sure there must be a workaround and I would really be grateful of any help. I'm currently building a web site for a small club I belong to and one of the features I would like to include is the ability to allow users to upload image files. ...
21
34349
KevinADC
by: KevinADC | last post by:
Note: You may skip to the end of the article if all you want is the perl code. Introduction Uploading files from a local computer to a remote web server has many useful purposes, the most obvious of which is the sharing of files. For example, you upload images to a server to share them with other people over the Internet. Perl comes ready...
18
34686
jhardman
by: jhardman | last post by:
Have you ever wanted to upload files through a form and thought, "I'd really like to use ASP, it surely has that capability, but the tutorial I used to learn ASP didn't mention how to do this."? Have you looked around trying to find simple solutions but didn't want to wade through pages of complex code? Have you balked at paying for premade...
0
7518
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7711
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
1
7467
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
7805
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6039
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
3497
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
1932
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1054
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
755
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.