473,836 Members | 1,560 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to upload form data containing special characters correctly?

Hello,

I was wondering if there are a few good pages and/or examples on how to
process form data correctly for putting it in a MySQL DB.

Since I'm not used to using PHP a lot, I already found out that
addslashes() can be used escape some characters, but I'm having some
more problems with for instance , and (since the text is scientifical)
Now some people also throw in htmlspecialchar s() to convert those to
HTML entities, but some nest htmlspecialchar s() in addslashes() and
others do the opposite.

Is there a good and error proof way of ensuring that what one puts in a
textarea gets stored and can be retrieved safe and sound?

Thanks in advance,

Wimmy

--
Being owned by someone used to be called slavery.
Now it's called commitment.
Sep 4 '06 #1
25 5406
try that:

- $input_string = 'some text with special characters';
- $input_string = base64_encode($ input_string);
- write to database,
- read from database,
- $output_string = base64_decode($ output_string);

Hope It will help.

Sep 4 '06 #2
"Wim Cossement" <wc******@nospa m.bcol.bewrote in message
news:ed******** **@snic.vub.ac. be...
Hello,

I was wondering if there are a few good pages and/or examples on how to
process form data correctly for putting it in a MySQL DB.

Since I'm not used to using PHP a lot, I already found out that
addslashes() can be used escape some characters, but I'm having some more
problems with for instance , and (since the text is scientifical)
Now some people also throw in htmlspecialchar s() to convert those to HTML
entities, but some nest htmlspecialchar s() in addslashes() and others do
the opposite.

Is there a good and error proof way of ensuring that what one puts in a
textarea gets stored and can be retrieved safe and sound?

Use Unicode for everything. Set utf-8 encoding to your database, save the
pages in utf-8, tell the browsers in every possibly imaginable way that you
are providing the content as utf-8. Not exactly easy process, but I
recommend you to try that.

--
"Ohjelmoija on organismi joka muuttaa kofeiinia koodiksi" - lpk
http://outolempi.net/ahdistus/ - Satunnaisesti pivittyv nettisarjis
sp**@outolempi. net || Gedoon-S @ IRCnet || rot13(xv***@bhg byrzcv.arg)
Sep 4 '06 #3
Wim Cossement wrote:
Hello,

I was wondering if there are a few good pages and/or examples on how to
process form data correctly for putting it in a MySQL DB.

Since I'm not used to using PHP a lot, I already found out that
addslashes() can be used escape some characters, but I'm having some
more problems with for instance , and (since the text is scientifical)
Now some people also throw in htmlspecialchar s() to convert those to
HTML entities, but some nest htmlspecialchar s() in addslashes() and
others do the opposite.

Is there a good and error proof way of ensuring that what one puts in a
textarea gets stored and can be retrieved safe and sound?

Thanks in advance,

Wimmy
You'll need to select the correct character set for MySQL. It might be
utf-8, as some have suggested, but you might find another charaset more
applicable. See the MySQL doc and comp.databases. mysql newsgroup for
more info on mysql topics.

Also, rather than use addslashes() you should use
mysql_real_esca pe_string() to escape your characters.

You shouldn't use htmlspecialchar s() for storing data into the database;
that's a display issue, not a storage issue. You should only use it
when displaying data (if necessary).

And also ensure you're using the correct character set on your html page
to display the data.

--
=============== ===
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attgl obal.net
=============== ===
Sep 4 '06 #4
Jerry Stuckle wrote:
>
You'll need to select the correct character set for MySQL. It might be
utf-8, as some have suggested, but you might find another charaset more
applicable. See the MySQL doc and comp.databases. mysql newsgroup for
more info on mysql topics.
Well, I've been hearing for a while UTF-8 is the best for all that
stuff, so tables and DB's are all in utf8_general_ci (does anyone know
the difference between that and utf8_bin, and what's utf8_unicode_ci
doing in that list)
Also, rather than use addslashes() you should use
mysql_real_esca pe_string() to escape your characters.
Some like the other better, there are still discussions going on... :-)
http://www.sitepoint.com/forums/showthread.php?t=337881
You shouldn't use htmlspecialchar s() for storing data into the database;
that's a display issue, not a storage issue. You should only use it
when displaying data (if necessary).
The fact is that the data does not realy need to be displayed in a
webpage, this is just for uploading. I'll rather use OpenOffice with
MyODBC to edit the data when needed and use a report to display it.
And also ensure you're using the correct character set on your html page
to display the data.
I guess this is the case.
The header contains <meta http-equiv="content-type"
content="applic ation/xhtml+xml; charset=utf-8" />

Now I'm going to try this and I'll let you know the outcome.

Thanks a bunch,

Wimmy
Sep 4 '06 #5
On Mon, 04 Sep 2006 11:24:04 +0200, Wim Cossement <wc******@nospa m.bcol.bewrote:
>Hello,

I was wondering if there are a few good pages and/or examples on how to
process form data correctly for putting it in a MySQL DB.

Since I'm not used to using PHP a lot, I already found out that
addslashes() can be used escape some characters, but I'm having some
more problems with for instance , and (since the text is scientifical)
Now some people also throw in htmlspecialchar s() to convert those to
HTML entities, but some nest htmlspecialchar s() in addslashes() and
others do the opposite.

Is there a good and error proof way of ensuring that what one puts in a
textarea gets stored and can be retrieved safe and sound?

Thanks in advance,

Wimmy


i found user comments in the php manual under htmlspecialchar
think these might help

also if you need to save special characters I sugget turning off magic quotes and that supresses
the backslashes normally adds with set_magic_quote _runtime(0);

After inspecting the non-native encoding problem, I noticed that for example, if the encoding is
cyrillic, and I write Latin characters that are not part of the encoding ( for example -
ae-ligature), the browser will send the real entity, such as &aelig; for this case.
Therefore, the only way I see to display multilingual text that is encoded with entities is by:
<?php
echo str_replace('&a mp;', '&', htmlspecialchar s($txt));
?>
The regex for numeric entities will skip the Latin-1 textual entities.



A sample function, if anybody want to turn html entities (and special characters) back to simple.
(eg: "&egrave;", "<" etc)
function html2specialcha rs($str){
$trans_table = array_flip(get_ html_translatio n_table(HTML_EN TITIES));
return strtr($str, $trans_table);
}


Quite often, on HTML pages that are not encoded as UTF-8, and people write in not native encoding,
some browser (for sure IExplorer) will send the different charset characters using HTML Entities,
such as б for small russian 'b'.
htmlspecialchar s() will convert this character to the entity, since it changes all & to &amp;
What I usually do, is either turn &amp; back to & so the correct characters will appear in the
output, or I use some regex to replace all entities of characters back to their original entity:
<?php
// treat this as pseudo-code, it hasn't been tested...
$result = preg_replace('/&amp;#(x[a-f0-9]+|[0-9]+);/i', '&#$1;', $source);
?>

Why '? The HTML and XML DTDs proposed &apos; for this.
See http://www.w3.org/TR/html/dtds.html#...ial_characters
So better use this:
$text = htmlspecialchar s($text, ENT_QUOTES);
$text = preg_replace('/&#0*39;/', '&apos;', $text);

Sep 4 '06 #6
Wim Cossement wrote:
Jerry Stuckle wrote:
>>
You'll need to select the correct character set for MySQL. It might
be utf-8, as some have suggested, but you might find another charaset
more applicable. See the MySQL doc and comp.databases. mysql newsgroup
for more info on mysql topics.


Well, I've been hearing for a while UTF-8 is the best for all that
stuff, so tables and DB's are all in utf8_general_ci (does anyone know
the difference between that and utf8_bin, and what's utf8_unicode_ci
doing in that list)
That some peoples opinions. And remember, they are opinions. Some
people know what they're talking about, and some don't. Take anything
you get on the internet (including this) with a grain of salt.

Personally, I use the characterset which matches my data. This may or
may not be utf-8.
>Also, rather than use addslashes() you should use
mysql_real_esc ape_string() to escape your characters.


Some like the other better, there are still discussions going on... :-)
http://www.sitepoint.com/forums/showthread.php?t=337881
Not much discussion. addslashes() is a PHP construct which escapes
certain characters. mysql_real_esca pe_string() is a mysql function to
escape the characters necessary to place the data in a mysql database
using the current charset.

mysql_real_esca pe_string needs no special processing when reading the
data out - the data is exactly as it was before mysql_real_esca pe_string
was called. That is not the case for addslashes().
>You shouldn't use htmlspecialchar s() for storing data into the
database; that's a display issue, not a storage issue. You should
only use it when displaying data (if necessary).


The fact is that the data does not realy need to be displayed in a
webpage, this is just for uploading. I'll rather use OpenOffice with
MyODBC to edit the data when needed and use a report to display it.
That's fine. So don't use htmlspecialchar s() at all then.
>And also ensure you're using the correct character set on your html
page to display the data.


I guess this is the case.
The header contains <meta http-equiv="content-type"
content="applic ation/xhtml+xml; charset=utf-8" />

Now I'm going to try this and I'll let you know the outcome.

Thanks a bunch,

Wimmy

--
=============== ===
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attgl obal.net
=============== ===
Sep 4 '06 #7
Hi again,

I must say I've tried all the suggested options but I still can't do a
proper upload.

There is one textarea where users must put in text about their subject
(more or less 2 formatted pages in a PFD/DOC document), so most (not to
say all) of them cut 'n' paste it from Acrobat/Word/OpenOffce into their
browser.

Most of them contain double quotes that are not escaped by addslashes or
htmlspecialchar s , I've copied a few myself: "bla" "bla" "bla"

If I add an entry by hand in phpMyAdmin for instance and one field
contains these characters they are stored and displayed OK.
When I store the resulting page and look at it in vi those quoted bla's
are displayed as ~@~\bla~@~]

How do I get rid of those, since Thunderbird wants to convert the
message to UTF-8?

Is there a way to limit or convert the encoding used in a textarea?
Or is this more HTML related?

Regards,

Wimmy
Sep 5 '06 #8
Wim Cossement wrote:
Hi again,

I must say I've tried all the suggested options but I still can't do a
proper upload.

There is one textarea where users must put in text about their subject
(more or less 2 formatted pages in a PFD/DOC document), so most (not to
say all) of them cut 'n' paste it from Acrobat/Word/OpenOffce into their
browser.

Most of them contain double quotes that are not escaped by addslashes or
htmlspecialchar s , I've copied a few myself: "bla" "bla" "bla"

If I add an entry by hand in phpMyAdmin for instance and one field
contains these characters they are stored and displayed OK.
When I store the resulting page and look at it in vi those quoted bla's
are displayed as ~@~\bla~@~]

How do I get rid of those, since Thunderbird wants to convert the
message to UTF-8?

Is there a way to limit or convert the encoding used in a textarea?
Or is this more HTML related?

Regards,

Wimmy
Well, what Thunderbird does is completely client side and has nothing to
do with PHP. What charset do you have defined for the page?

And if they care cutting and pasting from a Word document or a PDF,
chances are the document itself has the special characters. For
instance, Word can use different characters for left and right double
quotes, depending on the version and releases.

Nothing in PHP or MySQL would handle such characters; you'll have to
handle them yourself, i.e. with str_replace().

--
=============== ===
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attgl obal.net
=============== ===
Sep 5 '06 #9
Jerry Stuckle wrote:
Well, what Thunderbird does is completely client side and has nothing to
do with PHP. What charset do you have defined for the page?
The header contains <meta http-equiv="content-type"
content="applic ation/xhtml+xml; charset=utf-8" />, so it should be UTF-8
And if they care cutting and pasting from a Word document or a PDF,
chances are the document itself has the special characters. For
instance, Word can use different characters for left and right double
quotes, depending on the version and releases.
Well, when I save the text with those weird things in a textfile with
UTF-8 encoding they are still there when I open it, so it must be a
character that exists in this character set.
But how do I determine which one it is specificly?

I've put an example here in case someone knows how to do it:
http://ultr23.vub.ac.be/~wcosseme/someFile.txt
Nothing in PHP or MySQL would handle such characters; you'll have to
handle them yourself, i.e. with str_replace().
Then I might be able to replace it, who knows...

Many cheers to the one that can do it!

Wimmy
Sep 5 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
11774
by: dave | last post by:
Hello there, I am at my wit's end ! I have used the following script succesfully to upload an image to my web space. But what I really want to be able to do is to update an existing record in a table in MySQL with the path & filename to the image. I have successfully uploaded and performed an update query on the database, but the problem I have is I cannot retain the primary key field in a variable which is then used in a SQL update...
21
3932
by: Stefan Richter | last post by:
Hi, after coding for days on stupid form validations - Like: strings (min / max length), numbers(min / max value), money(min / max value), postcodes(min / max value), telefon numbers, email adresses and so on. I thought it might be a better way to programm an automated, dynamic form validation that works for all kinds of fields, shows the necessary error messages and highlights the coresponding form fields.
0
2386
by: Paul Hamlington | last post by:
Hello, I've been programming in ASP for a little while now and quite an advanced user, but I have come across an unusual problem in which I need assistance. I have built my own image upload, I have two versions of the binary to string conversion one fast, one slow because some servers use chillisoft and therefore the append function in not accessible for a disconnected recordset.
14
5438
by: StumpY | last post by:
HI, I have set up a page with a form which appends data to a .csv file on my asp server, this is to enable a limited number of users to add news threads to this file, which contains the data in the following format (date,headline,article); "181003","news title","news article" "171003","older news title","news article" The form adds the data in reverse chronological order with newest at the top of the list. Now I wish to be able to write...
2
6067
by: Tom Wells | last post by:
I have a little file upload page that I have been able to use to successfully upload files to the C: drive of LocalHost (my machine). I need to be able to upload to a network drive from the intranet server. On the line: dirs = Directory.GetDirectories(currentDir) I get "Access to the path "\\les-net\les\Special Projects\ATSPDF" is denied." How do I get the GetDirectories command to user my user ID and password when it tries to hit the...
0
1426
by: hoenes1 | last post by:
Hi all, I have a standard html form containing several textboxes. Since this is a german application, the boxes are likely to contain special characters like , , , etc. The form is passed to the asp.net application via action="FormHandler.aspx" method="post". Evaluating the values in Request.Form.GetValues(key) yields the content of the textboxes omitting the special characters, for example typing "Rdiger" (7 chars) in the form...
9
3851
by: Wayne Smith | last post by:
I've come up against a major headache that I can't seem to find a solution for but I'm sure there must be a workaround and I would really be grateful of any help. I'm currently building a web site for a small club I belong to and one of the features I would like to include is the ability to allow users to upload image files. unfortunately the servers web root www folder only allows READ and EXECUTE permissions, which makes it...
21
34463
KevinADC
by: KevinADC | last post by:
Note: You may skip to the end of the article if all you want is the perl code. Introduction Uploading files from a local computer to a remote web server has many useful purposes, the most obvious of which is the sharing of files. For example, you upload images to a server to share them with other people over the Internet. Perl comes ready equipped for uploading files via the CGI.pm module, which has long been a core module and allows users...
18
34866
jhardman
by: jhardman | last post by:
Have you ever wanted to upload files through a form and thought, "I'd really like to use ASP, it surely has that capability, but the tutorial I used to learn ASP didn't mention how to do this."? Have you looked around trying to find simple solutions but didn't want to wade through pages of complex code? Have you balked at paying for premade solutions that are probably overkill for your particular project? I'd like to walk you through the...
0
9820
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, well explore What is ONU, What Is Router, ONU & Routers main usage, and What is the difference between ONU and Router. Lets take a closer look ! Part I. Meaning of...
0
9671
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10551
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10594
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
5650
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5828
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4458
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4020
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3116
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.