By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,642 Members | 1,681 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,642 IT Pros & Developers. It's quick & easy.

Showing text weirdly with php

P: n/a
Hello,

i have html form for the user to input some french text. The text go in a mysql
database

* before: (mozilla-1.5)

All my pages were designed to show themselves with the iso-8859-15 charset
(accept attribute in forms, and Content-Type in html head).

The from-mysql-showed-text was full of '?' ('&#65533') instead of '-' and "'".

But when i switched Mozilla->View->Coding->iso-8859-1, the '?' were replaced by
the good '-' and "'".

Moreover, phpMyAdmin was showing texts correctly without touching View->Coding.

So i changed everything to use iso-8859-1 (form accept and Content-Type)

* Now:

Still the same for my pages.

phpMyAdmin shows '&#65533' directly in the navigator (no need to open the html
source code)

System:
* Mac OS X 10.2.8
* apache-1.3.28 (home-compiled)
* php-4.3.3 (home-compiled)
* mysql-4.1.0-alpha-standard (package)

Can somebody help me?

Thx

--
TheDD

Jul 17 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
david wrote:
All my pages were designed to show themselves with the iso-8859-15 charset
(accept attribute in forms, and Content-Type in html head).
The accept attribute takes a comma separated list of content
types. In W3C lingo, "content type" is synonymous with MIME type
or media type; text/plain is an example of a MIME type.

Peradventure, you meant to write "accept-charset attribute": a
space and/or comma separated list of "charsets" (character
encodings). ISO-8859-15 is an example of a charset. Anyway,
browser support for accept-charset is pathetic, I believe.

http://www.w3.org/TR/REC-html40/inte...ms.html#h-17.3
The from-mysql-showed-text was full of '?' ('&#65533') instead of '-' and "'".

But when i switched Mozilla->View->Coding->iso-8859-1, the '?' were replaced by
the good '-' and "'".


Sounds awfully like you're not advertising the character encoding
properly -- got a URL?

Forget about that meta hack. What you should be doing is sending
the charset parameter with a bona fide Content-Type header. So if
your document is sent in Latin-9 encoded text, you should ensure
that your server is outputting:

Content-Type: text/html; charset=ISO-8859-15

This is plain sailing in your server, Apache; or you could use
PHP's header function to override the server-set Content-Type
field value.

--
Jock
Jul 17 '05 #2

P: n/a
John Dunlop wrote:
david wrote:
All my pages were designed to show themselves with the iso-8859-15 charset
(accept attribute in forms, and Content-Type in html head).

The accept attribute takes a comma separated list of content
types. In W3C lingo, "content type" is synonymous with MIME type
or media type; text/plain is an example of a MIME type.

Peradventure, you meant to write "accept-charset attribute": a
space and/or comma separated list of "charsets" (character
encodings). ISO-8859-15 is an example of a charset. Anyway,
browser support for accept-charset is pathetic, I believe.

http://www.w3.org/TR/REC-html40/inte...ms.html#h-17.3


yes, my mistake

The from-mysql-showed-text was full of '?' ('&#65533') instead of '-' and "'".

But when i switched Mozilla->View->Coding->iso-8859-1, the '?' were replaced by
the good '-' and "'".

Sounds awfully like you're not advertising the character encoding
properly -- got a URL?
an URL? i'm developping the web site internally, i can't make it public now.

but i've forgotten one thing:
* the texte i put in the textarea is copied from the original (static html)
website. If i replace all the wrong letter, by the same one, typed from
keyboard (instead of copied), it works...

Forget about that meta hack. What you should be doing is sending
the charset parameter with a bona fide Content-Type header. So if
your document is sent in Latin-9 encoded text, you should ensure
that your server is outputting:

Content-Type: text/html; charset=ISO-8859-15

This is plain sailing in your server, Apache; or you could use
PHP's header function to override the server-set Content-Type
field value.


Well i had added the following directive in httpd.conf:
AddDefaultCharset iso-8859-15

wich leads to: (Mozilla request and apache 1.3.28 answer)

GET / HTTP/1.1
Host: localhost:8080
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.5)
Gecko/20031007
Accept: text/xml,application/xml,application/xhtml+xml,text/html;
q=0.9,text/plain;q=0.8,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1
Accept-Language: fr-fr,fr;q=0.8,en-us;q=0.5,en;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

HTTP/1.1 200 OK
Date: Wed, 05 Nov 2003 09:37:29 GMT
Server: Apache/1.3.28 (Darwin) PHP/4.3.3
Content-Location: index.html.fr
Vary: negotiate,accept-language,accept-charset
TCN: choice
Last-Modified: Thu, 18 Oct 2001 04:25:28 GMT
ETag: "5882-5f5-3bce59b8;3e7d7b13"
Accept-Ranges: bytes
Content-Length: 1525
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-15
Content-Language: fr

So i change it to iso-8859-1, and it works (everything i've tested this
morning)!!! Thx you so much! I had completely forgot this directive.
Just a little explanation: i tried to use -15 mainly because of the euro sign;
but in html, there is an "entity" for it, so i'm gonna use -1 since it's seems
better handled (and also because most of web sites are in -1 and since the
translation in textarea wich the specified accept-charset is not done, it's
definitely a better choice - i don't have php-iconv -).

(sorry for my bad english)

--
David

Jul 17 '05 #3

P: n/a
david wrote:
Just a little explanation: i tried to use -15 mainly because of the
euro sign; but in html, there is an "entity" for it,


Yes, but in practice, browser support isn't universal. The EURO
SIGN can be presented using the entity reference €, and the
numeric character references € and &#x20ac. The numeric
character reference € has the most browser support.

You could just use the word "euro". No browser gets that wrong.

--
Jock
Jul 17 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.