Andy Hassall <an**@andyh.co.uk> wrote in message news:<ko********************************@4ax.com>. ..
On 29 Sep 2004 21:13:30 GMT, Pedro Graca <he****@hotpop.com> wrote:
lawrence wrote: Validator chokes on my pages now because I started sending an
character encoding header of UTF-8 but the page is full of non UTF-8
characters. Anyway quick way to convert them?
http://validator.w3.org/check?uri=ht...krubner.com%2F
http://www.php.net/utf8_decode
... and check your meta tags
Isn't that the wrong way around? He's sending non-UTF8 data but flagging it as
UTF8, resulting in errors - if the headers remain that way, then isn't what he
wants is to encode it to UTF8, not decode?
The other big question is - why did the OP start sending UTF8 headers if he's
not actually sending UTF8?
The current site was broken in the sense that it is a weblog and I'd
like to put an RSS feed on it, because all weblogs have RSS feeds
nowadays.
But RSS feeds won't validate if the feed is sent out without a
character encoding. So I have to give it a character encoding of some
kind. So I decided on UTF-8 after hashing it out some over on
comp.lang.php. And now that I'm forcing the issue, there is a lot of
code that was input previously that is balking.
The site has been built-in a hodge-podge way over the last 6 years and
has debris from previous incarnations. The weblog software I now use
started long before I knew what a character encoding was. Developing
the software has been a process of finding out about stuff and then
trying to make the existing content fit whatever the new issue is. No
doubt this process will continue for many more years, as there will
always be things I don't know, and then there will always be new
technologies or uses I want to try for.
The way the software now works is that any input gets hit with
utf8_encode() and therefore any output from now on should be UTF-8.
But in the meantime I've got to clean up the old stuff.