garbage characters are now on the site, although they weren't thereoriginally
Question posted by: Lawrence Krubner
(Guest)
on
June 27th, 2008 07:16 PM
Once upon a time, there were no garbage characters on this page:
http://www.teamlalala.com/blog/category/css/
Now there are. For instance:
The 2nd paragraph from page 114 of “The Zen Of CSS Design”
For me, there are garbage characters before "The" and after "Design".
The page has always, always been served as UTF-8.
I'm having trouble what might have changed, which would cause these
garbage characters. At a stretch, I think back to an incident a few
months ago, when our server was hacked, and we had to do a re-install,
with upgraded versions of stuff like Apache. So I could almost imagine
Apache sending new headers, except that, in my case, the meta tag
indicates UTF-8 and when I look at it in FireFox, FireFox correctly
reads it as UTF-8.
Anything else that could cause this?
I can not find a character encoding that renders this page without
garbage characters.
-- lawrence krubner
Would you like to answer this question?
Sign up for a free account, or Login (if you're already a member).
|
|
June 27th, 2008 07:16 PM
# 2
|
Re: garbage characters are now on the site, although they weren't thereoriginally
On 2008-06-05, Lawrence Krubner <lawrence@krubner.comwrote:
Quote:
Originally Posted by
>
>
Once upon a time, there were no garbage characters on this page:
>
http://www.teamlalala.com/blog/category/css/
>
Now there are. For instance:
>
The 2nd paragraph from page 114 of “The Zen Of CSS Design”
>
>
For me, there are garbage characters before "The" and after "Design".
>
The page has always, always been served as UTF-8.
>
I'm having trouble what might have changed, which would cause these
garbage characters. At a stretch, I think back to an incident a few
months ago, when our server was hacked, and we had to do a re-install,
with upgraded versions of stuff like Apache. So I could almost imagine
Apache sending new headers, except that, in my case, the meta tag
indicates UTF-8 and when I look at it in FireFox, FireFox correctly
reads it as UTF-8.
>
Anything else that could cause this?
>
I can not find a character encoding that renders this page without
garbage characters.
|
The page _is_ valid UTF-8, and the server header says it's UTF-8, and it
really does contain those characters (a with circumflex, euro symbol, oe
diphthong ligature thing), encoded in UTF-8.
How did they get there? Not sure, perhaps you "converted" the file from
Latin1 to UTF-8 when it already was UTF-8 or something.
Anyway you should be OK if you just fix the page to contain instead the
UTF-8 representations of the characters you want (presumably quotation
marks).
Never mind the meta tag-- the browser only uses that if the server fails
to say what the encoding is. In your case the server is. The meta tag
might as well be correct, but it won't cause or solve a real problem
here.
|
|
June 27th, 2008 07:16 PM
# 3
|
Re: garbage characters are now on the site, although they weren't thereoriginally
On Thu, 05 Jun 2008 22:16:08 +0200, Lawrence Krubner
<lawrence@krubner.comwrote:
Quote:
Originally Posted by
Once upon a time, there were no garbage characters on this page:
>
http://www.teamlalala.com/blog/category/css/
>
Now there are. For instance:
>
The 2nd paragraph from page 114 of “The Zen Of CSS Design�
>
>
For me, there are garbage characters before "The" and after "Design".
>
The page has always, always been served as UTF-8.
>
I'm having trouble what might have changed, which would cause these
garbage characters. At a stretch, I think back to an incident a few
months ago, when our server was hacked, and we had to do a re-install,
with upgraded versions of stuff like Apache. So I could almost imagine
Apache sending new headers, except that, in my case, the meta tag
indicates UTF-8 and when I look at it in FireFox, FireFox correctly
reads it as UTF-8.
>
Anything else that could cause this?
>
I can not find a character encoding that renders this page without
garbage characters.
|
Among the top reasons for double utf-8 encoding is an improper database
export/import.
--
Rik Wasmus
....spamrun finished
|
|
June 27th, 2008 07:16 PM
# 4
|
Re: garbage characters are now on the site, although they weren't thereoriginally
On Jun 6, 12:16 am, Lawrence Krubner <lawre...@krubner.comwrote:
Quote:
Originally Posted by
Once upon a time, there were no garbage characters on this page:
>
http://www.teamlalala.com/blog/category/css/
>
Now there are. For instance:
>
The 2nd paragraph from page 114 of The Zen Of CSS Design
>
For me, there are garbage characters before "The" and after "Design".
>
The page has always, always been served as UTF-8.
>
I'm having trouble what might have changed, which would cause these
garbage characters. At a stretch, I think back to an incident a few
months ago, when our server was hacked, and we had to do a re-install,
with upgraded versions of stuff like Apache. So I could almost imagine
Apache sending new headers, except that, in my case, the meta tag
indicates UTF-8 and when I look at it in FireFox, FireFox correctly
reads it as UTF-8.
>
Anything else that could cause this?
>
I can not find a character encoding that renders this page without
garbage characters.
|
Don't use "smart quotes" in any other way but HTML entities. Better do
not use them at all, but if really needed then only as HTML entities.
For static documents always check for quotes damages after having the
document being open in a rich text editor like say Microsoft Word.
Better do not open (X)HTML documents in any rich text editor at all.
Some of golden rules of a successful web-design. See also:
http://en.wikipedia.org/wiki/Smart_...nic_documen ts
|
|
June 27th, 2008 07:16 PM
# 5
|
Re: garbage characters are now on the site, although they weren't thereoriginally
Rik Wasmus wrote:
Quote:
Originally Posted by
On Thu, 05 Jun 2008 22:16:08 +0200, Lawrence Krubner
<lawrence@krubner.comwrote:
Quote:
Originally Posted by
>Once upon a time, there were no garbage characters on this page:
>>
> http://www.teamlalala.com/blog/category/css/
>>
>Now there are. For instance:
>>
>The 2nd paragraph from page 114 of “The Zen Of CSS Design�
>>
>>
>For me, there are garbage characters before "The" and after "Design".
>>
>The page has always, always been served as UTF-8.
>>
>I'm having trouble what might have changed, which would cause these
>garbage characters. At a stretch, I think back to an incident a few
>months ago, when our server was hacked, and we had to do a re-install,
>with upgraded versions of stuff like Apache. So I could almost imagine
>Apache sending new headers, except that, in my case, the meta tag
>indicates UTF-8 and when I look at it in FireFox, FireFox correctly
>reads it as UTF-8.
>>
>Anything else that could cause this?
>>
>I can not find a character encoding that renders this page without
>garbage characters.
|
>
Among the top reasons for double utf-8 encoding is an improper database
export/import.
|
That must be it, then. Is there an automated way to undo the damage? Or
do I have to fix every post by hand?
Also, any tips on import/export, for the next time I have to do this?
--lk
|
|
June 27th, 2008 07:16 PM
# 6
|
Re: garbage characters are now on the site, although they weren't thereoriginally
On Jun 7, 7:44*pm, Lawrence Krubner <lawre...@krubner.comwrote:
Quote:
Originally Posted by
Rik Wasmus wrote:
Quote:
Originally Posted by
On Thu, 05 Jun 2008 22:16:08 +0200, Lawrence Krubner
<lawre...@krubner.comwrote:
Quote:
Originally Posted by
Once upon a time, there were no garbage characters on this page:
|
|
>
>
Quote:
Originally Posted by
Quote:
Originally Posted by
Now there are. For instance:
|
|
>
Quote:
Originally Posted by
Quote:
Originally Posted by
The 2nd paragraph from page 114 of “The Zen Of CSS Design�
|
|
>
Quote:
Originally Posted by
Quote:
Originally Posted by
For me, there are garbage characters before "The" and after "Design".
|
|
>
Quote:
Originally Posted by
Quote:
Originally Posted by
The page has always, always been served as UTF-8.
|
|
>
Quote:
Originally Posted by
Quote:
Originally Posted by
I'm having trouble what might have changed, which would cause these
garbage characters. At a stretch, I think back to an incident a few
months ago, when our server was hacked, and we had to do a re-install,
with upgraded versions of stuff like Apache. So I could almost imagine
Apache sending new headers, except that, in my case, the meta tag
indicates UTF-8 and when I look at it in FireFox, FireFox correctly
reads it as UTF-8.
|
|
>
Quote:
Originally Posted by
Quote:
Originally Posted by
Anything else that could cause this?
|
|
>
Quote:
Originally Posted by
Quote:
Originally Posted by
I can not find a character encoding that renders this page without
garbage characters.
|
|
>
Quote:
Originally Posted by
Among the top reasons for double utf-8 encoding is an improper database
export/import.
|
>
That must be it, then. Is there an automated way to undo the damage? Or
do I have to fix every post by hand?
>
Also, any tips on import/export, for the next time I have to do this?
>
--lk
|
Somewhat off-topic question, but, when you copy-and-paste text in
windows/unix, is the encoding included in that information?
I.e. if you saved a document in latin1 and wanted to get it to utf-8,
could you just coipy and paste the text into a new document
and save it as utf-8?
|
|
June 27th, 2008 07:16 PM
# 7
|
Re: garbage characters are now on the site, although they weren't thereoriginally
On Tue, 10 Jun 2008, Keith Hughitt wrote:
Quote:
Originally Posted by
Somewhat off-topic question, but, when you copy-and-paste text in
windows/unix, is the encoding included in that information?
|
What is "windows/unix"?
Quote:
Originally Posted by
I.e. if you saved a document in latin1 and wanted to get it to utf-8,
could you just coipy and paste the text into a new document
and save it as utf-8?
|
It depends on the program you use.
On Unix, it depends also on your locale settings.
--
In memoriam Alan J. Flavell
http://groups.google.com/groups/sea...:Alan.J.Flavell
|
|
June 27th, 2008 07:16 PM
# 8
|
Re: garbage characters are now on the site, although they weren't thereoriginally
Andreas Prilop wrote:
Quote:
Originally Posted by
On Tue, 10 Jun 2008, Keith Hughitt wrote:
>
Quote:
Originally Posted by
>Somewhat off-topic question, but, when you copy-and-paste text in
>windows/unix, is the encoding included in that information?
|
>
What is "windows/unix"?
|
s/\// or /
--
Blinky
Killing all posts from Google Groups
The Usenet Improvement Project -- http://improve-usenet.org
Found 5/08: a free GG-blocking news *feed* -- http://usenet4all.se
|
|
June 27th, 2008 07:16 PM
# 9
|
Re: garbage characters are now on the site, although they weren't thereoriginally
On Sun, 08 Jun 2008 01:44:50 +0200, Lawrence Krubner
<lawrence@krubner.comwrote:
Quote:
Originally Posted by
Rik Wasmus wrote:
Quote:
Originally Posted by
>On Thu, 05 Jun 2008 22:16:08 +0200, Lawrence Krubner
><lawrence@krubner.comwrote:
Quote:
Originally Posted by
>>Once upon a time, there were no garbage characters on this page:
>>>
>> http://www.teamlalala.com/blog/category/css/
>>>
>>Now there are. For instance:
>>>
>>The 2nd paragraph from page 114 of “The Zen Of CSS Design�
>>>
>>>
>>For me, there are garbage characters before "The" and after "Design".
>>>
>>The page has always, always been served as UTF-8.
>>>
>>I'm having trouble what might have changed, which would cause these
>>garbage characters. At a stretch, I think back to an incident a few
>>months ago, when our server was hacked, and we had to do a re-install,
>>with upgraded versions of stuff like Apache. So I could almost imagine
>>Apache sending new headers, except that, in my case, the meta tag
>>indicates UTF-8 and when I look at it in FireFox, FireFox correctly
>>reads it as UTF-8.
>>>
>>Anything else that could cause this?
>>>
>>I can not find a character encoding that renders this page without
>>garbage characters.
|
> Among the top reasons for double utf-8 encoding is an improper
>database export/import.
|
>
That must be it, then. Is there an automated way to undo the damage? Or
do I have to fix every post by hand?
|
I am not aware of a general quick easy fix, ask in a group dedicated to
the database of your choice, it isn't an uncommon problem.
Quote:
Originally Posted by
Also, any tips on import/export, for the next time I have to do this?
|
If MySQL, be sure to set your connection characteristics to the proper
values. The first statement in your file to be imported in that case
should've been:
SET NAMES utf8;
HTH,
--
Rik Wasmus
....spamrun finished
|
|
June 27th, 2008 07:16 PM
# 10
|
Re: garbage characters are now on the site, although they weren't thereoriginally
Hehe, what I meant was on either Windows or Unix (Linux). I'd be
interested to know how it works
on both systems.
On Jun 10, 11:50*am, Andreas Prilop <prilop1...@trashmail.netwrote:
Quote:
Originally Posted by
On Tue, 10 Jun 2008, Keith Hughitt wrote:
Quote:
Originally Posted by
Somewhat off-topic question, but, when you copy-and-paste text in
windows/unix, is the encoding included in that information?
|
>
What is "windows/unix"?
>
Quote:
Originally Posted by
I.e. if you saved a document in latin1 and wanted to get it to utf-8,
could you just coipy and paste the text into a new document
and save it as utf-8?
|
>
It depends on the program you use.
On Unix, it depends also on your locale settings.
>
--
In memoriam Alan J. Flavellhttp://groups.google.com/groups/search?q=author:Alan.J.Flavell
|
|
|
June 27th, 2008 07:16 PM
# 11
|
Re: garbage characters are now on the site, although they weren't thereoriginally
Differently.
Quote:
Originally Posted by
interested to know how it works on both systems.
Hehe, what I meant was on either Windows or Unix (Linux). I'd be
>
Quote:
Originally Posted by
>What is "windows/unix"?
>>
Quote:
Originally Posted by
>>windows/unix, is the encoding included in that information?
>>Somewhat off-topic question, but, when you copy-and-paste text in
|
|
|
On Wed, 11 Jun 2008, Keith Hughitt wrote:
--
Top-posting.
What's the most irritating thing on Usenet?
Not the answer you were looking for? Post your question . . .
182,263 Experts ready to help you find a solution.
Sign up for a free account, or Login (if you're already a member).
|
|
|
Top Community Contributors
|