473,883 Members | 1,665 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Adobe GoLive 6 - Nasty feature with UTF-8 encoding

Recently I was editing a document in GoLive 6. I like GoLive because it has some nice
features such as:
* rewrite source code
* check syntax
* global search & replace (through several files at once)
* regular expression search & replace.

Normally my documents are encoded with the ISO setting.

Recently I was writing an XHTML document. After changing the encoding to UTF-8 I used the
GoLive 'rewrite source code' feature. Big mistake. It changed all my funny characters to
non-SGML compliant characters (e.g. é was converted to ) and I didn't notice until
after I'd saved the document. Nasty. It doesn't do that with ISO encoded documents.

Jul 20 '05 #1
48 4667
Zenobia <5.**********@s pamgourmet.com> wrote:
Recently I was writing an XHTML document. After changing the encoding to UTF-8 I used the
GoLive 'rewrite source code' feature. Big mistake. It changed all my funny characters to
non-SGML compliant characters (e.g. &eacute; was converted to )


There's nothing non-compliant about . It exists in UTF-8 therefore if
UTF-8 is the declared encoding it is perfectly okay to use it. This
would also be the case for any encoding that contained , e.g.
ISO-8859-1 etc. This is true whether your document is HTML or XHTML.

However, an XHTML document that is "SGML compliant"? Surely that's an
oxymoron. ;-)

Steve

--
"My theories appal you, my heresies outrage you,
I never answer letters and you don't like my tie." - The Doctor

Steve Pugh <st***@pugh.net > <http://steve.pugh.net/>
Jul 20 '05 #2
Zenobia <5.**********@s pamgourmet.com> wrote:
Recently I was writing an XHTML document. After changing the encoding to UTF-8 I used the
GoLive 'rewrite source code' feature. Big mistake. It changed all my funny characters to
non-SGML compliant characters (e.g. &eacute; was converted to )


There's nothing non-compliant about . It exists in UTF-8 therefore if
UTF-8 is the declared encoding it is perfectly okay to use it. This
would also be the case for any encoding that contained , e.g.
ISO-8859-1 etc. This is true whether your document is HTML or XHTML.

However, an XHTML document that is "SGML compliant"? Surely that's an
oxymoron. ;-)

Steve

--
"My theories appal you, my heresies outrage you,
I never answer letters and you don't like my tie." - The Doctor

Steve Pugh <st***@pugh.net > <http://steve.pugh.net/>
Jul 20 '05 #3
On Sat, 10 Apr 2004 12:11:54 +0100, Steve Pugh <st***@pugh.net > wrote:
Zenobia <5.**********@s pamgourmet.com> wrote:
Recently I was writing an XHTML document. After changing the encoding to UTF-8 I used the
GoLive 'rewrite source code' feature. Big mistake. It changed all my funny characters to
non-SGML compliant characters (e.g. &eacute; was converted to )


There's nothing non-compliant about . It exists in UTF-8 therefore if
UTF-8 is the declared encoding it is perfectly okay to use it. This
would also be the case for any encoding that contained , e.g.
ISO-8859-1 etc. This is true whether your document is HTML or XHTML.


I entered these character into an XHTML document:

&plusmn; &deg; &auml;

It validated OK by the W3c XHTML validator.

This is what GoLive 6 displays after 'rewrite source code'

± ° ä

These are the characters, as rendered, by IE6:



This is the error message I get when I validate the (modified) document with the W3c XHTML
validator:

Sorry, I am unable to validate this document because on line 11 it contained
one or more bytes that I cannot interpret as us-ascii (in other words, the bytes
found are not valid values in the specified Character Encoding). Please check
both the content of the file and the character encoding indication.

Well, the corresponding numeric codes are:

± ° ä

(I suppose the w3c validator would also accept these).

How can these characters ( , or the 2-char versions) be valid UTF-8 characters if the
W3C validator doesn't accept them? Or does the W3C validator not work correctly?

I'm also lost as to why is shown in the GoLive editor as ±, etc. That character is
± - surely this is just a one byte character. GoLive should display it as

How would you go about writing XTHML valid code using GoLive - (a) with the document set
to use UTF-8 encoding (but without the benefit of the 'rewrite source code' feature. OR
(b) using ISO-8859-1, so that you are able to use the 'rewrite source code' feature?

Jul 20 '05 #4
On Sat, 10 Apr 2004 12:11:54 +0100, Steve Pugh <st***@pugh.net > wrote:
Zenobia <5.**********@s pamgourmet.com> wrote:
Recently I was writing an XHTML document. After changing the encoding to UTF-8 I used the
GoLive 'rewrite source code' feature. Big mistake. It changed all my funny characters to
non-SGML compliant characters (e.g. &eacute; was converted to )


There's nothing non-compliant about . It exists in UTF-8 therefore if
UTF-8 is the declared encoding it is perfectly okay to use it. This
would also be the case for any encoding that contained , e.g.
ISO-8859-1 etc. This is true whether your document is HTML or XHTML.


I entered these character into an XHTML document:

&plusmn; &deg; &auml;

It validated OK by the W3c XHTML validator.

This is what GoLive 6 displays after 'rewrite source code'

± ° ä

These are the characters, as rendered, by IE6:



This is the error message I get when I validate the (modified) document with the W3c XHTML
validator:

Sorry, I am unable to validate this document because on line 11 it contained
one or more bytes that I cannot interpret as us-ascii (in other words, the bytes
found are not valid values in the specified Character Encoding). Please check
both the content of the file and the character encoding indication.

Well, the corresponding numeric codes are:

± ° ä

(I suppose the w3c validator would also accept these).

How can these characters ( , or the 2-char versions) be valid UTF-8 characters if the
W3C validator doesn't accept them? Or does the W3C validator not work correctly?

I'm also lost as to why is shown in the GoLive editor as ±, etc. That character is
± - surely this is just a one byte character. GoLive should display it as

How would you go about writing XTHML valid code using GoLive - (a) with the document set
to use UTF-8 encoding (but without the benefit of the 'rewrite source code' feature. OR
(b) using ISO-8859-1, so that you are able to use the 'rewrite source code' feature?

Jul 20 '05 #5
Zenobia wrote:
On Sat, 10 Apr 2004 12:11:54 +0100, Steve Pugh <st***@pugh.net > wrote:
Zenobia <5.**********@s pamgourmet.com> wrote:
Recently I was writing an XHTML document. After changing the encoding to
UTF-8 I used the GoLive 'rewrite source code' feature. Big mistake. It
changed all my funny characters to non-SGML compliant characters (e.g.
&eacute; was converted to é)
There's nothing non-compliant about é. It exists in UTF-8 therefore if
UTF-8 is the declared encoding it is perfectly okay to use it. This
would also be the case for any encoding that contained é, e.g.
ISO-8859-1 etc. This is true whether your document is HTML or XHTML.


I entered these character into an XHTML document:

&plusmn; &deg; &auml;

It validated OK by the W3c XHTML validator.


Okay, but there's nothing depending on UTF-8 there, those characters are all
present in US-ASCII (the characters '&', 'p', 'l', etc). Those are the
characters that are actually present in the file.

This is what GoLive 6 displays after 'rewrite source code'

± ° ä
What a (possibly flawed) program displays isn't very relevent when trying to
determine where a bug lies. Could you provide a URL to a representative
example or two?

These are the characters, as rendered, by IE6:

± ° ä
This is of little value, as Internet Explorer violates multiple
specifications to try and guess at the behaviour that the author intended.

This is the error message I get when I validate the (modified) document
with the W3c XHTML validator:

Sorry, I am unable to validate this document because on line 11 it
contained one or more bytes that I cannot interpret as us-ascii (in other
words, the bytes found are not valid values in the specified Character
Encoding). Please check both the content of the file and the character
encoding indication.
From that error message, I would *guess* that there was an incorrect or
missing HTTP header and/or <meta> element in your document. The characters
you are talking about are not present in US-ASCII, if the validator thinks
the document is encoded in US-ASCII, it's probably because you have told it
so (which you shouldn't).

Well, the corresponding numeric codes are:

± ° ä

(I suppose the w3c validator would also accept these).
Yes, on account of those actual characters being present in the US-ASCII
character encoding ('&', '#', '1', etc).

How can these characters (± ° ä, or the 2-char versions) be valid UTF-8
characters if the W3C validator doesn't accept them?
If, when a user-agent requests a document, you are telling it that it is
encoded in US-ASCII, most user-agents will believe you, including
validators. Try supplying an appropriate HTTP header:

Content-Type: text/html; charset=UTF-8

Or does the W3C validator not work correctly?
If I had to guess between Internet Explorer working correctly, and something
else working correctly, I'd put money on the something else.

I'm also lost as to why ± is shown in the GoLive editor as ±, etc.
character is ± - surely this is just a one byte character.
No it isn't. How many bytes depends on the character encoding, and UTF-8
sometimes splits single characters up into multiple bytes. I'm pretty sure
that the byte sequence for ± in UTF-8 is the same as the byte sequence for
± in US-ASCII.

GoLive should display it as ±
Only if your document is correctly advertised as being UTF-8, which is
probably isn't.

How would you go about writing XTHML valid code using GoLive - (a) with
the document set to use UTF-8 encoding (but without the benefit of the
'rewrite source code' feature. OR (b) using ISO-8859-1, so that you are
able to use the 'rewrite source code' feature?


Ensure the server is sending the correct HTTP headers, and place a matching
<meta> element in each document.
--
Jim Dabell

Jul 20 '05 #6
Zenobia wrote:
On Sat, 10 Apr 2004 12:11:54 +0100, Steve Pugh <st***@pugh.net > wrote:
Zenobia <5.**********@s pamgourmet.com> wrote:
Recently I was writing an XHTML document. After changing the encoding to
UTF-8 I used the GoLive 'rewrite source code' feature. Big mistake. It
changed all my funny characters to non-SGML compliant characters (e.g.
&eacute; was converted to é)
There's nothing non-compliant about é. It exists in UTF-8 therefore if
UTF-8 is the declared encoding it is perfectly okay to use it. This
would also be the case for any encoding that contained é, e.g.
ISO-8859-1 etc. This is true whether your document is HTML or XHTML.


I entered these character into an XHTML document:

&plusmn; &deg; &auml;

It validated OK by the W3c XHTML validator.


Okay, but there's nothing depending on UTF-8 there, those characters are all
present in US-ASCII (the characters '&', 'p', 'l', etc). Those are the
characters that are actually present in the file.

This is what GoLive 6 displays after 'rewrite source code'

± ° ä
What a (possibly flawed) program displays isn't very relevent when trying to
determine where a bug lies. Could you provide a URL to a representative
example or two?

These are the characters, as rendered, by IE6:

± ° ä
This is of little value, as Internet Explorer violates multiple
specifications to try and guess at the behaviour that the author intended.

This is the error message I get when I validate the (modified) document
with the W3c XHTML validator:

Sorry, I am unable to validate this document because on line 11 it
contained one or more bytes that I cannot interpret as us-ascii (in other
words, the bytes found are not valid values in the specified Character
Encoding). Please check both the content of the file and the character
encoding indication.
From that error message, I would *guess* that there was an incorrect or
missing HTTP header and/or <meta> element in your document. The characters
you are talking about are not present in US-ASCII, if the validator thinks
the document is encoded in US-ASCII, it's probably because you have told it
so (which you shouldn't).

Well, the corresponding numeric codes are:

± ° ä

(I suppose the w3c validator would also accept these).
Yes, on account of those actual characters being present in the US-ASCII
character encoding ('&', '#', '1', etc).

How can these characters (± ° ä, or the 2-char versions) be valid UTF-8
characters if the W3C validator doesn't accept them?
If, when a user-agent requests a document, you are telling it that it is
encoded in US-ASCII, most user-agents will believe you, including
validators. Try supplying an appropriate HTTP header:

Content-Type: text/html; charset=UTF-8

Or does the W3C validator not work correctly?
If I had to guess between Internet Explorer working correctly, and something
else working correctly, I'd put money on the something else.

I'm also lost as to why ± is shown in the GoLive editor as ±, etc.
character is ± - surely this is just a one byte character.
No it isn't. How many bytes depends on the character encoding, and UTF-8
sometimes splits single characters up into multiple bytes. I'm pretty sure
that the byte sequence for ± in UTF-8 is the same as the byte sequence for
± in US-ASCII.

GoLive should display it as ±
Only if your document is correctly advertised as being UTF-8, which is
probably isn't.

How would you go about writing XTHML valid code using GoLive - (a) with
the document set to use UTF-8 encoding (but without the benefit of the
'rewrite source code' feature. OR (b) using ISO-8859-1, so that you are
able to use the 'rewrite source code' feature?


Ensure the server is sending the correct HTTP headers, and place a matching
<meta> element in each document.
--
Jim Dabell

Jul 20 '05 #7
Zenobia <5.**********@s pamgourmet.com> wrote in message news:<0f******* *************** **********@4ax. com>...
Sorry, I am unable to validate this document because on line 11 it contained
one or more bytes that I cannot interpret as us-ascii (in other words, the bytes
found are not valid values in the specified Character Encoding). Please check
both the content of the file and the character encoding indication.


Sounds like your server is sending the document with the character
encoding specified as us-ascii rather than utf-8, even though the
actual encoding is utf-8.

--
Dan
Jul 20 '05 #8
Zenobia <5.**********@s pamgourmet.com> wrote in message news:<0f******* *************** **********@4ax. com>...
Sorry, I am unable to validate this document because on line 11 it contained
one or more bytes that I cannot interpret as us-ascii (in other words, the bytes
found are not valid values in the specified Character Encoding). Please check
both the content of the file and the character encoding indication.


Sounds like your server is sending the document with the character
encoding specified as us-ascii rather than utf-8, even though the
actual encoding is utf-8.

--
Dan
Jul 20 '05 #9
"Steve Pugh" <st***@pugh.net > wrote in
comp.infosystem s.www.authoring.html:
There's nothing non-compliant about . It exists in UTF-8 therefore if
UTF-8 is the declared encoding it is perfectly okay to use it.


But Alan Flavell advises against using any of 128-255 directly in
UTF-8, if I understand his "Checklist" page correctly.[1] Instead he
says characters above 127 should be expressed in &-notation.

There's another problem with UTF-8: when I "Save As" a UTF-8 page,
Mozilla 1.4 scrogs up the high-order characters so that the local
copy contains garbage sequences instead of e.g. –. I reported
this months ago; anybody know if it's been fixed in later versions?

[1] http://ppewww.ph.gla.ac.uk/~flavell/...checklist.html
See scenario 6.

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
2.1 changes: http://www.w3.org/TR/CSS21/changes.html
validator: http://jigsaw.w3.org/css-validator/
Jul 20 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
15280
by: CJ Butcher | last post by:
Hi, I have a need to pull information from an Oracle 9.2.1 database. It is storing UTF-8 characters in a varchar2(33) field. I need to be able to pull these values out and put them in a SQL Server 2000 version of the table with the UTF-8 characters intact. Anyone know how to do this...efficiently? I've tried DTS and the high end UTF-8 characters get lost in the copy.
2
3604
by: wrrn | last post by:
Hi - I'm beginning work on an existing web site which was created with Adobe Golive. The page in question has CSACTIONS which is part of Golive's CyberStudio. Anyway, the code below has this CSaction stuff and a bunch of Javascript. If somebody would please take a very brief peek at the code and give me a basic idea of what it does. Presumably something to do with pausing the page before redirecting to another URL. But it seems like so...
16
2655
by: Teffy | last post by:
Should I switch from using HTML-Kit to using Adobe GoLive? I am just a volunteer who is webmaster for a small non-profit group. I am the only person working on the site. The only reason I am tempted to switch from HTML-Kit to GoLive is that I can get an academic version for cheap ($70 USD). I have been doing simple coding by hand with HTML-Kit because it is free, to avoid having to go back later to debug bloated tag soup some editors...
76
15183
by: Zenobia | last post by:
How do I display character 151 (long hyphen) in XHTML (utf-8) ? Is there another character that will substitute? The W3C validation parser, http://validator.w3.org, tells me that this character and the ones around it are illegal - then, after resubmission it flags no errors. So, are there any illegal characters between 0 and 255 in the UTF-8 character set or is it just my imagination that the W3C validation parser thinks there are -...
87
5687
by: CMAR | last post by:
For xhtml validatin, which is the right metatag to use for English language or can one forget about this tag? <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> Thanks, CMA
1
2007
by: MarkW | last post by:
I currently use Macromedia Studio for web development but have been considering changing to Adobe Products since I use Acrobat highly as well as Photoshop and will be using InDesign CS soon. My question, how do the Adobe products compare to Macromedia for web design? I also see they have GoLive CS as well as Framemaker. I'm kind of confused about this. Is GoLive the web development (HTML) program or is that Framemaker and how does...
0
1361
by: koklynn.yip | last post by:
Hey, I just wanted to let you guys know that I found a really good deal on Adobe GoLive 5.0. A download version is on sale here: http://store.crmsoftwares.com/adgoli5.html If you don't know, Adobe GoLive is a web-authorizing and site-management tool. This version also has a lot of new features that make it easier to use with Photoshop, Illustrator, and LiveMotion files.
18
1675
by: Kamen Yotov | last post by:
hi all, i first posted this on http://msdn.microsoft.com/vcsharp/team/language/ask/default.aspx (ask a c# language designer) a couple of days ago, but no response so far... therefore, i am pasting it here as well... enjoy! (you can skip to the source at the end of the message if you like...) Consider:
11
31795
by: ralphie | last post by:
hi all since nearly 2 days i fight with mssql and utf-8 as i need to store and retrieve arabic characters. i tried the com approach http://groups.google.com/group/mailing.www.php-dev/browse_thread/thread/69f90dc978763e99/d61aa0bd48de23f0?lnk=st&q=php+mssql+utf-8&rnum=7#d61aa0bd48de23f0 i tried as well the odbtp library, odbc and obviously the mssql native
0
9778
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10730
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10833
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10405
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9559
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing, and deploymentwithout human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7959
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupr who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7114
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5980
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
4205
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.