browse: forums | FAQ
Connecting Tech Pros Worldwide

Hey there! Do you need .NET Framework help?

Get answers from our community of .NET Framework experts on BYTES! It's free.

Encoding problems / Perl 5.8.0 / XML::LibXML / XML::LibXSLT

Iain
Guest
 
Posts: n/a
#1: Jul 20 '05
Folks,

I'm having a problem with charset encodings that I desparately need some
help with. I don't even pretend to know the basics about charsets, so
please forgive my ignorance.

I am transforming XML source into XHTML using an encoding of iso-8859-1
and when I browse (using Mozilla 1.x) I see strange, accented 'A'
characters preceeding some characters generated from an entity
reference. If I use utf-8, things get a lot worse: even my  
characters get prefixed with the accented junk.

My resultant XHTML source has the usual XML preamble at the top,
complete with encoding specification; however, it doesn't use <meta/> to
specify the charset -- could this be the cause of my problem?

Basically, because I don't understand this, and because I'd like to, can
someone recommend the practises I should be following when doing these
transforms, especially when using Perl and the XML::LibXML/XML::LibXSLT
to manage them.

Ideally, I'd like to use utf-8 (I'm guessing that's the best approach)
but it's been a bit of a non-started for me.

Hoping someone in c.t.xml or c.l.perl.misc can point me in the best
direction.

Many thanks,
Iain.
--
Blow the smoke from my address if replying personally.




Martin Honnen
Guest
 
Posts: n/a
#2: Jul 20 '05

re: Encoding problems / Perl 5.8.0 / XML::LibXML / XML::LibXSLT




Iain wrote:
[color=blue]
> I'm having a problem with charset encodings that I desparately need some
> help with. I don't even pretend to know the basics about charsets, so
> please forgive my ignorance.
>
> I am transforming XML source into XHTML using an encoding of iso-8859-1
> and when I browse (using Mozilla 1.x) I see strange, accented 'A'
> characters preceeding some characters generated from an entity
> reference. If I use utf-8, things get a lot worse: even my &nbsp;
> characters get prefixed with the accented junk.
>
> My resultant XHTML source has the usual XML preamble at the top,
> complete with encoding specification; however, it doesn't use <meta/> to
> specify the charset -- could this be the cause of my problem?[/color]

What content-type do you send to the browser? If you have server side
scripting then you don't need a meta element but you should send a HTTP
header
Content-Type: text/html; charset=ISO-8859-1
to indidacte the encoding if you send text/html as the HTML parser of a
browser will hardly look at the XML declaration.
If you send the XHTML with an XML content type like
Content-Type: text/xml
then the browser will use the XML parser and that should indeed process any
<?xml version="1.0" encoding="ISO-8859-1"?>
--

Martin Honnen
http://JavaScript.FAQTs.com/

Iain
Guest
 
Posts: n/a
#3: Jul 20 '05

re: Encoding problems / Perl 5.8.0 / XML::LibXML / XML::LibXSLT


Martin Honnen wrote:[color=blue]
>[/color]
-->8--[color=blue]
>
> What content-type do you send to the browser? If you have server side
> scripting then you don't need a meta element but you should send a HTTP
> header
> Content-Type: text/html; charset=ISO-8859-1
> to indidacte the encoding if you send text/html as the HTML parser of a
> browser will hardly look at the XML declaration.
> If you send the XHTML with an XML content type like
> Content-Type: text/xml
> then the browser will use the XML parser and that should indeed process any
> <?xml version="1.0" encoding="ISO-8859-1"?>[/color]

Thanks Martin. The HTTP header did the trick.

Iain.
--
Clear the smoke from my address before replying directly to me.

Closed Thread