On Mon, 07 Feb 2005 21:48:59 +0000, Börni <b.******@onlinehome.de> wrote:
Andy Hassall wrote: On Mon, 07 Feb 2005 20:19:05 +0000, Börni <b.******@onlinehome.de> wrote:
The user notes on:
http://uk2.php.net/manual/en/functio...-text-node.php
claim:
"all text methods in domxml expect utf-8 encoded strings as input."
... and recommend using utf8_eecode() on the value.
Ok, but why then can i specifiy the encoding for the file?!
You apparently need to pass all text into the functions as utf8 - and only
then when the XML is finally output is it re-encoded into the encoding you
specified.
Consider:
andyh@server:~/public_html$ cat test.php
<?php
$doc = new DOMDocument('1.0', 'iso-8859-1');
$doc->formatOutput = true;
$root = $doc->createElement('root');
$root = $doc->appendChild($root);
$head = $doc->createElement('head');
$head = $root->appendChild($head);
$title = $doc->createElement('title');
$title = $head->appendChild($title);
/* probably the real sign gets killed,
so here is the html: ä (an umlaut) */
$text = $doc->createTextNode(utf8_encode('ä'));
$text = $title->appendChild($text);
echo $doc->saveXML();
?>
andyh@server:~/public_html$ php -q test.php | hexdump -C
00000000 3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 31 |<?xml version="1|
00000010 2e 30 22 20 65 6e 63 6f 64 69 6e 67 3d 22 69 73 |.0" encoding="is|
00000020 6f 2d 38 38 35 39 2d 31 22 3f 3e 0a 3c 72 6f 6f |o-8859-1"?>.<roo|
00000030 74 3e 0a 20 20 3c 68 65 61 64 3e 0a 20 20 20 20 |t>. <head>. |
00000040 3c 74 69 74 6c 65 3e e4 3c 2f 74 69 74 6c 65 3e |<title>ä</title>|
00000050 0a 20 20 3c 2f 68 65 61 64 3e 0a 3c 2f 72 6f 6f |. </head>.</roo|
00000060 74 3e 0a |t>.|
00000063
Note that the ä has indeed come out in iso-8859-1, i.e a single byte, and not
in utf8.
--
Andy Hassall / <an**@andyh.co.uk> / <http://www.andyh.co.uk>
<http://www.andyhsoftware.co.uk/space> Space: disk usage analysis tool