By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,407 Members | 1,753 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,407 IT Pros & Developers. It's quick & easy.

HTML aext area -> xml

P: n/a
Hi,

i have a strange problem and don't know where to look for the bug. Ok, i
have an html form witha text area.
Very roughly i make something like this (PHP 4.2.3, i can't update the
version):

------
$DomDocument = xmldoc('<'.'?xml version=\'1.0\'
encoding=\'ISO-8859-1\'?'.'><pages/>');

$root = $DomDocument->root();

$element = $DomDocument->create_text_node(utf8_encode($content));
//$content ist die Textarea
$root->add_child($element);

$handle1, $DomDocument->dumpmem("file.xml");
------

This works in every case. Especially the conversion of utf8 Strings and then
having a good ISO... file after the dump works in every case, until now with
my text area:

The file contains a lot of entities. The returns are displayed as etc.
But if i write $content into a file, everything is fine.
In the html form the character set is also set to iso-8859-1.

Where do the entities come from and how can i avoid them?

Regards,
Frank
Jul 17 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
If you want to see a better method of dealing with multi-byte characters in
an XML file then take a look at
http://www.tonymarston.co.uk/php-mys...tml#multi-byte

You should also be aware that an XML file is only supposed to contain data,
not any HTML tags, which is why they are converted into entities.

When using an XML file in an XSL transformation all the HTML tags are
supposed to be generate by the XSL stylesheet and not ported across from the
XML data.

Tony Marston
http://www.tonymarston.net/

"Frank Thorstens" <Fr************@gmx.de> wrote in message
news:c0*************@ID-202045.news.uni-berlin.de...
Hi,

i have a strange problem and don't know where to look for the bug. Ok, i
have an html form witha text area.
Very roughly i make something like this (PHP 4.2.3, i can't update the
version):

------
$DomDocument = xmldoc('<'.'?xml version=\'1.0\'
encoding=\'ISO-8859-1\'?'.'><pages/>');

$root = $DomDocument->root();

$element = $DomDocument->create_text_node(utf8_encode($content));
//$content ist die Textarea
$root->add_child($element);

$handle1, $DomDocument->dumpmem("file.xml");
------

This works in every case. Especially the conversion of utf8 Strings and then having a good ISO... file after the dump works in every case, until now with my text area:

The file contains a lot of entities. The returns are displayed as etc. But if i write $content into a file, everything is fine.
In the html form the character set is also set to iso-8859-1.

Where do the entities come from and how can i avoid them?

Regards,
Frank

Jul 17 '05 #2

P: n/a
Hi,

thanks for your reply!
If you want to see a better method of dealing with multi-byte characters in an XML file then take a look at
http://www.tonymarston.co.uk/php-mys...tml#multi-byte
Hm, ok, so mb_convert_encoding($value,'UTF-8','ISO-8859-1');
works in another way than utf8_encode($value). ?
You should also be aware that an XML file is only supposed to contain data, not any HTML tags, which is why they are converted into entities.
That's clear, but why are the german Umlaute (, ) encoded and also the
returns? Thats my main problem.
When using an XML file in an XSL transformation all the HTML tags are
supposed to be generate by the XSL stylesheet and not ported across from the XML data.


But i'm not using any XSL transformation. It's very strange. It seems to
have s.th to do with the data coming from the html form. :-/

Regards,
Frank
Jul 17 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.