471,351 Members | 1,562 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,351 software developers and data experts.

HTML aext area -> xml

Hi,

i have a strange problem and don't know where to look for the bug. Ok, i
have an html form witha text area.
Very roughly i make something like this (PHP 4.2.3, i can't update the
version):

------
$DomDocument = xmldoc('<'.'?xml version=\'1.0\'
encoding=\'ISO-8859-1\'?'.'><pages/>');

$root = $DomDocument->root();

$element = $DomDocument->create_text_node(utf8_encode($content));
//$content ist die Textarea
$root->add_child($element);

$handle1, $DomDocument->dumpmem("file.xml");
------

This works in every case. Especially the conversion of utf8 Strings and then
having a good ISO... file after the dump works in every case, until now with
my text area:

The file contains a lot of entities. The returns are displayed as etc.
But if i write $content into a file, everything is fine.
In the html form the character set is also set to iso-8859-1.

Where do the entities come from and how can i avoid them?

Regards,
Frank
Jul 17 '05 #1
2 1515
If you want to see a better method of dealing with multi-byte characters in
an XML file then take a look at
http://www.tonymarston.co.uk/php-mys...tml#multi-byte

You should also be aware that an XML file is only supposed to contain data,
not any HTML tags, which is why they are converted into entities.

When using an XML file in an XSL transformation all the HTML tags are
supposed to be generate by the XSL stylesheet and not ported across from the
XML data.

Tony Marston
http://www.tonymarston.net/

"Frank Thorstens" <Fr************@gmx.de> wrote in message
news:c0*************@ID-202045.news.uni-berlin.de...
Hi,

i have a strange problem and don't know where to look for the bug. Ok, i
have an html form witha text area.
Very roughly i make something like this (PHP 4.2.3, i can't update the
version):

------
$DomDocument = xmldoc('<'.'?xml version=\'1.0\'
encoding=\'ISO-8859-1\'?'.'><pages/>');

$root = $DomDocument->root();

$element = $DomDocument->create_text_node(utf8_encode($content));
//$content ist die Textarea
$root->add_child($element);

$handle1, $DomDocument->dumpmem("file.xml");
------

This works in every case. Especially the conversion of utf8 Strings and then having a good ISO... file after the dump works in every case, until now with my text area:

The file contains a lot of entities. The returns are displayed as etc. But if i write $content into a file, everything is fine.
In the html form the character set is also set to iso-8859-1.

Where do the entities come from and how can i avoid them?

Regards,
Frank

Jul 17 '05 #2
Hi,

thanks for your reply!
If you want to see a better method of dealing with multi-byte characters in an XML file then take a look at
http://www.tonymarston.co.uk/php-mys...tml#multi-byte
Hm, ok, so mb_convert_encoding($value,'UTF-8','ISO-8859-1');
works in another way than utf8_encode($value). ?
You should also be aware that an XML file is only supposed to contain data, not any HTML tags, which is why they are converted into entities.
That's clear, but why are the german Umlaute (ä, ö) encoded and also the
returns? Thats my main problem.
When using an XML file in an XSL transformation all the HTML tags are
supposed to be generate by the XSL stylesheet and not ported across from the XML data.


But i'm not using any XSL transformation. It's very strange. It seems to
have s.th to do with the data coming from the html form. :-/

Regards,
Frank
Jul 17 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Mike Wilcox | last post: by
1 post views Thread by Jason Ho | last post: by
reply views Thread by XIAOLAOHU | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.