473,402 Members | 2,072 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,402 software developers and data experts.

[DOM XML] How to encode a node content (Accentuated characters)

Hello,

I'm trying to generate a RSS newsfeed using the DOM XML functions. However I
can't find a way to use accentuated characters. I even tried to specify a
character encoding set but it doesn't solve the problem.

Error I get :
XML Parsing Error: not well-formed
Location: news.php?action=syndicate&format=rss2
Line Number 1, Column 102:<?xml version="1.0" encoding="ISO-8859-15"?><rss
version="2.0"><channel><title>Website title - News (
----------------------------------------------------------------------------
-----------------------------------------------------------^

The not well-formed character is an accentuated character, as you can see on
the following code sample :

<?php

header ('content-type: text/xml');

echo ('<?xml version="1.0" encoding="ISO-8859-15"?>');

$dom_doc = domxml_new_doc ('1.0');

$rss_el = $dom_doc->create_element ('rss');
$rss_el->set_attribute ('version', '2.0');
$rss_el = $dom_doc->append_child ($rss_el);

$channel_el = $dom_doc->create_element ('channel');
$channel_el = $rss_el->append_child ($channel_el);

$title_el = $dom_doc->create_element ('title');
$title_el = $channel_el->append_child ($title_el);
$title_el->set_content ('Website title - News (àéèù)');

echo ($dom_doc->html_dump_mem ());

?>

Removing the accentuated characters from the title generates a well formed
XML file. Note that I also tried to encode the characters using the
htmlentities function but it didn't change anything.

--
Jean-Marc.

Jul 17 '05 #1
6 2938
Jean-Marc Molina <jm******@pasdepourriel-free.fr> wrote:
Error I get :
XML Parsing Error: not well-formed
Location: news.php?action=syndicate&format=rss2
Line Number 1, Column 102:<?xml version="1.0" encoding="ISO-8859-15"?><rss
version="2.0"><channel><title>Website title - News (
----------------------------------------------------------------------------
-----------------------------------------------------------^

The not well-formed character is an accentuated character, as you can see on
the following code sample : .... $title_el->set_content ('Website title - News (????)'); .... Removing the accentuated characters from the title generates a well formed
XML file. Note that I also tried to encode the characters using the
htmlentities function but it didn't change anything.


Disclaimer: I haven't ever used xmldomdoc in php.

If you are trying to create wellformed XML you should use UTF8 as the
encoding, since that is about the only encoding that XML utils must
support. That is also your problem, your xml doc. is in UTF8 since you
haven't told anyone otherwise (and I can't find how to do that in the
domxml reference).

htmlenties doesn't work either because there are only 5 xmlentities by
default: &amp; &apos; &quot; &gt; &lt;

IMHO the best solution would be to translate you string to UTF8:
http://nl.php.net/manual/en/function...t-encoding.php

(be sure to tell it's iso-8859-15 or else the EUR symbol will get
dropped for the generic currency symbol).

Or use http://nl.php.net/manual/en/function.utf8-encode.php after
manually encoding EUR.

--

Daniel Tryba

Jul 17 '05 #2
Daniel Tryba a écrit/wrote :
htmlenties doesn't work either because there are only 5 xmlentities by
default: &amp; &apos; &quot; &gt; &lt;
I didn't know that. I solved my problem by calling htmlentities 2 times, to
remove the junk :).
IMHO the best solution would be to translate you string to UTF8:
http://nl.php.net/manual/en/function...t-encoding.php


That's an other solution, a better one, thanks !

<?php

....
$title = mb_convert_encoding (stripslashes ($newsitem ['title']), 'UTF-8',
'ISO-8859-15');
....

?>

From ISO-8859-15 to UTF-8.

--
Jean-Marc.

Jul 17 '05 #3
Daniel Tryba a écrit/wrote :
If you are trying to create wellformed XML you should use UTF8 as the
encoding, since that is about the only encoding that XML utils must
support. That is also your problem, your xml doc. is in UTF8 since you
haven't told anyone otherwise (and I can't find how to do that in the
domxml reference).


Do you know of any good UTF-8 coding editor ? I develop using HTML-Kit and
jEdit but they don't support UTF-8 editing. However I'm not sure application
servers can handle UTF-8 scripts. Can they ?

Thanks again for all your help.

--
Jean-Marc.

Jul 17 '05 #4
Jean-Marc Molina <jm******@pasdepourriel-free.fr> wrote:
Do you know of any good UTF-8 coding editor ?
I never ever use anything other than ASCII :)
I develop using HTML-Kit and
jEdit but they don't support UTF-8 editing. However I'm not sure application
servers can handle UTF-8 scripts. Can they ?


Don't know about htmlkit. jedit is Java, which uses UCS-2 internally,
which is what UTF-8 encapsulates... loading and saving utf8 data should
not be any problem IMHO.

--

Daniel Tryba

Jul 17 '05 #5
Jean-Marc Molina <jm******@pasdepourriel-free.fr> wrote:
However I'm not sure application servers can handle UTF-8 scripts. Can
they ?


Forgot to answer this one.

http://nl.php.net/mb-string should make it possible to use various
encodings. mbstring.internal_encoding and mbstring.script_encoding
should be the keysettings...

--

Daniel Tryba

Jul 17 '05 #6
Daniel Tryba <Daniel Tryba <ne****************@canopus.nl>> wrote:
htmlenties doesn't work either because there are only 5 xmlentities by
default: &amp; &apos; &quot; &gt; &lt;

However, you can still use numerical entities: &#...;

--
Simon Stienen <http://dangerouscat.net> <http://slashlife.de>
»What you do in this world is a matter of no consequence,
The question is, what can you make people believe that you have done.«
-- Sherlock Holmes in "A Study in Scarlet" by Sir Arthur Conan Doyle
Jul 17 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Jamais-Content | last post by:
Hi, I currently use the DOM from PHP5 to create XHTML documents. I have a problem with entities. I would like to create a node : <input id="place" value="café rouge"> To do that I simply create...
2
by: DaRik | last post by:
Hi, I'm having some difficulties with a menu I'm making. I build up the menu through DOM. I append childnodes to a tree. 2 types of children are possible: url (a hyperlink) and sub (a submap). ...
4
by: Steve Jorgensen | last post by:
A while ago, I posted a 1/2 formed idea for what was probably an overcomplicated scheme for improving how we create content using the DOM API. Since then, I've been refactoring some real-world code...
8
by: jog | last post by:
Hi, I want to get text out of some nodes of a huge xml file (1,5 GB). The architecture of the xml file is something like this <parent> <page> <title>bla</title> <id></id> <revision> <id></id>...
2
by: jimmyfishbean | last post by:
Hi, I am using VB6, SAX (implementing IVBSAXContentHandler). I need to extract binary encoded data (images) from large XML files and decode this data and generate the appropriate images onto...
8
by: bennett.matthew | last post by:
Hello all, This is probably an elementary (no pun intended) question, but I've spent all afternoon on it and it's driving me crazy. I have a function which dynamically adds to a table. It...
1
by: Rebecca Tsukalas | last post by:
Hello, I have a problem concerning removeChild. This is the XML structure I use with php: <xml_thing <language1 <site>bla1</site <site>bla2</site <site>bla3</site </language1
0
by: gunimpi | last post by:
http://www.vbforums.com/showthread.php?p=2745431#post2745431 ******************************************************** VB6 OR VBA & Webbrowser DOM Tiny $50 Mini Project Programmer help wanted...
10
by: Simon Willison | last post by:
I'm having a horrible time trying to get xml.dom.pulldom to consume a UTF8 encoded XML file. Here's what I've tried so far: <msg>Simon\xe2\x80\x99s XML nightmare</msg> """ ('START_DOCUMENT',...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.