I must be missing something.
I am using org.apache.xml. serialize.XMLSe rializer to save a DOM but I am not
getting non-basic characters converted to UTF-8.
I create Text nodes in the DOM by, for example:
Document doc;
JTextArea textPrompt;
Text newTextNode;
Element descElt;
....
newTextNode = doc.createTextN ode(textPrompt. getText());
descElt.appendC hild(newTextNod e);
The code to serialize the DOM is:
private void saveXml(Documen t document)
{
// rename the existing layout file
new File(fileName). renameTo(new File(fileName + "~"));
// write the document out
OutputFormat format = new OutputFormat(do cument);
format.setInden ting(true);
format.setLineW idth(0);
format.setPrese rveSpace(true);
try {
XMLSerializer serializer;
serializer = new XMLSerializer (
new FileWriter(file Name),
format);
serializer.asDO MSerializer();
serializer.seri alize(document) ;
}
catch (IOException ioe)
{
....
}
}
If I enter a character such as e' (e with acute accent) into the JTextArea
and I look at the XML file using a non-UTF-8-aware editor I see that the e'
has been inserted as a single byte, not as the 2 character UTF-8 escaped
value. If I subsequently try to read the XML file using XERCES it blows up
because of the invalid escape sequence.
How do I get a valid serialization of this DOM into XML using UTF-8?
--
Jim Cobban jc*****@magma.c a
34 Palomino Dr.
Kanata, ON, CANADA
K2M 1M1
+1-613-592-9438