472,351 Members | 1,520 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,351 software developers and data experts.

org.apache.xml.serialize.XMLSerializer problem with UTF-8

I must be missing something.

I am using org.apache.xml.serialize.XMLSerializer to save a DOM but I am not
getting non-basic characters converted to UTF-8.

I create Text nodes in the DOM by, for example:

Document doc;
JTextArea textPrompt;
Text newTextNode;
Element descElt;
....
newTextNode = doc.createTextNode(textPrompt.getText());
descElt.appendChild(newTextNode);

The code to serialize the DOM is:

private void saveXml(Document document)
{
// rename the existing layout file
new File(fileName).renameTo(new File(fileName + "~"));
// write the document out
OutputFormat format = new OutputFormat(document);
format.setIndenting(true);
format.setLineWidth(0);
format.setPreserveSpace(true);
try {
XMLSerializer serializer;
serializer = new XMLSerializer (
new FileWriter(fileName),
format);
serializer.asDOMSerializer();
serializer.serialize(document);
}
catch (IOException ioe)
{
....
}
}

If I enter a character such as e' (e with acute accent) into the JTextArea
and I look at the XML file using a non-UTF-8-aware editor I see that the e'
has been inserted as a single byte, not as the 2 character UTF-8 escaped
value. If I subsequently try to read the XML file using XERCES it blows up
because of the invalid escape sequence.

How do I get a valid serialization of this DOM into XML using UTF-8?
--
Jim Cobban jc*****@magma.ca
34 Palomino Dr.
Kanata, ON, CANADA
K2M 1M1
+1-613-592-9438
Jul 20 '05 #1
2 6593
Jim Cobban wrote:
I must be missing something. XMLSerializer serializer;
serializer = new XMLSerializer (
new FileWriter(fileName),
format);
serializer.asDOMSerializer();
If I enter a character such as e' (e with acute accent) into the JTextArea
and I look at the XML file using a non-UTF-8-aware editor I see that the e'
has been inserted as a single byte, not as the 2 character UTF-8 escaped
value. If I subsequently try to read the XML file using XERCES it blows up
because of the invalid escape sequence.

How do I get a valid serialization of this DOM into XML using UTF-8?


As far as I know it is the Writer responsible for the encoding.

From FileWriter API doc:

public class FileWriter
extends OutputStreamWriter

Convenience class for writing character files. The constructors of this
class assume that the default character encoding and the default
byte-buffer size are acceptable. To specify these values yourself,
construct an OutputStreamWriter on a FileOutputStream.
- try that.

Soren

--
Fjern de 4 bogstaver i min mailadresse som er indsat for at hindre s...
Remove the 4 letter word meaning "junk mail" in my mail address.

Jul 20 '05 #2

"Soren Kuula" <do**********@bitplanet.net> wrote in message
news:5K*********************@news000.worldonline.d k...

As far as I know it is the Writer responsible for the encoding.

From FileWriter API doc:

public class FileWriter
extends OutputStreamWriter

Convenience class for writing character files. The constructors of this
class assume that the default character encoding and the default
byte-buffer size are acceptable. To specify these values yourself,
construct an OutputStreamWriter on a FileOutputStream.


Thank you.

The problem was that I copied the code from one of the examples that came
with Xerces. It was that example which constructed the default FileWriter.
Since their is a version of the XMLSerializer constructor which takes an
OutpuStream and internally constructs a Writer with the correct "utf-8"
encoding, that is the form of the constructor which I needed to use. I
should have read the documentation in more detail rather than trusting that
the example had been written correctly.
Jul 20 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Jim Cobban | last post by:
I must be missing something. I am using org.apache.xml.serialize.XMLSerializer to save a DOM but I am not getting non-basic characters converted...
1
by: Jim Cobban | last post by:
I am getting invalid output from org.apache.xml.serialize.XMLSerializer and I cannot figure out how to get the attention of whoever is working or...
0
by: jr | last post by:
Hello, I have got a schema file "myschema.xsd" from my customer for which I must create xml files. I used xsd.exe to create a class for the...
1
by: Jonah Olsson | last post by:
Hello guys, Is there any way to serialize the following? Or do I need to create a new class called Languages that inherits Language? ...
5
by: Brian Reed | last post by:
I have a class that I want to serialize to an XML string. I want the XML to serialize to utf-8 encoding. When I serialize to an XML file, the data...
5
by: Brad | last post by:
I would like to serialize an arraylist of objects to xml so I can store the xml in a database column. How would I code the serializing and...
0
by: John Manion via .NET 247 | last post by:
Long Post, thanks for your patience... I have and XML file that looks something like this: <?xml version="1.0" encoding="utf-8" ?> <Settings>...
4
by: nano2 | last post by:
Hi all , I am having a issue with importing the following import com.sun.org.apache.xml.internal.serialize.OutputFormat; import...
8
by: Andy B | last post by:
I have the following code in a default.aspx web form page_load event. There seems to be a problem with line 5 (NewsArticle.Date = line). //create...
0
better678
by: better678 | last post by:
Question: Discuss your understanding of the Java platform. Is the statement "Java is interpreted" correct? Answer: Java is an object-oriented...
1
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and...
0
jalbright99669
by: jalbright99669 | last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was...
0
by: Matthew3360 | last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function. Here is my code. ...
0
by: AndyPSV | last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable...
0
by: Arjunsri | last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and...
0
hi
by: WisdomUfot | last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific...
0
by: Matthew3360 | last post by:
Hi, I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.