473,739 Members | 4,265 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

org.apache.xml. serialize.XMLSe rializer problem with UTF-8

I must be missing something.

I am using org.apache.xml. serialize.XMLSe rializer to save a DOM but I am not
getting non-basic characters converted to UTF-8.

I create Text nodes in the DOM by, for example:

Document doc;
JTextArea textPrompt;
Text newTextNode;
Element descElt;
....
newTextNode = doc.createTextN ode(textPrompt. getText());
descElt.appendC hild(newTextNod e);

The code to serialize the DOM is:

private void saveXml(Documen t document)
{
// rename the existing layout file
new File(fileName). renameTo(new File(fileName + "~"));
// write the document out
OutputFormat format = new OutputFormat(do cument);
format.setInden ting(true);
format.setLineW idth(0);
format.setPrese rveSpace(true);
try {
XMLSerializer serializer;
serializer = new XMLSerializer (
new FileWriter(file Name),
format);
serializer.asDO MSerializer();
serializer.seri alize(document) ;
}
catch (IOException ioe)
{
....
}
}

If I enter a character such as e' (e with acute accent) into the JTextArea
and I look at the XML file using a non-UTF-8-aware editor I see that the e'
has been inserted as a single byte, not as the 2 character UTF-8 escaped
value. If I subsequently try to read the XML file using XERCES it blows up
because of the invalid escape sequence.

How do I get a valid serialization of this DOM into XML using UTF-8?
--
Jim Cobban jc*****@magma.c a
34 Palomino Dr.
Kanata, ON, CANADA
K2M 1M1
+1-613-592-9438
Jul 20 '05 #1
2 6735
Jim Cobban wrote:
I must be missing something. XMLSerializer serializer;
serializer = new XMLSerializer (
new FileWriter(file Name),
format);
serializer.asDO MSerializer();
If I enter a character such as e' (e with acute accent) into the JTextArea
and I look at the XML file using a non-UTF-8-aware editor I see that the e'
has been inserted as a single byte, not as the 2 character UTF-8 escaped
value. If I subsequently try to read the XML file using XERCES it blows up
because of the invalid escape sequence.

How do I get a valid serialization of this DOM into XML using UTF-8?


As far as I know it is the Writer responsible for the encoding.

From FileWriter API doc:

public class FileWriter
extends OutputStreamWri ter

Convenience class for writing character files. The constructors of this
class assume that the default character encoding and the default
byte-buffer size are acceptable. To specify these values yourself,
construct an OutputStreamWri ter on a FileOutputStrea m.
- try that.

Soren

--
Fjern de 4 bogstaver i min mailadresse som er indsat for at hindre s...
Remove the 4 letter word meaning "junk mail" in my mail address.

Jul 20 '05 #2

"Soren Kuula" <do**********@b itplanet.net> wrote in message
news:5K******** *************@n ews000.worldonl ine.dk...

As far as I know it is the Writer responsible for the encoding.

From FileWriter API doc:

public class FileWriter
extends OutputStreamWri ter

Convenience class for writing character files. The constructors of this
class assume that the default character encoding and the default
byte-buffer size are acceptable. To specify these values yourself,
construct an OutputStreamWri ter on a FileOutputStrea m.


Thank you.

The problem was that I copied the code from one of the examples that came
with Xerces. It was that example which constructed the default FileWriter.
Since their is a version of the XMLSerializer constructor which takes an
OutpuStream and internally constructs a Writer with the correct "utf-8"
encoding, that is the form of the constructor which I needed to use. I
should have read the documentation in more detail rather than trusting that
the example had been written correctly.
Jul 20 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
4806
by: Jim Cobban | last post by:
I must be missing something. I am using org.apache.xml.serialize.XMLSerializer to save a DOM but I am not getting non-basic characters converted to UTF-8. I create Text nodes in the DOM by, for example: Document doc; JTextArea textPrompt; Text newTextNode;
1
2482
by: Jim Cobban | last post by:
I am getting invalid output from org.apache.xml.serialize.XMLSerializer and I cannot figure out how to get the attention of whoever is working or has worked on that feature. I have gone into the Bugzilla database but it asks me for answers to a whole pile of questions that I haven't the foggiest notion of how to answer. Obviously you have to be intimately familiar with the Apache development process to be able to participate, and all I...
0
4558
by: jr | last post by:
Hello, I have got a schema file "myschema.xsd" from my customer for which I must create xml files. I used xsd.exe to create a class for the schema. I fill an object for that class with data and serialize it. My customer gave me an example xml file, how it should look like, and it begins with <?xml version="1.0" encoding="UTF-8"?>
1
7040
by: Jonah Olsson | last post by:
Hello guys, Is there any way to serialize the following? Or do I need to create a new class called Languages that inherits Language? <Serializable()> _ Public Class Language Public LanguageName As String Public LanguageCode As String Public Active As Integer
5
13293
by: Brian Reed | last post by:
I have a class that I want to serialize to an XML string. I want the XML to serialize to utf-8 encoding. When I serialize to an XML file, the data looks great. When I try to serialize to a String (ala StringBuilder) I get utf-16 and instead of the parenthesis (") I get a slash and then a " (\") which makes sense when looking at a character in memory, but not in a string Here is my code XmlSerializer serializer = new XmlSerializer...
5
8037
by: Brad | last post by:
I would like to serialize an arraylist of objects to xml so I can store the xml in a database column. How would I code the serializing and deserializing? Below is a (overly) simple, incomplete example of what I'd want to accomplish. Thanks Brad Example
0
2069
by: John Manion via .NET 247 | last post by:
Long Post, thanks for your patience... I have and XML file that looks something like this: <?xml version="1.0" encoding="utf-8" ?> <Settings> <Location> <X>30</X> <Y>40</Y> </Location> <Size>
4
8964
by: nano2 | last post by:
Hi all , I am having a issue with importing the following import com.sun.org.apache.xml.internal.serialize.OutputFormat; import com.sun.org.apache.xml.internal.serialize.XMLSerializer; using the following version of java However when i look in the jre I see it's location has changed FROM com.sun.org.apache.xml.internal.serialize TO org/apache/xml/serialize/Serializer.class
8
2004
by: Andy B | last post by:
I have the following code in a default.aspx web form page_load event. There seems to be a problem with line 5 (NewsArticle.Date = line). //create a news article NewsArticle NewsArticle = new NewsArticle(); NewsArticle.Body = "This is a test news article..."; NewsArticle.Title = "Testing XML";
0
8792
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9337
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9266
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9209
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6754
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6054
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4826
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2748
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2193
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.