469,613 Members | 1,170 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,613 developers. It's quick & easy.

Error while parsing local languages using SAX/DOM parser.

Hi,
I am facing a problem while parsing local language characters using
sax parser. We use DOM to parse and SAX to read the source. But when
our application parses strings with local language especially
czech,polish,turkish in place of local language character some other
word is comming.

Eg:
Input string :ahoj, jak se máš
Output string :ahoj, jak se máš
OS: Solaris.

We persist this xml in the database. This issue was not comming when
the parser was that of IBM and os NT.The local language character is
getting replaced by "&aacute". This causing problem when we tranlsate
it back.Can anyone please help me.

Stack Trace

class org.xml.sax.SAXException message = Parser reported fatal error
while parsing : Input Source/DTD
Stack Trace:
org.xml.sax.SAXParseException: The entity "aacute" was referenced, but
not declared.
at org.apache.xerces.util.ErrorHandlerWrapper.fatalEr ror(Unknown
Source)
at org.apache.xerces.impl.XMLErrorReporter.reportErro r(Unknown
Source)
at org.apache.xerces.impl.XMLErrorReporter.reportErro r(Unknown
Source)
at org.apache.xerces.impl.XMLScanner.reportFatalError (Unknown
Source)
at org.apache.xerces.impl.XMLScanner.scanAttributeVal ue(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerI mpl.scanAttribute(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerI mpl.scanStartElement(Unknown
Source)
at org.apache.xerces.impl.XMLDocumentScannerImpl
$ContentDispatcher.scanRootElementHook(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerI mpl
$FragmentContentDispatcher.dispatch(Unknown Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerI mpl.scanDocument(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse (Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse (Unknown
Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)

Thanks,
Sidhartha
Sep 15 '08 #1
1 3261
Sidhartha wrote:
Hi,
I am facing a problem while parsing local language characters using
sax parser. We use DOM to parse and SAX to read the source. But when
our application parses strings with local language especially
czech,polish,turkish in place of local language character some other
word is comming.

Eg:
Input string :ahoj, jak se máš
Output string :ahoj, jak se máš
OS: Solaris.

We persist this xml in the database. This issue was not comming when
the parser was that of IBM and os NT.The local language character is
getting replaced by "&aacute". This causing problem when we tranlsate
it back.Can anyone please help me.
It is rather odd that you get an XHTML entity reference 'á' in
your XML. I am not sure why that happens. Are you using XSLT for
instance to serialize XML?

--

Martin Honnen
http://JavaScript.FAQTs.com/
Sep 15 '08 #2

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

303 posts views Thread by mike420 | last post: by
14 posts views Thread by Viktor Rosenfeld | last post: by
2 posts views Thread by ST | last post: by
reply views Thread by Yarco | last post: by
6 posts views Thread by ST | last post: by
2 posts views Thread by Brett | last post: by
14 posts views Thread by brett | last post: by
3 posts views Thread by =?Utf-8?B?SGVyYg==?= | last post: by
reply views Thread by gheharukoh7 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.