Connecting Tech Pros Worldwide Help | Site Map

Error while parsing local languages using SAX/DOM parser.

  #1  
Old September 15th, 2008, 07:55 AM
Sidhartha
Guest
 
Posts: n/a
Hi,
I am facing a problem while parsing local language characters using
sax parser. We use DOM to parse and SAX to read the source. But when
our application parses strings with local language especially
czech,polish,turkish in place of local language character some other
word is comming.

Eg:
Input string :ahoj, jak se máš
Output string :ahoj, jak se máš
OS: Solaris.

We persist this xml in the database. This issue was not comming when
the parser was that of IBM and os NT.The local language character is
getting replaced by "&aacute". This causing problem when we tranlsate
it back.Can anyone please help me.

Stack Trace

class org.xml.sax.SAXException message = Parser reported fatal error
while parsing : Input Source/DTD
Stack Trace:
org.xml.sax.SAXParseException: The entity "aacute" was referenced, but
not declared.
at org.apache.xerces.util.ErrorHandlerWrapper.fatalEr ror(Unknown
Source)
at org.apache.xerces.impl.XMLErrorReporter.reportErro r(Unknown
Source)
at org.apache.xerces.impl.XMLErrorReporter.reportErro r(Unknown
Source)
at org.apache.xerces.impl.XMLScanner.reportFatalError (Unknown
Source)
at org.apache.xerces.impl.XMLScanner.scanAttributeVal ue(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerI mpl.scanAttribute(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerI mpl.scanStartElement(Unknown
Source)
at org.apache.xerces.impl.XMLDocumentScannerImpl
$ContentDispatcher.scanRootElementHook(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerI mpl
$FragmentContentDispatcher.dispatch(Unknown Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerI mpl.scanDocument(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse (Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse (Unknown
Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)

Thanks,
Sidhartha


  #2  
Old September 15th, 2008, 11:55 AM
Martin Honnen
Guest
 
Posts: n/a

re: Error while parsing local languages using SAX/DOM parser.


Sidhartha wrote:
Quote:
Hi,
I am facing a problem while parsing local language characters using
sax parser. We use DOM to parse and SAX to read the source. But when
our application parses strings with local language especially
czech,polish,turkish in place of local language character some other
word is comming.
>
Eg:
Input string :ahoj, jak se máš
Output string :ahoj, jak se máš
OS: Solaris.
>
We persist this xml in the database. This issue was not comming when
the parser was that of IBM and os NT.The local language character is
getting replaced by "&aacute". This causing problem when we tranlsate
it back.Can anyone please help me.
It is rather odd that you get an XHTML entity reference 'á' in
your XML. I am not sure why that happens. Are you using XSLT for
instance to serialize XML?

--

Martin Honnen
http://JavaScript.FAQTs.com/
Closed Thread