By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,959 Members | 1,196 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,959 IT Pros & Developers. It's quick & easy.

No parsing-result of HTML into XHTML

P: 1
Hi,

I have a big problem with parsing HTML into a XHTML using Cberneko to validate the html.

First I tried to work with a HTML-File. This solutions works fine:

Expand|Select|Wrap|Line Numbers
  1.         String aHTMLFile = "file:\\C:/work/Eclipse3.1.1/html-file.html";
  2.         org.xml.sax.InputSource pSource  = new InputSource(aHTMLFile);
  3.  
  4.     org.cyberneko.html.HTMLConfiguration htmlConfig = new HTMLConfiguration();
  5.         org.apache.xerces.parsers.DOMParser parser = new DOMParser(htmlConfig);
  6.  
  7.         //setting DOMParser
  8.         parser.setProperty(DruckKonstanten.CYBERNEKO_PROP_ELEMS, "lower");
  9.         parser.setProperty(DruckKonstanten.CYBERNEKO_PROP_ATTRIB, "lower");
  10.  
  11.     //parse and validating HTML into XHTML 
  12.         parser.parse(pSource);
  13.  
  14.     //XHTML-Doc into JDOM-Document 
  15.         org.jdom.input.DOMBuilder builder = new org.jdom.input.DOMBuilder(); 
  16.         org.jdom.Document document = builder.build(parser.getDocument());
  17.  
  18. ..


But when I try to use a string of html instead of a file via StringReader, the parser ignores the html. There is no error occuring but I miss the result of xhtml. It seems the parser eats it up:
Expand|Select|Wrap|Line Numbers
  1. ..
  2.  
  3.         String aHTMLStr = "<p>Example</p>";
  4.         org.xml.sax.InputSource pSource  = new InputSource(new StringReader(aHTMLStr));
  5.  
  6.     org.cyberneko.html.HTMLConfiguration htmlConfig = new HTMLConfiguration();
  7.         org.apache.xerces.parsers.DOMParser parser = new DOMParser(htmlConfig);
  8.  
  9.         //setting DOMParser
  10.         parser.setProperty(DruckKonstanten.CYBERNEKO_PROP_ELEMS, "lower");
  11.         parser.setProperty(DruckKonstanten.CYBERNEKO_PROP_ATTRIB, "lower");
  12.  
  13.     //parse and validating HTML into XHTML 
  14.         parser.parse(pSource);
  15.  
  16.     //XHTML-Doc into JDOM-Document 
  17.         org.jdom.input.DOMBuilder builder = new org.jdom.input.DOMBuilder(); 
  18.         org.jdom.Document document = builder.build(parser.getDocument());
  19.  
  20. ..

Is there anybody who can give me a hint to solve my problem?
I have no idea and no hint to look up for anymore. :-((((

Intel/Windows NT/Java 5

Thanks for your help in advance.
Sep 8 '06 #1
Share this question for a faster answer!
Share on Google+

Post your reply

Sign in to post your reply or Sign up for a free account.