I have a big problem with parsing HTML into a XHTML using Cberneko to validate the html.
First I tried to work with a HTML-File. This solutions works fine:
Expand|Select|Wrap|Line Numbers
- String aHTMLFile = "file:\\C:/work/Eclipse3.1.1/html-file.html";
- org.xml.sax.InputSource pSource = new InputSource(aHTMLFile);
- org.cyberneko.html.HTMLConfiguration htmlConfig = new HTMLConfiguration();
- org.apache.xerces.parsers.DOMParser parser = new DOMParser(htmlConfig);
- //setting DOMParser
- parser.setProperty(DruckKonstanten.CYBERNEKO_PROP_ELEMS, "lower");
- parser.setProperty(DruckKonstanten.CYBERNEKO_PROP_ATTRIB, "lower");
- //parse and validating HTML into XHTML
- parser.parse(pSource);
- //XHTML-Doc into JDOM-Document
- org.jdom.input.DOMBuilder builder = new org.jdom.input.DOMBuilder();
- org.jdom.Document document = builder.build(parser.getDocument());
- ..
But when I try to use a string of html instead of a file via StringReader, the parser ignores the html. There is no error occuring but I miss the result of xhtml. It seems the parser eats it up:
Expand|Select|Wrap|Line Numbers
- ..
- String aHTMLStr = "<p>Example</p>";
- org.xml.sax.InputSource pSource = new InputSource(new StringReader(aHTMLStr));
- org.cyberneko.html.HTMLConfiguration htmlConfig = new HTMLConfiguration();
- org.apache.xerces.parsers.DOMParser parser = new DOMParser(htmlConfig);
- //setting DOMParser
- parser.setProperty(DruckKonstanten.CYBERNEKO_PROP_ELEMS, "lower");
- parser.setProperty(DruckKonstanten.CYBERNEKO_PROP_ATTRIB, "lower");
- //parse and validating HTML into XHTML
- parser.parse(pSource);
- //XHTML-Doc into JDOM-Document
- org.jdom.input.DOMBuilder builder = new org.jdom.input.DOMBuilder();
- org.jdom.Document document = builder.build(parser.getDocument());
- ..
Is there anybody who can give me a hint to solve my problem?
I have no idea and no hint to look up for anymore. :-((((
Intel/Windows NT/Java 5
Thanks for your help in advance.