* Anna wrote in comp.text.xml:
If there is an HTML document, XML-formed using JTidy,
is there any tool to convert it to valid XHTML?
I.e. so that all the tags and attribute values will be XHTML compliant.
For example, if the original document has following snippet:
<p><div>text</div></p> (which is not valid XHTML), the output
would be something like <p><span>text</span></p> (which is valid XHTML).
Well, I am not that familiar with JTidy, but Tidy attempts to do that to
some extend. Current Tidy would turn that fragment into
<div>text</div>
<br />
<br />
which is bad but valid. If you specify the --drop-proprietary-attributes
config option, it would even drop all non-W3C attributes from all
elements, Tidy even tries to convert some known proprietary elements to
other elements, CSS, characters, ... whatever might fit. In fact, if the
result with --drop-proprietary-attributes enabled is not valid, it is
probably a bug and should be reported (this does not apply to invalid
attribute values for valid attributes, Tidy fixes some, but quite a
number cannot be fixed, just dropped, but there is currently no config
option to do that.)