470,619 Members | 1,405 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,619 developers. It's quick & easy.

Remove XML node before validating

Hello,

I need to remove the DTD reference from an xml document, the reason
for this is that we want to validate against a schema instead (which
we have locally). It takes up to a minute to fetch all documents
referred to in the DTD, and as we have no use for them I want to
remove the reference.

I'm using XmlReaderSettings to pass in the xml document and the
schema, but when I loop through the reader it goes and tries to get
the DTD before I can remove it, so I'm assuming there's a better way
to remove it before doing the validation. I've tried using XPath but I
don't know how to find the doctype node. Is it Xpath that I should
use?

I'd be very grateful if anyone could point me in the right direction.

Thanks,

AK
Oct 27 '08 #1
4 5513
ag***********@gmail.com wrote:
I need to remove the DTD reference from an xml document, the reason
for this is that we want to validate against a schema instead (which
we have locally). It takes up to a minute to fetch all documents
referred to in the DTD, and as we have no use for them I want to
remove the reference.

I'm using XmlReaderSettings to pass in the xml document and the
schema, but when I loop through the reader it goes and tries to get
the DTD before I can remove it, so I'm assuming there's a better way
to remove it before doing the validation. I've tried using XPath but I
don't know how to find the doctype node. Is it Xpath that I should
use?
No, the XPath data model does not know any DTDs so it does certainly not
help.
If you want the XmlReader (or XmlDocument) to ignore the referenced DTD
then you can try to set the XmlResolver property (of the
XmlReaderSettings you create your XmlReader with
http://msdn.microsoft.com/en-us/libr...lresolver.aspx)
to null. That way the reader will not fetch any resources. That will
only work however if the XML document does not references any entities
defined in the DTD.
A bit more work but a more complete solution is to set the XmlResolver
to your own implementation of UrlResolver, for instance by subclassing
XmlUrlResolver, that then uses a locally cached copy of the DTDs.
--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Oct 27 '08 #3
AK
On Oct 27, 2:26*pm, Martin Honnen <mahotr...@yahoo.dewrote:
No, the XPath data model does not know any DTDs so it does certainly not
help.
If you want the XmlReader (or XmlDocument) to ignore the referenced DTD
then you can try to set the XmlResolver property (of the
XmlReaderSettings you create your XmlReader withhttp://msdn.microsoft.com/en-us/library/system.xml.xmlreadersettings....)
to null. That way the reader will not fetch any resources. That will
only work however if the XML document does not references any entities
defined in the DTD.
A bit more work but a more complete solution is to set the XmlResolver
to your own implementation of UrlResolver, for instance by subclassing
XmlUrlResolver, that then uses a locally cached copy of the DTDs.
Thanks for your answer, it took so long before my post came up that I
actually thought it had gone missing at first, only noticed it now!

This is the code I'm using at the moment:

XmlDocument xdoc = new XmlDocument();
bool docIsValid = false;

try
{
xdoc.XmlResolver = null;
xdoc.Load(scorePath);

docIsValid = true;
}
catch (System.Exception ex)
{
errorList.Add(ex.Message);
}

if (docIsValid == true)
{
foreach (XmlNode node in xdoc.ChildNodes)
{
if (node.GetType().ToString().Contains("DocumentType" ))
{
// Delete it
xdoc.RemoveChild(node);
}
}

MemoryStream ms = new MemoryStream();
xdoc.Save(ms);
ms.Position = 0;
XmlReader xmlDoc = XmlReader.Create(ms);

XmlReaderSettings settings = new XmlReaderSettings();
settings.ProhibitDtd = false;
settings.XmlResolver = new LocalXmlResolver();

settings.ValidationEventHandler += new
System.Xml.Schema.ValidationEventHandler(settings_ ValidationEventHandler);

XmlSchema x =
XmlSchema.Read(Utilities.getSchemaFromResources(pv gschema),
settings_ValidationEventHandler);
settings.Schemas.Add(x);

settings.ValidationType = ValidationType.Schema;

XmlReader reader = XmlReader.Create(xmlDoc, settings);

while (reader.Read())
{

}
}

Basically I want to validate against a locally saved schema (which is
set to an embedded resource), and never validate against the DTD. The
code above is not ideal as I'm validating the xml file twice, once to
remove the DTD reference then once against the schema, however it does
avoid me having to go get all the documents referenced in the DTD
(which could take up to a minute).

Also, I've saved all the schemas referenced to in 'pvgschema' locally
and added them as embedded resources, but it doesn't seem like the
XmlResolver works as I thought as it still does an HTTP get for those
schemas on the line settings.Schemas.Add(x);.

Is there a simpler way of doing this?

Many thanks,

AK
Oct 28 '08 #4
AK
On Oct 28, 1:54*pm, AK <agda.karlb...@gmail.comwrote:
Basically I want to validate against a locally saved schema (which is
set to an embedded resource), and never validate against the DTD. The
code above is not ideal as I'm validating the xml file twice, once to
remove the DTD reference then once against the schema, however it does
avoid me having to go get all the documents referenced in the DTD
(which could take up to a minute).

Also, I've saved all the schemas referenced to in 'pvgschema' locally
and added them as embedded resources, but it doesn't seem like the
XmlResolver works as I thought as it still does an HTTP get for those
schemas on the line settings.Schemas.Add(x);.
For the second point, I had made a mistake in the resolver. It now
tries to get the embedded schema but fails as the schema has a
"xs:redefine schemaLocation" in it and I get the error message
"schemaLocation must successfully resolve if <redefinecontains any
child other than <annotation>". Is it possible to solve this or would
it be better to remove the redefine from the schema?

(Apologies if someone has already answered this - I've had troubles
seeing updates and only saw my own answer to this when I came in this
morning even if I posted it yesterday afternoon.)

Many thanks,

AK
Oct 29 '08 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

7 posts views Thread by Robert Mark Bram | last post: by
1 post views Thread by Steve S | last post: by
6 posts views Thread by Boobie | last post: by
6 posts views Thread by Nikhil Patel | last post: by
2 posts views Thread by William | last post: by
3 posts views Thread by tom | last post: by
11 posts views Thread by David | last post: by
reply views Thread by AK | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.