473,473 Members | 1,901 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Remove XML node before validating

Hello,

I need to remove the DTD reference from an xml document, the reason
for this is that we want to validate against a schema instead (which
we have locally). It takes up to a minute to fetch all documents
referred to in the DTD, and as we have no use for them I want to
remove the reference.

I'm using XmlReaderSettings to pass in the xml document and the
schema, but when I loop through the reader it goes and tries to get
the DTD before I can remove it, so I'm assuming there's a better way
to remove it before doing the validation. I've tried using XPath but I
don't know how to find the doctype node. Is it Xpath that I should
use?

I'd be very grateful if anyone could point me in the right direction.

Thanks,

AK
Oct 27 '08 #1
4 5696
ag***********@gmail.com wrote:
I need to remove the DTD reference from an xml document, the reason
for this is that we want to validate against a schema instead (which
we have locally). It takes up to a minute to fetch all documents
referred to in the DTD, and as we have no use for them I want to
remove the reference.

I'm using XmlReaderSettings to pass in the xml document and the
schema, but when I loop through the reader it goes and tries to get
the DTD before I can remove it, so I'm assuming there's a better way
to remove it before doing the validation. I've tried using XPath but I
don't know how to find the doctype node. Is it Xpath that I should
use?
No, the XPath data model does not know any DTDs so it does certainly not
help.
If you want the XmlReader (or XmlDocument) to ignore the referenced DTD
then you can try to set the XmlResolver property (of the
XmlReaderSettings you create your XmlReader with
http://msdn.microsoft.com/en-us/libr...lresolver.aspx)
to null. That way the reader will not fetch any resources. That will
only work however if the XML document does not references any entities
defined in the DTD.
A bit more work but a more complete solution is to set the XmlResolver
to your own implementation of UrlResolver, for instance by subclassing
XmlUrlResolver, that then uses a locally cached copy of the DTDs.
--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Oct 27 '08 #3
AK
On Oct 27, 2:26*pm, Martin Honnen <mahotr...@yahoo.dewrote:
No, the XPath data model does not know any DTDs so it does certainly not
help.
If you want the XmlReader (or XmlDocument) to ignore the referenced DTD
then you can try to set the XmlResolver property (of the
XmlReaderSettings you create your XmlReader withhttp://msdn.microsoft.com/en-us/library/system.xml.xmlreadersettings....)
to null. That way the reader will not fetch any resources. That will
only work however if the XML document does not references any entities
defined in the DTD.
A bit more work but a more complete solution is to set the XmlResolver
to your own implementation of UrlResolver, for instance by subclassing
XmlUrlResolver, that then uses a locally cached copy of the DTDs.
Thanks for your answer, it took so long before my post came up that I
actually thought it had gone missing at first, only noticed it now!

This is the code I'm using at the moment:

XmlDocument xdoc = new XmlDocument();
bool docIsValid = false;

try
{
xdoc.XmlResolver = null;
xdoc.Load(scorePath);

docIsValid = true;
}
catch (System.Exception ex)
{
errorList.Add(ex.Message);
}

if (docIsValid == true)
{
foreach (XmlNode node in xdoc.ChildNodes)
{
if (node.GetType().ToString().Contains("DocumentType" ))
{
// Delete it
xdoc.RemoveChild(node);
}
}

MemoryStream ms = new MemoryStream();
xdoc.Save(ms);
ms.Position = 0;
XmlReader xmlDoc = XmlReader.Create(ms);

XmlReaderSettings settings = new XmlReaderSettings();
settings.ProhibitDtd = false;
settings.XmlResolver = new LocalXmlResolver();

settings.ValidationEventHandler += new
System.Xml.Schema.ValidationEventHandler(settings_ ValidationEventHandler);

XmlSchema x =
XmlSchema.Read(Utilities.getSchemaFromResources(pv gschema),
settings_ValidationEventHandler);
settings.Schemas.Add(x);

settings.ValidationType = ValidationType.Schema;

XmlReader reader = XmlReader.Create(xmlDoc, settings);

while (reader.Read())
{

}
}

Basically I want to validate against a locally saved schema (which is
set to an embedded resource), and never validate against the DTD. The
code above is not ideal as I'm validating the xml file twice, once to
remove the DTD reference then once against the schema, however it does
avoid me having to go get all the documents referenced in the DTD
(which could take up to a minute).

Also, I've saved all the schemas referenced to in 'pvgschema' locally
and added them as embedded resources, but it doesn't seem like the
XmlResolver works as I thought as it still does an HTTP get for those
schemas on the line settings.Schemas.Add(x);.

Is there a simpler way of doing this?

Many thanks,

AK
Oct 28 '08 #4
AK
On Oct 28, 1:54*pm, AK <agda.karlb...@gmail.comwrote:
Basically I want to validate against a locally saved schema (which is
set to an embedded resource), and never validate against the DTD. The
code above is not ideal as I'm validating the xml file twice, once to
remove the DTD reference then once against the schema, however it does
avoid me having to go get all the documents referenced in the DTD
(which could take up to a minute).

Also, I've saved all the schemas referenced to in 'pvgschema' locally
and added them as embedded resources, but it doesn't seem like the
XmlResolver works as I thought as it still does an HTTP get for those
schemas on the line settings.Schemas.Add(x);.
For the second point, I had made a mistake in the resolver. It now
tries to get the embedded schema but fails as the schema has a
"xs:redefine schemaLocation" in it and I get the error message
"schemaLocation must successfully resolve if <redefinecontains any
child other than <annotation>". Is it possible to solve this or would
it be better to remove the redefine from the schema?

(Apologies if someone has already answered this - I've had troubles
seeing updates and only saw my own answer to this when I came in this
morning even if I posted it yesterday afternoon.)

Many thanks,

AK
Oct 29 '08 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Robert Mark Bram | last post by:
Hi All! In the code below, I am reading in an xhtml document and attempting to use selectNodes to find a <p id="rmb"> node.. But the result is: 2 - */* 0 - */p Can anyone suggest what I...
1
by: Steve S | last post by:
I need some help modifying this code. The code was orginally setup to add/remove a single table row. I modified it to add 2 (two) rows with one column that spans both rows. That part works. I...
6
by: Boobie | last post by:
Is there any other way to remove a node/element other than going into the Parent and do "removeChild" ? thanks (need to remove a div of id "foobar")
6
by: Nikhil Patel | last post by:
Hi all, Following is a portion of an XML document. I need to remove all nodes that belong to ns0 without deleting their child nodes. So in the following example , I want to delete "ns0:Proposal"...
6
by: SHC | last post by:
Hi all, I created an application from the Console Application (.NET) of VC++ .NET 2003, and I did "Build" the application of the attached .cpp file, volcanoes.xml and geology.dtd on my VC++ .NET...
2
by: William | last post by:
Have somebody a brilliant idea how to remove a nodelist in a XML document? (With C#) I have this xml file,I want delete the stock nodelist when an user don't have permissions to see stocklist of...
3
by: tom | last post by:
Hi folks Is there a piece of software or a wbsite that validates the content inside an XML node? I download an XML feed and when I export it from MSAccess to an XML file and transform it with...
11
by: David | last post by:
Hi All, I am working on a script that is theoreticaly simple but I can not get it to work completely. I am dealing with a page spit out by .NET that leaves empty tags in the markup. I need a...
0
by: AK | last post by:
Hello, I need to remove the DTD reference from an xml document, the reason for this is that we want to validate against a schema instead (which we have locally). It takes up to a minute to fetch...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.