473,387 Members | 1,388 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Schemas, imports and namespaces

I'm a real XML novice, but my ultimate goal here is to get a workable schema
for the GEDCOM XML format as spec'ed out here:

http://www.familysearch.org/GEDCOM/GedXML60.pdf

It's a proposed XML format for genealogical records. They include a DTD in
the spec but sadly its incomplete in that the spec allows for "html"
(unlimited HTML or a subset they don't say) in certain elements to allow for
formatting. There is also a sample GEDCOM file in the spec with some
<html:BR /elements. There is no tag for "html:BR" in the DTD so it fails
on their included GEDCOM file.

So I have a few things to do. I want a schema, not a DTD and I want HTML to
be allowed in certain elements. An approximate schema I can produce by
using VS to change the DTD to a schema. The HTML part I have no idea how to
do. I would love to be able to say in the schema that the citation element
(to name one example) can have any tag from the "html" namespace but I don't
see how to do this. So I scaled back my expectations and decided to just
try to allow for <html:BR />. You can't declare this directly because XML
gets hung up on the namespace. I couldn't find anything that talked about
this in all the schema docs I looked at so I decided to let VS handle it for
me. I produced a schema from the GEDCOM XML file and it produced two xsd
files. One is the gedcom.xsd and one is a schema file which solely defines
<html:BR />. It also placed an import element in the gedcom.xsd file:

<xs:import namespace="http//:www.w3c.org/TR/REC-html40/" />

This does, in fact, seem to reference the file with html:BR defined,
although I fail to see how the resolution works. I mean, it seems to work
through the target namespace from the imported schema matching up with the
namespace named in the import statement and that, in turn matching up with
the xmlns attribute in the schema element, but I don't know how it finds the
html file to make these matches. It certainly works for me in VS and the
gedcom XML file with <html:BR /seems to validate properly according to the
VS XML editor. They (the base and html schemas) are in the same directory
which is the directory of my executable so I kind of figured that the
resolution was done by looking at every xsd file in the directory to see if
any have that namespace as a target and importing them if so. In any event,
since the XML editor was telling me that this file was valid with this
schema, I thought that was great, I'll definitely get it to work when I run
my simple program which, currently, simple reads the XML file into an
XmlDocument. Now, in order to test the schema in the editor, I have to
explicitly reference the filename with a noNamespaceSchemaLocation attribute
in the root element of the XML file (is there some other way?). If I'm
running this on general GEDCOM files, obviously they're not going to
reference my schema file so I have to remove the explicit filename link
formed by the noNamespaceSchemaLocation attribute and instead, load in the
schema manually and attempt to use it to validate like so:

XmlSchemaSet sc = new XmlSchemaSet();
sc.Add(null, "gedcom.xsd");
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.Schemas = sc;
settings.ValidationEventHandler += new
ValidationEventHandler(ValidationCallBack);
XmlReader xrd = XmlReader.Create(strXmlFile, settings);
(_xdoc = new XmlDocument()).Load(xrd);

When I do this, it tells me that it can't load in the schema file because
html:BR is not declared so it would appear that my guess about the
resolution occuring because they were in the same directory is wrong - at
least when I'm doing this all programmatically.

So what do I need to do here to get it to do the import correctly? It
appears that importing is the only way to handle the html namespace since I
produced the schema from the GEDCOM file in another piece of software and it
also produced two files with one importing from the other.

Finally, and this is really a pure XML question: supposing I do get this to
work. That only allows me to use the html namespace in my base schema
declarations. In order to allow for all of HTML in certain elements as I'm
doing things now, that means I have to declare every HTML tag in each of the
elements I want to allow them in. Is that really what has to be done? Do I
have to put *all* potential HTML tags in my schema to declare them as legal?
Again, it would be *really* nice to just be able to say "anything from the
HTML namespace is legal here". Is there some good way of doing that? I
suppose I could just allow for some subset of HTML tags but it would be nice
to give an option where the user could just type in (or paste in) any HTML
he/she likes and I would store it mostly as a black box without ever
directly interpreting any of the contents but just passing them in to an
HTML viewer control.

Sorry for the long post. I don't know how to make it any more brief. If
you're still with me, thanks for any ideas you might have on this!

Darrell
Oct 30 '06 #1
2 3057
When you used the Visual Studio editor to produce a schema for your XML, it
produced one schema per namespace encountered in your file. Your file
contains an element in the 'html' namespace so it created a schema from that
namespace and imported its namespace into the parent schema. Internally the
Xml Editor has constructed a Xml Schema set itself and has added both of
these schemas to it. Since both schemas are in the set, it does not need an
explicit file location to fetch the imported schema from - it is already in
memory. But when you save these schema files out to disk, you should put a
schemaLocation attribute in the import statement in order for the imported
schema to be resolved when you later use them in another application. Like:

<xs:import namespace="http//:www.w3c.org/TR/REC-html40/"
schemaLocation="HTMLSchema.xsd"/>

So you could either have an explicit location reference in your main schema
like above, or you could do what the Editor did: just load up all schema
files you need in a Xml Schema Set (not just gedcom.xsd) and the import will
resolve automatically since it is already in the set.

Regarding having to allow all HTML elements in your content models. You
could use an element wildcard (xsd:any). You can read up wildcards from
http://www.w3.org/TR/2004/PER-xmlsch...html#Wildcards
You could restrict your wildcard content to a namespace (like
http//:www.w3c.org/TR/REC-html40/) so you could only have element from the
html namespace to be allowed in that location. It is pretty useful to define
open content models like the one you need.

Thanks.
"Darrell Plank" <ja******@msn.comwrote in message
news:uH****************@TK2MSFTNGP04.phx.gbl...
I'm a real XML novice, but my ultimate goal here is to get a workable
schema for the GEDCOM XML format as spec'ed out here:

http://www.familysearch.org/GEDCOM/GedXML60.pdf

It's a proposed XML format for genealogical records. They include a DTD
in the spec but sadly its incomplete in that the spec allows for "html"
(unlimited HTML or a subset they don't say) in certain elements to allow
for formatting. There is also a sample GEDCOM file in the spec with some
<html:BR /elements. There is no tag for "html:BR" in the DTD so it
fails on their included GEDCOM file.

So I have a few things to do. I want a schema, not a DTD and I want HTML
to be allowed in certain elements. An approximate schema I can produce by
using VS to change the DTD to a schema. The HTML part I have no idea how
to do. I would love to be able to say in the schema that the citation
element (to name one example) can have any tag from the "html" namespace
but I don't see how to do this. So I scaled back my expectations and
decided to just try to allow for <html:BR />. You can't declare this
directly because XML gets hung up on the namespace. I couldn't find
anything that talked about this in all the schema docs I looked at so I
decided to let VS handle it for me. I produced a schema from the GEDCOM
XML file and it produced two xsd files. One is the gedcom.xsd and one is
a schema file which solely defines <html:BR />. It also placed an import
element in the gedcom.xsd file:

<xs:import namespace="http//:www.w3c.org/TR/REC-html40/" />

This does, in fact, seem to reference the file with html:BR defined,
although I fail to see how the resolution works. I mean, it seems to work
through the target namespace from the imported schema matching up with the
namespace named in the import statement and that, in turn matching up with
the xmlns attribute in the schema element, but I don't know how it finds
the html file to make these matches. It certainly works for me in VS and
the gedcom XML file with <html:BR /seems to validate properly according
to the VS XML editor. They (the base and html schemas) are in the same
directory which is the directory of my executable so I kind of figured
that the resolution was done by looking at every xsd file in the directory
to see if any have that namespace as a target and importing them if so.
In any event, since the XML editor was telling me that this file was valid
with this schema, I thought that was great, I'll definitely get it to work
when I run my simple program which, currently, simple reads the XML file
into an XmlDocument. Now, in order to test the schema in the editor, I
have to explicitly reference the filename with a noNamespaceSchemaLocation
attribute in the root element of the XML file (is there some other way?).
If I'm running this on general GEDCOM files, obviously they're not going
to reference my schema file so I have to remove the explicit filename link
formed by the noNamespaceSchemaLocation attribute and instead, load in
the schema manually and attempt to use it to validate like so:

XmlSchemaSet sc = new XmlSchemaSet();
sc.Add(null, "gedcom.xsd");
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.Schemas = sc;
settings.ValidationEventHandler += new
ValidationEventHandler(ValidationCallBack);
XmlReader xrd = XmlReader.Create(strXmlFile, settings);
(_xdoc = new XmlDocument()).Load(xrd);

When I do this, it tells me that it can't load in the schema file because
html:BR is not declared so it would appear that my guess about the
resolution occuring because they were in the same directory is wrong - at
least when I'm doing this all programmatically.

So what do I need to do here to get it to do the import correctly? It
appears that importing is the only way to handle the html namespace since
I produced the schema from the GEDCOM file in another piece of software
and it also produced two files with one importing from the other.

Finally, and this is really a pure XML question: supposing I do get this
to work. That only allows me to use the html namespace in my base schema
declarations. In order to allow for all of HTML in certain elements as
I'm doing things now, that means I have to declare every HTML tag in each
of the elements I want to allow them in. Is that really what has to be
done? Do I have to put *all* potential HTML tags in my schema to declare
them as legal? Again, it would be *really* nice to just be able to say
"anything from the HTML namespace is legal here". Is there some good way
of doing that? I suppose I could just allow for some subset of HTML tags
but it would be nice to give an option where the user could just type in
(or paste in) any HTML he/she likes and I would store it mostly as a black
box without ever directly interpreting any of the contents but just
passing them in to an HTML viewer control.

Sorry for the long post. I don't know how to make it any more brief. If
you're still with me, thanks for any ideas you might have on this!

Darrell


Oct 30 '06 #2
Ah, Zafar, you are a lifesaver and I am such a rookie. This was precisely
the right information and solved all my problems. I have always been a bit
puzzled about namespace resolution which most of my books don't talk about
too much, but now I'm enlightened! I'm glad to learn about the wildcard.
Thank goodness you told me about it because I can't find it mentioned in a
single one of my four books on XML.

Thanks very much!

Darrell
Oct 30 '06 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: inerte | last post by:
Hello all! I need to build an XML file structure so a client can import data to one of our systems. Totally new to XML, I learned about Namespaces and DTD, and built a nice spec using them. Now...
0
by: Murali Inguva | last post by:
I have an xml. It was generated from different schemas. I want to know the namespaces and the schemas that are reffered in that XML. So that i can add the Namespaces and Schemas to the Namespace...
4
by: anonymous | last post by:
When I use the schema collection to apply many schemas to one XML instance document, I get an error if I do not qualify every element with the appropriate namespace. Both the W3C site and this...
2
by: Shapper | last post by:
Hello, In the main root of my web site together with my aspx and aspx.vb files I have the file global.vb. This file has a class with all the functions which are used in many aspx.vb files and...
19
by: Tiraman | last post by:
Hi , I have an assembly that hold few class and few imports which are not used in all of the class's . so my question is when to use the "Imports XXX" And when to use this kind of statement...
0
by: vihrao | last post by:
I am designing wsdl that uses multiple schemas. I can do this in two ways: 1) use multiple schema imports in one wsdl or 2) use multiple schema imports in to one common schema and then import a...
0
by: dlutz | last post by:
Good day to all -- I have been trying to load a dataset from XML based on a multi-level schema structure that uses <xsd:importtags to bring other schemas into the XML document context - Schema1...
0
by: mgoold2002 | last post by:
Hello. I'm new to using XML/HTML. I am trying to learn to validate xml sample docs against schemas. I have an XML and Schema file (.xsd) saved to my desktop. Below is my sample code, which is...
6
by: douglass_davis | last post by:
Is the whole System namespace automatically imported?
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.