Connecting Tech Pros Worldwide Forums | Help | Site Map

include file problem: namespaces?

maxwell@ldc.upenn.edu
Guest
 
Posts: n/a
#1: May 1 '06
I'm having a problem using the XML "include" mechanism, which I think
has to do with namespaces.

I have an XML file that has a lot of repetition--a sequence of elements
that appear multiple times. My thought was to copy this sequence into
a separate file, then use an entity defn to refer to that separate
file:

<!ENTITY ThirdPersonAllButTense SYSTEM "3PersNotTense.xml">

and replace the sequences with &ThirdPersonAllButTense;. (There's no
particular reason this sequence of elements should go into an
*external* file, but I can't figure out how to define an entity as a
sequence of elements in any other way.)

But I get an error msg from xmllint:

Element 'FeatureValueSet': This element is not expected.
Expected is (
{http://lodl.ldc.upenn.edu/ParadigmDefn.xsd}FeatureValueSet ).

If I understand this correctly, it's saying it expected to find a
namespace identifier in front of the element name (and it's suggesting
a URL as the identifier). This namespace is defined as the default
namespace in the "calling" file, i.e.

<ParadigmDefns xmlns ="http://lodl.ldc.upenn.edu/ParadigmDefn.xsd"
...>

There is no namespace definition in the included file. (Should there
be? How?)

I'm unsure why it wants this identifier (if that is indeed what this
error msg means). My concept of what this external entity inclusion
does, is that it just copies the contents of the external file in place
of the entity. Obviously my concept is wrong, or I wouldn't be getting
an error msg, but I'm unsure why.

Apologies if the above is unclear/ uses the wrong terminology...


Richard Tobin
Guest
 
Posts: n/a
#2: May 1 '06

re: include file problem: namespaces?


In article <1146452265.951392.240040@i40g2000cwc.googlegroups .com>,
<maxwell@ldc.upenn.edu> wrote:
[color=blue]
>I have an XML file that has a lot of repetition--a sequence of elements
>that appear multiple times. My thought was to copy this sequence into
>a separate file, then use an entity defn to refer to that separate
>file:
>
> <!ENTITY ThirdPersonAllButTense SYSTEM "3PersNotTense.xml">
>
>and replace the sequences with &ThirdPersonAllButTense;. (There's no
>particular reason this sequence of elements should go into an
>*external* file, but I can't figure out how to define an entity as a
>sequence of elements in any other way.)[/color]

You can use an internal entity equally well, for example:

<!DOCTYPE foo [
<!ELEMENT foo (bar*)>
<!ELEMENT bar EMPTY>
<!ENTITY lots-of-bars "
<bar/>
<bar/>
<bar/>
<bar/>
<bar/>
">
]>
<foo>
&lots-of-bars;
&lots-of-bars;
</foo>
[color=blue]
> Element 'FeatureValueSet': This element is not expected.
> Expected is (
>{http://lodl.ldc.upenn.edu/ParadigmDefn.xsd}FeatureValueSet ).
>
>If I understand this correctly, it's saying it expected to find a
>namespace identifier in front of the element name (and it's suggesting
>a URL as the identifier). This namespace is defined as the default
>namespace in the "calling" file, i.e.
>
> <ParadigmDefns xmlns ="http://lodl.ldc.upenn.edu/ParadigmDefn.xsd"
> ...>
>
>There is no namespace definition in the included file. (Should there
>be? How?)
>
>I'm unsure why it wants this identifier (if that is indeed what this
>error msg means). My concept of what this external entity inclusion
>does, is that it just copies the contents of the external file in place
>of the entity.[/color]

That's right; entites work at a textual level before namespace processing.
I don't know why you're getting this error message. Perhaps you could
post a complete cut-down example so others can check it.

-- Richard



maxwell@ldc.upenn.edu
Guest
 
Posts: n/a
#3: May 2 '06

re: include file problem: namespaces?


> Perhaps you could post a complete cut-down[color=blue]
> example so others can check it.[/color]

I was afraid you were going to say that :-). The file is (by my
standards at least) large and complex, so it will take me awhile to
trim it down to s.t. manageable. But I'll try, and thanks for the
suggestion!

maxwell@ldc.upenn.edu
Guest
 
Posts: n/a
#4: May 5 '06

re: include file problem: namespaces?


OK, I trimmed it down. Unfortunately, I'm posting on Google groups, so
I can't attach files (maybe that's a no-no here anyway?). So, here
goes:

First, the schema, which I have in a file BugSchema.xsd:
---------------BugSchema.xsd-----------------
<?xml version="1.0"?>
<xsd:schema xmlns:xsd ="http://www.w3.org/2001/XMLSchema"
targetNamespace
="http://lodl.ldc.upenn.edu/ParadigmDefn.xsd"
xmlns
="http://lodl.ldc.upenn.edu/ParadigmDefn.xsd"
elementFormDefault ="qualified"[color=blue]
>[/color]

<xsd:element name="ParadigmDefns">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="Paradigm" minOccurs="1"
maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>

<xsd:element name="Paradigm">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="FeatureValueSet" minOccurs="1"
maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>

<xsd:element name="FeatureValueSet">
</xsd:element>
</xsd:schema>
-----------------------------------------------------------

Next, the file to be validated against that schema, Bug.Defn.xml:
------------------Bug.Defn.xml--------------------
<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE ParadigmDefn
[<!ENTITY Inclusion SYSTEM "IncludeBug.xml">][color=blue]
>[/color]

<ParadigmDefns xmlns
="http://lodl.ldc.upenn.edu/ParadigmDefn.xsd"
xmlns:xsi
="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://lodl.ldc.upenn.edu
ParadigmDefn.xsd"[color=blue]
>[/color]
<Paradigm>
<!--If the ff. '&Inclusion;' is commented out, this file
validates.-->
&Inclusion;
<FeatureValueSet></FeatureValueSet>
</Paradigm>
</ParadigmDefns>
-----------------------------------------

And finally, the "include" file, IncludeBug.xml:
--------------------IncludeBug.xml---------------
<FeatureValueSet></FeatureValueSet>
-----------------------------------------

You'll notice that this line is identical to the line that appears
immediately below the '&Inclusion;'; the latter line does not cause any
problem, if the '&Inclusion;' is commented out.

The error msg happens (if the inclusion is not commented out) when I
run the ff. cmd:

xmllint --noout --noent --schema BugSchema.xsd TestSuite/bug.Defn.xml

Specifically, the error msg is:

Element 'FeatureValueSet': This element is not expected. Expected is
( {http://lodl.ldc.upenn.edu/ParadigmDefn.xsd}FeatureValueSet ).
TestSuite/bug.Defn.xml fails to validate

I'm running this under CygWin. xmllint reports "using libxml version
20622 of the in-memory document" (whatever that means--sounds like an
odd version #, but...)

Thanks for any suggestions!

Joe Kesselman
Guest
 
Posts: n/a
#5: May 5 '06

re: include file problem: namespaces?


Remember, inclusion in XML is semantic, NOT string based.

Your external entity doesn't define a default namespace, nor does it use
prefixes bound to a namespace. Ergo, its elements are NOT in any
namespace. Ergo, they are NOT the same as the ones in the namespace.
That's exactly what the error message is telling you:
[color=blue]
> Element 'FeatureValueSet': This element is not expected. Expected is
> ( {http://lodl.ldc.upenn.edu/ParadigmDefn.xsd}FeatureValueSet ).
> TestSuite/bug.Defn.xml fails to validate[/color]

You provided a <FeatureValueSet>; it expected a
<FeatureValueSet xmlns="http://lodl.ldc.upenn.edu/ParadigmDefn.xsd">, or
<foo:FeatureValueSet
xmlns:foo="http://lodl.ldc.upenn.edu/ParadigmDefn.xsd">, or something
equivalent.

Fix your external entity.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
maxwell@ldc.upenn.edu
Guest
 
Posts: n/a
#6: May 5 '06

re: include file problem: namespaces?


> Remember, inclusion in XML is semantic, NOT string based.

That definitely seems to be what's happening, but it's not what I
expected from statements like
You can specify an entity that has text defined external to
the document by using the SYSTEM keyword...
In this case, the XML processor will parse the content of
that file as if its content had been typed at the location
of the entity reference.
(--http://www.javacommerce.com/displaypage.jsp?name=entities.sql&id=18238)
[color=blue]
> You provided a <FeatureValueSet>; it expected a
> <FeatureValueSet xmlns="http://lodl.ldc.upenn.edu/ParadigmDefn.xsd">, or
> <foo:FeatureValueSet
> xmlns:foo="http://lodl.ldc.upenn.edu/ParadigmDefn.xsd">, or something
> equivalent.
>
> Fix your external entity.[/color]

In the actual file (not the cut down one that I posted here), there is
a long series of these entities
<FeatureValueSet>foo</FeatureValueSet>
<FeatureValueSet>bar</FeatureValueSet>
<FeatureValueSet>baz</FeatureValueSet>
etc., and I want to splice that *sequence* in at several points. I can
put in the xmlns in each one of these entites, but is there a way in
this external file of specifying the xmlns once, rather than repeatedly?

Joseph Kesselman
Guest
 
Posts: n/a
#7: May 5 '06

re: include file problem: namespaces?


maxwell@ldc.upenn.edu wrote:[color=blue][color=green]
>>Remember, inclusion in XML is semantic, NOT string based.[/color]
>
> That definitely seems to be what's happening, but it's not what I
> expected from statements like[/color]

Hmmmm. Let me think about that.

Part of the problem here is that namespaces were introduced after the
XML spec, so you get into odd situations where DTD behavior is
non-namespace-aware but later layers *are* namespace aware.

So I think I may be wrong; the inclusion -- because external entities
are defined by the XML spec rather than XML-plus-namespaces -- probably
should have occurred syntactically, and probably should have picked up
the namespace context at the point where you brought it in.

Interesting. I need to dig into this one more deeply to really convince
myself, and I should try it in a parser I trust to see what it thinks
should happen.

Which parser are you using?



--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
maxwell@ldc.upenn.edu
Guest
 
Posts: n/a
#8: May 5 '06

re: include file problem: namespaces?


Actually, I haven't tested this with any application/ parser, only with
xmllint.

Joseph Kesselman
Guest
 
Posts: n/a
#9: May 5 '06

re: include file problem: namespaces?


maxwell@ldc.upenn.edu wrote:[color=blue]
> Actually, I haven't tested this with any application/ parser, only with
> xmllint.[/color]

Not familiar with it. It may be wrong...

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
maxwell@ldc.upenn.edu
Guest
 
Posts: n/a
#10: May 5 '06

re: include file problem: namespaces?


>> Actually, I haven't tested this with any application/ parser, only with[color=blue][color=green]
>> xmllint.[/color][/color]
[color=blue]
> Not familiar with it. It may be wrong...[/color]

Could be. It uses xmllib, for what that's worth.
http://xmlsoft.org/xmllint.html is a man page, but I get the executable
in the CygWin package.

Can you suggest another (better!) parser that I could use to validate
the xml file against my schema?

Richard Tobin
Guest
 
Posts: n/a
#11: May 5 '06

re: include file problem: namespaces?


In article <1146797709.274196.126540@u72g2000cwu.googlegroups .com>,
<maxwell@ldc.upenn.edu> wrote:
[color=blue]
>The error msg happens (if the inclusion is not commented out) when I
>run the ff. cmd:
>
> xmllint --noout --noent --schema BugSchema.xsd TestSuite/bug.Defn.xml
>
>Specifically, the error msg is:
>
> Element 'FeatureValueSet': This element is not expected. Expected is
> ( {http://lodl.ldc.upenn.edu/ParadigmDefn.xsd}FeatureValueSet ).
> TestSuite/bug.Defn.xml fails to validate[/color]

Your document is valid. xmllint is confused. The included element
is in the http://lodl.ldc.upenn.edu/ParadigmDefn.xsd namespace as you
intended.

-- Richard

Richard Tobin
Guest
 
Posts: n/a
#12: May 5 '06

re: include file problem: namespaces?


In article <445b7a43$1@kcnews01>,
Joseph Kesselman <keshlam-nospam@comcast.net> wrote:
[color=blue]
>the inclusion -- because external entities
>are defined by the XML spec rather than XML-plus-namespaces -- probably
>should have occurred syntactically, and probably should have picked up
>the namespace context at the point where you brought it in.[/color]

Yes. Namespace processing can be considered as a layer that happens
after parsing, including DTD processing and in particular entity
expansion.

-- Richard
maxwell@ldc.upenn.edu
Guest
 
Posts: n/a
#13: May 6 '06

re: include file problem: namespaces?


So I think what I hear from both of you (Richard and Joseph) is that
xmllint (which I believe depends on xmllib) is majorly broken, at least
the version I'm using. I'm surprised this hasn't come up before;
surely including external entities is common, and having namespaces is
common??

Is there a better choice than xmllint? For the time being, I've
resorted to producing an intermediate file with all the external
entities #included, using gpp, and only then running xmllint over it to
make sure everything is hunky dory. This certainly works, but it seems
like an odd way to be working, when the inclusion mechanism is supposed
to be built in to XML.

Joe Kesselman
Guest
 
Posts: n/a
#14: May 6 '06

re: include file problem: namespaces?


maxwell@ldc.upenn.edu wrote:[color=blue]
> surely including external entities is common, and having namespaces is
> common??[/color]

External entities are uncommon in recent documents.
Namespaces are uncommon (nonexistant) in older documents.
Not entirely surprising if the two weren't tested together.



--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
maxwell@ldc.upenn.edu
Guest
 
Posts: n/a
#15: May 6 '06

re: include file problem: namespaces?


> External entities are uncommon in recent documents.

I suppose there's a reason for that. What should I be using instead,
when I have a sequence of entities that I want to splice in at multiple
points? References?

Joe Kesselman
Guest
 
Posts: n/a
#16: May 6 '06

re: include file problem: namespaces?


maxwell@ldc.upenn.edu wrote:[color=blue][color=green]
>> External entities are uncommon in recent documents.[/color]
> I suppose there's a reason for that.[/color]

They're supported only by DTDs, not schemas. And, as you reminded me,
they aren't namespace-aware, which is a Bad Thing.
[color=blue]
> What should I be using instead,[/color]

Sigh. The closest equivalent is XInclude
(http://www.w3.org/TR/2004/REC-xinclude-20041220/), but widespread
support for that at the parser level has been a slow in arriving because
there hasn't been a lot of demand for it..

It's not uncommon for folks to kluge up an implementation of XInclude by
using an XSLT stylesheet that recognizes the XInclude directives and
generates a new document that replaces them with the information they
reference. (Stylesheets which provide this function can be found on the
web, with a bit of searching.) Of course that requires that you
explicitly run the document through an XSLT processor to execute that
stylesheet, in addition to any other processing (including additional
stylesheets) you want to apply to it.


--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
maxwell@ldc.upenn.edu
Guest
 
Posts: n/a
#17: May 6 '06

re: include file problem: namespaces?


Goodness, and I thought what I wanted to do was fairly routine...I may
just retain the #include and use gpp. Anyway, thank you both for all
the time you've put in answering my naive questions!

Joe Kesselman
Guest
 
Posts: n/a
#18: May 6 '06

re: include file problem: namespaces?


One of the other reason parsed entities are going away is that they
really aren't very useful. They're defined in DTD, which means there's
only one set of them for all documents of that type -- as if you had to
hardwire all your possible #includes into the compiler. Usually when you
want to include something, you want the instance document to decide what
it wants to include.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Joe Kesselman
Guest
 
Posts: n/a
#19: May 6 '06

re: include file problem: namespaces?


(OK, yes, you can define external parsed entites in the Internal Subset.
It's still a sloppy solution.)

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
maxwell@ldc.upenn.edu
Guest
 
Posts: n/a
#20: May 6 '06

re: include file problem: namespaces?


Well, let me describe what I'm trying to do, and maybe you can suggest
a better way to do it.

I am keeping 'instructions" for generating inflectional paradigms of a
language (Nahuatl) in XML. I transform the XML files in various ways,
depending on what I'm trying to do.

The "instructions" take the form of features that I want to hold
constant for a given paradigm, and features I want to allow to vary.
So for example I might want to generate all the tenses of some set of
verbs, but holding subject person, object person, negation etc.
constant. Another time I might want to generate all the combinations
of subject and object person forms in the present tense.

I find that I re-use lots of combinations--e.g. hold tense constant at
'present', the negation at 'positive', not use any directional affixes,
and then generate all the person combos; then hold tense constant at
'future', generate person combos, etc. So a typical combinatin that I
want to re-use consists of the settings for tense, negation, and
directionals. That requires three XML elements. And I may re-use that
set of three elements in several different XML files, so I want to
collect the set into an external file where it can be defined as an
external entity, along with other commonly re-used entities. That file
can then be loaded into various XML files. Which entities a given XML
file actually makes use of is of course up to that XML file.

Basically, the idea strikes me as being parallel to the use of files of
library functions in a traditional programming language. You
(statically or dynamically) link in the library file whenever your
executable needs to make use of a pre-defined function. In my case,
what I want to do is more like static linking: I'm including in the
final XML file the pre-defined external entities.

Does that make sense? Right now, because of the problem I was having
with the XML way of doing things, I'm just #include-ing that external
file. Is there an XMLish way to do that, which avoids the problems
that triggered this thread?

Richard Tobin
Guest
 
Posts: n/a
#21: May 6 '06

re: include file problem: namespaces?


In article <v7-dnZ_yOoHYs8HZRVn-vg@comcast.com>,
Joe Kesselman <keshlam-nospam@comcast.net> wrote:
[color=blue]
>One of the other reason parsed entities are going away is that they
>really aren't very useful. They're defined in DTD, which means there's
>only one set of them for all documents of that type -- as if you had to
>hardwire all your possible #includes into the compiler. Usually when you
>want to include something, you want the instance document to decide what
>it wants to include.[/color]

One of the common uses of entities (external and internal) is for
boilerplate that's common across documents - copyright notices for
example - and for that purpose fixed values are what you want.

But the system ID of an external entity can be a relative URI, so it's
also possible to package up a document with a set of external entities
in the same directory.

-- Richard
Joe Kesselman
Guest
 
Posts: n/a
#22: May 6 '06

re: include file problem: namespaces?


Interesting project!

My reflex is to suggest that you consider restructuring this as a
rules-driven process, using XSLT (which is, after all, a rules-based
nonprocedural language, and can do more intelligent processing than the
simple C preprocessor you've been playing with).

(There's definitely something to be said for writing your "library
functions" in a higher-level language that's explicitly designed for
manipulating XML rather than trying to do it at a lower level. And
applying inflection does sound very much like a styling problem, even if
it's in the temporal rather than spatial rendering domain.)

Caveat: Depending on exactly how your data is structured and what kinds
of rules you're tyring to apply, XSLT may or may not be a reasonable
approach. I freely admit that I'm biased because I've been working on
XSLT processors for the past five years, and I do recognize that this
general-purpose tool isn't a good fit for all tasks. But this sounds
like a case where it might be entirely appropriate.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry


--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
maxwell@ldc.upenn.edu
Guest
 
Posts: n/a
#23: May 7 '06

re: include file problem: namespaces?


Richard and Joe--
[color=blue]
> Interesting project![/color]

I think so :-).

Thanks for the ideas. I guess I could use xslt (which I am using
elsewhere in the project) to handle the inclusions, but it seems like
overkill at this point, when all I want to do is pull in some
boilerplate. But I'll think about it--there is some pre-processing,
although at this point it's done by Python, since there is some fairly
heavy looping involved (producing all possible combinations of a set of
sets of features). In fact the only reason for using xmllint was to
ensure that the file Python was going to read was valid (thereby making
it easier to debug).

The actual generation of inflected words is done by an entirely
different program, the Xerox Finite State toolkit, which converts forms
like compl+1sgS-3sgO-pale:wia-perfv.sg into o:+n-k-pale:wih-0 by
replacing glosses with Nahuatl affixes, and thence into o:nihpale:wih
by applying phonological rules. I don't think that's a job for xslt
:-).

Anyway, thanks to both of you for all the ideas, and the helpful
discussion!

Closed Thread