469,645 Members | 1,653 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,645 developers. It's quick & easy.

Excluding values in the xsd

I have a XML which specifies a Cobol copybook member. The XML is
checked against a XSD.
One of the xml attributes is the Cobol fieldname. The xsd constraints
the value of this attribute to be greater then zero and less then 31.
Now I want to include another check. The value must not be one of the
Cobol reserved words. So I have a list with reserved words (like SUM,
ACCEPT, COMPUTE, etc).
How can I specify this excluded values in the xsd so that the XML
vaidation will return errors if one of the reserved words is used in
the attribute value.
DickD

Mar 21 '07 #1
11 5264
di*********@donkeydevelopment.com wrote:
I have a XML which specifies a Cobol copybook member. The XML is
checked against a XSD.
One of the xml attributes is the Cobol fieldname. The xsd constraints
the value of this attribute to be greater then zero and less then 31.
Now I want to include another check. The value must not be one of the
Cobol reserved words. So I have a list with reserved words (like SUM,
ACCEPT, COMPUTE, etc).
How can I specify this excluded values in the xsd so that the XML
vaidation will return errors if one of the reserved words is used in
the attribute value.
You can enumerate those reserved words e.g.
<xs:simpleType name="reserved-word">
<xs:restriction base="xs:string">
<xs:enumeration value="ACCEPT"/>
<xs:enumeration value="COMPUTE"/>
<xs:enumeration value="SUM"/>
<!-- add further values here -->
</xs:restriction>
</xs:simpleType>


--

Martin Honnen
http://JavaScript.FAQTs.com/
Mar 22 '07 #2
di*********@donkeydevelopment.com wrote:
How can I specify this excluded values in the xsd so that the XML
vaidation will return errors if one of the reserved words is used in
the attribute value.
Use a regular expression to describe the attribute's acceptable values.

http://www.w3.org/TR/xmlschema-2/#rf-pattern
http://www.w3.org/TR/xmlschema-2/#regexs
--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Mar 22 '07 #3
Martin Honnen wrote:
di*********@donkeydevelopment.com wrote:
>I have a XML which specifies a Cobol copybook member. The XML is
checked against a XSD.
One of the xml attributes is the Cobol fieldname. The xsd constraints
the value of this attribute to be greater then zero and less then 31.
Now I want to include another check. The value must not be one of the
Cobol reserved words. So I have a list with reserved words (like SUM,
ACCEPT, COMPUTE, etc).
How can I specify this excluded values in the xsd so that the XML
vaidation will return errors if one of the reserved words is used in
the attribute value.

You can enumerate those reserved words e.g.
Sorry, I misread your request, enumeration helps if you want to allow
all reserved words, but not if you want to disallow them.
--

Martin Honnen
http://JavaScript.FAQTs.com/
Mar 22 '07 #4
On 22 mrt, 15:42, Joseph Kesselman <keshlam-nos...@comcast.netwrote:
dick.den...@donkeydevelopment.com wrote:
How can I specify this excluded values in the xsd so that the XML
vaidation will return errors if one of the reserved words is used in
the attribute value.

Use a regular expression to describe the attribute's acceptable values.

http://www.w3.org/TR/xmlschema-2/#rf...hema-2/#regexs

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
The problem is that any value can be accepted, except the list of
reserved words.
In regex it is not easy to negate an expression. There is not
something like ^(SUM,COMPUTE,DATA).
After a long internet search I found an expression the matched my
needs.
Here is the java code:
String s2 = "perfOrm";
String regex = "^(?:(?!^(?im:accept|accept-encoding|from|to|perform|
sub)$)[\\w-])*$";
System.out.println(s2 + " matches " + regex + " =
"+s2.matches(regex));

The exclude values in this example are arbritary.
But: ... this kind of expression is not supported by Xerces or any
other parser.
I found that the XML Schema specifcations talk about level 1 regex
support.

So if anyone has a idea to solve this ??
Regards
Dick Deneer

Mar 22 '07 #5
di*********@donkeydevelopment.com wrote:
So if anyone has a idea to solve this ??
I think Schema's supported regular expressions can be presuaded to do
it, though the expression may be painfully ugly.

If you aren't happy with that, implement the check in the application
rather than in schema.

Remember, the schema is only an initial sanity check on syntax and
overall structure of the document. It is NOT intended to capture and
check all possible semantic constraints. Some checking will still have
to be implemented in the application.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Mar 22 '07 #6
I think Schema's supported regular expressions can be presuaded to do
it, though the expression may be painfully ugly.
I think it is not possible. Please convince me :)

Regards
Dick Deneer
Mar 23 '07 #7
di*********@donkeydevelopment.com writes:
I think it is not possible. Please convince me :)
Regular languages are closed under complementation. So, you can be
sure it is possible: there _is_ a regular expression that matches
everything except a finite set of words. If you want to exclude, e.g.,
"if" and "else", you can go:

([^i]|i[^f]|if.|[^e]|e[^l]|el[^s]|els[^e]|else.).*

(I'm not sure about the regexp syntax for schemas). It may be a real
pain. I don't know if there's an easier way to get the same result.

-- Alain.
Mar 23 '07 #8
On 23 mrt, 13:42, Alain Ketterlin <a...@dpt-info.u-strasbg.frwrote:
dick.den...@donkeydevelopment.com writes:
I think it is not possible. Please convince me :)

Regular languages are closed under complementation. So, you can be
sure it is possible: there _is_ a regular expression that matches
everything except a finite set of words. If you want to exclude, e.g.,
"if" and "else", you can go:

([^i]|i[^f]|if.|[^e]|e[^l]|el[^s]|els[^e]|else.).*

(I'm not sure about the regexp syntax for schemas). It may be a real
pain. I don't know if there's an easier way to get the same result.

-- Alain.
Alain,

I tested your expression and it always returns true, whatever
(including if and else) I type.
Do I miss something?

Mar 23 '07 #9
di*********@donkeydevelopment.com writes:
> ([^i]|i[^f]|if.|[^e]|e[^l]|el[^s]|els[^e]|else.).*
I tested your expression and it always returns true, whatever
(including if and else) I type.
Do I miss something?
I did :-) I went to fast. You have to 1) include trailing chars in the
alternative, 2) group prefixes to exclude, 3) take care of strict
prefixes. Something like:

([^ie].*|i|i[^f].*|if.+|e|e[^l].*|el|el[^s].*|els|els[^e].*|else.+)

May get really hairy with lots of keywords. Be careful with common
prefixes, like "if" and "int":

([^i].*|i|i[^nf].*|if.+|in[^t].*|int.+)

I stop here, in fear of writing nonsense. The basic idea is simple:

1) draw a trie (lexicographic tree) containing all the words
2) add one alternative for each path to a non leaf node (i,el,els)
3) add one alternative for each path out of a node (either
leaf or non-leaf), i.e., a path that starts "in" the tree and "exits"
the tree at some point (i[^f].*,if.+ etc.)

(It basically amounts in reverting the output of a deterministic
finite automaton.)

-- Alain.

P/S: BTW, I just discovered grep --colour... Useful in such cases.
Mar 24 '07 #10
* Alain Ketterlin wrote in comp.text.xml:
>Regular languages are closed under complementation. So, you can be
sure it is possible: there _is_ a regular expression that matches
everything except a finite set of words. If you want to exclude, e.g.,
"if" and "else", you can go:

([^i]|i[^f]|if.|[^e]|e[^l]|el[^s]|els[^e]|else.).*

(I'm not sure about the regexp syntax for schemas). It may be a real
pain. I don't know if there's an easier way to get the same result.
You created not(if) or not(else) which matches if and else, you need to
create not(if) and not(else), i.e. the intersection of two regular ex-
pressions. I suppose there is a painful way in XML Schema to specify
multiple regular expressions a string must match, and inverting a group
is simple (abc -not(a) .* or a not(b) .* or ab not(c) or abc .+). It
would be better to compute the intersection of the regular expressions.
There may be finite state automata tools that support that. I am about
to release a tool that can do it aswell.
--
Björn Höhrmann · mailto:bj****@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Mar 24 '07 #11
On 24 mrt, 08:24, Alain Ketterlin <a...@dpt-info.u-strasbg.frwrote:
dick.den...@donkeydevelopment.com writes:
([^i]|i[^f]|if.|[^e]|e[^l]|el[^s]|els[^e]|else.).*
I tested your expression and it always returns true, whatever
(including if and else) I type.
Do I miss something?

I did :-) I went to fast. You have to 1) include trailing chars in the
alternative, 2) group prefixes to exclude, 3) take care of strict
prefixes. Something like:

([^ie].*|i|i[^f].*|if.+|e|e[^l].*|el|el[^s].*|els|els[^e].*|else.+)

May get really hairy with lots of keywords. Be careful with common
prefixes, like "if" and "int":

([^i].*|i|i[^nf].*|if.+|in[^t].*|int.+)

I stop here, in fear of writing nonsense. The basic idea is simple:

1) draw a trie (lexicographic tree) containing all the words
2) add one alternative for each path to a non leaf node (i,el,els)
3) add one alternative for each path out of a node (either
leaf or non-leaf), i.e., a path that starts "in" the tree and "exits"
the tree at some point (i[^f].*,if.+ etc.)

(It basically amounts in reverting the output of a deterministic
finite automaton.)

-- Alain.

P/S: BTW, I just discovered grep --colour... Useful in such cases.
Alain (and Bjoern)

I am convinced.
It is possible but indeed very painfull if your list of reserved words
is big, which is the case for me.
Thanks a lot,
Dick Deneer

Mar 24 '07 #12

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Olav Tollefsen | last post: by
4 posts views Thread by Mark Rae | last post: by
1 post views Thread by asnowfall | last post: by
4 posts views Thread by Fred!head | last post: by
12 posts views Thread by jim.richardson | last post: by
reply views Thread by gheharukoh7 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.