473,779 Members | 2,015 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Hierarchical query

Reposting with more clarification (as Jan asked).

Suppose I have a BNFgrammar and a source text parsed into a tree. How
would I query an
identifier declaration?

All the XQuery tutorials (where I went to gather some ideas) start
with simpleminded examples like browsing all the descendants of /
bookstore/book (and the bookstore XML design is such wrong idea to
boot).

Perhaps some example is needed. A simplified grammar:

statement_block :
'declare'
declaration_ite m_list
'begin'
statements
'end'
;

statements:
(statement_bloc k | assignment) statements |
;

assignment:
identifier ':=' (identifier | number) ';'
;

declaration_ite m_list:
identifier 'integer' ';'
;

Suppose we parse the following text

declare -- token #1
i integer; -- tokens 2,3,4
begin
i := 1; -- tokens 6,7,8,9
end;

So that we get the derivation tree like this:

1 statement_block
1.1 'declare' (matches token #1)
1.2 declaration_ite m_list
1.2.1 identifier (matches token #2)
1.2.2 'integer' (matches token #3)
1.2.3 ';' (matches token #4)
1.3 'begin' (matches token #5)
1.4 statements
1.4.1 assignment
1.4.1.1 identifier
1.4.1.2 ':='
1.4.1.3 number
1.4.1.4 ';'
1.4.2 statements (matches empty token)
1.5 'end';
Now, given a derivation tree and the leaf node identifier 1.4.1.1
corresponding to the token #6 which is variable i, how do we find the
node 1.2.1 that declares it?

Jun 13 '07 #1
30 2369
On 13 jun, 18:11, Vadim Tropashko <vadimtro_inva. ..@yahoo.comwro te:
Reposting with more clarification (as Jan asked).

Suppose I have a BNFgrammar and a source text parsed into a tree. How
would I query an
identifier declaration?

All the XQuery tutorials (where I went to gather some ideas) start
with simpleminded examples like browsing all the descendants of /
bookstore/book (and the bookstore XML design is such wrong idea to
boot).

Perhaps some example is needed. A simplified grammar:

statement_block :
'declare'
declaration_ite m_list
'begin'
statements
'end'
;

statements:
(statement_bloc k | assignment) statements |
;

assignment:
identifier ':=' (identifier | number) ';'
;

declaration_ite m_list:
identifier 'integer' ';'
;

Suppose we parse the following text

declare -- token #1
i integer; -- tokens 2,3,4
begin
i := 1; -- tokens 6,7,8,9
end;

So that we get the derivation tree like this:

1 statement_block
1.1 'declare' (matches token #1)
1.2 declaration_ite m_list
1.2.1 identifier (matches token #2)
1.2.2 'integer' (matches token #3)
1.2.3 ';' (matches token #4)
1.3 'begin' (matches token #5)
1.4 statements
1.4.1 assignment
1.4.1.1 identifier
1.4.1.2 ':='
1.4.1.3 number
1.4.1.4 ';'
1.4.2 statements (matches empty token)
1.5 'end';

Now, given a derivation tree and the leaf node identifier 1.4.1.1
corresponding to the token #6 which is variable i, how do we find the
node 1.2.1 that declares it?
Ok. I'm going to assume the following DTD (in a notation of my own
making to make it a bit more readable) for the syntax tree. (It uses
no attributes to keep things simple):

<stat_bl--<decl_kw<decl_i tem_list<begin_ kw<statements>
<semicol_kw<e nd-kw>
<decl_kw--EMPTY
<begin_kw--EMPTY
<end_kw--EMPTY
<semicol_kw--EMPTY
<statements--( <stat_bl| <assignment)+
<assignment--<var<assign_k w( <var| <number) <semicol_kw>
<var--PCDATA
<assign_kw--EMPTY
<number--PCDATA
<decl_item_li st--( <var<type<semic ol_kw)+
<type--PCDATA

For starters I'll first do the reverse query, so I will assume there
is a variable $dvar that contains a <varelement that describes a
variable in a declaration. The XPath expression that walks to all the
<varnodes in an assignment that are in the scope of $dvar is as
follows:

(1) $dvar/(
(2) ../../statements//assignment/var[string() = $dvar/string()]
(3) minus
(4) ../../statements//stat_bl[decl_item_list/var/string() = $dvar/
string()]/statements//var
(5) )

The idea is quite simple: the path expression in line (2) walks to all
variables that are nested within the statement block of the
declaration $dvar, and the path expression in line (4) subtracts those
for which there is another intermediate declaration that supersedes
the one of $dvar.

Now the query that you asked for, which in some sense simply the
reverse. We assume now that $uvar contains the <varelement that is
used in an assignment and the path expression walks to the <var>
elements in a declaration that binds the <varelement in $uvar:

(1) $uvar/(
(2) ancestor::stat_ bl/decl_item_list/var[string() = $uvar/
string()]
(3) minus
(4) ancestor::stat_ bl[decl_item_list/var/string() = $uvar/
string()]/ancestor::stat_ bl/decl_item/var
(5) )

Of course there are many many different ways of expressing these
queries, but these are the most compact ones I could think of. Note
that I actually didn't have to take into account the order of the
declarations, just how they are nested. But this is because of the way
you defined the syntax and scoping rules. However, if you would extend
the syntax and the scoping rules such that order also maters then
extending the queries to take that into account should not be a big
problem.

-- Jan Hidders

Jun 13 '07 #2
On Jun 13, 9:19 am, Jan Hidders <hidd...@gmail. comwrote:
Ok. I'm going to assume the following DTD (in a notation of my own
making to make it a bit more readable) for the syntax tree. (It uses
no attributes to keep things simple):

<stat_bl--<decl_kw<decl_i tem_list<begin_ kw<statements>
<semicol_kw<e nd-kw>
<decl_kw--EMPTY
<begin_kw--EMPTY
<end_kw--EMPTY
<semicol_kw--EMPTY
<statements--( <stat_bl| <assignment)+
<assignment--<var<assign_k w( <var| <number) <semicol_kw>
<var--PCDATA
<assign_kw--EMPTY
<number--PCDATA
<decl_item_li st--( <var<type<semic ol_kw)+
<type--PCDATA
All right, the DTD is a bastardized [context free?] grammar describing
a language that XML document is an element of. Although the adjectives
"cumbersome " and "ugly" still apply, I grudgingly admit that the idea
that a grammar fits into DTD effortlessly is quite powerful (so I'm
removing the "XML sucks" image from my homepage:-)
For starters I'll first do the reverse query,
so I will assume there
is a variable $dvar that contains a <varelement that describes a
variable in a declaration. The XPath expression that walks to all the
<varnodes in an assignment that are in the scope of $dvar is as
follows:

(1) $dvar/(
(2) ../../statements//assignment/var[string() = $dvar/string()]
(3) minus
(4) ../../statements//stat_bl[decl_item_list/var/string() = $dvar/
string()]/statements//var
(5) )

The idea is quite simple: the path expression in line (2) walks to all
variables that are nested within the statement block of the
declaration $dvar, ...
I have trouble comprehending this line

(2) ../../statements//assignment/var[string() = $dvar/string()]

should I read it left to right? Then the ".." selects the parent of
the current node, and what is the current node? Or do we start from
some node that mathes "statements ", then go two levels up? The
statement block <stat_blis only one level up!
Jun 13 '07 #3
On Jun 13, 9:19 am, Jan Hidders <hidd...@gmail. comwrote:
Ok. I'm going to assume the following DTD (in a notation of my own
making to make it a bit more readable) for the syntax tree. (It uses
no attributes to keep things simple):

<stat_bl--<decl_kw<decl_i tem_list<begin_ kw<statements>
<semicol_kw<e nd-kw>
<decl_kw--EMPTY
<begin_kw--EMPTY
<end_kw--EMPTY
<semicol_kw--EMPTY
<statements--( <stat_bl| <assignment)+
<assignment--<var<assign_k w( <var| <number) <semicol_kw>
<var--PCDATA
<assign_kw--EMPTY
<number--PCDATA
<decl_item_li st--( <var<type<semic ol_kw)+
<type--PCDATA
All right, the DTD is a bastardized [context free?] grammar describing
a language that XML document is an element of. Although the adjectives
"cumbersome " and "ugly" still apply, I grudgingly admit that the idea
that a grammar fits into DTD effortlessly is quite powerful (so I'm
removing the "XML sucks" image from my homepage:-)
For starters I'll first do the reverse query,
so I will assume there
is a variable $dvar that contains a <varelement that describes a
variable in a declaration. The XPath expression that walks to all the
<varnodes in an assignment that are in the scope of $dvar is as
follows:

(1) $dvar/(
(2) ../../statements//assignment/var[string() = $dvar/string()]
(3) minus
(4) ../../statements//stat_bl[decl_item_list/var/string() = $dvar/
string()]/statements//var
(5) )

The idea is quite simple: the path expression in line (2) walks to all
variables that are nested within the statement block of the
declaration $dvar, ...
I have trouble comprehending this line

(2) ../../statements//assignment/var[string() = $dvar/string()]

should I read it left to right? Then the ".." selects the parent of
the current node, and what is the current node? Or do we start from
some node that mathes "statements ", then go two levels up? The
statement block <stat_blis only one level up!

Jun 13 '07 #4
Vadim Tropashko wrote:
I have trouble comprehending this line

(2) ../../statements//assignment/var[string() = $dvar/string()]

should I read it left to right?

Find a good tutorial on XPath. Yes, it evaluates left to right. From the
context node, find its parent; find the parent of that; find its
<statementschil dren; find their <assignmentdesc endants (at any
depth), find their <varchildren whose string value is the same as ...
hmmm. actually I'm not sure what $dvar/string() was intended to mean,
and I don't have time right now to check that.
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Jun 13 '07 #5
On Jun 13, 9:11 am, Vadim Tropashko <vadimtro_inva. ..@yahoo.com>
wrote:
1 statement_block
1.1 'declare' (matches token #1)
1.2 declaration_ite m_list
1.2.1 identifier (matches token #2)
1.2.2 'integer' (matches token #3)
1.2.3 ';' (matches token #4)
1.3 'begin' (matches token #5)
1.4 statements
1.4.1 assignment
1.4.1.1 identifier
1.4.1.2 ':='
1.4.1.3 number
1.4.1.4 ';'
1.4.2 statements (matches empty token)
1.5 'end';
Actually, here is quite simple method. Write down the parse tree as a
set of paths, using grammar varibles and constants (terminals and
nonterminals) as path elements. In the example above we get:

statement_block
statement_block .'declare'
statement_block .declaration_it em_list
statement_block .declaration_it em_list.identif ier"i"
statement_block .declaration_it em_list.'intege r'
.... and so on ...

Query this set with regular expressions.

XML, good bye!

Jun 13 '07 #6
On 13 jun, 21:10, Vadim Tropashko <vadimtro_inva. ..@yahoo.comwro te:
On Jun 13, 9:19 am, Jan Hidders <hidd...@gmail. comwrote:
Ok. I'm going to assume the following DTD (in a notation of my own
making to make it a bit more readable) for the syntax tree. (It uses
no attributes to keep things simple):
<stat_bl--<decl_kw<decl_i tem_list<begin_ kw<statements>
<semicol_kw<e nd-kw>
<decl_kw--EMPTY
<begin_kw--EMPTY
<end_kw--EMPTY
<semicol_kw--EMPTY
<statements--( <stat_bl| <assignment)+
<assignment--<var<assign_k w( <var| <number) <semicol_kw>
<var--PCDATA
<assign_kw--EMPTY
<number--PCDATA
<decl_item_li st--( <var<type<semic ol_kw)+
<type--PCDATA

All right, the DTD is a bastardized [context free?] grammar describing
a language that XML document is an element of. Although the adjectives
"cumbersome " and "ugly" still apply, I grudgingly admit that the idea
that a grammar fits into DTD effortlessly is quite powerful (so I'm
removing the "XML sucks" image from my homepage:-)
Of course, it combines ugliness with great power. Welcome to the Dark
Side. :-)
For starters I'll first do the reverse query,
so I will assume there
is a variable $dvar that contains a <varelement that describes a
variable in a declaration. The XPath expression that walks to all the
<varnodes in an assignment that are in the scope of $dvar is as
follows:
(1) $dvar/(
(2) ../../statements//assignment/var[string() = $dvar/string()]
(3) minus
(4) ../../statements//stat_bl[decl_item_list/var/string() = $dvar/
string()]/statements//var
(5) )
The idea is quite simple: the path expression in line (2) walks to all
variables that are nested within the statement block of the
declaration $dvar, ...

I have trouble comprehending this line

(2) ../../statements//assignment/var[string() = $dvar/string()]

should I read it left to right? Then the ".." selects the parent of
the current node, and what is the current node?
First a small correction, the 'minus' I used should actualy be
'except', which has the semantics of the set minus.

What is the current node? Note that the global structure is p_1/(p_2
except p_3) where (p_2 except p_3) means that from the context node
you evaluate p_2 which gives you a set (or rather sequence) of nodes
and from that you subtract the set of nodes that you get when you
evaluate p_3 starting from the context node. So in this case the
"current node" that you ask about is for both subexpressions the same,
namely the node in $dvar.

Btw., if you forget about sequence order, document order and value
comparisons for a moment, the core of XPath is actually a relatively
neat calculus of binary relations:
- p_1/p_2 is the concatenation of binary relations
- p_1[p_2] is the selection of pairs in p_2 whose right-hand side
matches a left-hand side in p_2 (semijoin anyone?)
- p_1 union p_2 is the set union of two binary relations
- p_1 intersect p_2 is the set intersection of two binary relations
- p_1 except p_2 is the set difference between binary relations

Did I mention Tarski already? :-) Of course we had to wait until
XPath2.0 until the three set operations were all allowed everywhere,
but we have them now.

-- Jan Hidders

Jun 13 '07 #7
On 13 jun, 21:37, Joe Kesselman <keshlam-nos...@comcast. netwrote:
Vadim Tropashko wrote:
I have trouble comprehending this line
(2) ../../statements//assignment/var[string() = $dvar/string()]
should I read it left to right?

Find a good tutorial on XPath. Yes, it evaluates left to right. From the
context node, find its parent; find the parent of that; find its
<statementschil dren; find their <assignmentdesc endants (at any
depth), find their <varchildren whose string value is the same as ...
hmmm. actually I'm not sure what $dvar/string() was intended to mean,
and I don't have time right now to check that.
It retrieves the text node children of the nodes in $dvar. Actually it
is unnecessary, in this case I might just have well have written the
predicate as [. = $dvar].

-- Jan Hidders

Jun 13 '07 #8
On 13 jun, 22:58, Vadim Tropashko <vadimtro_inva. ..@yahoo.comwro te:
On Jun 13, 9:11 am, Vadim Tropashko <vadimtro_inva. ..@yahoo.com>
wrote:
1 statement_block
1.1 'declare' (matches token #1)
1.2 declaration_ite m_list
1.2.1 identifier (matches token #2)
1.2.2 'integer' (matches token #3)
1.2.3 ';' (matches token #4)
1.3 'begin' (matches token #5)
1.4 statements
1.4.1 assignment
1.4.1.1 identifier
1.4.1.2 ':='
1.4.1.3 number
1.4.1.4 ';'
1.4.2 statements (matches empty token)
1.5 'end';

Actually, here is quite simple method. Write down the parse tree as a
set of paths, using grammar varibles and constants (terminals and
nonterminals) as path elements. In the example above we get:

statement_block
statement_block .'declare'
statement_block .declaration_it em_list
statement_block .declaration_it em_list.identif ier"i"
statement_block .declaration_it em_list.'intege r'
... and so on ...

Query this set with regular expressions.

XML, good bye!
Not so fast. An enumeration of the paths in a tree does not always
give you enough information to reconstruct the tree, so you would be
losing expressive power. Never mind that you ignore the order, which
may be a factor for the scope of declarations.

-- Jan Hidders

Jun 13 '07 #9
On Jun 13, 2:44 pm, Jan Hidders <hidd...@gmail. comwrote:
On 13 jun, 22:58, Vadim Tropashko <vadimtro_inva. ..@yahoo.comwro te:


On Jun 13, 9:11 am, Vadim Tropashko <vadimtro_inva. ..@yahoo.com>
wrote:
1 statement_block
1.1 'declare' (matches token #1)
1.2 declaration_ite m_list
1.2.1 identifier (matches token #2)
1.2.2 'integer' (matches token #3)
1.2.3 ';' (matches token #4)
1.3 'begin' (matches token #5)
1.4 statements
1.4.1 assignment
1.4.1.1 identifier
1.4.1.2 ':='
1.4.1.3 number
1.4.1.4 ';'
1.4.2 statements (matches empty token)
1.5 'end';
Actually, here is quite simple method. Write down the parse tree as a
set of paths, using grammar varibles and constants (terminals and
nonterminals) as path elements. In the example above we get:
statement_block
statement_block .'declare'
statement_block .declaration_it em_list
statement_block .declaration_it em_list.identif ier"i"
statement_block .declaration_it em_list.'intege r'
... and so on ...
Query this set with regular expressions.
XML, good bye!

Not so fast. An enumeration of the paths in a tree does not always
give you enough information to reconstruct the tree, so you would be
losing expressive power. Never mind that you ignore the order, which
may be a factor for the scope of declarations.
Derivation tree was the confusing part. When we query nodes on a tree
what exactly are we doing? Then, your reply was a critical for me
understanding that the tree structure is unnecessary, the derivation
is essentially a language -- a set of words (which includes both
terminals and nonterminals) and this set of words can be quieried
solely with the language theory means. Formally, a query is a language
intersection.

The order on the tree comes from the order in the Kleene algebra, and
of course the information about the order is still there in the
grammar rules (or should I say inequalities?).

The "reqular expression" exclamation in my previous is
oversimplificat ion, of course.

Jun 13 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
4121
by: Mike N. | last post by:
Hello to all: First let me apologize for the length of this question, I've made an attempt to include as much information as is needed to help with the question. I am having problems putting together a query to pull out an alternative hierarchical view of my data. The database is implemented under SQL Sever 2000 and I am writing the front end using VB.Net and ADO.net. The following is the portion of my database structure that I am...
3
5590
by: Cláudia Morgado | last post by:
Hello! Oracle has the option with the SQL CONECT BY statement to run through a hierarchical database with a single SQl-statement: <!--SQL SELECT ms_id,ms_parent FROM messages CONNECT BY PRIOR ms_id = ms_parent START WITH ms_id = 1 --> Result-set (example): ms_id parent_id 1 1.1 1 1.1.1 1.1 1.1.2 1.1 1.1.3 1.1 1.2 1 1.2.1 1.2
13
5907
by: Anton.Nikiforov | last post by:
Hello everybody! Does someone know how to build hierarchical queries to the postgresql? I have a table with tree in it (id, parent) and need to find a way from any point of the tree to any other point. And i would like to have a list of all steps from point A to point B to make some changes on each step (this is required by the algorythm). Here is an example:
1
2620
by: News Server | last post by:
Any help appreciated. I have two Access tables, customer and orders. I would like to create a truly hierarchical xml file form the joined tables. I need to produce: <Query1> <Customer> <Name>Bob</Name> <Order> <ID>1001</ID>
4
2610
by: Daisy | last post by:
Let's say I've got a forum, where users can be moderators of each forum. Tables look like this: USER -------- user_key name FORUM
5
2257
by: clintonG | last post by:
I'm looking for documentation and would not turn my nose up to any code from anybody who thinks they are good at the design of an algorythm that can be used to generated a hierarchical relational data model. What? A Yahoo-like drill-down menu that is a series of categories and nested categories is a hierarchical relational data model. An example can be seen at but the review of the query string values strongly indicates
3
2044
by: Bennett Haselton | last post by:
I want to display a hierarchical listing of items from a database table, where, say, each row in the table has an "ID" field and a "parent_id" field giving the ID of its parent (NULL if it's at the top level of the hierarchy) -- like message posts and their replies. Is there a built-in way to do this, or a generally accepted simplest way? My first idea was to create a user control like HierarchicalListing that contains a Repeater, and...
1
1868
by: rose74 | last post by:
hi i'd like to know how can i merge two hierarchical trees together. for instnace to query for each deparatment all its employees with considering to the employees hierarchicy (managers-employees).
12
5823
by: Steve | last post by:
I have been studying the Adjacency List Model as a means of achieving a folder structure in a project I am working on. Started with the excellent article by Gijs Van Tulder http://www.sitepoint.com/article/hierarchical-data-database My database has this basic structure: Id FolderName
0
9632
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10302
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10136
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10071
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9925
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8958
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7478
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5372
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
3
2867
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.