By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,440 Members | 1,041 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,440 IT Pros & Developers. It's quick & easy.

XQuerying material between elements

P: n/a
I am working with marking up the text of old books,
and need to be able to present the result page-wise.
Problem is, sometimes the page breaks occurs in the
middle of a paragraph (or in some other element).
See the following example.

<p>I shall not describe it to you, for in-
<lb/>deed I cannot. To delineate the truly aw-
<lb/>ful locality of Trollhättan, would
<lb/>baffle the powers of poetic fancy, and mock
<pb n="15" urn="urn:nbn:se:kb:digark-7886"/>
<lb/>the painter's daring pencil. I ran only af-
<lb/>ford you a faint idea of its characteristic
<lb/>features, and even that will he found
<lb/>arduous. Come, and see it, and you will
<lb/>applaud my modesty.
</p>

<p>[...]
<lb/>of gold." Subscribing to the old Swedish
<lb/>proverb: When it rains down milk, the poor
<lb/>has no spoon," I silently dropped the theme,
<lb/>and would not have rementioned it now,
<pb n="16" urn="urn:nbn:se:kb:digark-7887"/>
<lb/>if I were not anxious to dis-play to you, what
<lb/>an able minister of state I might possibly
<lb/>be, if His Majesty should be pleased to
<lb/>invest me with that honor, which, you
<lb/>know, is as distant from me as the mitre
<lb/>and the slipper of the Pope of Rome.
</p>

Just separating out the material in between the <pb/>'s
gives non-wellformed XML.

So, is it possible to write an XQuery expression that
can fix this, i.e. 'detect' that the <pb/occurs in
the middle of another element and take the appropriate
action? The result would have to look something like

<pb n="15" urn="urn:nbn:se:kb:digark-7886"/>
<p rend="noindent">the painter's daring pencil. I ran only af-
<lb/>ford you a faint idea of its characteristic
<lb/>features, and even that will he found
<lb/>arduous. Come, and see it, and you will
<lb/>applaud my modesty.
</p>

<p>[...]
<lb/>of gold." Subscribing to the old Swedish
<lb/>proverb: When it rains down milk, the poor
<lb/>has no spoon," I silently dropped the theme,
<lb/>and would not have rementioned it now,
</p>

Thanks.

Apr 25 '07 #1
Share this Question
Share on Google+
9 Replies


P: n/a
I'm sure XQuery can do it, though I'm not sure of the syntax offhand.

In XPath, I would set up a template that matches on p[pb] (a paragraph
that contains a page break) and rewrites it appropriately by first
outputting a p containing the pb's preceeding siblings, then the pb,
then a p containing the following siblings. Very straightforward.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Apr 25 '07 #2

P: n/a
Joseph Kesselman <ke************@comcast.netwrote in
<462f6975$1@kcnews01>:
So, is it possible to write an XQuery expression that
can fix this, i.e. 'detect' that the <pb/occurs in
the middle of another element and take the appropriate
action? The result would have to look something like
I'm sure XQuery can do it, though I'm not sure of the
syntax offhand.

In XPath, I would set up a template that matches on p[pb]
(a paragraph that contains a page break) and rewrites it
appropriately by first outputting a p containing the pb's
preceeding siblings, then the pb, then a p containing the
following siblings. Very straightforward.
XSLT does indeed seem like a better bet than XQuery in this
case, but if you try to generalize the problem a bit
(multiple page breaks and more than one level of ancestor
elements to be spliced) it gets kinda messy with XSLT1. On
the other hand, an XSLT2 solution would be fairly elegant
thanks to sequences--may FSM touch with his noodly
appendage whoever on XSLT WG came up with those.

--
Pavel Lepin
Apr 25 '07 #3

P: n/a
Joseph Kesselman wrote:
In XPath,
Meant to write XSLT, obviously. Sigh. Engage mind, THEN put fingers in
gear...

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Apr 25 '07 #4

P: n/a
Thanks for the replies. I forgot to mention that the texts
are posited in the eXist database, hence the need for XQuery.
What I've managed to come up with is this.

1 <hit>
2 (: Check if the initial <pbis the child of another element,
3 and print the name of that element. :)
4 {
5 let $i1 := //pb[@urn='urn:nbn:se:kb:digark-7886']
6 return
7 if ($i1[parent::p]) then
8 '<p rend="noindent">'
9 else
10 if ($i1[parent::lg]) then
11 '<lg>'
12 else()
13 }
14 (: Print the material between the pagebreaks. :)
15 {
16 let $i1 := //pb[@urn='urn:nbn:se:kb:digark-7886'],
17 $i2 := //pb[@urn='urn:nbn:se:kb:digark-7887']
18 for $n in //text()
19 where $n >$i1 and $n << $i2
20 return $n
21 }
22 (: Check if the final <pbis the child of another element,
23 and print the name of that element. :)
24 {
25 let $i2 := //pb[@urn='urn:nbn:se:kb:digark-7887']
26 return
27 if ($i2[parent::p]) then
28 '</p>'
29 else
30 if ($i2[parent::lg]) then
31 '</lg>'
32 else()
33 }
34 </hit>

This works fine, except of course for the 'text()' om line 18.
This outputs only the text, not the text and markup, which is what I
want.
Switching 'text()' for 'node()' or 'element()' doesn't give the
desired result either, naturally.

Any suggestions are welcome. Thanks.
--
Patrik Nyman

Apr 26 '07 #5

P: n/a
pa**********@orient.su.se a écrit :

For my curiosity, is :
1 <hit>
2 (: Check if the initial <pbis the child of another element,
3 and print the name of that element. :)
4 {
5 let $i1 := //pb[@urn='urn:nbn:se:kb:digark-7886']
6 return
7 if ($i1[parent::p]) then
this :
8 '<p rend="noindent">'
9 else
10 if ($i1[parent::lg]) then
this :
11 '<lg>'
12 else()
13 }
14 (: Print the material between the pagebreaks. :)
15 {
16 let $i1 := //pb[@urn='urn:nbn:se:kb:digark-7886'],
17 $i2 := //pb[@urn='urn:nbn:se:kb:digark-7887']
18 for $n in //text()
19 where $n >$i1 and $n << $i2
20 return $n
21 }
22 (: Check if the final <pbis the child of another element,
23 and print the name of that element. :)
24 {
25 let $i2 := //pb[@urn='urn:nbn:se:kb:digark-7887']
26 return
27 if ($i2[parent::p]) then
this :
28 '</p>'
29 else
30 if ($i2[parent::lg]) then
and this :
31 '</lg>'
32 else()
33 }
34 </hit>
.... supposed to be mark-up in the resulting sequence ?

p.b.
Apr 26 '07 #6

P: n/a
Pierrick Brihaye wrote:
For my curiosity, is :
this :
> 8 '<p rend="noindent">'
... supposed to be mark-up in the resulting sequence ?
I certainly hope not, because if so I'd consider it an abuse of XQuery,
akin to trying to hand-construct tags in XSLT.

If the goal is to construct document structure, construct structure, not
text that looks like structure.
--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Apr 26 '07 #7

P: n/a
Hi,

How about something like this:

let $i1 := //pb[@urn='urn:nbn:se:kb:digark-7886'],
$i2 := //pb[@urn='urn:nbn:se:kb:digark-7887']
return <hit>{
if ($i1[parent::p])
then <p rend="noindent">{$i1/following-sibling::node()}</p>
else ()
,
for $n in //p
where $n >$i1 and $n << $i2 and not($n/*[. is $i1]) and not($n/*[. is
$i2])
return $n
,
if ($i2[parent::p])
then <p>{$i2/preceding-sibling::node()}</p>
else ()

}</hit>

Hope that helps,
Priscilla

---------------------------------------------
Priscilla Walmsley
Author, XQuery (2007, O'Reilly Media)
http://www.datypic.com
http://www.xqueryfunctions.com
---------------------------------------------

*** Sent via Developersdex http://www.developersdex.com ***
Apr 27 '07 #8

P: n/a
On 27 Apr, 19:13, Priscilla Walmsley <nos...@datypic.comwrote:
Hi,

How about something like this:

let $i1 := //pb[@urn='urn:nbn:se:kb:digark-7886'],
$i2 := //pb[@urn='urn:nbn:se:kb:digark-7887']
return <hit>{
if ($i1[parent::p])
then <p rend="noindent">{$i1/following-sibling::node()}</p>
else ()
,
for $n in //p
where $n >$i1 and $n << $i2 and not($n/*[. is $i1]) and not($n/*[. is
$i2])
return $n
,
if ($i2[parent::p])
then <p>{$i2/preceding-sibling::node()}</p>
else ()

}</hit>

Hope that helps,
Priscilla

---------------------------------------------
Priscilla Walmsley
Author, XQuery (2007, O'Reilly Media)http://www.datypic.comhttp://www.xqueryfunctions.com
---------------------------------------------

*** Sent via Developersdexhttp://www.developersdex.com***
Thanks a lot for this. I cannot test it until wednesday, but then I'll
let you know.

/Patrik Nyman

Apr 29 '07 #9

P: n/a
On 27 Apr, 19:13, Priscilla Walmsley <nos...@datypic.comwrote:
Hi,

How about something like this:

let $i1 := //pb[@urn='urn:nbn:se:kb:digark-7886'],
$i2 := //pb[@urn='urn:nbn:se:kb:digark-7887']
return <hit>{
if ($i1[parent::p])
then <p rend="noindent">{$i1/following-sibling::node()}</p>
else ()
,
for $n in //p
where $n >$i1 and $n << $i2 and not($n/*[. is $i1]) and not($n/*[. is
$i2])
return $n
,
if ($i2[parent::p])
then <p>{$i2/preceding-sibling::node()}</p>
else ()

}</hit>

Hope that helps,
Priscilla
Yes, it works, and is much better than my version!
Thanks a lot,
Patrik
May 3 '07 #10

This discussion thread is closed

Replies have been disabled for this discussion.