473,769 Members | 1,711 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Split element with another one

Hi,

I'm creating XML-files from printed documents. According to the DTD I
have to use, there has to be pagebreaks in the XML-file. These
pagebrakes must be located whenever a new page in the printed version
occurs. This is fairly simple to accomplish.
The problem is however, the DTD states that the pagebreak cannot occur
inside paragraph-element, but must be in between them.
Is it possible, using XSLT, to end the paragraph-element before the
pagebreak, and start a new one after it?

To illustrastrate:

Illegal text block:
<para>Blah blah
<pagebreak/>
more blah blah</para>

Must become:
<para>Blah blah</para>
<pagebreak/>
<para>more blah blah</para>

I'm grateful for any help!

regards,
Eivind Andersen

Jul 20 '05 #1
7 1556
"Eivind" <ei******@spyma c.com> writes:
Hi,

I'm creating XML-files from printed documents. According to the DTD I
have to use, there has to be pagebreaks in the XML-file. These
pagebrakes must be located whenever a new page in the printed version
occurs. This is fairly simple to accomplish.
The problem is however, the DTD states that the pagebreak cannot occur
inside paragraph-element, but must be in between them.
Is it possible, using XSLT, to end the paragraph-element before the
pagebreak, and start a new one after it?

To illustrastrate:

Illegal text block:
<para>Blah blah
<pagebreak/>
more blah blah</para>

Must become:
<para>Blah blah</para>
<pagebreak/>
<para>more blah blah</para>

I'm grateful for any help!

regards,
Eivind Andersen


XSLT can do essentially arbitrary tree transformations so the answer is
yes, but in this case the transformation may be more or less hard
depending where pagebreak can be. Do you know that it's at the top level
of para (this makes it fairly easy or can it be nested aywhere

<para>Blah blah <italic> xxx <bold> zzz</bold>
<pagebreak/> yyy</italic>
more blah blah</para>

In the latter case things are "interestin g" as you have to close an
arbitrary number of elements, and things get more interesting if
the pagebreak appears in table markup and you have to correcly close all
teh elemenst and open up everything needed for a new table...

Assuming the simple case this is a grouping problem you just want to
group all children of para depending on their position related to
pagebreak, searching for xslt grouping on google will show lots of
possibilities

eg

<xsl:template match="para">
<xsl:copy-of select="."/>
</xsl:template>

<xsl:template match="para[pagebreak]">
<para>
<xsl:copy-of select="@*|page break[1]/preceding-sibling::node() "/>
</para>
<xsl:for-each select="pagebre ak">
<xsl:copy-of select="."/>
<para>
<xsl:copy-of select="../@*"><!-- re-copy attributes, you might not want that-->
<xsl:apply-templates select="followi ng-sibling::node()[1] mode="p"/>
</para>
</xsl:for-each>
</xsl:template>

<xsl:template match="node()" mode="p">
<xsl:copy-of select="."/>
<xsl:apply-templates select="followi ng-sibling::node()[1] mode="p"/>
</xsl:template>

<xsl:template match="pagebrea k" mode="p"/>

David
Jul 20 '05 #2
Wow! Thank you!

Fortunately the pagebreaks wont occur inside a table, but it's possible
to have one inside an <italic> or <bold> element.

I havent been able to test this code yet, but I get on it first thing
monday morning, and I'll report back a littel bit later :).

Again, thank you for ble incredlble quick and helpful reply!

Eivind

Jul 20 '05 #3
Fortunately the pagebreaks wont occur inside a table, but it's possible
o have one inside an <italic> or <bold> element.


It's really a lot harder if that can happen.
The general case where you have to close an arbitrary number of elements
would need a completely different approach essentially walking over
the whole tree one node at a a time building up a data structure of
currently open elements as you go along. Ie implementing a parser in
xslt. This is certainly possible but probably not a lot of fun (it would
be a bit more fun in xslt2 than xslt1) But if you can tie down a secific
list of bad things that can happen, in practice most cases can be done
fairly easily in xslt, usually, on a good day...

David
Jul 20 '05 #4
Hi,

I've tried using the xsl templates you provided, and they seem to work
quite good. However, the templates inserts some new attributes to the
para and pagebreak elements:

<pagebreak xmlns:xlink="ht tp://www.w3.org/1999/xlink"
xmlns:mml="http ://www.w3.org/1998/Math/MathML">116</pagebreak>

How can you remove these? (I must admit I don't entirely undestand
what's going on in the templates you gave me, so I don't see where the
new attributes are inserted, and how to remove them)

Eivind

Jul 20 '05 #5

I've tried using the xsl templates you provided, and they seem to work
quite good. However, the templates inserts some new attributes to the
para and pagebreak elements:
<pagebreak xmlns:xlink="ht tp://www.w3.org/1999/xlink"
xmlns:mml="http ://www.w3.org/1998/Math/MathML">116</pagebreak>
These namespace declarations do not come from the templates I provided in
this thread, they must be declared either elsewhere in your stylesheet
or in your source file. How to get rid of them depends on where they
came from.

they may have come from me originally, I quite often use mml as the
mathml namespace prefix, but mathml hasn't been mentioned so far in this
thread has it?

David
Jul 20 '05 #6
It seems they come from the root-element of the source file.
(I tried to delete them from the source file, and then run the xslt
again. Result: no namespace declarations throughout the resulting
xml-file)

Thank you for all your help!

Eivind

Jul 20 '05 #7

In general of course removing namespace declarations from the input will
break the the input. If your document has any mathml in it then you
can't remove the mathml declaration.

To avoid copying, just don't use copy-of,

so I think i originally said something like:
<xsl:for-each select="pagebre ak">
<xsl:copy-of select="."/>

doing
<xsl:for-each select="pagebre ak">
<pagebreak/>

would generate a new pagebreak element rather than copying one from the
source so wouldn't copy any namespace nodes from the source.
(but would use any in scope namespaces from the stylesheet)

<xsl:for-each select="pagebre ak">
<xsl:element name="pagebreak "/>
is similar but wouldn't use any namespaces from the stylesheet either
(other than the default namepsace, if that has been declared)

David
Jul 20 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
3607
by: Brian Glen Palicia | last post by:
My goal is to accept input from the user into a text box and then parse the data using split(). The first step is this tiny program to test the split() function. It runs in IE, but in Mozilla it just hangs and keeps loading forever. I checked around on the web and in USENET, but I haven't seen any mention of split() not working in Mozilla. Thoughts? Thanks in advance. <HTML> <HEAD> </HEAD>
8
2112
by: Geoff Cox | last post by:
Hello, If I have fred = "0 -10 5 6 "; how can I split above into groups of 2, ie 0,-10
4
728
by: William Stacey [MVP] | last post by:
Would like help with a (I think) a common regex split example. Thanks for your example in advance. Cheers! Source Data Example: one "two three" four Optional, but would also like to ignore pairs of brackets like: "one" <tab> "two three" ( four "five six" ) Want fields like:
3
2101
by: John Salerno | last post by:
This is an example in the book I'm reading: string fullName = " Edward C Koop "; fullName = fullName.Trim(); string names = fullName.Split(' '); string firstName = names; // Edward Two questions about this: 1. Why do you use single quotes with Split() instead of double? Is this
4
1325
by: Craig Buchanan | last post by:
I am trying to split a comma-delimited string into a string array. unfortunately, if the string doesn't contain a comma, the resulting array is Nothing. other than using vb6 compatibility, is there another option? thanks, Craig Buchanan
10
2514
by: teddyber | last post by:
Hello, first i'm a newbie to python (but i searched the Internet i swear). i'm looking for some way to split up a string into a list of pairs 'key=value'. This code should be able to handle this particular example string : qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des, 3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
2
2184
by: AMP | last post by:
Hello, I am trying to split an Array into another array. Each value in the array has tab delimited strings. I am getting: Cannot implicitly convert type 'string' to 'string' I am trying this: string lineSeparators = new String{"\r\n"}; string Channel =
1
3295
by: mad.scientist.jr | last post by:
I am working in C# ASP.NET framework 1.1 and for some reason Regex.Split isn't working as expected. When trying to split a string, Split is returning an array with the entire string in element and an empty string in element . I am trying two different ways (an ArrayList and a string array) and both are doing that. Also, IndexOf is not working, but StartsWith does. The code:
5
4777
by: Robert Dodier | last post by:
Hello, I'd like to split a string by commas, but only at the "top level" so to speak. An element can be a comma-less substring, or a quoted string, or a substring which looks like a function call. If some element contains commas, I don't want to split it. Examples: 'foo, bar, baz' ='foo' 'bar' 'baz'
5
12000
sicarie
by: sicarie | last post by:
I am attempting to parse a CSV, but am not allowed to install the CSV parsing module because of "security reasons" (what a joke), so I'm attempting to use 'split' to break up a comma-delimited file. My issue is that as soon as an "empty" field comes up (two commas in a row), split seems to think the line is done and goes to the next one. Everything I've read online says that split will return a null field, but I don't know how to get it to...
0
9579
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
1
9978
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8860
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7392
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5293
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5432
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3947
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3551
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2810
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.