Hi,
I'm creating XML-files from printed documents. According to the DTD I
have to use, there has to be pagebreaks in the XML-file. These
pagebrakes must be located whenever a new page in the printed version
occurs. This is fairly simple to accomplish.
The problem is however, the DTD states that the pagebreak cannot occur
inside paragraph-element, but must be in between them.
Is it possible, using XSLT, to end the paragraph-element before the
pagebreak, and start a new one after it?
To illustrastrate:
Illegal text block:
<para>Blah blah
<pagebreak/>
more blah blah</para>
Must become:
<para>Blah blah</para>
<pagebreak/>
<para>more blah blah</para>
I'm grateful for any help!
regards,
Eivind Andersen 7 1556
"Eivind" <ei******@spyma c.com> writes: Hi,
I'm creating XML-files from printed documents. According to the DTD I have to use, there has to be pagebreaks in the XML-file. These pagebrakes must be located whenever a new page in the printed version occurs. This is fairly simple to accomplish. The problem is however, the DTD states that the pagebreak cannot occur inside paragraph-element, but must be in between them. Is it possible, using XSLT, to end the paragraph-element before the pagebreak, and start a new one after it?
To illustrastrate:
Illegal text block: <para>Blah blah <pagebreak/> more blah blah</para>
Must become: <para>Blah blah</para> <pagebreak/> <para>more blah blah</para>
I'm grateful for any help!
regards, Eivind Andersen
XSLT can do essentially arbitrary tree transformations so the answer is
yes, but in this case the transformation may be more or less hard
depending where pagebreak can be. Do you know that it's at the top level
of para (this makes it fairly easy or can it be nested aywhere
<para>Blah blah <italic> xxx <bold> zzz</bold>
<pagebreak/> yyy</italic>
more blah blah</para>
In the latter case things are "interestin g" as you have to close an
arbitrary number of elements, and things get more interesting if
the pagebreak appears in table markup and you have to correcly close all
teh elemenst and open up everything needed for a new table...
Assuming the simple case this is a grouping problem you just want to
group all children of para depending on their position related to
pagebreak, searching for xslt grouping on google will show lots of
possibilities
eg
<xsl:template match="para">
<xsl:copy-of select="."/>
</xsl:template>
<xsl:template match="para[pagebreak]">
<para>
<xsl:copy-of select="@*|page break[1]/preceding-sibling::node() "/>
</para>
<xsl:for-each select="pagebre ak">
<xsl:copy-of select="."/>
<para>
<xsl:copy-of select="../@*"><!-- re-copy attributes, you might not want that-->
<xsl:apply-templates select="followi ng-sibling::node()[1] mode="p"/>
</para>
</xsl:for-each>
</xsl:template>
<xsl:template match="node()" mode="p">
<xsl:copy-of select="."/>
<xsl:apply-templates select="followi ng-sibling::node()[1] mode="p"/>
</xsl:template>
<xsl:template match="pagebrea k" mode="p"/>
David
Wow! Thank you!
Fortunately the pagebreaks wont occur inside a table, but it's possible
to have one inside an <italic> or <bold> element.
I havent been able to test this code yet, but I get on it first thing
monday morning, and I'll report back a littel bit later :).
Again, thank you for ble incredlble quick and helpful reply!
Eivind Fortunately the pagebreaks wont occur inside a table, but it's possible o have one inside an <italic> or <bold> element.
It's really a lot harder if that can happen.
The general case where you have to close an arbitrary number of elements
would need a completely different approach essentially walking over
the whole tree one node at a a time building up a data structure of
currently open elements as you go along. Ie implementing a parser in
xslt. This is certainly possible but probably not a lot of fun (it would
be a bit more fun in xslt2 than xslt1) But if you can tie down a secific
list of bad things that can happen, in practice most cases can be done
fairly easily in xslt, usually, on a good day...
David
Hi,
I've tried using the xsl templates you provided, and they seem to work
quite good. However, the templates inserts some new attributes to the
para and pagebreak elements:
<pagebreak xmlns:xlink="ht tp://www.w3.org/1999/xlink"
xmlns:mml="http ://www.w3.org/1998/Math/MathML">116</pagebreak>
How can you remove these? (I must admit I don't entirely undestand
what's going on in the templates you gave me, so I don't see where the
new attributes are inserted, and how to remove them)
Eivind
I've tried using the xsl templates you provided, and they seem to work
quite good. However, the templates inserts some new attributes to the
para and pagebreak elements:
<pagebreak xmlns:xlink="ht tp://www.w3.org/1999/xlink"
xmlns:mml="http ://www.w3.org/1998/Math/MathML">116</pagebreak>
These namespace declarations do not come from the templates I provided in
this thread, they must be declared either elsewhere in your stylesheet
or in your source file. How to get rid of them depends on where they
came from.
they may have come from me originally, I quite often use mml as the
mathml namespace prefix, but mathml hasn't been mentioned so far in this
thread has it?
David
It seems they come from the root-element of the source file.
(I tried to delete them from the source file, and then run the xslt
again. Result: no namespace declarations throughout the resulting
xml-file)
Thank you for all your help!
Eivind
In general of course removing namespace declarations from the input will
break the the input. If your document has any mathml in it then you
can't remove the mathml declaration.
To avoid copying, just don't use copy-of,
so I think i originally said something like:
<xsl:for-each select="pagebre ak">
<xsl:copy-of select="."/>
doing
<xsl:for-each select="pagebre ak">
<pagebreak/>
would generate a new pagebreak element rather than copying one from the
source so wouldn't copy any namespace nodes from the source.
(but would use any in scope namespaces from the stylesheet)
<xsl:for-each select="pagebre ak">
<xsl:element name="pagebreak "/>
is similar but wouldn't use any namespaces from the stylesheet either
(other than the default namepsace, if that has been declared)
David This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Brian Glen Palicia |
last post by:
My goal is to accept input from the user into a text box and then
parse the data using split(). The first step is this tiny program to
test the split() function. It runs in IE, but in Mozilla it just
hangs and keeps loading forever. I checked around on the web and in
USENET, but I haven't seen any mention of split() not working in
Mozilla. Thoughts? Thanks in advance.
<HTML>
<HEAD>
</HEAD>
|
by: Geoff Cox |
last post by:
Hello,
If I have
fred = "0 -10 5 6 ";
how can I split above into groups of 2, ie
0,-10
|
by: William Stacey [MVP] |
last post by:
Would like help with a (I think) a common regex split example. Thanks for
your example in advance. Cheers!
Source Data Example:
one "two three" four
Optional, but would also like to ignore pairs of brackets like:
"one" <tab> "two three" ( four "five six" )
Want fields like:
|
by: John Salerno |
last post by:
This is an example in the book I'm reading:
string fullName = " Edward C Koop ";
fullName = fullName.Trim();
string names = fullName.Split(' ');
string firstName = names; // Edward
Two questions about this:
1. Why do you use single quotes with Split() instead of double? Is this
|
by: Craig Buchanan |
last post by:
I am trying to split a comma-delimited string into a string array.
unfortunately, if the string doesn't contain a comma, the resulting array is
Nothing. other than using vb6 compatibility, is there another option?
thanks,
Craig Buchanan
| |
by: teddyber |
last post by:
Hello,
first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :
qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
|
by: AMP |
last post by:
Hello,
I am trying to split an Array into another array. Each value in the
array has tab delimited strings.
I am getting:
Cannot implicitly convert type 'string' to 'string'
I am trying this:
string lineSeparators = new String{"\r\n"};
string Channel =
|
by: mad.scientist.jr |
last post by:
I am working in C# ASP.NET framework 1.1 and
for some reason Regex.Split isn't working as expected.
When trying to split a string, Split is returning an array
with the entire string in element and an empty string in element
.
I am trying two different ways (an ArrayList and a string array)
and both are doing that. Also, IndexOf is not working,
but StartsWith does.
The code:
|
by: Robert Dodier |
last post by:
Hello,
I'd like to split a string by commas, but only at the "top level" so
to speak. An element can be a comma-less substring, or a
quoted string, or a substring which looks like a function call.
If some element contains commas, I don't want to split it.
Examples:
'foo, bar, baz' ='foo' 'bar' 'baz'
|
by: sicarie |
last post by:
I am attempting to parse a CSV, but am not allowed to install the CSV parsing module because of "security reasons" (what a joke), so I'm attempting to use 'split' to break up a comma-delimited file.
My issue is that as soon as an "empty" field comes up (two commas in a row), split seems to think the line is done and goes to the next one.
Everything I've read online says that split will return a null field, but I don't know how to get it to...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
|
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |