473,326 Members | 2,099 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

XSL for recursive transformation

Hi,
I have a XHTML input file with custom tag which specifies html
fragments to include
For example:
<html>
....
<include frag1="frag1.html" frag2="frag2.html">
More html here
</include>
....html...
<include frag1="frag3.html" ....>...

</html>
The include tag can be nested. The contents of an include tag would be
combined with the fragments [frag1.html and frag2.html] to produce the
output xml which would replace the currently processed include tag.
After that the whole output has to be checked for valid XML. And the
process is continued until there are no more include tags.

I was wondering about the best way to go about doing this. Is XSL
suitable? If so how?

Thanks
Indy

Feb 15 '06 #1
16 1821
Indy wrote:
I was wondering about the best way to go about doing this. Is XSL
suitable? If so how?


Given that XHTML is an XML language, the *right* way to do this would be
to use XInclude tags. Assuming your XHTML processor supports XInclude,
of course.

If it doesn't -- yes, you can implement XInclude, or similar
functionality, in XSLT if you want to. One such implementation can be
seen at http://www.dpawson.co.uk/xsl/sect2/include.html

(It's always worth checking Dave Pawson's XSLT FAQ website. He's done a
very good job of collecting many of the best answers from the XSLT
user's mailing list. Which, by the way, is also worth subscribing to if
you're looking for a deeper understanding of stylesheets.)
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Feb 15 '06 #2
Joe Kesselman wrote:
Indy wrote:
I was wondering about the best way to go about doing this. Is XSL
suitable? If so how?

Given that XHTML is an XML language, the *right* way to do this would be
to use XInclude tags. Assuming your XHTML processor supports XInclude,
of course.


FWIW, mod_transform for Apache is an XSLT filter that supports XInclude
(based on libxml2/libxslt). So it's a solved problem on the Web.

However, XSLT is not a good solution to this, except for small
documents. Inclusion can be streamed, so it'll be hugely faster
and more scalable using a SAX-based parser. mod_publisher would
be a better choice.

--
Nick Kew
Feb 15 '06 #3
Indy wrote:
Hi,
I have a XHTML input file with custom tag which specifies html
fragments to include
For example:
<html>
...
<include frag1="frag1.html" frag2="frag2.html">
More html here
</include>
...html...
<include frag1="frag3.html" ....>...

</html>
The include tag can be nested. The contents of an include tag would be
combined with the fragments [frag1.html and frag2.html] to produce the
output xml which would replace the currently processed include tag.
After that the whole output has to be checked for valid XML. And the
process is continued until there are no more include tags.

I was wondering about the best way to go about doing this.


Why not just use entity declarations?

///Peter
--
XML FAQ: http://xml.silmaril.ie/
Feb 15 '06 #4
Peter Flynn wrote:
Why not just use entity declarations?


Parsed entities are pretty much dying as XML Schema replaces DTDs.
Schemas don't have any equivalent. XInclude/XLink were supposed to take
over that role.
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Feb 16 '06 #5
Hi,
Thanks for your comments, I tried using XInclude tags but came across
some problems.
The fragments that I'm trying to include are not valid XML themselves,
they could be for example be:
---sof---
<table><tr><td>This is a header</td></tr>
---eof---

and only when the fragments are assembled it forms a valid XML.

Do you think XInclude can still be used to achieve this?

Thanks again,
Indeera

Feb 16 '06 #6
In article <11*********************@o13g2000cwo.googlegroups. com>,
Indy <in*****@gmail.com> wrote:
The fragments that I'm trying to include are not valid XML themselves, ....and only when the fragments are assembled it forms a valid XML. Do you think XInclude can still be used to achieve this?


No. XInclude operates at the level of the XML Infoset, not on
characters. You will need to use a non-XML tool to put them together.

-- Richard
Feb 16 '06 #7
Indy wrote:
The fragments that I'm trying to include are not valid XML themselves,


In which case XML-aware tools aren't going to handle them. Write
something text-based.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Feb 16 '06 #8
.... or redesign the whole problem so you're working with XML structure
rather than text fragments.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Feb 16 '06 #9
Indy wrote:
I have a XHTML input file with custom tag which specifies html
fragments to include


Other posters have suggested ways to include XML fragments in XML.

However I'd advise against this, because you're trying to embed HTML as
the fragment and HTML is _not_ XML. HTML needs to be processed with
text or SGML aware tools, not XML. What happens if you encounter a HTML
fragment that's not well-formed? What happens if you _want_ to use a
fragment that's not well forned?

RSS has addressed this same problem before now. Worth reading the
background.

Feb 16 '06 #10
Joe Kesselman wrote:
Peter Flynn wrote:
Why not just use entity declarations?
Parsed entities are pretty much dying as XML Schema replaces DTDs.


I think you'll find them alive and kicking in many places. Reports
of the death of DTDs are greatly exaggerated.
Schemas don't have any equivalent.
QED
XInclude/XLink were supposed to take over that role.


Oooh look, flying pigs :-)

///Peter

Feb 16 '06 #11
>> Parsed entities are pretty much dying as XML Schema replaces DTDs.

I think you'll find them alive and kicking in many places. Reports
of the death of DTDs are greatly exaggerated.


Uhm. I agree that schemas are taking longer to find their way in than
might have been expected, partly becuase they're a syntax only a
database expert or computer science geek could love. (Though frankly the
DTD syntax is also pretty hideous.)

However, entities are definitely on the way out. The problem is that
they really aren't all that useful unless there's a fragment that will
appear in a huge number of instances of this kind of document, and even
then they're only a significant advantage when producing the document by
hand; it is a significant pain for software to recognize that the
opportunity exists to take advantage of a parsed entity, and there
usually isn't much to be gained by doing so.

Entities had value when most docs were produced by humans pounding on
raw XML text; they really aren't useful for docs produced by smarter
editors. Most of the things you might still want to use them for can be
handled better by an appropriate tool -- an editor that lets you see and
enter the actual characters rather than their named equivalents, for
example, or a syntax that's actually defined in the document rather than
in a non-tag-language secondary file. Among other things, that permits
different documents to reference different resource rather than having
only a single set, hard-wired into the DTD, that they can name.
XInclude/XLink were supposed to take over that role.

Oooh look, flying pigs :-)


I did put it in the imperfect tense... Part of the problem is that we're
finding that the need for a portable syntax for documents referencing
other documents isn't as universal as we expected. Or at least isn't so
right now.

If we'd designed XML completely before releasing it to the public, we
would have started with the infoset (including namespaces and schemas
and includes and links), then designed the syntax and APIs from that,
Instead the W3C started with the syntax and a known-inadequate schema
language (DTDs), and has build everything out from there. The upside is
that folks had a chance to start using XML much earlier, and we've
gotten some benefit from seeing which directions everyone has gone with
it. The downside is that there have been some warts and hiccups and
direction changes along the way, and tools have not always been quick to
catch up -- and even when they have, folks who have working solutions
using the old stopgaps are often reluctant to make the effort to move
over. Which leaves all of us with the job of supporting multiple ways of
doing things and trying to gently push folks toward the ones that will
make their life -- and ours -- easier in the long run.

Oh well. The cutting edge usually has a few nicks in it.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Feb 17 '06 #12
Joe Kesselman wrote:
Parsed entities are pretty much dying as XML Schema replaces DTDs.
I think you'll find them alive and kicking in many places. Reports
of the death of DTDs are greatly exaggerated.


Uhm. I agree that schemas are taking longer to find their way in than
might have been expected, partly because they're a syntax only a
database expert or computer science geek could love. (Though frankly the
DTD syntax is also pretty hideous.)


Only a syntax geek would love it, but it has the advantage of being very
terse, and once learned, quite expressive. RelaxNG seems to be the way
forward, but I still feel we did the community a disservice by not
properly investigating the possibility of adding datatyping to DTDs
before running amok with W3C Schemas. Ah well. Another time.
However, entities are definitely on the way out. The problem is that
they really aren't all that useful unless there's a fragment that will
appear in a huge number of instances of this kind of document, and even
then they're only a significant advantage when producing the document by
hand;
Actually there is rather a lot of stuff out there that does this.
it is a significant pain for software to recognize that the
opportunity exists to take advantage of a parsed entity, and there
usually isn't much to be gained by doing so.
For parsed entities, yes. Legal boilerplate, tech doc, and chapter
files for long documents are the only real candidates.

Parameter entities are a different matter.
Entities had value when most docs were produced by humans pounding on
raw XML text; they really aren't useful for docs produced by smarter
editors. Most of the things you might still want to use them for can be
handled better by an appropriate tool -- an editor that lets you see and
enter the actual characters rather than their named equivalents, for
This refers to character entities. Sadly, editors are still in their
infancy when it comes to the interface (hence my thesis topic), and
there are still a gazillion so-called plaintext editors (non-XML) out
there that XML beginners use, which seriously screws up their chances
when they start editing UTF-8. For this reason, several companies and
projects I have been dealing with have made it policy for the moment
to create ISO-8859-1 files only, and ALL other characters go in as
character entity references or numeric references (fortunately for them
they deal only with western languages in Latin scripts).
example, or a syntax that's actually defined in the document rather than
in a non-tag-language secondary file. Among other things, that permits
different documents to reference different resource rather than having
only a single set, hard-wired into the DTD, that they can name.
XInclude/XLink were supposed to take over that role. Oooh look, flying pigs :-)


I did put it in the imperfect tense...


Sorry, I was being deliberately provocative.
Part of the problem is that we're
finding that the need for a portable syntax for documents referencing
other documents isn't as universal as we expected. Or at least isn't so
right now.
Ahead of the curve as usual :-) Although the demand for a syntax to
refer from one document to another is slowly approaching FAQ-level.
It's just embarrassing that we had multi-way bidirectional 3rd-party
linking in the Panorama plugin a decade ago, and still nothing to
replace it.
If we'd designed XML completely before releasing it to the public,
We'd still be discussing it.
would have started with the infoset (including namespaces and schemas
and includes and links), then designed the syntax and APIs from that,
Instead the W3C started with the syntax and a known-inadequate schema
language (DTDs), and has build everything out from there. The upside is
that folks had a chance to start using XML much earlier, and we've
gotten some benefit from seeing which directions everyone has gone with
I like the description, although I disagree about the infoset. Coming
from the tech doc background, I would have preferred to see some of the
useful SGML features retained and more attention paid to the usability
of markup. Pretending that a document is a tree when it's not (it's a
document!) was a mistake we are still paying for. Starting with the
syntax was OK, IMHO, and pretty much 99% of what we did was right. But
schemas were a later development, a bolt-on which only came when the
XML-Data folks saw the market for the syntax (and that's something else
we'll end up paying for -- I see way too many slabs of data being done
into XML when CSV would be much more sensible).
it. The downside is that there have been some warts and hiccups and
direction changes along the way, and tools have not always been quick to
catch up -- and even when they have, folks who have working solutions
using the old stopgaps are often reluctant to make the effort to move
over.
This is going to be the interesting bit. New tools -- *really good* new
tools -- are few and far between. And there are too many good old tools
which have become unavailable just at the point when they were most
needed, because of corporate buyouts resulting in technically-unaware
people dropping the ball.
Which leaves all of us with the job of supporting multiple ways of
doing things and trying to gently push folks toward the ones that will
make their life -- and ours -- easier in the long run.
It does work eventually. I've only had one breakage so far, and that was
due to sabotage.
Oh well. The cutting edge usually has a few nicks in it.


Mind that axe, Eugene.

///Peter
Feb 17 '06 #13
Peter Flynn said the following on 2/17/2006 22:52 +0200:
Mind that axe, Eugene.


Actually, it's "Careful with that Axe, Eugene" ;)

http://www.pink-floyd-lyrics.com/htm...ma-lyrics.html

--
Regards
Harrie
Feb 17 '06 #14
>> If we'd designed XML completely before releasing it to the public,
We'd still be discussing it.
Which is why they went the other way around. Unfortunately that left us
with some warts where the afterthoughts were tacked on (including some
that could have been avoided, but... oh well; too much water over the
dam at this point).
I like the description, although I disagree about the infoset. Coming
from the tech doc background, I would have preferred to see some of the
useful SGML features retained
Trimming away everything that wasn't absolutely required is what made
implementing XML easy. If you've ever written an SGML processor, you
know getting it right is messy at best. XML was deliberately restricted
to the point where the parser is implementable by an average student in
a week or less.
This is going to be the interesting bit. New tools -- *really good* new
tools -- are few and far between.


They're starting to appear, though. If you see a market not being
adequately served, think of it as a marketing opportunity. That's what
got us started on Xerces and Xalan...<grin/>

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Feb 22 '06 #15
Joe Kesselman wrote:
Trimming away everything that wasn't absolutely required is what made
implementing XML easy. If you've ever written an SGML processor, you
know getting it right is messy at best. XML was deliberately restricted
to the point where the parser is implementable by an average student in
a week or less.
I think Tim Bray's comment was "implementable in 'just a few' 30-hour
Perl hacking sessions" :-)
They're starting to appear, though. If you see a market not being
adequately served, think of it as a marketing opportunity.


Oh I am, believe me :-)

///Peter
Feb 23 '06 #16
Peter Flynn wrote:
I think Tim Bray's comment was "implementable in 'just a few' 30-hour
Perl hacking sessions" :-)


The concept of the DPH -- Desperate Perl Hacker -- has been invoked a
number of times as an argument for why everything should be kept as
simple as possible. (But not simpler.)

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Feb 24 '06 #17

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Sergio del Amo | last post by:
Hi, I use the xslt functions provided by php. I am running in my computer the package xampp(www.apachefriends.org) which includes php/apache/mysql .. In this package the php includes the sablotron...
7
by: bearophileHUGS | last post by:
(This is a repost from another python newsgroup). While using some nested data structures, I've seen that I'd like to have a function that tells me if a given data structure contains one or more...
0
by: yurick | last post by:
Hello everybody, I have XML structure like this: <person name="adam"/> <person name="eve"/> <person name="cain"> <parent name="adam"/> <parent name="eve"/> </person>
0
by: b0yce | last post by:
Hi all, I am trying to create a recursive loop transformation that remembers last position of inner loop so that it continues from that point instead of the point from where the recursion...
7
by: Rolf Kemper | last post by:
Dear All, somehow I remember that such or similar question was discussed already somewhere. But I can't find it anymore. I have a template calling itself. As long it goes deeper into the...
1
by: Paul Guz | last post by:
I've discovered a quirk of .Net System.Xml.Xsl.XSLTransfrom that doesn't seem to exist in the MSXML2 transformation. When calling a recursive template for the first time, don't pass a parameter...
14
by: BQ | last post by:
Due to a lack of resources, I have to translate the following recursive function in its iterative form. It's a kind of dichotomic search. void SearchSlaves(unsigned long start_uid, unsigned long...
5
by: monmonja | last post by:
Hi i'm new to xsl and i have been using smarty php templating but its just so hard to read codes in smarty/php/flash than xml/xsl/flash, i rather sacrifice speed then not being able to read code...
13
by: jm.suresh | last post by:
Hi, I have a program which literately finds the object that overlapping a point. The horizontal and vertical search are called recursively from inside each other. Is this way of implementation...
41
by: Harry | last post by:
Hi all, 1)I need your help to solve a problem. I have a function whose prototype is int reclen(char *) This function has to find the length of the string passed to it.But the conditions...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.