473,394 Members | 1,785 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

XSLT to transform a "flat" XML file into a structured text file

Subject: XSLT to transform a flat XML file into a structured text file

I have an XML file that lists the PDF file segment names and titles of a
larger document and looks something like this:

<DOCUMENT>
......
...... some lead elements
......
<SEGMENT_LIST>
<SEGMENT FILE="fwd.pdf">Foreword</SEGMENT>
<SEGMENT FILE="chap1.pdf">Chapter 1</SEGMENT>
<SEGMENT FILE="chap2.pdf">Chapter 2</SEGMENT>
<SEGMENT FILE="chap3.pdf">Chapter 3</SEGMENT>
<SEGMENT FILE="v1fwd.pdf" VOLUME="Volume 1">Foreword</SEGMENT>
<SEGMENT FILE="v1defs.pdf" VOLUME="Volume 1">Definitions</SEGMENT>
<SEGMENT FILE="v1meth.pdf" VOLUME="Volume 1">Methodology</SEGMENT>
<SEGMENT FILE="v1sachap1.pdf" VOLUME="Volume 1" GROUP='Section
A">Chapter 1A</SEGMENT>
<SEGMENT FILE="v1sachap2.pdf" VOLUME="Volume 1" GROUP='Section
A">Chapter 2A</SEGMENT>
<SEGMENT FILE="v1sachap3.pdf" VOLUME="Volume 1" GROUP='Section
A">Chapter 3A</SEGMENT>
<SEGMENT FILE="v1sbchap1.pdf" VOLUME="Volume 1" GROUP='Section
B">Chapter 1B</SEGMENT>
<SEGMENT FILE="v1sbchap2.pdf" VOLUME="Volume 1" GROUP='Section
B">Chapter 2B</SEGMENT>
<SEGMENT FILE="v1sbchap3.pdf" VOLUME="Volume 1" GROUP='Section
B">Chapter 3B</SEGMENT>
<SEGMENT FILE="appa.pdf" GROUP="Appendices">Appendix A</SEGMENT>
<SEGMENT FILE="appb.pdf" GROUP="Appendices">Appendix B</SEGMENT>
<SEGMENT FILE="appc.pdf" GROU2P="Appendices">Appendix C</SEGMENT>
</SEGMENT_LIST>
</DOCUMENT>

I need to transform the SEGMENT_LIST elements into a structured text
file for use by another application to construct the Table Of Content
(TOC). The file would be vertical bar (|) separated list of PDF file
segment names and their titles with a single-digit TOC indentation level
indicator in the first position as so:

1|fwd.pdf|Foreword
1|chap1.pdf|Chapter 1
1|chap2.pdf|Chapter 2
1|chap3.pdf|Chapter 3
1||Volume 1
2|v1fwd.pdf|Foreword
2|v1defs.pdf|Definitions
2|v1meth.pdf|Methodology
2||Section A
3|v1sachap1.pdf|Chapter 1A
3|v1sachap2.pdf|Chapter 2A
3|v1sachap3.pdf|Chapter 3A
2||Section B
3|v1sbchap1.pdf|Chapter 1B
3|v1sbchap2.pdf|Chapter 2B
3|v1sbchap3.pdf|Chapter 3B
1||Appendices
2|appa.pdf|Appendix A
2|appb.pdf|Appendix B
2|appc.pdf|Appendix C

I think you can imagine from the transformed file how the TOC would look
like:

Foreword
Chapter 1
Chapter 2
Chapter 3
Volume 1
Foreword
Definitions
Methodology
Section A
Chapter 1A
Chapter 2A
Chapter 3A
Section B
Chapter 1B
Chapter 2B
Chapter 3B
Appendices
Appendix A
Appendix B
Appendix C

My problem is that while I find it easy to write an XSLT stylesheet to
create the first 4 lines of the output file where the source XML does
not have either of the optional VOLUME and GROUP attributes:

<xsl:template match="/">
<xsl:apply-templates select="/DOCUMENT/SEGMENT_LIST/*" />
</xsl:template>

<xsl:template match="SEGMENT">
<xsl:text>1|</xsl:text>
<xsl:value-of select="@FILE"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="."/>
<xsl:text>
</xsl:text>
</xsl:template>

I have no idea however, how to transform the rest of XML because I don't
know how to process those attributes to make them Volume, Section and
Appendices headers in the output file for all the segments with the
same attribute value and with the proper indent level numbers.

Any suggestion would be greatly appreciated.

Rudy

Jun 21 '06 #1
3 2714
On Wed, 21 Jun 2006 04:07:44 +0200, R. P. <r_********@hotmail.com> wrote:
Subject: XSLT to transform a flat XML file into a structured text file

I have an XML file that lists the PDF file segment names and titles ofa
larger document and looks something like this:

<DOCUMENT>
.....
..... some lead elements
.....
<SEGMENT_LIST>
<SEGMENT FILE="fwd.pdf">Foreword</SEGMENT>
<SEGMENT FILE="chap1.pdf">Chapter 1</SEGMENT>
<SEGMENT FILE="chap2.pdf">Chapter 2</SEGMENT>
<SEGMENT FILE="chap3.pdf">Chapter 3</SEGMENT>
<SEGMENT FILE="v1fwd.pdf" VOLUME="Volume 1">Foreword</SEGMENT>
<SEGMENT FILE="v1defs.pdf" VOLUME="Volume 1">Definitions</SEGMENT>
<SEGMENT FILE="v1meth.pdf" VOLUME="Volume 1">Methodology</SEGMENT>
<SEGMENT FILE="v1sachap1.pdf" VOLUME="Volume 1" GROUP='Section
A">Chapter 1A</SEGMENT>
<SEGMENT FILE="v1sachap2.pdf" VOLUME="Volume 1" GROUP='Section
A">Chapter 2A</SEGMENT>
<SEGMENT FILE="v1sachap3.pdf" VOLUME="Volume 1" GROUP='Section
A">Chapter 3A</SEGMENT>
<SEGMENT FILE="v1sbchap1.pdf" VOLUME="Volume 1" GROUP='Section
B">Chapter 1B</SEGMENT>
<SEGMENT FILE="v1sbchap2.pdf" VOLUME="Volume 1" GROUP='Section
B">Chapter 2B</SEGMENT>
<SEGMENT FILE="v1sbchap3.pdf" VOLUME="Volume 1" GROUP='Section
B">Chapter 3B</SEGMENT>
<SEGMENT FILE="appa.pdf" GROUP="Appendices">Appendix A</SEGMENT>
<SEGMENT FILE="appb.pdf" GROUP="Appendices">Appendix B</SEGMENT>
<SEGMENT FILE="appc.pdf" GROU2P="Appendices">Appendix C</SEGMENT>
</SEGMENT_LIST>
</DOCUMENT>

I need to transform the SEGMENT_LIST elements into a structured text
file for use by another application to construct the Table Of Content
(TOC). The file would be vertical bar (|) separated list of PDF file
segment names and their titles with a single-digit TOC indentation level
indicator in the first position as so:


You probably should look for a solution involving 'multi-level grouping',
possibly with muenchian technique...

In the mean time, you could try out this quick and dirty solution:
(I wouldn't use it in a production environment)

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="text"/>

<xsl:template match="SEGMENT">
<xsl:variable name="this" select="@*[not(name()='FILE')]"/>
<xsl:variable name="that"
select="preceding-sibling::SEGMENT[1]/@*[not(name()='FILE')]"/>

<xsl:if test="$this[not(.=$that)] or count($this)!=count($that)">
<xsl:value-of select="count($this)"/>||<xsl:value-of
select="$this[not(.=$that)]"/>
<xsl:text> </xsl:text>
</xsl:if>

<xsl:value-of select="count($this) + 1"/>|<xsl:value-of select="@FILE"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="."/>
<xsl:text> </xsl:text>
</xsl:template>

</xsl:stylesheet>
regards,
--
Joris Gillis (http://users.telenet.be/root-jg/me.html)
Gaudiam omnibus traderat W3C, nec vana fides
Jun 21 '06 #2
"Joris Gillis" <jo**********@student.kuleuven.be> wrote:

You probably should look for a solution involving 'multi-level
grouping', possibly with muenchian technique...

In the mean time, you could try out this quick and dirty solution:
(I wouldn't use it in a production environment)


Thanks Joris, I wouldn't do it either. If for nothing else, it did not
provide the sought results on my first attempt. :-( However, you gave me
some tips on the direction I should be looking at for solution,
especially the term that describes my problem: "multi-level grouping."
I didn't know there was a name for it.

Regards,
Rudy

Jun 22 '06 #3
In case you aren't aware of it: Check the XSLT FAQ website's grouping
and indexing pages; some of the techniques there are quite useful but
not at all obvious.

http://www.dpawson.co.uk/xsl/sect2/sect21.html

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Jun 22 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Nicolas Girard | last post by:
Hi, Forgive me if the answer is trivial, but could you tell me how to achieve the following: {k1:,k2:v3,...} --> ,,,...] The subtle point (at least to me) is to "flatten" values that are...
6
by: Glenn Owens | last post by:
OK I'm on a steep learning curve with XML et.al. and need some advice. I'm writing a B2B front door for a new application. I have multiple data suppliers all sending various formats of flat...
1
by: Knepper, Michelle | last post by:
Hi out there, I'm a first-time user of the "Copy ... From..." command, and I'm trying to load a table from a text flat file. http://www.postgresql.org/docs/7.4/static/sql-copy.html I don't...
0
by: AJ2 | last post by:
Hi Guys, I included "for xml auto" in sql statement in "Excute SQL Task" and aimed to dump the output to a flat file. I added "ActiveX Script Task" into my package and with the following lines: ...
1
by: gelangov | last post by:
I am sorry, I am posting this message again, since I did not get any reply. I want to export a table into a "fixed width" file using SQL 2005 import export wizard. This is the version I have:...
5
by: Rob | last post by:
Anyone know of a simple way to convert a UNIX flat file (tab delimited)with about 20 fields(one header) to a xml file. What I want to do it to be able to open the file in a spreadsheet easily. ...
15
by: Richard | last post by:
Can anyone recommend a good online resource listing all C keywords, standard system calls, defines etc in a "flat" hierarchy. I wish to set up some bindings in order to bring up "hints and tips"...
2
by: saritha2008 | last post by:
Hi, As part of transforming one form of xml to another form, i need to do the below mentioned transformation: My Input XML: <rss> <channel> <item> <assignee...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.