Hi XML/XSLT masters and gurus,
I am a newbie in XML/XSLT and have been reading up XML/XSLT in order to convert a XML generated from a OCR engine to another format of XML. I am truly stuck at this stage that I really need expert help on what is the thing that I might be missing or not understand. Basically, my desired output is to get the text into the WORD tags like this:
<WORD coords="559,518,641,629">12</WORD>
<WORD coords="537,752,623,1325">TESTING</WORD>
<WORD coords="541,1402,628,2361">THE-SCRIPTS</WORD> But, I just could not get it to go in and the text is left outside like this:
<WORD coords="559,518,641,629"></WORD>12
<WORD coords="537,752,623,1325"></WORD>TESTING
<WORD coords="541,1402,628,2361"></WORD>THE-SCRIPTS I really need help and advice on this and my XML code is below:
<?xml version="1.0" encoding="UTF-8" ?>
<meaning>
<fmtnfo type="3" valtype="2"><rect t="537" l="518" b="641" r="3045"/></fmtnfo>
<fmtnfo type="5" valtype="2"><rect t="559" l="518" b="641" r="629"/></fmtnfo>
<txt s="3" a="0" t="559" l="518" b="641" r="562">1</txt>
<txt s="0" a="0" t="566" l="572" b="621" r="629">2</txt>
<fmtnfo type="5" valtype="2"><rect t="537" l="752" b="623" r="1325"/></fmtnfo>
<txt s="0" a="0" t="537" l="752" b="621" r="825">T</txt>
<txt s="0" a="0" t="540" l="844" b="621" r="922">E</txt>
<txt s="0" a="0" t="538" l="937" b="620" r="1017">S</txt>
<txt s="0" a="0" t="541" l="1032" b="622" r="1103">T</txt>
<txt s="0" a="0" t="541" l="1118" b="622" r="1147">I</txt>
<txt s="0" a="0" t="539" l="1163" b="622" r="1222">N</txt>
<txt s="0" a="0" t="542" l="1240" b="623" r="1325">G</txt>
<fmtnfo type="5" valtype="2"><rect t="541" l="1402" b="628" r="2361"/></fmtnfo>
<txt s="0" a="0" t="541" l="1402" b="624" r="1475">T</txt>
<txt s="0" a="0" t="543" l="1492" b="623" r="1583">H</txt>
<txt s="0" a="0" t="543" l="1603" b="623" r="1667">E</txt>
<txt s="0" a="0" t="544" l="1684" b="624" r="1761">-</txt>
<txt s="0" a="0" t="543" l="1770" b="625" r="1849">S</txt>
<txt s="0" a="0" t="543" l="1864" b="625" r="1895">C</txt>
<txt s="0" a="0" t="544" l="1916" b="625" r="1996">R</txt>
<txt s="0" a="0" t="543" l="2012" b="624" r="2083">I</txt>
<txt s="0" a="0" t="547" l="2102" b="626" r="2178">P</txt>
<txt s="0" a="0" t="545" l="2191" b="625" r="2262">T</txt>
<txt s="0" a="0" t="545" l="2278" b="628" r="2361">S</txt>
</meaning> My XSLT that I have coded with my current knowledge and understanding below:
<?xml version="1.0" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="meaning">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="fmtnfo[@type='5']/rect">
<WORD coords="{@t},{@l},{@b},{@r}">
<xsl:apply-templates select="txt" />
</WORD>
</xsl:template>
<xsl:template match="txt">
<xsl:value-of select="." />
</xsl:template>
</xsl:stylesheet> All your expert help and advice is really needed for this newbie. Thanks.
Best Regards,
Gaiason
6 1633
the problem is that the <txt> elements are not contained within the <fmtnfo> or <rect> tags, but after them.
I suggest something like the following: - <xsl:template match="meaning">
-
<xsl:apply-templates select="fmtnfo" />
-
</xsl:template>
-
-
<xsl:template match="fmtnfo[@type='5']/rect">
-
<WORD coords="{@t},{@l},{@b},{@r}">
-
<xsl:variable name="id" select="generate-id(..)"/>
-
<xsl:apply-templates select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]" />
-
</WORD>
-
</xsl:template>
Note that the meaning template has change to only apply templates to the fmtnfo nodes, so we don't duplicate the txt info.
We use generate-id to identify the fmtnfo nodes, and only select the txt nodes that 'belong' to them.
Hi guru jkmyoung,
Thanks for the help and code. It works. Now just for me to understand how you did that magic.
First thing is that you removed my excess apply templates to prevent duplication of the txt infomation, and choosing to apply template for the fmtnfo nodes, as they are located there.
Second thing is the part that gets a bit advance and without your help, I don't thing I might ever come out with the following code below: - <xsl:variable name="id" select="generate-id(..)"/>
-
<xsl:apply-templates select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]" />
I only understand the theory on how you used it, but not the mechanics of it. From what I could understand from your reply and some reading is that, you declare a variable to capture the text from the txt tag. You used generate id() to create the unique tag links for the variable and the text capture by going to the node info of fmtnfo?
Could you be so kind to explain to me in more detail on how line 1 (select="generate-id(..)") and line 2 works (select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]" ?
Thanks.
Best Regards,
Gaiason
the problem is that the <txt> elements are not contained within the <fmtnfo> or <rect> tags, but after them.
I suggest something like the following: - <xsl:template match="meaning">
-
<xsl:apply-templates select="fmtnfo" />
-
</xsl:template>
-
-
<xsl:template match="fmtnfo[@type='5']/rect">
-
<WORD coords="{@t},{@l},{@b},{@r}">
-
<xsl:variable name="id" select="generate-id(..)"/>
-
<xsl:apply-templates select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]" />
-
</WORD>
-
</xsl:template>
Note that the meaning template has change to only apply templates to the fmtnfo nodes, so we don't duplicate the txt info.
We use generate-id to identify the fmtnfo nodes, and only select the txt nodes that 'belong' to them.
Hi guru jkmyoung,
One more question that I have been scratching my head. As the tags are closed, I would always need to wrap it around the preceding sibling.
So for the fmtnfro type="3" I need to wrap the tags <LINE></LINE> in between the fmtnfro type="5" - <LINE>
-
<WORDS>1</WORDS>
-
<WORDS>2</WORDS>
-
</LINE>
Please help me on this as I have tried apply-template, call-template and for-each but I just could not figure it out. Which part of XSLT should I really look into to get a better understanding on my type of code?
Thanks.
Best Regards,
Gaiason
Hi guru jkmyoung,
Thanks for the help and code. It works. Now just for me to understand how you did that magic.
First thing is that you removed my excess apply templates to prevent duplication of the txt infomation, and choosing to apply template for the fmtnfo nodes, as they are located there.
Second thing is the part that gets a bit advance and without your help, I don't thing I might ever come out with the following code below: - <xsl:variable name="id" select="generate-id(..)"/>
-
<xsl:apply-templates select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]" />
I only understand the theory on how you used it, but not the mechanics of it. From what I could understand from your reply and some reading is that, you declare a variable to capture the text from the txt tag. You used generate id() to create the unique tag links for the variable and the text capture by going to the node info of fmtnfo?
Could you be so kind to explain to me in more detail on how line 1 (select="generate-id(..)") and line 2 works (select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]" ?
Thanks.
Best Regards,
Gaiason
line 1 (select="generate-id(..)")
".." means parent node(axis). This is basically an identifier to the parent fmtnfo node.
generate-id function http://www.w3schools.com/xsl/func_generateid.asp
Axis links: http://www.zvon.org/xxl/XSLTreferenc...axesIndex.html
Line 2: following::txt
Selects all the following txt nodes, regardless of level, eg parent or child. See axis links for more information.
preceding-sibling::fmtnfo
Gets all preceding-sibling fmtnfo of a particular txt node
preceding-sibling::fmtnfo[1]
Gets the first fmtnfo sibling before this txt node.
generate-id(preceding-sibling::fmtnfo[1])
Gets the id for the first fmtnfo sibling before this txt node.
select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]"
selects all following txt nodes, whose immediately preceding fmtnfo node is the parent of the current node.
-----
If speed is a concern, this entire process can be sped up by declaring a key instead. Put the following just under your stylesheet node:
<xsl:key name="txt_by_preceding_fmtnfo" match="txt" use="generate-id(preceding-sibling::fmtnfo[1])"/>
This links all txt nodes to the fmtnfo node directly preceding them.
And instead of the 2 lines earlier, use
<xsl:apply-templates select="key('txt_by_preceding_fmtnfo", generate-id(..))"/>
This would use the same sort of trick. You are generating a LINE for every fmtnfo[@type='3']. So declare this template and put the Line in there.
Note, this solution assumes that every fmtnfo[@type=5] is preceded by a type 3 fmtnfo. -
<xsl:template match="fmtnfo[@type='3']>
-
<LINE>
-
...
-
</LINE>
-
</xsl:template>
Again to avoid duplicate processing, (of the fmtnfo[@type='5'] nodes this time), you'd have to edit your meaning template: -
<xsl:template match="meaning">
-
<xsl:apply-templates select="fmtnfo[@type='3']"/>
-
</xsl:template>
-
Move the processing of those nodes into the type 3 template. -
<xsl:template match="fmtnfo[@type='3']">
-
<LINE>
-
<xsl:variable name="id" select="generate-id(.)"/>
-
<xsl:apply-templates select="following-sibling::fmtnfo[@type='5'][generate-id(preceding-sibling::fmtnfo[1]) = $id]"/>
-
</LINE>
-
</xsl:template>
In all: - <?xml version="1.0"?>
-
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
-
<xsl:output method="xml" indent="yes"/>
-
<xsl:key name="type5" match="fmtnfo[@type='5']" use="generate-id(preceding-sibling::fmtnfo[@type=3][1])"/>
-
<xsl:key name="txt" match="txt" use="generate-id(preceding-sibling::fmtnfo[@type=5][1])"/>
-
<xsl:template match="/">
-
<xsl:apply-templates/>
-
</xsl:template>
-
<xsl:template match="meaning">
-
<xsl:apply-templates select="fmtnfo[@type='3']"/>
-
</xsl:template>
-
<xsl:template match="fmtnfo[@type='3']">
-
<LINE>
-
<xsl:apply-templates select="key('type5', generate-id(.))"/>
-
</LINE>
-
</xsl:template>
-
<xsl:template match="fmtnfo[@type='5']/rect">
-
<WORD coords="{@t},{@l},{@b},{@r}">
-
<xsl:apply-templates select="key('txt', generate-id(..))"/>
-
</WORD>
-
</xsl:template>
-
<xsl:template match="txt">
-
<xsl:value-of select="."/>
-
</xsl:template>
-
</xsl:stylesheet>
-
Your processor should support keys, but if it doesn't, switch back to the variable id way.
Hi guru jmkyoung,
The generateid() and key() magic works! You really helped me a lot to solve this puzzle. I've seen your explanation and code and do have a clearer picture of the logic on how the script works. I just maybe need a bit of time to understand the usage and know how to apply those axis and functions correctly.
Thank you very much, have a nice weekend.
Best Regards,
Gaiason
Sign in to post your reply or Sign up for a free account.
Similar topics
by: ted |
last post by:
Was wondering if XSLT alone is appropriate for the following situation.
From XML, I'm creating a small website (around 50 pages) with pages that
link to each other through a nav menu and a...
|
by: Mohit |
last post by:
Hi Friends
I have to call 1 of the 2 child XSLT files from the Main XSLT file
based on some criteria. I want one child XSLT file will be executed by
version 1 of XSLT processor and the other by...
|
by: dennis |
last post by:
Hi,
First of all, hi to you all.
I'm working on a Delphi project wich is becoming near it's deadline.
I have a very simple XSLT question wich i hope one of you folks can
help me with?
The...
|
by: KathyB |
last post by:
If someone could just explain this to me...I just don't get it!
I have an aspx page where I retrieve several session variables and use
xmlDocument to transform xml file with xsl file into an...
|
by: Ian Roddis |
last post by:
Hello,
I want to embed SQL type queries within an XML data record. The XML
looks something like this:
<DISPLAYPAGE>
<FIELD NAME="SERVER" TYPE="DROPDOWN">
<OPTION>1<OPTION>
<OPTION>2<OPTION>...
|
by: shapper |
last post by:
Hello,
I am for days trying to apply a XSL transformation to a XML file and
display the result in a the browser. I am using Asp.Net 2.0.
Please, could someone post just a simple code example,...
|
by: Chris |
last post by:
Hi,
Just wondering if anyone out there knows if it is possible to convert
a CSV to xml using XSLT?
I've seen a lot of examples of xml to CSV, but is it possible to go
back the other way?
I...
|
by: mark4asp |
last post by:
I want to write a xslt template to create a xhtml 1.0 (transitional)
file which will be sent in as email.
Here is a typical xml data file:
<BatchEmail>
<Domain>www.myDomain.com</Domain>...
|
by: jkmyoung |
last post by:
Here's a short list of useful xslt general tricks that aren't taught at w3schools.
Attribute Value Template
Official W3C explanation and example
This is when you want to put dynamic values...
|
by: ryjfgjl |
last post by:
ExcelToDatabase: batch import excel into database automatically...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: Vimpel783 |
last post by:
Hello!
Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
|
by: ArrayDB |
last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
|
by: PapaRatzi |
last post by:
Hello,
I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
|
by: Defcon1945 |
last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
|
by: af34tf |
last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
|
by: Faith0G |
last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome former...
| |