473,326 Members | 2,108 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

Please help on this XML/XSLT

Gaiason
Hi XML/XSLT masters and gurus,

I am a newbie in XML/XSLT and have been reading up XML/XSLT in order to convert a XML generated from a OCR engine to another format of XML. I am truly stuck at this stage that I really need expert help on what is the thing that I might be missing or not understand.

Basically, my desired output is to get the text into the WORD tags like this:
<WORD coords="559,518,641,629">12</WORD>
<WORD coords="537,752,623,1325">TESTING</WORD>
<WORD coords="541,1402,628,2361">THE-SCRIPTS</WORD>

But, I just could not get it to go in and the text is left outside like this:
<WORD coords="559,518,641,629"></WORD>12
<WORD coords="537,752,623,1325"></WORD>TESTING
<WORD coords="541,1402,628,2361"></WORD>THE-SCRIPTS

I really need help and advice on this and my XML code is below:
<?xml version="1.0" encoding="UTF-8" ?>
<meaning>
<fmtnfo type="3" valtype="2"><rect t="537" l="518" b="641" r="3045"/></fmtnfo>
<fmtnfo type="5" valtype="2"><rect t="559" l="518" b="641" r="629"/></fmtnfo>
<txt s="3" a="0" t="559" l="518" b="641" r="562">1</txt>
<txt s="0" a="0" t="566" l="572" b="621" r="629">2</txt>
<fmtnfo type="5" valtype="2"><rect t="537" l="752" b="623" r="1325"/></fmtnfo>
<txt s="0" a="0" t="537" l="752" b="621" r="825">T</txt>
<txt s="0" a="0" t="540" l="844" b="621" r="922">E</txt>
<txt s="0" a="0" t="538" l="937" b="620" r="1017">S</txt>
<txt s="0" a="0" t="541" l="1032" b="622" r="1103">T</txt>
<txt s="0" a="0" t="541" l="1118" b="622" r="1147">I</txt>
<txt s="0" a="0" t="539" l="1163" b="622" r="1222">N</txt>
<txt s="0" a="0" t="542" l="1240" b="623" r="1325">G</txt>
<fmtnfo type="5" valtype="2"><rect t="541" l="1402" b="628" r="2361"/></fmtnfo>
<txt s="0" a="0" t="541" l="1402" b="624" r="1475">T</txt>
<txt s="0" a="0" t="543" l="1492" b="623" r="1583">H</txt>
<txt s="0" a="0" t="543" l="1603" b="623" r="1667">E</txt>
<txt s="0" a="0" t="544" l="1684" b="624" r="1761">-</txt>
<txt s="0" a="0" t="543" l="1770" b="625" r="1849">S</txt>
<txt s="0" a="0" t="543" l="1864" b="625" r="1895">C</txt>
<txt s="0" a="0" t="544" l="1916" b="625" r="1996">R</txt>
<txt s="0" a="0" t="543" l="2012" b="624" r="2083">I</txt>
<txt s="0" a="0" t="547" l="2102" b="626" r="2178">P</txt>
<txt s="0" a="0" t="545" l="2191" b="625" r="2262">T</txt>
<txt s="0" a="0" t="545" l="2278" b="628" r="2361">S</txt>
</meaning>

My XSLT that I have coded with my current knowledge and understanding below:
<?xml version="1.0" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes"/>

<xsl:template match="/">
<xsl:apply-templates />
</xsl:template>

<xsl:template match="meaning">
<xsl:apply-templates />
</xsl:template>

<xsl:template match="fmtnfo[@type='5']/rect">
<WORD coords="{@t},{@l},{@b},{@r}">
<xsl:apply-templates select="txt" />
</WORD>
</xsl:template>

<xsl:template match="txt">
<xsl:value-of select="." />
</xsl:template>

</xsl:stylesheet>

All your expert help and advice is really needed for this newbie. Thanks.

Best Regards,
Gaiason
Oct 3 '07 #1
6 1633
jkmyoung
2,057 Expert 2GB
the problem is that the <txt> elements are not contained within the <fmtnfo> or <rect> tags, but after them.

I suggest something like the following:
Expand|Select|Wrap|Line Numbers
  1. <xsl:template match="meaning">
  2. <xsl:apply-templates select="fmtnfo" />
  3. </xsl:template>
  4.  
  5. <xsl:template match="fmtnfo[@type='5']/rect">
  6. <WORD coords="{@t},{@l},{@b},{@r}">
  7. <xsl:variable name="id" select="generate-id(..)"/>
  8. <xsl:apply-templates select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]" />
  9. </WORD>
  10. </xsl:template>
Note that the meaning template has change to only apply templates to the fmtnfo nodes, so we don't duplicate the txt info.
We use generate-id to identify the fmtnfo nodes, and only select the txt nodes that 'belong' to them.
Oct 3 '07 #2
Hi guru jkmyoung,

Thanks for the help and code. It works. Now just for me to understand how you did that magic.

First thing is that you removed my excess apply templates to prevent duplication of the txt infomation, and choosing to apply template for the fmtnfo nodes, as they are located there.

Second thing is the part that gets a bit advance and without your help, I don't thing I might ever come out with the following code below:

Expand|Select|Wrap|Line Numbers
  1. <xsl:variable name="id" select="generate-id(..)"/>
  2. <xsl:apply-templates select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]" />
I only understand the theory on how you used it, but not the mechanics of it. From what I could understand from your reply and some reading is that, you declare a variable to capture the text from the txt tag. You used generate id() to create the unique tag links for the variable and the text capture by going to the node info of fmtnfo?

Could you be so kind to explain to me in more detail on how line 1 (select="generate-id(..)") and line 2 works (select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]" ?

Thanks.

Best Regards,
Gaiason

the problem is that the <txt> elements are not contained within the <fmtnfo> or <rect> tags, but after them.

I suggest something like the following:
Expand|Select|Wrap|Line Numbers
  1. <xsl:template match="meaning">
  2. <xsl:apply-templates select="fmtnfo" />
  3. </xsl:template>
  4.  
  5. <xsl:template match="fmtnfo[@type='5']/rect">
  6. <WORD coords="{@t},{@l},{@b},{@r}">
  7. <xsl:variable name="id" select="generate-id(..)"/>
  8. <xsl:apply-templates select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]" />
  9. </WORD>
  10. </xsl:template>
Note that the meaning template has change to only apply templates to the fmtnfo nodes, so we don't duplicate the txt info.
We use generate-id to identify the fmtnfo nodes, and only select the txt nodes that 'belong' to them.
Oct 4 '07 #3
Hi guru jkmyoung,

One more question that I have been scratching my head. As the tags are closed, I would always need to wrap it around the preceding sibling.

So for the fmtnfro type="3" I need to wrap the tags <LINE></LINE> in between the fmtnfro type="5"
Expand|Select|Wrap|Line Numbers
  1. <LINE>
  2. <WORDS>1</WORDS>
  3. <WORDS>2</WORDS>
  4. </LINE>
Please help me on this as I have tried apply-template, call-template and for-each but I just could not figure it out. Which part of XSLT should I really look into to get a better understanding on my type of code?

Thanks.

Best Regards,
Gaiason

Hi guru jkmyoung,

Thanks for the help and code. It works. Now just for me to understand how you did that magic.

First thing is that you removed my excess apply templates to prevent duplication of the txt infomation, and choosing to apply template for the fmtnfo nodes, as they are located there.

Second thing is the part that gets a bit advance and without your help, I don't thing I might ever come out with the following code below:

Expand|Select|Wrap|Line Numbers
  1. <xsl:variable name="id" select="generate-id(..)"/>
  2. <xsl:apply-templates select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]" />
I only understand the theory on how you used it, but not the mechanics of it. From what I could understand from your reply and some reading is that, you declare a variable to capture the text from the txt tag. You used generate id() to create the unique tag links for the variable and the text capture by going to the node info of fmtnfo?

Could you be so kind to explain to me in more detail on how line 1 (select="generate-id(..)") and line 2 works (select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]" ?

Thanks.

Best Regards,
Gaiason
Oct 4 '07 #4
jkmyoung
2,057 Expert 2GB
line 1 (select="generate-id(..)")
".." means parent node(axis). This is basically an identifier to the parent fmtnfo node.
generate-id functionhttp://www.w3schools.com/xsl/func_generateid.asp
Axis links: http://www.zvon.org/xxl/XSLTreferenc...axesIndex.html


Line 2: following::txt
Selects all the following txt nodes, regardless of level, eg parent or child. See axis links for more information.

preceding-sibling::fmtnfo
Gets all preceding-sibling fmtnfo of a particular txt node

preceding-sibling::fmtnfo[1]
Gets the first fmtnfo sibling before this txt node.

generate-id(preceding-sibling::fmtnfo[1])
Gets the id for the first fmtnfo sibling before this txt node.

select="following::txt[generate-id(preceding-sibling::fmtnfo[1]) = $id]"
selects all following txt nodes, whose immediately preceding fmtnfo node is the parent of the current node.

-----
If speed is a concern, this entire process can be sped up by declaring a key instead. Put the following just under your stylesheet node:
<xsl:key name="txt_by_preceding_fmtnfo" match="txt" use="generate-id(preceding-sibling::fmtnfo[1])"/>
This links all txt nodes to the fmtnfo node directly preceding them.

And instead of the 2 lines earlier, use
<xsl:apply-templates select="key('txt_by_preceding_fmtnfo", generate-id(..))"/>
Oct 4 '07 #5
jkmyoung
2,057 Expert 2GB
This would use the same sort of trick. You are generating a LINE for every fmtnfo[@type='3']. So declare this template and put the Line in there.
Note, this solution assumes that every fmtnfo[@type=5] is preceded by a type 3 fmtnfo.
Expand|Select|Wrap|Line Numbers
  1. <xsl:template match="fmtnfo[@type='3']>
  2. <LINE>
  3. ...
  4. </LINE>
  5. </xsl:template>
Again to avoid duplicate processing, (of the fmtnfo[@type='5'] nodes this time), you'd have to edit your meaning template:

Expand|Select|Wrap|Line Numbers
  1. <xsl:template match="meaning">
  2. <xsl:apply-templates select="fmtnfo[@type='3']"/>
  3. </xsl:template>
  4.  
Move the processing of those nodes into the type 3 template.
Expand|Select|Wrap|Line Numbers
  1. <xsl:template match="fmtnfo[@type='3']">
  2.   <LINE>
  3.     <xsl:variable name="id" select="generate-id(.)"/>
  4.     <xsl:apply-templates select="following-sibling::fmtnfo[@type='5'][generate-id(preceding-sibling::fmtnfo[1]) = $id]"/>
  5.   </LINE>
  6. </xsl:template>
In all:
Expand|Select|Wrap|Line Numbers
  1. <?xml version="1.0"?>
  2. <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  3.     <xsl:output method="xml" indent="yes"/>
  4.     <xsl:key name="type5" match="fmtnfo[@type='5']" use="generate-id(preceding-sibling::fmtnfo[@type=3][1])"/>
  5.     <xsl:key name="txt" match="txt" use="generate-id(preceding-sibling::fmtnfo[@type=5][1])"/>
  6.     <xsl:template match="/">
  7.         <xsl:apply-templates/>
  8.     </xsl:template>
  9.     <xsl:template match="meaning">
  10.         <xsl:apply-templates select="fmtnfo[@type='3']"/>
  11.     </xsl:template>
  12.     <xsl:template match="fmtnfo[@type='3']">
  13.         <LINE>
  14.             <xsl:apply-templates select="key('type5', generate-id(.))"/>
  15.         </LINE>
  16.     </xsl:template>
  17.     <xsl:template match="fmtnfo[@type='5']/rect">
  18.         <WORD coords="{@t},{@l},{@b},{@r}">
  19.             <xsl:apply-templates select="key('txt', generate-id(..))"/>
  20.         </WORD>
  21.     </xsl:template>
  22.     <xsl:template match="txt">
  23.         <xsl:value-of select="."/>
  24.     </xsl:template>
  25. </xsl:stylesheet>
  26.  
Your processor should support keys, but if it doesn't, switch back to the variable id way.
Oct 4 '07 #6
Hi guru jmkyoung,

The generateid() and key() magic works! You really helped me a lot to solve this puzzle. I've seen your explanation and code and do have a clearer picture of the logic on how the script works. I just maybe need a bit of time to understand the usage and know how to apply those axis and functions correctly.

Thank you very much, have a nice weekend.

Best Regards,
Gaiason
Oct 5 '07 #7

Sign in to post your reply or Sign up for a free account.

Similar topics

2
by: ted | last post by:
Was wondering if XSLT alone is appropriate for the following situation. From XML, I'm creating a small website (around 50 pages) with pages that link to each other through a nav menu and a...
1
by: Mohit | last post by:
Hi Friends I have to call 1 of the 2 child XSLT files from the Main XSLT file based on some criteria. I want one child XSLT file will be executed by version 1 of XSLT processor and the other by...
5
by: dennis | last post by:
Hi, First of all, hi to you all. I'm working on a Delphi project wich is becoming near it's deadline. I have a very simple XSLT question wich i hope one of you folks can help me with? The...
5
by: KathyB | last post by:
If someone could just explain this to me...I just don't get it! I have an aspx page where I retrieve several session variables and use xmlDocument to transform xml file with xsl file into an...
3
by: Ian Roddis | last post by:
Hello, I want to embed SQL type queries within an XML data record. The XML looks something like this: <DISPLAYPAGE> <FIELD NAME="SERVER" TYPE="DROPDOWN"> <OPTION>1<OPTION> <OPTION>2<OPTION>...
2
by: shapper | last post by:
Hello, I am for days trying to apply a XSL transformation to a XML file and display the result in a the browser. I am using Asp.Net 2.0. Please, could someone post just a simple code example,...
12
by: Chris | last post by:
Hi, Just wondering if anyone out there knows if it is possible to convert a CSV to xml using XSLT? I've seen a lot of examples of xml to CSV, but is it possible to go back the other way? I...
4
by: mark4asp | last post by:
I want to write a xslt template to create a xhtml 1.0 (transitional) file which will be sent in as email. Here is a typical xml data file: <BatchEmail> <Domain>www.myDomain.com</Domain>...
2
jkmyoung
by: jkmyoung | last post by:
Here's a short list of useful xslt general tricks that aren't taught at w3schools. Attribute Value Template Official W3C explanation and example This is when you want to put dynamic values...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.