Connecting Tech Pros Worldwide Forums | Help | Site Map

Merging 2 different XML files...

Newbie
 
Join Date: Nov 2008
Posts: 9
#1: Nov 26 '08
Hi,

I have two files to merge using Java based on a similar text identifier:

File 1:
Expand|Select|Wrap|Line Numbers
  1. <ListRecords>
  2. <record> 
  3. <header> 
  4. <identifier>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</identifier> 
  5. <datestamp>2007-05-29T15:55:00Z</datestamp> 
  6. <datestampasdatetime>2007-05-29T17:55:00+02:00</datestampasdatetime> 
  7. </header> 
  8. <metadata> 
  9. <lom xsi:schemaLocation="http://dpc.uba.uva.nl/schema/lom/triplel http://dpc.uba.uva.nl/schema/lom/triplel/lom.xsd"> 
  10. <general > 
  11. <identifier>
  12. <catalog>oai</catalog>
  13. <entry>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</entry>
  14. </identifier
  15. <title> 
  16. <langstring> 
  17. <value>Graduation mw. S. de Caralt</value> 
  18. <language>en</language> 
  19. </langstring> 
  20. </title> 
  21. <catalogentry> 
  22. <catalog>nl.wur.wurtv</catalog> 
  23. <entry> 
  24. <langstring> 
  25. <value>2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</value> 
  26. <language>x-none</language> 
  27. </langstring> 
  28. </entry> 
  29. </catalogentry> 
  30. <grouplanguage>en</grouplanguage> 
  31. <description> 
  32. <langstring> 
  33. <value>Sponge Culture: Learning from Biology and Ecology</value> 
  34. <language>en</language> 
  35. </langstring> 
  36. </description> 
  37. </general> 
  38. <lifecycle xmlns="" /> 
  39. <metametadata > 
  40. <metadatascheme>LORENET</metadatascheme> 
  41. </metametadata> 
  42. </lom> 
  43. </metadata> 
  44. </record>
  45. <….More Records here…..!>
  46. </ListRecords>
  47.  
File 2:
Expand|Select|Wrap|Line Numbers
  1. <ListRecords>
  2.  <record>
  3.  <header>
  4.   <identifier>some value herer</identifier> 
  5.   <datestamp>2008-07-14T09:23:25Z</datestamp> 
  6.   </header>
  7.  <metadata>
  8.  <group xsi:schemaLocation="http://dpc.uba.uva.nl/schema/lom/triplel http://dpc.uba.uva.nl/schema/lom/triplel/lom.xsd"">
  9.   <title>User manipulating this</title> 
  10.  <feed>
  11.   <title>My feed</title> 
  12.   <url>http://no.url.available</url> 
  13.  <item>
  14.   <guid>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</guid> 
  15.  <events>
  16.  <event>
  17.   <dateTime>2008-03-26T13:27:49.00</dateTime> 
  18.  <action>
  19.   <actionType>doSomeAtcion</actionType> 
  20.   </action>
  21.   </event>
  22.   </events>
  23.   </item>
  24.   </feed>
  25.   </group>
  26.   </metadata>
  27.   </record>
  28. <....More Records here....!>
  29. </ListRecords>
I want to merge <metadata> element and all its sub elements from file 1 into the file 2 within its <metadata element> based on unique text of element "<identifier>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</identifier>" in file 1and similar ID <guid>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</guid> in file 2.

Any suggestions and guidelines will be highly appreciated.

Thnx.

Newbie
 
Join Date: Nov 2008
Posts: 9
#2: Nov 26 '08

re: Merging 2 different XML files...


sorry i forgot to mention, i have to use Java to merge them...
Moderator
 
Join Date: Mar 2006
Posts: 1,103
#3: Nov 26 '08

re: Merging 2 different XML files...


Need to define:
  • Rows/identifiers
  • Fields to be merged
  • Merging rules


Please correct if any of the following is wrong.
Assumptions from looking at the code:

Fields summarized in xpaths:
File 1
rows: /ListRecords/record
row id: header/identifier

File 2
rows: /ListRecords/record/metadata/group/item


Let's look at the seperate xml sections to be merged:
File 2:
Expand|Select|Wrap|Line Numbers
  1.  <item>
  2.   <guid>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</guid> 
  3.  <events>
  4.  <event>
  5.   <dateTime>2008-03-26T13:27:49.00</dateTime> 
  6.  <action>
  7.   <actionType>doSomeAtcion</actionType> 
  8.   </action>
  9.   </event>
  10.   </events>
  11.   </item>
  12.  
Is this technically a 'join' ? Eg are you just adding fields from one file to another, or are you copying over existing fields?


Since you're merging into a file I would recommend either:
1. DOM. Open both files with DOM. Add nodes to File1 DOM. Save back to file.
2. XSLT. Performance may be less than optimal, but code is much more maintainable.
Newbie
 
Join Date: Nov 2008
Posts: 9
#4: Nov 26 '08

re: Merging 2 different XML files...


Thnx. a lot for ur reply i was so worried about it as i have a deadline
Actually i want to join the record of similar id from file 2 into file 1 after the file 1 record for that id ends, the output might look like:
Expand|Select|Wrap|Line Numbers
  1. <ListRecords>
  2. <record>
  3. <header>
  4. <identifier>some value here</identifier> 
  5. <datestamp>2008-07-14T09:23:25Z</datestamp> 
  6. </header>
  7. <metadata>
  8. <group xsi:schemaLocation="http://dpc.uba.uva.nl/schema/lom/triplel http://dpc.uba.uva.nl/schema/lom/triplel/lom.xsd"">
  9. <title>User manipulating this</title> 
  10. <feed>
  11. <title>My feed</title> 
  12. <url>http://no.url.available</url> 
  13. <item>
  14. <guid>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</guid> 
  15. <events>
  16. <event>
  17. <dateTime>2008-03-26T13:27:49.00</dateTime> 
  18. <action>
  19. <actionType>doSomeAtcion</actionType>  
  20. </lom> 
  21. </action>
  22. </event>
  23. </events>
  24. </item>
  25. </feed>
  26. </group>
  27. <header> 
  28. <identifier>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</identifier> 
  29. <datestamp>2007-05-29T15:55:00Z</datestamp> 
  30. <datestampasdatetime>2007-05-29T17:55:00+02:00</datestampasdatetime> 
  31. </header> 
  32. <metadata> 
  33. <lom xsi:schemaLocation="http://dpc.uba.uva.nl/schema/lom/triplel http://dpc.uba.uva.nl/schema/lom/triplel/lom.xsd"> 
  34. <general > 
  35. <identifier>
  36. <catalog>oai</catalog>
  37. <entry>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</entry>
  38. </identifier
  39. <title> 
  40. <langstring> 
  41. <value>Graduation mw. S. de Caralt</value> 
  42. <language>en</language> 
  43. </langstring> 
  44. </title> 
  45. <catalogentry> 
  46. <catalog>nl.wur.wurtv</catalog> 
  47. <entry> 
  48. <langstring> 
  49. <value>2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</value> 
  50. <language>x-none</language> 
  51. </langstring> 
  52. </entry> 
  53. </catalogentry> 
  54. <grouplanguage>en</grouplanguage> 
  55. <description> 
  56. <langstring> 
  57. <value>Sponge Culture: Learning from Biology and Ecology</value> 
  58. <language>en</language> 
  59. </langstring> 
  60. </description> 
  61. </general> 
  62. <lifecycle xmlns="" /> 
  63. <metametadata > 
  64. <metadatascheme>LORENET</metadatascheme> 
  65. </metametadata> 
  66. </lom> 
  67. </metadata> 
  68. </metadata>
  69. </record>
  70.  
i have to do this for almost 10 records for similar ids in both files
Moderator
 
Join Date: Mar 2006
Posts: 1,103
#5: Nov 26 '08

re: Merging 2 different XML files...


Considering this, I would probably use xslt.

Driving xslt in Java (sample):

Expand|Select|Wrap|Line Numbers
  1. //set file names
  2. File file1 = new File("Filename1.xml");
  3. String filename2 = "Filename2.xml";
  4. File xslt = new File("FileXSLT.xslt");
  5. File dest = new File("resultFile.xml");
  6.  
  7. //build transformer
  8. TransformerFactory xformFactory = TransformerFactory.newInstance();
  9. transformer = xformFactory.newTransformer(new StreamSource(xslt)); 
  10.  
  11. // set file2 filename parameter
  12. transformer.setParameter("file2", FileName2);
  13.  
  14. // Modularization :( looks stupid, but actually makes it perform better.
  15. DocumentBuilderFactory docBuildFactory = DocumentBuilderFactory.newInstance();
  16. DocumentBuilder parser = docBuildFactory.newDocumentBuilder();
  17. Document document = parser.parse(file1);
  18.  
  19. transformer.transform(new StreamSource(source), new StreamResult(dest));
  20.  
For more info, google "java xslt transformation"

XSLT: Starting with a copy template, add template for the proper fields to merge them. I'm having trouble seeing which fields need to be merged, so I hope you can figure it out from the example.
Expand|Select|Wrap|Line Numbers
  1. <xsl:param name="file2" select="''"/><!-- defaults to empty string -->
  2. <xsl:variable name="doc2" select="document($file2)"/><!-- convert to nodes -->
  3.  
  4. <xsl:template match="*"><!-- copy template -->
  5.   <xsl:copy>
  6.     <xsl:copy-of select="@*"/>
  7.     <xsl:apply-templates/>
  8.   </xsl:copy>
  9. </xsl:template>
  10.  
  11. <xsl:template match="record">
  12.   <xsl:copy>
  13.     <xsl:copy-of select="@*"/>
  14.     <xsl:apply-templates/>
  15.     <!-- add in other stuff here -->
  16.     <xsl:copy-of select="$doc2/ListRecords/record/metadata/group/item[guid = current()/header/identifier]"/>
  17.   </xsl:copy>
  18. </xsl:template>
  19.  
Key line in all of this is:
<xsl:copy-of select="$doc2/ListRecords/record/metadata/group/item[guid = current()/header/identifier]"/>
Copy the item nodes which match the current node's id.

Customize this to merge as you need.
Newbie
 
Join Date: Nov 2008
Posts: 9
#6: Nov 30 '08

re: Merging 2 different XML files...


Thank you very much for the reply at-least i got the idea but problem is that i am totally new with XSLT so of-course have no time to start with tutorials due to deadline but still i am trying and i hope to solve it but in case i have any problems i will post them.
Newbie
 
Join Date: Nov 2008
Posts: 9
#7: Nov 30 '08

re: Merging 2 different XML files...


Hi,
Thnx. a lot for ur help and tried (still trying) but couldn't manage to write the XSLT file correctly and also its not possible to start with tutorial for xslt from beginning due to deadline so please help me so at-least when this first task is done i will be able to read more about it as tomorrow is deadline :-(

As top elements <ListRecords> and then <record> in both files.This means that this <record> is one unique record based on <identifer> value in file1 (line 4) and <guid> value in file2 (line 14). This unique record of these similar id's in both files have different data elements i mean different fields. I want to merge this unique record of mentioned ID from file 2 into file 1.

There is also this <metadata> element in both files, file1 (line 8 to 42) and in file2 (line 7 to 26) so i want to simply copy this <metadata> element and elements in between (sub-elements) till line 43 from file 2 into file 1 after file1's <metadata> element ends at line 26 and after that last element would be then simply <record>
There are 10 unique records in both files and final file should mention all of them in a similar way so i hope if one is correctly merged others follow the same template match.
Please help me as i am really worried and first task in a new language is always such headache

Best Regards
Newbie
 
Join Date: Nov 2008
Posts: 9
#8: Nov 30 '08

re: Merging 2 different XML files...


"There is also this <metadata> element in both files, file1 (line 8 to 42) and in file2 (line 7 to 26) so i want to simply copy this <metadata> element and elements in between (sub-elements) till line 43 from file 2 into file 1 after file1's <metadata> element ends at line 26 and after that last element would be then simply <record>"

Sorry a little mistake in above paragraph i want to merge record from file 1 into file 2 and not the other way around.
Moderator
 
Join Date: Mar 2006
Posts: 1,103
#9: Dec 1 '08

re: Merging 2 different XML files...


Could you show us what you have so far? If you can get the first few fields copying correctly, then it'll be easier to figure out mistakes you're making with the rest.
Newbie
 
Join Date: Nov 2008
Posts: 9
#10: Dec 1 '08

re: Merging 2 different XML files...


Thnx. for the reply..
Actually i only changed the xpath u provided as i made a mistake while mentioning which file to copy so i have to copy data from file 1 into file 2 under record element based on that unique ID. So i only changed xpath in the sample u provided (i am not sure i did it write as i m messed up) so it is:

Expand|Select|Wrap|Line Numbers
  1. <?xml version="1.0"?>
  2. <xsl:stylesheet version = '1.0'
  3. xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
  4. <xsl:output method="xml" indent="yes"/>
  5. <xsl:param name="file1" select="''"/><!-- defaults to empty string -->
  6. <xsl:variable name="doc1" select="document($file1)"/><!-- convert to nodes --> 
  7. <xsl:template match="*"><!-- copy template -->
  8. <xsl:copy>
  9. <xsl:copy-of select="@*"/>
  10. <xsl:apply-templates/>
  11. </xsl:copy>
  12. </xsl:template> 
  13. <xsl:template match="record">
  14. <xsl:copy>
  15. <xsl:copy-of select="@*"/>
  16. <xsl:apply-templates/>
  17. <!-- copy data from file 1 into file 2 based on guid in file 2 -->
  18. <xsl:copy-of select="$doc1/ListRecords/record/header[identifier = current()/item//feed/guid]"/> <!-- dont know whether where will it copy that data and under which element of file 2 -->
  19. </xsl:copy>
  20. </xsl:template>
  21. </xsl:stylesheet>
  22.  
So i dont know how to copy all metadata files from file one into file 2 exactly after file 2 metadata element ends. I know i didnt do much...
Hope u would help to solve it.
Moderator
 
Join Date: Mar 2006
Posts: 1,103
#11: Dec 1 '08

re: Merging 2 different XML files...


The easiest way I can think of (not the best programatically) is to have a last metadata template. Use xpath like: "metadata[not(following::metadata)]"
Expand|Select|Wrap|Line Numbers
  1. <xsl:template match="metadata[not(following::metadata)]">
  2.   <xsl:copy>
  3.     <xsl:copy-of select="@*"/>
  4.     <xsl:apply-templates/>
  5.   </xsl:copy>
  6.   <!-- add rest from other file -->
  7.   <xsl:copy-of select="$doc1//metadata"/>
  8. </xsl:template>
  9.  
Newbie
 
Join Date: Nov 2008
Posts: 9
#12: Dec 2 '08

re: Merging 2 different XML files...


Thnx. a lot for the help..
Yes it works but only in case i have one record in each of the files but when merging more records, would require some concrete appraoch...
Reply

Tags
documentbuilder, java, join, merge, transformer, xml, xslt