473,385 Members | 1,341 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Merging 2 different XML files...

Hi,

I have two files to merge using Java based on a similar text identifier:

File 1:
Expand|Select|Wrap|Line Numbers
  1. <ListRecords>
  2. <record> 
  3. <header> 
  4. <identifier>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</identifier> 
  5. <datestamp>2007-05-29T15:55:00Z</datestamp> 
  6. <datestampasdatetime>2007-05-29T17:55:00+02:00</datestampasdatetime> 
  7. </header> 
  8. <metadata> 
  9. <lom xsi:schemaLocation="http://dpc.uba.uva.nl/schema/lom/triplel http://dpc.uba.uva.nl/schema/lom/triplel/lom.xsd"> 
  10. <general > 
  11. <identifier>
  12. <catalog>oai</catalog>
  13. <entry>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</entry>
  14. </identifier
  15. <title> 
  16. <langstring> 
  17. <value>Graduation mw. S. de Caralt</value> 
  18. <language>en</language> 
  19. </langstring> 
  20. </title> 
  21. <catalogentry> 
  22. <catalog>nl.wur.wurtv</catalog> 
  23. <entry> 
  24. <langstring> 
  25. <value>2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</value> 
  26. <language>x-none</language> 
  27. </langstring> 
  28. </entry> 
  29. </catalogentry> 
  30. <grouplanguage>en</grouplanguage> 
  31. <description> 
  32. <langstring> 
  33. <value>Sponge Culture: Learning from Biology and Ecology</value> 
  34. <language>en</language> 
  35. </langstring> 
  36. </description> 
  37. </general> 
  38. <lifecycle xmlns="" /> 
  39. <metametadata > 
  40. <metadatascheme>LORENET</metadatascheme> 
  41. </metametadata> 
  42. </lom> 
  43. </metadata> 
  44. </record>
  45. <….More Records here…..!>
  46. </ListRecords>
  47.  
File 2:
Expand|Select|Wrap|Line Numbers
  1. <ListRecords>
  2.  <record>
  3.  <header>
  4.   <identifier>some value herer</identifier> 
  5.   <datestamp>2008-07-14T09:23:25Z</datestamp> 
  6.   </header>
  7.  <metadata>
  8.  <group xsi:schemaLocation="http://dpc.uba.uva.nl/schema/lom/triplel http://dpc.uba.uva.nl/schema/lom/triplel/lom.xsd"">
  9.   <title>User manipulating this</title> 
  10.  <feed>
  11.   <title>My feed</title> 
  12.   <url>http://no.url.available</url> 
  13.  <item>
  14.   <guid>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</guid> 
  15.  <events>
  16.  <event>
  17.   <dateTime>2008-03-26T13:27:49.00</dateTime> 
  18.  <action>
  19.   <actionType>doSomeAtcion</actionType> 
  20.   </action>
  21.   </event>
  22.   </events>
  23.   </item>
  24.   </feed>
  25.   </group>
  26.   </metadata>
  27.   </record>
  28. <....More Records here....!>
  29. </ListRecords>
I want to merge <metadata> element and all its sub elements from file 1 into the file 2 within its <metadata element> based on unique text of element "<identifier>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</identifier>" in file 1and similar ID <guid>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</guid> in file 2.

Any suggestions and guidelines will be highly appreciated.

Thnx.
Nov 26 '08 #1
11 5920
sorry i forgot to mention, i have to use Java to merge them...
Nov 26 '08 #2
jkmyoung
2,057 Expert 2GB
Need to define:
  • Rows/identifiers
  • Fields to be merged
  • Merging rules


Please correct if any of the following is wrong.
Assumptions from looking at the code:

Fields summarized in xpaths:
File 1
rows: /ListRecords/record
row id: header/identifier

File 2
rows: /ListRecords/record/metadata/group/item


Let's look at the seperate xml sections to be merged:
File 2:
Expand|Select|Wrap|Line Numbers
  1.  <item>
  2.   <guid>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</guid> 
  3.  <events>
  4.  <event>
  5.   <dateTime>2008-03-26T13:27:49.00</dateTime> 
  6.  <action>
  7.   <actionType>doSomeAtcion</actionType> 
  8.   </action>
  9.   </event>
  10.   </events>
  11.   </item>
  12.  
Is this technically a 'join' ? Eg are you just adding fields from one file to another, or are you copying over existing fields?


Since you're merging into a file I would recommend either:
1. DOM. Open both files with DOM. Add nodes to File1 DOM. Save back to file.
2. XSLT. Performance may be less than optimal, but code is much more maintainable.
Nov 26 '08 #3
Thnx. a lot for ur reply i was so worried about it as i have a deadline
Actually i want to join the record of similar id from file 2 into file 1 after the file 1 record for that id ends, the output might look like:
Expand|Select|Wrap|Line Numbers
  1. <ListRecords>
  2. <record>
  3. <header>
  4. <identifier>some value here</identifier> 
  5. <datestamp>2008-07-14T09:23:25Z</datestamp> 
  6. </header>
  7. <metadata>
  8. <group xsi:schemaLocation="http://dpc.uba.uva.nl/schema/lom/triplel http://dpc.uba.uva.nl/schema/lom/triplel/lom.xsd"">
  9. <title>User manipulating this</title> 
  10. <feed>
  11. <title>My feed</title> 
  12. <url>http://no.url.available</url> 
  13. <item>
  14. <guid>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</guid> 
  15. <events>
  16. <event>
  17. <dateTime>2008-03-26T13:27:49.00</dateTime> 
  18. <action>
  19. <actionType>doSomeAtcion</actionType>  
  20. </lom> 
  21. </action>
  22. </event>
  23. </events>
  24. </item>
  25. </feed>
  26. </group>
  27. <header> 
  28. <identifier>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</identifier> 
  29. <datestamp>2007-05-29T15:55:00Z</datestamp> 
  30. <datestampasdatetime>2007-05-29T17:55:00+02:00</datestampasdatetime> 
  31. </header> 
  32. <metadata> 
  33. <lom xsi:schemaLocation="http://dpc.uba.uva.nl/schema/lom/triplel http://dpc.uba.uva.nl/schema/lom/triplel/lom.xsd"> 
  34. <general > 
  35. <identifier>
  36. <catalog>oai</catalog>
  37. <entry>oai:triple-l:2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</entry>
  38. </identifier
  39. <title> 
  40. <langstring> 
  41. <value>Graduation mw. S. de Caralt</value> 
  42. <language>en</language> 
  43. </langstring> 
  44. </title> 
  45. <catalogentry> 
  46. <catalog>nl.wur.wurtv</catalog> 
  47. <entry> 
  48. <langstring> 
  49. <value>2c7ba037-52a6-4323-97dd-b6ea1cdbfd18</value> 
  50. <language>x-none</language> 
  51. </langstring> 
  52. </entry> 
  53. </catalogentry> 
  54. <grouplanguage>en</grouplanguage> 
  55. <description> 
  56. <langstring> 
  57. <value>Sponge Culture: Learning from Biology and Ecology</value> 
  58. <language>en</language> 
  59. </langstring> 
  60. </description> 
  61. </general> 
  62. <lifecycle xmlns="" /> 
  63. <metametadata > 
  64. <metadatascheme>LORENET</metadatascheme> 
  65. </metametadata> 
  66. </lom> 
  67. </metadata> 
  68. </metadata>
  69. </record>
  70.  
i have to do this for almost 10 records for similar ids in both files
Nov 26 '08 #4
jkmyoung
2,057 Expert 2GB
Considering this, I would probably use xslt.

Driving xslt in Java (sample):

Expand|Select|Wrap|Line Numbers
  1. //set file names
  2. File file1 = new File("Filename1.xml");
  3. String filename2 = "Filename2.xml";
  4. File xslt = new File("FileXSLT.xslt");
  5. File dest = new File("resultFile.xml");
  6.  
  7. //build transformer
  8. TransformerFactory xformFactory = TransformerFactory.newInstance();
  9. transformer = xformFactory.newTransformer(new StreamSource(xslt)); 
  10.  
  11. // set file2 filename parameter
  12. transformer.setParameter("file2", FileName2);
  13.  
  14. // Modularization :( looks stupid, but actually makes it perform better.
  15. DocumentBuilderFactory docBuildFactory = DocumentBuilderFactory.newInstance();
  16. DocumentBuilder parser = docBuildFactory.newDocumentBuilder();
  17. Document document = parser.parse(file1);
  18.  
  19. transformer.transform(new StreamSource(source), new StreamResult(dest));
  20.  
For more info, google "java xslt transformation"

XSLT: Starting with a copy template, add template for the proper fields to merge them. I'm having trouble seeing which fields need to be merged, so I hope you can figure it out from the example.
Expand|Select|Wrap|Line Numbers
  1. <xsl:param name="file2" select="''"/><!-- defaults to empty string -->
  2. <xsl:variable name="doc2" select="document($file2)"/><!-- convert to nodes -->
  3.  
  4. <xsl:template match="*"><!-- copy template -->
  5.   <xsl:copy>
  6.     <xsl:copy-of select="@*"/>
  7.     <xsl:apply-templates/>
  8.   </xsl:copy>
  9. </xsl:template>
  10.  
  11. <xsl:template match="record">
  12.   <xsl:copy>
  13.     <xsl:copy-of select="@*"/>
  14.     <xsl:apply-templates/>
  15.     <!-- add in other stuff here -->
  16.     <xsl:copy-of select="$doc2/ListRecords/record/metadata/group/item[guid = current()/header/identifier]"/>
  17.   </xsl:copy>
  18. </xsl:template>
  19.  
Key line in all of this is:
<xsl:copy-of select="$doc2/ListRecords/record/metadata/group/item[guid = current()/header/identifier]"/>
Copy the item nodes which match the current node's id.

Customize this to merge as you need.
Nov 26 '08 #5
Thank you very much for the reply at-least i got the idea but problem is that i am totally new with XSLT so of-course have no time to start with tutorials due to deadline but still i am trying and i hope to solve it but in case i have any problems i will post them.
Nov 30 '08 #6
Hi,
Thnx. a lot for ur help and tried (still trying) but couldn't manage to write the XSLT file correctly and also its not possible to start with tutorial for xslt from beginning due to deadline so please help me so at-least when this first task is done i will be able to read more about it as tomorrow is deadline :-(

As top elements <ListRecords> and then <record> in both files.This means that this <record> is one unique record based on <identifer> value in file1 (line 4) and <guid> value in file2 (line 14). This unique record of these similar id's in both files have different data elements i mean different fields. I want to merge this unique record of mentioned ID from file 2 into file 1.

There is also this <metadata> element in both files, file1 (line 8 to 42) and in file2 (line 7 to 26) so i want to simply copy this <metadata> element and elements in between (sub-elements) till line 43 from file 2 into file 1 after file1's <metadata> element ends at line 26 and after that last element would be then simply <record>
There are 10 unique records in both files and final file should mention all of them in a similar way so i hope if one is correctly merged others follow the same template match.
Please help me as i am really worried and first task in a new language is always such headache

Best Regards
Nov 30 '08 #7
"There is also this <metadata> element in both files, file1 (line 8 to 42) and in file2 (line 7 to 26) so i want to simply copy this <metadata> element and elements in between (sub-elements) till line 43 from file 2 into file 1 after file1's <metadata> element ends at line 26 and after that last element would be then simply <record>"

Sorry a little mistake in above paragraph i want to merge record from file 1 into file 2 and not the other way around.
Nov 30 '08 #8
jkmyoung
2,057 Expert 2GB
Could you show us what you have so far? If you can get the first few fields copying correctly, then it'll be easier to figure out mistakes you're making with the rest.
Dec 1 '08 #9
Thnx. for the reply..
Actually i only changed the xpath u provided as i made a mistake while mentioning which file to copy so i have to copy data from file 1 into file 2 under record element based on that unique ID. So i only changed xpath in the sample u provided (i am not sure i did it write as i m messed up) so it is:

Expand|Select|Wrap|Line Numbers
  1. <?xml version="1.0"?>
  2. <xsl:stylesheet version = '1.0'
  3. xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
  4. <xsl:output method="xml" indent="yes"/>
  5. <xsl:param name="file1" select="''"/><!-- defaults to empty string -->
  6. <xsl:variable name="doc1" select="document($file1)"/><!-- convert to nodes --> 
  7. <xsl:template match="*"><!-- copy template -->
  8. <xsl:copy>
  9. <xsl:copy-of select="@*"/>
  10. <xsl:apply-templates/>
  11. </xsl:copy>
  12. </xsl:template> 
  13. <xsl:template match="record">
  14. <xsl:copy>
  15. <xsl:copy-of select="@*"/>
  16. <xsl:apply-templates/>
  17. <!-- copy data from file 1 into file 2 based on guid in file 2 -->
  18. <xsl:copy-of select="$doc1/ListRecords/record/header[identifier = current()/item//feed/guid]"/> <!-- dont know whether where will it copy that data and under which element of file 2 -->
  19. </xsl:copy>
  20. </xsl:template>
  21. </xsl:stylesheet>
  22.  
So i dont know how to copy all metadata files from file one into file 2 exactly after file 2 metadata element ends. I know i didnt do much...
Hope u would help to solve it.
Dec 1 '08 #10
jkmyoung
2,057 Expert 2GB
The easiest way I can think of (not the best programatically) is to have a last metadata template. Use xpath like: "metadata[not(following::metadata)]"
Expand|Select|Wrap|Line Numbers
  1. <xsl:template match="metadata[not(following::metadata)]">
  2.   <xsl:copy>
  3.     <xsl:copy-of select="@*"/>
  4.     <xsl:apply-templates/>
  5.   </xsl:copy>
  6.   <!-- add rest from other file -->
  7.   <xsl:copy-of select="$doc1//metadata"/>
  8. </xsl:template>
  9.  
Dec 1 '08 #11
Thnx. a lot for the help..
Yes it works but only in case i have one record in each of the files but when merging more records, would require some concrete appraoch...
Dec 2 '08 #12

Sign in to post your reply or Sign up for a free account.

Similar topics

3
by: Mike | last post by:
Hi! I also asked this question in C# group with no results: I have 2 datasets loaded with data from two xml files having the same schema. The files contain data from yesterday and today. I'd...
3
by: Patrick | last post by:
I have got 2 XML documents, both of which conform to the same XSD Schema, which define possible optional elements. The 2 XML documents contain 2 disjoint set of XML elements. What is the best,...
2
by: Nikhil Prashar | last post by:
I'm trying to merge two XML files that have the same structure but not necessarily the same nodes in the same order. I've tried opening the files as datasets and using the DataSet.Merge() function,...
0
by: Naresh Narwani | last post by:
Problem Summary: Merging two different web applications into one create a problem for User Controls. Reason for merging two different Web Application: To share non serializable object...
12
by: google_groups3 | last post by:
Hi all. I currently have 2 text files which contain lists of file names. These text files are updated by my code. What I want to do is be able to merge these text files discarding the...
5
by: ckoniecny | last post by:
I have the following two files: File1: 11 John Doe 33 Jane Doe 55 Steve Smith File2: 22 Joe Doe 44 Willy Widget
10
by: n o s p a m p l e a s e | last post by:
Is it possible to merge two DLL files into one? If so, how? Thanx/NSP
1
by: keveen | last post by:
Can someone tell me how I can import tables from another non-Joomla mysql file into Joomla? Basically it is just from one mySQL database into another. I use phpMyAdmin to import and export the entire...
0
by: veer | last post by:
Hello sir. I am making a program on merging in Visual Basic. The program is that I have a folder which is not on my hard drive contain 80 Mdb files and each Mdb file contains two tables. I want to...
0
by: Albert-jan Roskam | last post by:
Hi John, Thanks! Using a higher xlrd version did the trick! Regarding your other remarks: -yep, input files with multiple sheets don't work yet. I kinda repressed that ;-) Spss outputs only...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.