473,396 Members | 1,767 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Joining XML files?

I'm very new to XML and maybe just a touch impatient because I'm going to
ask a moderately advanced question even though I'm just learning the basics.

I've spent many years working with databases, both hierarchical and
relational. So far, XML is obviously hierarchical in nature. I'm wondering
if there is anything analogous to a relational "join" in XML?

For example, let's say I have an XML file that has a root element of
departments. Each record of the file has information about a single
department in a company and consists of a department number, department
name, manager name, and location. Let's say I have another XML file that
lists employees. Each record is an employee and gives information about the
employee's name, date of birth, department number, home address, etc.

Given that these are two separate XML files but that there is some common
information, specifically the department number, could I use XSLT to
generate a report that shows me each department name followed by the names
of the people who work in the department? Something like this:

Marketing
Department Number: 001
Location: New York
Manager: T. Jones

Other Staff
E. Humperdinck
E. Presley
J. Hendrix

Information Systems
Department Number: 666
Location: Toronto
Manager: M. Slate

Other Staff
F. Flintstone
B. Rubble
J. Rockhead

In other words, we're getting the department name and manager name from the
departments file and the "other staff" names from the employees file. We
know which employees go in which departments because the department number
is in both the departments file and in the employees file.

Is it conceptually possible to do this kind of joining in XSLT? If so, what
is this called? In other words, what are the main terms I need to know here?
I'd call this a join in relational database terminology but I imagine XSLT
has different terminology.

If this IS possible, can someone point me to a tutorial or reference that
explains how to write XSLT to do this?

--
Rhino
Jul 1 '08 #1
4 1965
rhino wrote:
Given that these are two separate XML files but that there is some common
information, specifically the department number, could I use XSLT to
generate a report that shows me each department name followed by the names
of the people who work in the department?
XSLT can certainly reference more than one input source, using the
document() function; then it's just a matter of writing expressions that
use data from one document to look up information in the other document.

There are probably examples on the XSLT FAQ website... but seriously,
once you know about the document() function it really isn't any harder
than if you were recombining data read from a single document.

The only tricky part, really, is deciding how you're going to tell the
stylesheet which two sources to look at. Common solutions are passing
one of the URIs in as a parameter, or having one hardcoded into the
stylesheet, or having a front-end document which the stylesheet obtains
both the actual URIs from.... Which of those solutions is best depends
on the environment you're performing this operation in. Note that all of
'em are extensible to more than 2 documents.

As to what to call it: Conceptually it's certainly a join or merge. The
former term is more likely to be recognized by DB and data-structure
folks, the latter is more familiar to folks coming to XML and XSLT from
the document-markup side of the world. I wouldn't get hung up on the
terminology; the clearest solution is probably to do exactly what you
did, provide a brief example of what you're trying to accomplish.
Jul 1 '08 #2

"Joseph J. Kesselman" <ke************@comcast.netwrote in message
news:486a9058$1@kcnews01...
rhino wrote:
>Given that these are two separate XML files but that there is some common
information, specifically the department number, could I use XSLT to
generate a report that shows me each department name followed by the
names of the people who work in the department?

XSLT can certainly reference more than one input source, using the
document() function; then it's just a matter of writing expressions that
use data from one document to look up information in the other document.

There are probably examples on the XSLT FAQ website... but seriously, once
you know about the document() function it really isn't any harder than if
you were recombining data read from a single document.

The only tricky part, really, is deciding how you're going to tell the
stylesheet which two sources to look at. Common solutions are passing one
of the URIs in as a parameter, or having one hardcoded into the
stylesheet, or having a front-end document which the stylesheet obtains
both the actual URIs from.... Which of those solutions is best depends on
the environment you're performing this operation in. Note that all of 'em
are extensible to more than 2 documents.

As to what to call it: Conceptually it's certainly a join or merge. The
former term is more likely to be recognized by DB and data-structure
folks, the latter is more familiar to folks coming to XML and XSLT from
the document-markup side of the world. I wouldn't get hung up on the
terminology; the clearest solution is probably to do exactly what you did,
provide a brief example of what you're trying to accomplish.
Thank you once again, Joseph! This definitely gets me going in the right
direction.

I was counting on something like this being possible for the project I am
designing. The fact that it is possible, and apparently pretty routine, is
VERY helpful in planning what I need to do next. (After I work out a couple
of examples of joins/merges, that is!)

I really appreciate your assistance with my questions today!

--
Rhino
Jul 1 '08 #3
rhino wrote:
>
Given that these are two separate XML files but that there is some common
information, specifically the department number, could I use XSLT to
generate a report that shows me each department name followed by the names
of the people who work in the department? Something like this:

Marketing
Department Number: 001
Location: New York
Manager: T. Jones

Other Staff
E. Humperdinck
E. Presley
J. Hendrix

Information Systems
Department Number: 666
Location: Toronto
Manager: M. Slate

Other Staff
F. Flintstone
B. Rubble
J. Rockhead
Just in case you (or someone else) might be interested in a non-XSLT solution: here a small xgawk script that does the job.

$ cat departments.xml
<?xml version="1.0" encoding="UTF-8"?>
<departments>
<department>
<department_number>001</department_number>
<name>Marketing</name>
<location>New York</location>
<manager>T. Jones</manager>
</department>
<department>
<department_number>666</department_number>
<name>Information Systems</name>
<location>Toronto</location>
<manager>M. Slate</manager>
</department>
</departments>

$ cat employees.xml
<?xml version="1.0" encoding="UTF-8"?>
<employees>
<employee>
<name>E. Humperdinck</name>
<department_number>001</department_number>
</employee>
<employee>
<name>E. Presley</name>
<department_number>001</department_number>
</employee>
<employee>
<name>J. Hendrix</name>
<department_number>001</department_number>
</employee>
<employee>
<name>F. Flintstone</name>
<department_number>666</department_number>
</employee>
<employee>
<name>B. Rubble</name>
<department_number>666</department_number>
</employee>
<employee>
<name>J. Rockhead</name>
<department_number>666</department_number>
</employee>
</employees>

$ cat join.awk
@load xml

XMLSTARTELEM {data = "" ; next}
XMLCHARDATA {data = $0 ; next}

XMLDEPTH == 3 && XMLENDELEM {
a[XMLENDELEM] = data
dept = a["department_number"]
}

NR == FNR && XMLENDELEM == "employee" {
o[dept] = o[dept] sep[dept] " " a["name"]
sep[dept] = "\n"
next
}

XMLENDELEM == "department" {
print a["name"]
print " Department Number: " dept
print " Location: " a["location"]
print " Manager: " a["manager"]

print ORS " Other Staff" ORS o[dept] ORS
}

END {if (XMLERROR) print XMLERROR}

$ xgawk -f join.awk employees.xml departments.xml
Marketing
Department Number: 001
Location: New York
Manager: T. Jones

Other Staff
E. Humperdinck
E. Presley
J. Hendrix

Information Systems
Department Number: 666
Location: Toronto
Manager: M. Slate

Other Staff
F. Flintstone
B. Rubble
J. Rockhead
Jul 6 '08 #4
rhino wrote:
I'm very new to XML and maybe just a touch impatient because I'm going to
ask a moderately advanced question even though I'm just learning the basics.

I've spent many years working with databases, both hierarchical and
relational. So far, XML is obviously hierarchical in nature. I'm wondering
if there is anything analogous to a relational "join" in XML?
Be aware of the warning in http://xml.silmaril.ie/authors/databases/
XML is a language specification, not a database application. While there
are some similarities, there are a *lot* of differences.

///Peter
Jul 13 '08 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Mark | last post by:
Hi all, I have 2 files containing Id numbers and surnames (these files essentially contain the same data) I want to select distinct() and join on id number to return a recordset containing every...
3
by: JHenstay | last post by:
I've been doing quite alot of reading on C++ and classes, however, everything I read just talks about the code itself and not the location of the code. My question is, what if you want to...
9
by: Eric Sabine | last post by:
Can someone give me a practical example of why I would join threads? I am assuming that you would typically join a background thread with the UI thread and not a background to a background, but...
5
by: Paul Czubilinski | last post by:
Hello, I would like to join few pdf files uploaded separetly into my website into one downloable pdf file. Is it possible in php or is it neccessary to download all these files one by one? ...
5
by: Hugh Janus | last post by:
Hi group, I have an app that streams files over the network. What I want to be able to do now is select a folder and stream the entire contents of that folder over the network. I could simply...
1
by: sarffi | last post by:
Hi. i m getting a problem regarding joining of xml files of size greater than 1GB in java.the error i mgetting is "out of heap memory space" in java....So,plz suggest me the possible solutions....
7
by: mosscliffe | last post by:
I have 4 text files each approx 50mb. I need to join these into one large text file. I only need to do this very occasionally, as the problem has occurred because of upload limitations. ...
2
by: Supermansteel | last post by:
I am joining these 2 tables together in Access 2003 and can't figure out the exact way of writing this script......Can anyone help? I have the following SQL: SELECT...
3
sumittyagi
by: sumittyagi | last post by:
Hi All, I am stuck with one tricky situation here. The situation is as follows:- I have two files, both files have two columns - space seperated key value pairs. Now say files are f1 and...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.