473,659 Members | 2,980 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Joining XML files?

I'm very new to XML and maybe just a touch impatient because I'm going to
ask a moderately advanced question even though I'm just learning the basics.

I've spent many years working with databases, both hierarchical and
relational. So far, XML is obviously hierarchical in nature. I'm wondering
if there is anything analogous to a relational "join" in XML?

For example, let's say I have an XML file that has a root element of
departments. Each record of the file has information about a single
department in a company and consists of a department number, department
name, manager name, and location. Let's say I have another XML file that
lists employees. Each record is an employee and gives information about the
employee's name, date of birth, department number, home address, etc.

Given that these are two separate XML files but that there is some common
information, specifically the department number, could I use XSLT to
generate a report that shows me each department name followed by the names
of the people who work in the department? Something like this:

Marketing
Department Number: 001
Location: New York
Manager: T. Jones

Other Staff
E. Humperdinck
E. Presley
J. Hendrix

Information Systems
Department Number: 666
Location: Toronto
Manager: M. Slate

Other Staff
F. Flintstone
B. Rubble
J. Rockhead

In other words, we're getting the department name and manager name from the
departments file and the "other staff" names from the employees file. We
know which employees go in which departments because the department number
is in both the departments file and in the employees file.

Is it conceptually possible to do this kind of joining in XSLT? If so, what
is this called? In other words, what are the main terms I need to know here?
I'd call this a join in relational database terminology but I imagine XSLT
has different terminology.

If this IS possible, can someone point me to a tutorial or reference that
explains how to write XSLT to do this?

--
Rhino
Jul 1 '08 #1
4 1979
rhino wrote:
Given that these are two separate XML files but that there is some common
information, specifically the department number, could I use XSLT to
generate a report that shows me each department name followed by the names
of the people who work in the department?
XSLT can certainly reference more than one input source, using the
document() function; then it's just a matter of writing expressions that
use data from one document to look up information in the other document.

There are probably examples on the XSLT FAQ website... but seriously,
once you know about the document() function it really isn't any harder
than if you were recombining data read from a single document.

The only tricky part, really, is deciding how you're going to tell the
stylesheet which two sources to look at. Common solutions are passing
one of the URIs in as a parameter, or having one hardcoded into the
stylesheet, or having a front-end document which the stylesheet obtains
both the actual URIs from.... Which of those solutions is best depends
on the environment you're performing this operation in. Note that all of
'em are extensible to more than 2 documents.

As to what to call it: Conceptually it's certainly a join or merge. The
former term is more likely to be recognized by DB and data-structure
folks, the latter is more familiar to folks coming to XML and XSLT from
the document-markup side of the world. I wouldn't get hung up on the
terminology; the clearest solution is probably to do exactly what you
did, provide a brief example of what you're trying to accomplish.
Jul 1 '08 #2

"Joseph J. Kesselman" <ke************ @comcast.netwro te in message
news:486a9058$1 @kcnews01...
rhino wrote:
>Given that these are two separate XML files but that there is some common
information, specifically the department number, could I use XSLT to
generate a report that shows me each department name followed by the
names of the people who work in the department?

XSLT can certainly reference more than one input source, using the
document() function; then it's just a matter of writing expressions that
use data from one document to look up information in the other document.

There are probably examples on the XSLT FAQ website... but seriously, once
you know about the document() function it really isn't any harder than if
you were recombining data read from a single document.

The only tricky part, really, is deciding how you're going to tell the
stylesheet which two sources to look at. Common solutions are passing one
of the URIs in as a parameter, or having one hardcoded into the
stylesheet, or having a front-end document which the stylesheet obtains
both the actual URIs from.... Which of those solutions is best depends on
the environment you're performing this operation in. Note that all of 'em
are extensible to more than 2 documents.

As to what to call it: Conceptually it's certainly a join or merge. The
former term is more likely to be recognized by DB and data-structure
folks, the latter is more familiar to folks coming to XML and XSLT from
the document-markup side of the world. I wouldn't get hung up on the
terminology; the clearest solution is probably to do exactly what you did,
provide a brief example of what you're trying to accomplish.
Thank you once again, Joseph! This definitely gets me going in the right
direction.

I was counting on something like this being possible for the project I am
designing. The fact that it is possible, and apparently pretty routine, is
VERY helpful in planning what I need to do next. (After I work out a couple
of examples of joins/merges, that is!)

I really appreciate your assistance with my questions today!

--
Rhino
Jul 1 '08 #3
rhino wrote:
>
Given that these are two separate XML files but that there is some common
information, specifically the department number, could I use XSLT to
generate a report that shows me each department name followed by the names
of the people who work in the department? Something like this:

Marketing
Department Number: 001
Location: New York
Manager: T. Jones

Other Staff
E. Humperdinck
E. Presley
J. Hendrix

Information Systems
Department Number: 666
Location: Toronto
Manager: M. Slate

Other Staff
F. Flintstone
B. Rubble
J. Rockhead
Just in case you (or someone else) might be interested in a non-XSLT solution: here a small xgawk script that does the job.

$ cat departments.xml
<?xml version="1.0" encoding="UTF-8"?>
<departments>
<department>
<department_num ber>001</department_numb er>
<name>Marketing </name>
<location>New York</location>
<manager>T. Jones</manager>
</department>
<department>
<department_num ber>666</department_numb er>
<name>Informati on Systems</name>
<location>Toron to</location>
<manager>M. Slate</manager>
</department>
</departments>

$ cat employees.xml
<?xml version="1.0" encoding="UTF-8"?>
<employees>
<employee>
<name>E. Humperdinck</name>
<department_num ber>001</department_numb er>
</employee>
<employee>
<name>E. Presley</name>
<department_num ber>001</department_numb er>
</employee>
<employee>
<name>J. Hendrix</name>
<department_num ber>001</department_numb er>
</employee>
<employee>
<name>F. Flintstone</name>
<department_num ber>666</department_numb er>
</employee>
<employee>
<name>B. Rubble</name>
<department_num ber>666</department_numb er>
</employee>
<employee>
<name>J. Rockhead</name>
<department_num ber>666</department_numb er>
</employee>
</employees>

$ cat join.awk
@load xml

XMLSTARTELEM {data = "" ; next}
XMLCHARDATA {data = $0 ; next}

XMLDEPTH == 3 && XMLENDELEM {
a[XMLENDELEM] = data
dept = a["department_num ber"]
}

NR == FNR && XMLENDELEM == "employee" {
o[dept] = o[dept] sep[dept] " " a["name"]
sep[dept] = "\n"
next
}

XMLENDELEM == "department " {
print a["name"]
print " Department Number: " dept
print " Location: " a["location"]
print " Manager: " a["manager"]

print ORS " Other Staff" ORS o[dept] ORS
}

END {if (XMLERROR) print XMLERROR}

$ xgawk -f join.awk employees.xml departments.xml
Marketing
Department Number: 001
Location: New York
Manager: T. Jones

Other Staff
E. Humperdinck
E. Presley
J. Hendrix

Information Systems
Department Number: 666
Location: Toronto
Manager: M. Slate

Other Staff
F. Flintstone
B. Rubble
J. Rockhead
Jul 6 '08 #4
rhino wrote:
I'm very new to XML and maybe just a touch impatient because I'm going to
ask a moderately advanced question even though I'm just learning the basics.

I've spent many years working with databases, both hierarchical and
relational. So far, XML is obviously hierarchical in nature. I'm wondering
if there is anything analogous to a relational "join" in XML?
Be aware of the warning in http://xml.silmaril.ie/authors/databases/
XML is a language specification, not a database application. While there
are some similarities, there are a *lot* of differences.

///Peter
Jul 13 '08 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
553
by: Mark | last post by:
Hi all, I have 2 files containing Id numbers and surnames (these files essentially contain the same data) I want to select distinct() and join on id number to return a recordset containing every individual listed in both the files HOWEVER, in some cases an incomplete ID number has been collected into one of the 2 files -is there a way to join on partial matches not just identical records in the same way as you can select where LIKE...
3
1672
by: JHenstay | last post by:
I've been doing quite alot of reading on C++ and classes, however, everything I read just talks about the code itself and not the location of the code. My question is, what if you want to seperate classes into their own CPP files, what changes are needed to the files and how do you actually compile and build an executable. For instance, let's say I have the following all in one file, it would compile and link into an executable fine:
9
1871
by: Eric Sabine | last post by:
Can someone give me a practical example of why I would join threads? I am assuming that you would typically join a background thread with the UI thread and not a background to a background, but since I'm asking in the first place, assume that assumption to be very assuming. thanks, Eric
5
5875
by: Paul Czubilinski | last post by:
Hello, I would like to join few pdf files uploaded separetly into my website into one downloable pdf file. Is it possible in php or is it neccessary to download all these files one by one? Thx for help, Paul
5
1348
by: Hugh Janus | last post by:
Hi group, I have an app that streams files over the network. What I want to be able to do now is select a folder and stream the entire contents of that folder over the network. I could simply just loop through each file and transmit them one at a time but I would rather somehow stream all the files into a single file and then transmit this one file. Upon receipt, then "un-stream" all these files. Something similar to the TAR command...
1
1275
by: sarffi | last post by:
Hi. i m getting a problem regarding joining of xml files of size greater than 1GB in java.the error i mgetting is "out of heap memory space" in java....So,plz suggest me the possible solutions....
7
1508
by: mosscliffe | last post by:
I have 4 text files each approx 50mb. I need to join these into one large text file. I only need to do this very occasionally, as the problem has occurred because of upload limitations. Bearing in mind filesize and memory useage, would I be better reading every line in every file and writing each line to the output file or is there someway I could execute some shell command.
2
2253
by: Supermansteel | last post by:
I am joining these 2 tables together in Access 2003 and can't figure out the exact way of writing this script......Can anyone help? I have the following SQL: SELECT PL_Input.Date_ID,Count(PL_Input.+PL_Input.+PL_Input.+PL_Input.+PL_Input.) AS FROM Country INNER JOIN PL_Input ON Country.Country_ID = PL_Input.Country_ID WHERE (((. & "" & . & "" & . & "" & . & "" & .) Like "*2*"))
3
4784
sumittyagi
by: sumittyagi | last post by:
Hi All, I am stuck with one tricky situation here. The situation is as follows:- I have two files, both files have two columns - space seperated key value pairs. Now say files are f1 and f2. and columns in f1 are f1c1, f1c2; and columns in f2 are f2c1 and f2c2.
0
8428
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8335
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8851
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8627
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7356
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6179
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4175
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
2752
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1976
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.