Federation of DB2 Instances Across LPAR's
Question posted by: spencer@tabbert.net
(Guest)
on
November 12th, 2005 10:13 AM
My client is going through a large project to replace existing data
warehouse physical infastructure which is running DB2 8.1.6. In the
past we had a much more distributed environment where we had certain
portions of the data warehouse on a physically different server. We
utilized federation to allow the joining of tables on seperate
instances. We found that this worked ok in situations where the tables
were not that large say under 100,000 rows. We now have the need to be
able to join tables on seperate instances with more then 30 Million
rows. While it works in the current environment it takes a very long
time to ship the rows and thus it has become our bottleneck.
In our new physical infrastructure our database instances will all be
on the same physical hardware but will exist in different LPAR's. Will
we see improved performance when using federation to join across the
database instances? My thoughts are that we would see a vast
improvement as the data doesn't need to be shipped across the network.
Can anyone offer any confirmation on if my thoughts are correct or not?
Spencer
14
Answers Posted
<spencer@tabbert.net> wrote in message
news:1108398233.454247.137860@l41g2000cwc.googlegr oups.com...[color=blue]
> My client is going through a large project to replace existing data
> warehouse physical infastructure which is running DB2 8.1.6. In the
> past we had a much more distributed environment where we had certain
> portions of the data warehouse on a physically different server. We
> utilized federation to allow the joining of tables on seperate
> instances. We found that this worked ok in situations where the tables
> were not that large say under 100,000 rows. We now have the need to be
> able to join tables on seperate instances with more then 30 Million
> rows. While it works in the current environment it takes a very long
> time to ship the rows and thus it has become our bottleneck.
>
> In our new physical infrastructure our database instances will all be
> on the same physical hardware but will exist in different LPAR's. Will
> we see improved performance when using federation to join across the
> database instances? My thoughts are that we would see a vast
> improvement as the data doesn't need to be shipped across the network.
> Can anyone offer any confirmation on if my thoughts are correct or not?
>
> Spencer
>[/color]
Depends on how fast your old network interface was, and how much other
traffic was on the network. For example, if you had a private gigabit
Ethernet connection between the two machines, I would think it would pretty
fast.
With LPAR, each logical machine has its own IP address, so I am not sure
what the actual path between the machines would be. For example, would all
communications between the LPAR's just go out to the network and back again?
I really don't know the answer myself. You might need to ask someone in an
AIX forum (I am assuming that use AIX).
Actually it is going to be on HP-UX hardware and I believe we only had
a 100MB between the two machines.
Spencer
<spencer@tabbert.net> wrote in message
news:1108402283.971992.202580@o13g2000cwo.googlegr oups.com...[color=blue]
> Actually it is going to be on HP-UX hardware and I believe we only had
> a 100MB between the two machines.
> Spencer
>[/color]
I would try a HP/UX forum. The communication between LPAR's is via TCP/IP.
> > Actually it is going to be on HP-UX hardware and I believe we only had[color=blue][color=green]
> > a 100MB between the two machines.
> > Spencer
> >[/color]
> I would try a HP/UX forum. The communication between LPAR's is via TCP/IP.
>[/color]
I should have said that "I assume the communication between two different
databases on different servers that are LPAR'ed on the same physical box is
via TCP/IP." I don't know for a fact.
Hi,
Our experience with federation is that up to 3 million rows, on a 70 MSU
machine or equivalent (about 350 MIPS), it starts to degrade in spite of use
of Materialized Query Tables, fine tuning, using separte disk i/os, cache
and any fine tuning possible. I think and I want to be corrected if I am
wrong, this is a natural issue as of now with federation, as data grows
bigger, federation goes slower. Even if you save on time on shipping the
data, the queries with joins on big tables will still be your bottleneck.
What we do is replicate the current data we need using replication tools (in
DB2 you have Data Propagator or Q Replication) to replicate data out of the
big bunch of data to a smaller subset (same data structure, etc.) and do the
federation on the subset of data we are replicating to. There will be a lot
of architechting and setting up schedules to replicate data but that was the
most efficient way we found on how to deal with federation of big tables.
Hope this helps.
RdR
<spencer@tabbert.net> wrote in message
news:1108398233.454247.137860@l41g2000cwc.googlegr oups.com...[color=blue]
> My client is going through a large project to replace existing data
> warehouse physical infastructure which is running DB2 8.1.6. In the
> past we had a much more distributed environment where we had certain
> portions of the data warehouse on a physically different server. We
> utilized federation to allow the joining of tables on seperate
> instances. We found that this worked ok in situations where the tables
> were not that large say under 100,000 rows. We now have the need to be
> able to join tables on seperate instances with more then 30 Million
> rows. While it works in the current environment it takes a very long
> time to ship the rows and thus it has become our bottleneck.
>
> In our new physical infrastructure our database instances will all be
> on the same physical hardware but will exist in different LPAR's. Will
> we see improved performance when using federation to join across the
> database instances? My thoughts are that we would see a vast
> improvement as the data doesn't need to be shipped across the network.
> Can anyone offer any confirmation on if my thoughts are correct or not?
>
> Spencer
>[/color]
100MB wouldn't be optimal. GB Ethernet would be. Don't know network hw
well enough to know how much faster LPAR to LPAR communications would be.
Larry Edelstein
Join Bytes! wrote:[color=blue]
> Actually it is going to be on HP-UX hardware and I believe we only had
> a 100MB between the two machines.
> Spencer
>[/color]
So your telling me that the two instances would still communicate via
TCP/IP? That does not seem right to me. Is there no gain by having
these on the same physical box then?
Spencer
Here is a post from comp.sys.hp.hpux newsgroup today and the answer. If the
answer below is correct, you might be better off with the old configuration
and then add a Gigabit Ethernet card on each machine and run a private
Ethernet connection between them (assuming they are reasonably near each
other).
[color=blue]
> If one physical machine is LPAR'ed to run two HP/UX systems, and
> there is a TCP/IP connection between the two servers, does the IP
> path go out to the hub/router on the network and back, or is there a
> faster IP path within the machine between the two LPAR's.[/color]
IIRC, down the stack, past the driver, out the NIC, to the switch,
nothing but net. TCP/IP communications between LPARs is just like
communication between two completely separate hosts.
rick jones
It has nothing to do with TCP/IP. Databases always use TCP/IP to
communicate.
It is related to the hw/network that the traffic must pass through in
order to get from one to the other.
Larry Edelstein
Join Bytes! wrote:[color=blue]
> So your telling me that the two instances would still communicate via
> TCP/IP? That does not seem right to me. Is there no gain by having
> these on the same physical box then?
> Spencer
>[/color]
Join Bytes! wrote:[color=blue]
> So your telling me that the two instances would still communicate via
> TCP/IP? That does not seem right to me. Is there no gain by having
> these on the same physical box then?
> Spencer
>[/color]
I can't speak for HP's hardware and how it does paritioning, but with
IBM pSeries hardware (prior to the new p5 servers), each LPAR has
dedicated physical hardware like CPU, RAM, NIC, HBA. Each LPAR acts
just like a standalone machine, so communications between machines
uses NICs, etc.
The p5 servers have I/O virtualization, meaning that multiple LPARs
can share a single CPU/NIC/etc.
The advantage of having multiple servers within one single piece of
hardware comes from the fact that you can dynamically re-allocate
resources (CPU/memory/etc) from one LPAR to another.
Join Bytes! wrote:
[color=blue]
> In our new physical infrastructure our database instances will all be
> on the same physical hardware but will exist in different LPAR's.[/color]
Will[color=blue]
> we see improved performance when using federation to join across the
> database instances? My thoughts are that we would see a vast
> improvement as the data doesn't need to be shipped across the[/color]
network.[color=blue]
> Can anyone offer any confirmation on if my thoughts are correct or[/color]
not?
It might depend on the hardware. I was just reading the other day how
ibm p5-series hardware allows you to create ethernet connections across
the backplane. I'd assume that p4-series 690s, sun e15ks, etc would
allow the same.
On the other hand, i wonder how the optimizer handles those queries.
Are you using mqt to keep a local copy of data in the remote tables?
Or actually running adhoc queries across a federated nickname?
buck
It appears that the hardware does support some sort of VLAN between the
LPARS which is going to be gigabit I believe so this should help.
Currently we are not using MQT's for the purpose of keeping a local
copy of the data in the remote tables and are just running queries
across the nicknames. Utilizing replicated MQT's is probably something
we are going to have to more closely consider with the hardware
changes.
Spencer
Hi Spencer,
When I first encountered degradation of federating data, my first reaction
was to upgrade to the fastest and latest hardware, fine tune DB2, use
Materialized Query Tables. My conclusion was, federation with 3 million
records with queries, nothing of the above helped. Good thing that we were
scheduled for an upgrade, otherwise, I would have asked our company to spend
a lot of hardware for small improvements. The hardware upgrades and the fine
tuning helped but not to justify the time we have to wait for data to be
available. I might be wrong so I think the right step will be to use
monitoring tools to see where the bottleneck is. If the bottleneck is
getting data from federated sources, based on our experience having a CPU
that is 20 times more will not bring down the wait time even by half. If the
bottleneck is writing the data, perhaps more DASDs to write on with fast
rpms and large cache but our experience is if you are writing about 20,000
inserts, updates, deletes per minute, the fastests DASDs and the biggest
cache will not help. With sending data on the network, if you will be
sending about the same size of transaction the bigger the pipe the better
but if the i/o will move to its max, your pipe will always be full. You
mentioned your data will grow by 30 million records, calculate how much data
you will be sending per minute (or will you will be batching them every hour
or so?). It is just hard to send purchased hardware back to the store.
Like I mentioned, we found that replicating data to a smaller table will do
the job. Data Propagator is free anyways with DB2 LUW (I think for
mainframes and AS/400s you need to buy them).
Just trying to play Devil's advocate, not in any way trying to criticize. I
hope I am wrong and that having fast channels in your network does the job.
Thanks,
RdR
<spencer@tabbert.net> wrote in message
news:1108478735.114642.40600@g14g2000cwa.googlegro ups.com...[color=blue]
> It appears that the hardware does support some sort of VLAN between the
> LPARS which is going to be gigabit I believe so this should help.
> Currently we are not using MQT's for the purpose of keeping a local
> copy of the data in the remote tables and are just running queries
> across the nicknames. Utilizing replicated MQT's is probably something
> we are going to have to more closely consider with the hardware
> changes.
> Spencer
>[/color]
Thanks for the info! If it were up to me I would not be partitioning
the environments but that decision is not mine to make. I will offer
the information up and see what is said.
Spencer
|
|
|
What is Bytes?
We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights.
Get the best answers to your questions from over 196,927 network members.
Top Community Contributors
|