By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,225 Members | 1,051 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,225 IT Pros & Developers. It's quick & easy.

IBM 595 dual core CPU DB partitioning question

P: n/a
I am starting to config a 64 bits DB2 in IBM 595 AIX box with 2 dual
core CPU and I would like to assigned one 'processor' for one db
partition. Should I config it as a 4 nodes or 2 nodes instances? How
about other setting such as IO cleaner, Default degree etc?

Mar 30 '07 #1
Share this Question
Share on Google+
15 Replies


P: n/a
On 30 Mar, 15:32, "Woody Ling" <woodyl...@gmail.comwrote:
I am starting to config a 64 bits DB2 in IBM 595 AIX box with 2 dual
core CPU and I would like to assigned one 'processor' for one db
partition. Should I config it as a 4 nodes or 2 nodes instances? How
about other setting such as IO cleaner, Default degree etc?
There's no rules of thumb. It depends on expected volumetrics and the
nature of the database that you're working with.

There's a "configuration advisor" in Control Center, if you get
desperate


Mar 30 '07 #2

P: n/a
On Mar 30, 11:12 am, james_...@hotmail.com wrote:
On 30 Mar, 15:32, "Woody Ling" <woodyl...@gmail.comwrote:
I am starting to config a 64 bits DB2 in IBM 595 AIX box with 2 dual
core CPU and I would like to assigned one 'processor' for one db
partition. Should I config it as a 4 nodes or 2 nodes instances? How
about other setting such as IO cleaner, Default degree etc?

There's no rules of thumb. It depends on expected volumetrics and the
nature of the database that you're working with.

There's a "configuration advisor" in Control Center, if you get
desperate
Hi Woody,

Any particular reason why you want a partitioned instance here? Each
database partition will use up some fixed resources (CPU and memory) -
the more partitions you have on a machine, the less resources are
available for your actual workload. As far as machine sizes go, I
probably wouldn't recommend using multiple partitions for a 4-CPU
machine, unless you're planning to scale out and add more machines in
the future. A single partition can fully utilize all the CPUs on your
machine.

Cheers,
Liam.

Mar 30 '07 #3

P: n/a
"Liam Finnie" <lf*****@ca.ibm.comwrote in message
news:11**********************@p77g2000hsh.googlegr oups.com...
Hi Woody,

Any particular reason why you want a partitioned instance here? Each
database partition will use up some fixed resources (CPU and memory) -
the more partitions you have on a machine, the less resources are
available for your actual workload. As far as machine sizes go, I
probably wouldn't recommend using multiple partitions for a 4-CPU
machine, unless you're planning to scale out and add more machines in
the future. A single partition can fully utilize all the CPUs on your
machine.

Cheers,
Liam.
Partitioning a table enables query parallelism. If this is a data warehouse
application with a lot of table scans, then it will help performance. If it
is an OLTP application or access is usually returning a small answer set via
index access, then it is probably not a good idea.

The number of CPU's is not a determinant as to whether to partition, but you
generally should have at least one CPU per partition (dual core can count as
2 CPU's). If you have a data warehouse application using a lot of table
scans and the server has one dual core CPU, partitioning the table is
probably a good idea.
Mar 30 '07 #4

P: n/a
On Mar 31, 1:20 am, "Mark A" <nob...@nowhere.comwrote:
"Liam Finnie" <lfin...@ca.ibm.comwrote in message

news:11**********************@p77g2000hsh.googlegr oups.com...
Hi Woody,
Any particular reason why you want a partitioned instance here? Each
database partition will use up some fixed resources (CPU and memory) -
the more partitions you have on a machine, the less resources are
available for your actual workload. As far as machine sizes go, I
probably wouldn't recommend using multiple partitions for a 4-CPU
machine, unless you're planning to scale out and add more machines in
the future. A single partition can fully utilize all the CPUs on your
machine.
Cheers,
Liam.

Partitioning a table enables query parallelism. If this is a data warehouse
application with a lot of table scans, then it will help performance. If it
is an OLTP application or access is usually returning a small answer set via
index access, then it is probably not a good idea.

The number of CPU's is not a determinant as to whether to partition, but you
generally should have at least one CPU per partition (dual core can count as
2 CPU's). If you have a data warehouse application using a lot of table
scans and the server has one dual core CPU, partitioning the table is
probably a good idea.

The idea of 1 CPU for 1 partition can be shown by the following
example:

Suppose we have a complex query to select 10000 rows that use total
580s to return result (with 315s CPU time and 332s sort time). When we
turn on the intra-parallel option and use 2 CPU to process the query,
the total time becomes 533s (with 508s CPU times and 493s sort time
[sum of 2 cpu] ). Although the 2 CPU are fully utilize in intra-
parallel partition and the elapsed time is shorter, the efficiency is
not as good as 1 CPU for 1 partition.

Assume there is 30% overhead for using multiple partitions, the
elapsed time become ( 580/2 ) * 1.3 = 377sec because each partition
handles 5000 rows only.

Do you agree with this example?

Mar 30 '07 #5

P: n/a
"Woody Ling" <wo*******@gmail.comwrote in message
news:11**********************@e65g2000hsc.googlegr oups.com...
The idea of 1 CPU for 1 partition can be shown by the following
example:

Suppose we have a complex query to select 10000 rows that use total
580s to return result (with 315s CPU time and 332s sort time). When we
turn on the intra-parallel option and use 2 CPU to process the query,
the total time becomes 533s (with 508s CPU times and 493s sort time
[sum of 2 cpu] ). Although the 2 CPU are fully utilize in intra-
parallel partition and the elapsed time is shorter, the efficiency is
not as good as 1 CPU for 1 partition.

Assume there is 30% overhead for using multiple partitions, the
elapsed time become ( 580/2 ) * 1.3 = 377sec because each partition
handles 5000 rows only.

Do you agree with this example?
I don't understanding what you are saying.

Just because you turn on intra-partition parallelism (not needing DPF) does
not mean that the query is actually running in parallel mode. And
intra-partition parallelism is significantly different than inter-partition
parallelism (DPF) where a single table is physically partitioned (within a
single server or across servers). Using DPF to partition and with a query
that does a table scan, the total elapsed time will be reasonably close to
50% of a non-partitioned table (unless it is a very fast query).

With DPF, parallel overhead is not anywhere near 30%, although the overhead
can be noticed if all the partitions are on the same physical node and you
don't have enough bandwidth on your disk sub-system to run both partitions
at full speed on a table scan. Preferably, each partition should have its
own disk controller, and each partition must have its own disks to scale in
a linear (or near linear) fashion on a large table scan.
Mar 30 '07 #6

P: n/a
On Mar 31, 5:14 am, "Mark A" <nob...@nowhere.comwrote:
"Woody Ling" <woodyl...@gmail.comwrote in message

news:11**********************@e65g2000hsc.googlegr oups.com...


The idea of 1 CPU for 1 partition can be shown by the following
example:
Suppose we have a complex query to select 10000 rows that use total
580s to return result (with 315s CPU time and 332s sort time). When we
turn on the intra-parallel option and use 2 CPU to process the query,
the total time becomes 533s (with 508s CPU times and 493s sort time
[sum of 2 cpu] ). Although the 2 CPU are fully utilize in intra-
parallel partition and the elapsed time is shorter, the efficiency is
not as good as 1 CPU for 1 partition.
Assume there is 30% overhead for using multiple partitions, the
elapsed time become ( 580/2 ) * 1.3 = 377sec because each partition
handles 5000 rows only.
Do you agree with this example?

I don't understanding what you are saying.

Just because you turn on intra-partition parallelism (not needing DPF) does
not mean that the query is actually running in parallel mode. And
intra-partition parallelism is significantly different than inter-partition
parallelism (DPF) where a single table is physically partitioned (within a
single server or across servers). Using DPF to partition and with a query
that does a table scan, the total elapsed time will be reasonably close to
50% of a non-partitioned table (unless it is a very fast query).

With DPF, parallel overhead is not anywhere near 30%, although the overhead
can be noticed if all the partitions are on the same physical node and you
don't have enough bandwidth on your disk sub-system to run both partitions
at full speed on a table scan. Preferably, each partition should have its
own disk controller, and each partition must have its own disks to scale in
a linear (or near linear) fashion on a large table scan.- Hide quoted text -

- Show quoted text -

Yes, I agree with you so that the elasped time for handling the
complex query is:

a. 1 CPU without DPF and intra-parallel OFF is 580s (with 315s CPU
time and 332s sort time). {figure from IBM}
b. 2 CPU without DPF and intra-parallel ON is 533s (with 508s CPU
times and 493s sort time [sum of 2 cpu] ). {figure from IBM}
c. 2 CPU with DPF and intra-parallel OFF is 377s (each partition
handles 50% workload and assume 30% overhead for cross join, combining
result etc).

30% overhead for DPF is an assumption. I just want to show you that
even there is overhead for using DPF, it is still much better than
using intra-parallel with same number of CPU.

Think about if we have 4 CPU, which configuration is better?
a. 2 CPU for 1 partition and create total 2 db partitions with intra-
parallel ON
b. 1 CPU for 1 partition and create total 4 db partitions with intra-
paralle OFF


Mar 31 '07 #7

P: n/a
"Woody Ling" <wo*******@gmail.comwrote in message
news:11**********************@n59g2000hsh.googlegr oups.com...
Yes, I agree with you so that the elasped time for handling the
complex query is:

a. 1 CPU without DPF and intra-parallel OFF is 580s (with 315s CPU
time and 332s sort time). {figure from IBM}
b. 2 CPU without DPF and intra-parallel ON is 533s (with 508s CPU
times and 493s sort time [sum of 2 cpu] ). {figure from IBM}
c. 2 CPU with DPF and intra-parallel OFF is 377s (each partition
handles 50% workload and assume 30% overhead for cross join, combining
result etc).

30% overhead for DPF is an assumption. I just want to show you that
even there is overhead for using DPF, it is still much better than
using intra-parallel with same number of CPU.

Think about if we have 4 CPU, which configuration is better?
a. 2 CPU for 1 partition and create total 2 db partitions with intra-
parallel ON
b. 1 CPU for 1 partition and create total 4 db partitions with intra-
paralle OFF
You are still confused. Let me repeat:

Just because you turn on intra-partition parallelism (without DPF) does not
mean that the query is actually running in parallel mode.
Mar 31 '07 #8

P: n/a
On Mar 31, 9:28 pm, "Mark A" <nob...@nowhere.comwrote:
"Woody Ling" <woodyl...@gmail.comwrote in message

news:11**********************@n59g2000hsh.googlegr oups.com...


Yes, I agree with you so that the elasped time for handling the
complex query is:
a. 1 CPU without DPF and intra-parallel OFF is 580s (with 315s CPU
time and 332s sort time). {figure from IBM}
b. 2 CPU without DPF and intra-parallel ON is 533s (with 508s CPU
times and 493s sort time [sum of 2 cpu] ). {figure from IBM}
c. 2 CPU with DPF and intra-parallel OFF is 377s (each partition
handles 50% workload and assume 30% overhead for cross join, combining
result etc).
30% overhead for DPF is an assumption. I just want to show you that
even there is overhead for using DPF, it is still much better than
using intra-parallel with same number of CPU.
Think about if we have 4 CPU, which configuration is better?
a. 2 CPU for 1 partition and create total 2 db partitions with intra-
parallel ON
b. 1 CPU for 1 partition and create total 4 db partitions with intra-
paralle OFF

You are still confused. Let me repeat:

Just because you turn on intra-partition parallelism (without DPF) does not
mean that the query is actually running in parallel mode.- Hide quoted text -

- Show quoted text -

I totally agree with you again.
>From the example, it already showed that the overhead of using intra-
partition parallelism is very large. If intra-parallel makes the query
running in parallel mode, the elasped time should be much shorter (may
be 1/2 of original). In fact, it is not (533s vs 580s). So if I have 2
CPU, I will create 2 db partitions and assign 1 CPU for each partition
and turn off intra-parallel to fully utilize all CPU resource.

Please correct me if I have a wrong concept.


Mar 31 '07 #9

P: n/a
Ian
Woody Ling wrote:
>
Yes, I agree with you so that the elasped time for handling the
complex query is:

a. 1 CPU without DPF and intra-parallel OFF is 580s (with 315s CPU
time and 332s sort time). {figure from IBM}
b. 2 CPU without DPF and intra-parallel ON is 533s (with 508s CPU
times and 493s sort time [sum of 2 cpu] ). {figure from IBM}
c. 2 CPU with DPF and intra-parallel OFF is 377s (each partition
handles 50% workload and assume 30% overhead for cross join, combining
result etc).

30% overhead for DPF is an assumption. I just want to show you that
even there is overhead for using DPF, it is still much better than
using intra-parallel with same number of CPU.
Making an assumption of 30% overhead based on a single result is
more than a little misleading.
Think about if we have 4 CPU, which configuration is better?
a. 2 CPU for 1 partition and create total 2 db partitions with intra-
parallel ON
b. 1 CPU for 1 partition and create total 4 db partitions with intra-
paralle OFF
The decision of whether to use 1 CPU / partition or 2 CPUs/partition is
totally dependent on your workload. Certain workloads will benefit from
having more CPU resources available per partition. Many don't, and the
benefit of having more (smaller) partitions is better. Best practices
today start with 1 CPU / partition.
Apr 2 '07 #10

P: n/a
Ian wrote:
>Think about if we have 4 CPU, which configuration is better?
a. 2 CPU for 1 partition and create total 2 db partitions with intra-
parallel ON
b. 1 CPU for 1 partition and create total 4 db partitions with intra-
paralle OFF

The decision of whether to use 1 CPU / partition or 2 CPUs/partition is
totally dependent on your workload. Certain workloads will benefit from
having more CPU resources available per partition. Many don't, and the
benefit of having more (smaller) partitions is better. Best practices
today start with 1 CPU / partition.
Actually, there is a misconception in that one could assign a CPU to a
specific logical partition in DPF. Instead, all logical partitions on the
same machine share all CPUs and compete for CPU resources.

--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany
Apr 2 '07 #11

P: n/a
On Apr 2, 5:57 am, Knut Stolze <sto...@de.ibm.comwrote:
Ian wrote:
Think about if we have 4 CPU, which configuration is better?
a. 2 CPU for 1 partition and create total 2 db partitions with intra-
parallel ON
b. 1 CPU for 1 partition and create total 4 db partitions with intra-
paralle OFF
The decision of whether to use 1 CPU / partition or 2 CPUs/partition is
totally dependent on your workload. Certain workloads will benefit from
having more CPU resources available per partition. Many don't, and the
benefit of having more (smaller) partitions is better. Best practices
today start with 1 CPU / partition.

Actually, there is a misconception in that one could assign a CPU to a
specific logical partition in DPF. Instead, all logical partitions on the
same machine share all CPUs and compete for CPU resources.

--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany
If you're on DB2 9, you can actually "bind" a partition to a resource
set, by specifying the resource set name in your db2nodes.cfg file.
On AIX, a resource set is composed of one or more processors, along
with some amount of RAM. You can configure your own resource sets as
needed, or you can use any predefined resource sets. This is
particularly useful for NUMA machines (not sure if 595's are NUMA-like
or not offhand), but you should also be able to use this on non-NUMA
machines to enable CPU binding. This will probably require a bit of
fiddling around to get right though - I think there are some kernel
tuneables that need to be configured, and some privileges you need to
grant the instance owning ID - search for "DB2 node configuration
file" in the online books for more details.

Cheers,
Liam.

Apr 2 '07 #12

P: n/a
On Apr 2, 8:19 pm, "Liam Finnie" <lfin...@ca.ibm.comwrote:
On Apr 2, 5:57 am, Knut Stolze <sto...@de.ibm.comwrote:


Ian wrote:
>Think about if we have 4 CPU, which configuration is better?
>a. 2 CPU for 1 partition and create total 2 db partitions with intra-
>parallel ON
>b. 1 CPU for 1 partition and create total 4 db partitions with intra-
>paralle OFF
The decision of whether to use 1 CPU / partition or 2 CPUs/partition is
totally dependent on your workload. Certain workloads will benefit from
having more CPU resources available per partition. Many don't, and the
benefit of having more (smaller) partitions is better. Best practices
today start with 1 CPU / partition.
Actually, there is a misconception in that one could assign a CPU to a
specific logical partition in DPF. Instead, all logical partitions on the
same machine share all CPUs and compete for CPU resources.
--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany

If you're on DB2 9, you can actually "bind" a partition to a resource
set, by specifying the resource set name in your db2nodes.cfg file.
On AIX, a resource set is composed of one or more processors, along
with some amount of RAM. You can configure your own resource sets as
needed, or you can use any predefined resource sets. This is
particularly useful for NUMA machines (not sure if 595's are NUMA-like
or not offhand), but you should also be able to use this on non-NUMA
machines to enable CPU binding. This will probably require a bit of
fiddling around to get right though - I think there are some kernel
tuneables that need to be configured, and some privileges you need to
grant the instance owning ID - search for "DB2 node configuration
file" in the online books for more details.

Cheers,
Liam.- Hide quoted text -

- Show quoted text -
Thanks. I found the manual of db2 8.2 about how to define resource set
on AIX. URL here:

http://publib.boulder.ibm.com/infoce...t/r0006351.htm

But I have no idea if the agents and subagents processes are also
"bind" by this setting too. May be they are shared by all CPU. On the
other hand, I am afraid I cannot assign one logical CPU of dual core
processor to one node by using resource set which is designed for
physical resource only.

I also agree that we cannot make conclusion base on one simple
example. Although I can make use of DPF, I should also increase memory
and I/O to keep linear scale.

However, I have another question. Should I turn on intra-parallel if I
assign 2 CPU for 1 node?


Apr 2 '07 #13

P: n/a
On Apr 2, 1:47 pm, "Woody Ling" <woodyl...@gmail.comwrote:
On Apr 2, 8:19 pm, "Liam Finnie" <lfin...@ca.ibm.comwrote:
On Apr 2, 5:57 am, Knut Stolze <sto...@de.ibm.comwrote:
Ian wrote:
Think about if we have 4 CPU, which configuration is better?
a. 2 CPU for 1 partition and create total 2 db partitions with intra-
parallel ON
b. 1 CPU for 1 partition and create total 4 db partitions with intra-
paralle OFF
The decision of whether to use 1 CPU / partition or 2 CPUs/partition is
totally dependent on your workload. Certain workloads will benefit from
having more CPU resources available per partition. Many don't, and the
benefit of having more (smaller) partitions is better. Best practices
today start with 1 CPU / partition.
Actually, there is a misconception in that one could assign a CPU to a
specific logical partition in DPF. Instead, all logical partitions on the
same machine share all CPUs and compete for CPU resources.
--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany
If you're on DB2 9, you can actually "bind" a partition to a resource
set, by specifying the resource set name in your db2nodes.cfg file.
On AIX, a resource set is composed of one or more processors, along
with some amount of RAM. You can configure your own resource sets as
needed, or you can use any predefined resource sets. This is
particularly useful for NUMA machines (not sure if 595's are NUMA-like
or not offhand), but you should also be able to use this on non-NUMA
machines to enable CPU binding. This will probably require a bit of
fiddling around to get right though - I think there are some kernel
tuneables that need to be configured, and some privileges you need to
grant the instance owning ID - search for "DB2 node configuration
file" in the online books for more details.
Cheers,
Liam.- Hide quoted text -
- Show quoted text -

Thanks. I found the manual of db2 8.2 about how to define resource set
on AIX. URL here:

http://publib.boulder.ibm.com/infoce...x.jsp?topic=/c...

But I have no idea if the agents and subagents processes are also
"bind" by this setting too. May be they are shared by all CPU. On the
other hand, I am afraid I cannot assign one logical CPU of dual core
processor to one node by using resource set which is designed for
physical resource only.

I also agree that we cannot make conclusion base on one simple
example. Although I can make use of DPF, I should also increase memory
and I/O to keep linear scale.

However, I have another question. Should I turn on intra-parallel if I
assign 2 CPU for 1 node?
Hi Woody,

If you have 2 dual-core chips, then you have 4 physical CPUs, and you
should be able to assign single CPUs to resource sets. If these 4
CPUs have SMT enabled (8 CPU "threads"), then you can't assign each
CPU thread independently.

The way the binding works is that the initial system controller
process is bound to the resource set, and each process that is forked
from that system controller will inherit the same binding, so will use
the same CPU(s).

I'll leave that intra-parallel question for someone else :-) My naive
approach would be that if you have multiple concurrent applications,
it's not as important to enable intra-parallel, than if you have only
a single application running at a time.

Cheers,
Liam.

Apr 3 '07 #14

P: n/a
Ian
Knut Stolze wrote:
>
Actually, there is a misconception in that one could assign a CPU to a
specific logical partition in DPF. Instead, all logical partitions on the
same machine share all CPUs and compete for CPU resources.
Sorry if I wasn't clear. I was just talking about the ratio of CPU /
database partitions, not trying to imply that a CPU is dedicated.

Apr 6 '07 #15

P: n/a
Ian
Woody Ling wrote:
However, I have another question. Should I turn on intra-parallel if I
assign 2 CPU for 1 node?
There is no rule for this decision. It is similar to the decision on
whether you go with a ratio of 1 or 2 CPUs per database partition. And
as I said earlier, this is very dependent on your workload.

Fortunately, turning INTRA_PARALLEL on or off is a lot easier than
changing the number of partitions in your database, so it's easier to
evaluate. ;-)

Apr 6 '07 #16

This discussion thread is closed

Replies have been disabled for this discussion.