Peoplesoft on Federated UDB?

BadPony

Anyone using Peoplesoft on a Federated UDB
(shared nothing)Environment on Open System Platforms?
Preferably AIX, but any war stories would be good.

TEA
EB-C

Nov 12 '05

Subscribe Post Reply

6133

Serge Rielau

Daniel,

Let's be careful not to confuse architectures with single vendors
marketing. (I know you top the statistics in the Oracle group. Try to
separate that for a moment...).
If you loose disks attached to node 6-10 and there are no other disks
catching the ball the same thing happens for both architectures. Data
gone. The data ownershp for nodes is logical.
Any node can get to any data physically. See Mark A's earlier
discussions on advances in disk subsystems.
Just for real life experience: I am currently working on an competitive
migration.
Customer goes from non DB2 on SMP to DB2 + DPF. We achieve 90% near
linear scalability. Did the logic have to be rewritten? Yes to a small
extend. But you don't get better than 90% and if you look at other
vendor's benchmarks you will find app-changes + only <60% scale.

I grant other vendors that they may be able to add a second or third
node with positive (whatever that means) scalability and no app changes.
I have yet to see proof that this approach scales beyond a small
sweetspot. In the benchmarks I see said vendor uses datapartitioning
just the same which brings us back to the blurring of shared disk and
shared nothing we see in the industry today.

Cheers
Serge

Nov 12 '05 #51

Serge Rielau

Daniel,

Maybe the poster had what you had earlier....

Nov 12 '05 #52

Serge Rielau

Not sure the TPC comittee got that iodea when Oracle redefined January 30th.
http://www.tpc.org/tpcc/results/tpcc...p?id=103073004

Cheers
Serge

PS: I can't help it with the fucntion. Oracle customers just can't wait
to jump ship. It's empathy you know.

Nov 12 '05 #53

Blair Adamache

Mark A wrote, liberally but not maliciously edited by Blair:

...

Share nothing usually does not yield the best price/performance

Except here, where shared nothing databases seem to do okay in
price/performance:

http://www.tpc.org/tpch/results/tpch...ype=&version=2

Nov 12 '05 #54

Blair Adamache

With User-Defined Functions, all sorts of things are possible - one
could move the leap day from February to January so we'd have a January
32. If necessary, truncate all other months to 28 days, lengthen January
to 58 days, and will ship in January, meeting the TPC 6 month requirement.

Serge Rielau wrote:

Not sure the TPC comittee got that iodea when Oracle redefined January
30th.
http://www.tpc.org/tpcc/results/tpcc...p?id=103073004

Cheers
Serge

PS: I can't help it with the fucntion. Oracle customers just can't wait
to jump ship. It's empathy you know.

Nov 12 '05 #55

Daniel Morgan

hrishy wrote:

Hi Daniel

I think i was misread..what you are referring to is shared
nothing...the RAC server which you setup is shared nothing
architecture..

Not the case. RAC has never been shared nothing. OPS was but not RAC.
RAC is a shared everything architecture just like mainframe DB2.

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #56

Daniel Morgan

hrishy wrote:

Hi daniel
Oops sorry..i made a blunder while posting...:-)

My line shared everything should hhave read as shared nothing and
shared nothing should have read as sahred everything.. :-)

Thanks for correcting me.

so my lines should read

select * from dept where deptno=10 it wil go to node 1

with shared nothing
select * from dept where deptno=10 it wil go to any of the nodes

with shared everything..

so application partitioning becomes vital in shared nothing
architecture.

once again thanks for correcting me.

regards
Hrishy

Oops on my part too. I responded to your first post before seeing this
one. Your correction ... is correct.

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #57

Daniel Morgan

Serge Rielau wrote:

Daniel,

Maybe the poster had what you had earlier....

Most likely. Great latitude should always be given early in the morning
late at night, and on days whose name ends with the letter 'Y'.

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #58

Daniel Morgan

Serge Rielau wrote:

Daniel,

Let's be careful not to confuse architectures with single vendors
marketing. (I know you top the statistics in the Oracle group. Try to
separate that for a moment...).
If you loose disks attached to node 6-10 and there are no other disks
catching the ball the same thing happens for both architectures.
Not the case with shared anything which is not single vendor. It is both
Oracle and mainframe DB2. If node 6 dies the load is redistributed.
There are no 'attached disks' and no data can possibly be lost.
gone. The data ownershp for nodes is logical.
Not with shared everything that is the point of the architecture and
exactly why it is used with DB2 on mainframes.
Any node can get to any data physically. See Mark A's earlier
discussions on advances in disk subsystems.
Even with Microsoft's Federated though Federated is far more of an issue
than is shared nothing. But still the data and the node are inseparable.
Node dies ... data is gone because the node owns the data and is not
longer in a position to share it with the other nodes.
Just for real life experience: I am currently working on an competitive
migration.
Customer goes from non DB2 on SMP to DB2 + DPF. We achieve 90% near
linear scalability. Did the logic have to be rewritten? Yes to a small
extend. But you don't get better than 90% and if you look at other
vendor's benchmarks you will find app-changes + only <60% scale.
That is not the case. First because the shared everything for both
vendors using it routinely runs in the 88-90% and both vendors claim so
but more importantly because no code has to be rewritten when nodes are
added or lost. The number you are quoting, < 60%, is either very old
Oracle (pre-RAC) or Microsoft's.
I grant other vendors that they may be able to add a second or third
node with positive (whatever that means) scalability and no app changes.
I have yet to see proof that this approach scales beyond a small
sweetspot.
I've seen the numbers from a test done that began with 1 node that was
scaled up to 128: Each time adding an identical 2x4 Linux box. Graphic
performance gave a linear 88-89% relationship. We didn't do it this
weekend but we routinely take our six RAC lab machines, configure them
as three two node clusters, test fail-over, and then reconfigure in just
a few minutes to a single six-node cluster changing nothing except a few
parameter files.
Cheers
Serge

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #59

Daniel Morgan

Serge Rielau wrote:

Not sure the TPC comittee got that iodea when Oracle redefined January
30th.
http://www.tpc.org/tpcc/results/tpcc...p?id=103073004

Cheers
Serge

PS: I can't help it with the fucntion. Oracle customers just can't wait
to jump ship. It's empathy you know.

This ship? http://www.psoug.org/cruise2004.html

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #60

Daniel Morgan

Blair Adamache wrote:

With User-Defined Functions, all sorts of things are possible - one
could move the leap day from February to January so we'd have a January
32. If necessary, truncate all other months to 28 days, lengthen January
to 58 days, and will ship in January, meeting the TPC 6 month requirement.

Serge Rielau wrote:
Not sure the TPC comittee got that iodea when Oracle redefined January
30th.
http://www.tpc.org/tpcc/results/tpcc...p?id=103073004

Cheers
Serge

PS: I can't help it with the fucntion. Oracle customers just can't
wait to jump ship. It's empathy you know.

I'm still trying to figure out why a benchmark on unreleased Beta
software has any meaning. From what I've seen it has gotten better
with each subsequent Beta release. I can't imagine why they would want
to use the older slower numbers.

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #61

Mark A

> Mark A wrote, liberally but not maliciously edited by Blair:

... Share nothing usually does not yield the best price/performance

"Blair Adamache" <ba*******@2muchspam.yahoo.com> wrote in message news:bv**********@hanover.torolab.ibm.com... Except here, where shared nothing databases seem to do okay in
price/performance:

http://www.tpc.org/tpch/results/tpch...ype=&version=2

Blair, I specifically mentioned that the IBM eServer 325 "may" challenge the
price performance barrier of shared nothing vs. SMP nodes. However (and I
getting very tired of repeating myself) the IBM TPC benchmarks you
referenced did not use share nothing. There were two partitions per physical
node. That means on each physical node (or cluster as TPC calls it) the two
partitions share the following components:

- System Memory
- Cisco 1000BASE-T Gigabit GBIC Module (network interface card to
communicate with other nodes)
- 18.2GB 15K Ultra160 SCSI Hot-Swap Drive (not for DB2, but for OS, swap
space, etc.)
- IBM FC2-133 Host Bus Adapter
- Qlogic QLA 2342 Host Bus Adapter

Note that one of the Host Bus Adapters (not sure which one) is shared for
access to the external Storage Subsystem where the DB2 database resides. The
other is for the internal hard drive mentioned above.

The partitions did not really share processors because there were 2
processors on each node (with 2 partitions).

I don't really understand why "share nothing" is so difficult to understand
and why so many people claim to have used it, when in fact they did not. I
would, however, be interested to see the same benchmark tested with a true
share nothing environment. That would mean either reducing the partitions to
8 (from 16) for the 8 physical nodes (blades) used in the benchmark, or
increasing the number of physical nodes (blades) to 16 to match the 16
partitions used in the benchmark.

Nov 12 '05 #62

Anton Versteeg

Sorry Mark, I don't agree.
Of course that is using 'shared nothing' architecture.
Between the physical nodes nothing is shared.

Mark A wrote:

Mark A wrote, liberally but not maliciously edited by Blair:

>...
>
>
Share nothing usually does not yield the best price/performance

"Blair Adamache" <ba*******@2muchspam.yahoo.com> wrote in message

news:bv**********@hanover.torolab.ibm.com...

Except here, where shared nothing databases seem to do okay in
price/performance:

http://www.tpc.org/tpch/results/tpch...ype=&version=2
Blair, I specifically mentioned that the IBM eServer 325 "may" challenge the
price performance barrier of shared nothing vs. SMP nodes. However (and I
getting very tired of repeating myself) the IBM TPC benchmarks you
referenced did not use share nothing. There were two partitions per physical
node. That means on each physical node (or cluster as TPC calls it) the two
partitions share the following components:

- System Memory
- Cisco 1000BASE-T Gigabit GBIC Module (network interface card to
communicate with other nodes)
- 18.2GB 15K Ultra160 SCSI Hot-Swap Drive (not for DB2, but for OS, swap
space, etc.)
- IBM FC2-133 Host Bus Adapter
- Qlogic QLA 2342 Host Bus Adapter

Note that one of the Host Bus Adapters (not sure which one) is shared for
access to the external Storage Subsystem where the DB2 database resides. The
other is for the internal hard drive mentioned above.

The partitions did not really share processors because there were 2
processors on each node (with 2 partitions).

I don't really understand why "share nothing" is so difficult to understand
and why so many people claim to have used it, when in fact they did not. I
would, however, be interested to see the same benchmark tested with a true
share nothing environment. That would mean either reducing the partitions to
8 (from 16) for the 8 physical nodes (blades) used in the benchmark, or
increasing the number of physical nodes (blades) to 16 to match the 16
partitions used in the benchmark.

--
Anton Versteeg
IBM Certified DB2 Specialist
IBM Netherlands

Nov 12 '05 #63

Mark A

>"Anton Versteeg" <an************@nnll.iibbmm.com> wrote in message
news:40**************@nnll.iibbmm.com...

Sorry Mark, I don't agree.
Of course that is using 'shared nothing' architecture.
Between the physical nodes nothing is shared.

Yes, between the nodes there is no sharing (by definition), but within the
nodes there is sharing. So the architecture used in the benchmark Blair
referred to is neither share everything (all partitions on the same node),
nor is it share nothing (each partition on its own node). It is in-between
the two.

Of course with only 2 partitions per node, it is "close" to share nothing
(especially when compared to the usual 4-6 partitions per node frequently
used on a 8 processor node).

But share "nothing" means NOTHING; NOTA; ZERO SHARING; NILL; I don't know
how else to say it.

2 partitions per node is not share nothing. Close, but no cigar.

You can disagree all you want, but that does not make it so.

Nov 12 '05 #64

Serge Rielau

OK, come up with a better word then.
It seems the majority (when sover at least) understands shared nothing
in the context of DBMS to relate to the view of the DBMS and not to the
hardware view.
Could we just agree on the fact that the meaning of shared nothing in
thsi context might have changed a bit over the last 20 years for all the
valid reasons you brought up?

Cheers
Serge

Nov 12 '05 #65

Mark A

"Serge Rielau" <sr*****@ca.eye-be-em.com> wrote in message
news:bv**********@hanover.torolab.ibm.com...

OK, come up with a better word then.
It seems the majority (when sover at least) understands shared nothing
in the context of DBMS to relate to the view of the DBMS and not to the
hardware view.
Could we just agree on the fact that the meaning of shared nothing in
thsi context might have changed a bit over the last 20 years for all the
valid reasons you brought up?

Cheers
Serge

IMO no, because it is important to distinguish having 2 partitions per node,
from having 4 partitions or 6 partitions per node, even when there are
multiple nodes. Saying that both are "shared nothing" is incorrect IMO, and
blurs the distinction between them, even though they should scale a little
differently.

It is still possible to have shared nothing (1 partition per node) and the
eServer 325 makes it more attractive than ever before.

I think that some people are too convinced that shared nothing is always the
best solution, and because of that, the marketing people are trying to latch
on to a label that doesn't really fit.

Even though hardware and software advances have made sharing more acceptable
in terms of performance and scalability, there still can be some bottlenecks
when partitions share a node.

Nov 12 '05 #66

Gert van der Kooij

In article <1075131919.777753@yasure>, Daniel Morgan
(da******@x.washington.edu) says...

Serge Rielau wrote:
Daniel,

Maybe the poster had what you had earlier....

Most likely. Great latitude should always be given early in the morning
late at night, and on days whose name ends with the letter 'Y'.

Yesterday ?

Nov 12 '05 #67

Blair Adamache

But not tomorrow.

Gert van der Kooij wrote:

In article <1075131919.777753@yasure>, Daniel Morgan
(da******@x.washington.edu) says...
Serge Rielau wrote:

Daniel,

Maybe the poster had what you had earlier....

Most likely. Great latitude should always be given early in the morning
late at night, and on days whose name ends with the letter 'Y'.

Yesterday ?

Nov 12 '05 #68

Daniel Morgan

Mark A wrote:

I think that some people are too convinced that shared nothing is always the
best solution, and because of that, the marketing people are trying to latch
on to a label that doesn't really fit.

Well you won't have to worry about the Oracle people thinking that. They
dropped that architecture many years ago. Doubt the mainframe people
using DB2 will jump behind that slogan either so you're pretty safe. ;-)

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #69

Serge Rielau

Ah.. yes. That is the beauty of marketing.
When a vendor publishes a benhcmark it gives braggin rights.
From full page adds to analysts attention.
The whole idea is to push a product.
Now if that product does not ship within the timeframe indicated then
the result must be withdrawn.
However all the marketing cannot be undone.
One can slip in the odd case, but if it happens regularly then first the
reputation for the vendor goes, and then the one of the benchmark
because it's benchmarking prototypes rather than products.

Cheers
Serge

Nov 12 '05 #70

Database Guy

Daniel Morgan <da******@x.washington.edu> wrote in message news:<1075100915.338329@yasure>...

Far be it from me to defend Oracle Consulting as I have come in after
them a few times myself: Enough said. But without a thorough knowledge
of what that person found when they walked in the door it is hard to say.

Tsk, tsk Daniel. I think you *should* defend them. Otherwise you may
find you get chopped off the 11e beta test programme.
DG

Nov 12 '05 #71

Joseph

Daniel Morgan <da******@x.washington.edu> writes:

The only two commecial RDBMS products with shared nothing architecture
are DB2 on mainframes (not other platforms) and Oracle on all platforms.

Actually, neither of those are shared-nothing architectures, but DB2 EEE
is. The original shared-nothing parallel database architecture is the
GAMMA research prototype developed by David DeWitt at U of
Wisconsin-Madison. Informix XPS and IBM DB2 UDB EEE (on linux, windoze, and
unix) are implementations of the same architecture. This architecture is
not only the underlying platform, but also a particular mapping of relations
to the hardware. The GAMMA paper and other papers on parallel database
machines can be found in the March 1990 issue of IEEE Transactions on
Knowlege and Data Engineering.

A shared-nothing platform is a collection of nodes, each of which is its
own CPU(s), disk(s), memory, buses. The nodes are connected by a
high-speed network. Each node has the ability to store and query relations.
A logical relation is horizontally declustered across the nodes. This means
that a fragment of the relation is stored on each node. Each fragment has
the same schema-- the schema of the relation. Each fragment stores a subset
of the tuples of the logical relation, and the subsets are disjoint. The
original logical relation is the union of all the fragments (1 per node).
Typically the partition of the relation into subsets is done either by
hashing on the key value (each has bucket is assigned to a node) or by
a range partition (each range is assigned to a node) or round-robin (if
there n nodes, the kth tuple inserted is inserted into node k mod n).

When a query is submitted, queries typically are generated to run at all of
the nodes in parallel. For instance, consider a logical relation R that
is declustered horizontally into 8 subsets, R0, R1, ..., R7, with Ri stored
node 8. The query "select * from R" is implemented by running
"select * from Ri" at node i, and all 8 queries can run in parallel,
with the results pipelined into a union or merge operator that produces
the result (in parallel with the continued running of the queries because
of pipelining).

Joins typically are done via hash joins with the fragments being pairwise
joined in parallel, but also each join of a pair of fragments can be
futher paralellized by hashing and parallel joining of hash buckets.

This shared-nothing scheme has been shown to achieve linear scalability,
ie you can double the workload and maintain the same response time by
doubling the number of nodes. It also exhibits linear speedup-- you cacut
the response time in half by doubling the number of nodes with workload
held fixed. This linear scalability can certainly be achieved up to 64
nodes, and should be pushing 128 nodes with optimizations of system
overhead.

Note that if you have shared memory, you can use it to implement fast
communication channels (A sends a message to B by writing it to a buffer
that B can read) as long as logically you stick to the shared-nothing
architecture.

The Oracle cluster architecture is advantageous for fault-tolerance, but
has never been shown to achieve linear scalability. It is a fundamentally
different architecture and is not a traditional shared-nothing parallel
database machine.

Hope that helps to clarify a little bit.

Joseph

Nov 12 '05 #72

Daniel Morgan

Database Guy wrote:

Daniel Morgan <da******@x.washington.edu> wrote in message news:<1075100915.338329@yasure>...

Far be it from me to defend Oracle Consulting as I have come in after
them a few times myself: Enough said. But without a thorough knowledge
of what that person found when they walked in the door it is hard to say.

Tsk, tsk Daniel. I think you *should* defend them. Otherwise you may
find you get chopped off the 11e beta test programme.
DG

Some like to bungee jump. Some like to parachute jump. Me? I prefer
to talk openly about billion dollar software companies. Guess we all
have our own way to "live dangerously." ;-)

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #73

Daniel Morgan

Joseph wrote:

Daniel Morgan <da******@x.washington.edu> writes:

The only two commecial RDBMS products with shared nothing architecture
are DB2 on mainframes (not other platforms) and Oracle on all platforms.

Actually, neither of those are shared-nothing architectures, but DB2 EEE
is.

I caught that mistake, apologized, and corrected it one or two days
ago earlier in the thread ... but thank you.

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #74

Mark A

"Joseph" <jo****@aracnet.com> wrote in message
news:bv*********@enews2.newsguy.com...

Actually, neither of those are shared-nothing architectures, but DB2 EEE
is. The original shared-nothing parallel database architecture is the
GAMMA research prototype developed by David DeWitt at U of
Wisconsin-Madison. Informix XPS and IBM DB2 UDB EEE (on linux, windoze, and unix) are implementations of the same architecture. This architecture is
not only the underlying platform, but also a particular mapping of relations to the hardware. The GAMMA paper and other papers on parallel database
machines can be found in the March 1990 issue of IEEE Transactions on
Knowlege and Data Engineering.

<snip>
Joseph

I don't really know about the research prototype developed by David DeWitt
at U of Wisconsin-Madison, but you seemed to have left out Teradata which
was the first company to implement a real world relational database product
with share nothing architecture. This happened long before DB2 PE (parallel
edition) or Informix XPS. And it happened before 1990.

DB2 parallel database implementation has always been (and still is)
"capable" of share nothing, ever since the first release of DB2 Parallel
Edition. In fact it was required at that time, because AIX did not support
SMP at that time.

But with the advent of good SMP processing, not many people implement a
"pure" share nothing architecture anymore, unless they have really huge
scalability requirements. It is usually implemented as a "hybrid" of
multiple nodes, with multiple partitions per node. But with DB2 it's up to
the DBA/customer as to how pure of a share nothing architecture that is
implemented.

Nov 12 '05 #75

Richard D. Latham

"Mark A" <ma@switchboard.net> writes:

"Serge Rielau" <sr*****@ca.eye-be-em.com> wrote in message
news:bv**********@hanover.torolab.ibm.com...
OK, come up with a better word then.
It seems the majority (when sover at least) understands shared nothing
in the context of DBMS to relate to the view of the DBMS and not to the
hardware view.
Could we just agree on the fact that the meaning of shared nothing in
thsi context might have changed a bit over the last 20 years for all the
valid reasons you brought up?

Cheers
Serge

IMO no, because it is important to distinguish having 2 partitions per node,
from having 4 partitions or 6 partitions per node, even when there are
multiple nodes. Saying that both are "shared nothing" is incorrect IMO, and
blurs the distinction between them, even though they should scale a little
differently.

I'm not sure how, given your defination, you'd characterize the
implementation of DB2 that runs on the Parallel Sysplex hardware,
using the coupling facility, on the zSeries processors.

--
#include <disclaimer.std> /* I don't speak for IBM ... */
/* Heck, I don't even speak for myself */
/* Don't believe me ? Ask my wife :-) */
Richard D. Latham la*****@us.ibm.com

Nov 12 '05 #76

Mark Townsend

Database Guy wrote:

Daniel Morgan <da******@x.washington.edu> wrote in message news:<1075100915.338329@yasure>...

Far be it from me to defend Oracle Consulting as I have come in after
them a few times myself: Enough said. But without a thorough knowledge
of what that person found when they walked in the door it is hard to say.

Tsk, tsk Daniel. I think you *should* defend them. Otherwise you may
find you get chopped off the 11e beta test programme.
DG

Too late. Daniel who ? :-)

Nov 12 '05 #77

Mark Townsend

Serge Rielau wrote:

Ah.. yes. That is the beauty of marketing.
When a vendor publishes a benhcmark it gives braggin rights.
From full page adds to analysts attention.
The whole idea is to push a product.
Now if that product does not ship within the timeframe indicated then
the result must be withdrawn.
However all the marketing cannot be undone.
One can slip in the odd case, but if it happens regularly then first the
reputation for the vendor goes, and then the one of the benchmark
because it's benchmarking prototypes rather than products.

Cheers
Serge

Well, it's not yet the 30th :-)

And there are plenty of other benchmarks for braggin rights - see
http://www.tpc.org/tpcc/results/tpcc...resulttype=all

Nov 12 '05 #78

Daniel Morgan

Mark Townsend wrote:

Database Guy wrote:
Daniel Morgan <da******@x.washington.edu> wrote in message
news:<1075100915.338329@yasure>...

Far be it from me to defend Oracle Consulting as I have come in after
them a few times myself: Enough said. But without a thorough
knowledge of what that person found when they walked in the door it
is hard to say.

Tsk, tsk Daniel. I think you *should* defend them. Otherwise you may
find you get chopped off the 11e beta test programme.
DG

Too late. Daniel who ? :-)

Seems to think I should defend Oracle Consulting. Can't imagine why.
Apparently part of some ... 'if you work with a product you must be
baptized to its faith' ... kind of thinking: Foreign to me.

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #79

Buck Nuggets

Mark Townsend <ma***********@comcast.net> wrote in message news:<40**************@comcast.net>...

For example, Oracle requires more DBA
support than DB2, an important consideration when choosing.

Prove it. And don't quote me a lot of fly by night supposed analysts
whose main income seems to come from talking at DB2 user conferences.

Go get 'em Mark! The oracle install is a piece of cake, easiest
install of any database I've ever worked with. Some people talk about
how easy sql server, mysql, or postgresql are to install - but those
people simply don't have enough training in oracle.

It's almost always very easy in windows, but to be honest it's
sometimes a little tougher in other operating systems. While
installing 9i on solaris I discovered that the 350 page installation
manaual was so wrong that it actually set me way back - had to spend
two days googling for answers in various groups to the java problems.
Probably sabotaged by db2ers. And about six months ago I had to do a
reinstall on linux redhat 7.3 - unfortunately there was a set of odd
bugs. Luckily these were documented on metalink, unfortunately I only
had an OEM license - and so no access to metalink. Man, that was a
painful couple of days. Then I've also had some odd performance
strangeness on AIX as well, that required quite a few calls and
research.

Still, I stand by my point - oracle is a breeze to install, all those
problems of mine could have been easily resolved by additional
training. Right now I'm lobbying for a couple of weeks of training
each year to keep up on these issues so that I won't make the product
look bad. Wish me luck!

Buck

Nov 12 '05 #80

Joseph

"Mark A" <ma@switchboard.net> writes:

I don't really know about the research prototype developed by David DeWitt
at U of Wisconsin-Madison, but you seemed to have left out Teradata which
was the first company to implement a real world relational database product
with share nothing architecture. This happened long before DB2 PE (parallel
edition) or Informix XPS. And it happened before 1990.

Gamma happened way before 1990 too-- the article is the final archival
paper describing the system, probably submitted in 1989 after a 5 or 6 year
project. Teradata may have had a shared-nothing parallel DB before that,
but it is hard to tell exactly when Teradata began to be shared-nothing--
I think their first incarnation was not shared-nothing. But you are correct
that Teradata may have been first. I also think in the 1980's the Teradata
machine declustered relations to multiple nodes for parallel query
evaluation, but I don't know if they supported parallelization of joins of
individual buckets at that time.

Joseph

Nov 12 '05 #81

Serge Rielau

Now you're talking man. It's not yet the 30th indeed.
Not sure what's to brag about <60% scalability on that RAC result though.

Cheers
Serge

Nov 12 '05 #82

Mark A

"Joseph" <jo****@aracnet.com> wrote in message
news:bv*********@enews1.newsguy.com...

"Mark A" <ma@switchboard.net> writes:
I don't really know about the research prototype developed by David DeWittat U of Wisconsin-Madison, but you seemed to have left out Teradata which
was the first company to implement a real world relational database productwith share nothing architecture. This happened long before DB2 PE (paralleledition) or Informix XPS. And it happened before 1990.
Gamma happened way before 1990 too-- the article is the final archival
paper describing the system, probably submitted in 1989 after a 5 or 6

year project. Teradata may have had a shared-nothing parallel DB before that,
but it is hard to tell exactly when Teradata began to be shared-nothing--
I think their first incarnation was not shared-nothing. But you are correct that Teradata may have been first. I also think in the 1980's the Teradata machine declustered relations to multiple nodes for parallel query
evaluation, but I don't know if they supported parallelization of joins of
individual buckets at that time.

Joseph

Teradata was definitely the first to develop a parallel database and by
definition it needed to be share nothing because SMP was available except on
a mainframe. Teradata used Intel nodes with a proprietary operating system.
Teradata used a share nothing architecture (one partition per node) up until
about 1996 when they began to support SMP machines on one or more nodes. Of
course, like DB2, one can still implement Teradata in a share nothing
architecture, but it is usually most cost effective to have multiple
partitions per SMP node, with multiple nodes.

I don't know what you mean by parallelism of joins and individual buckets.
But Teradata in the mid 1980's worked pretty much the same way DB2 does
today (at a conceptual level). The table was spread across multiple
partitions based on a hash key, and each partition processed the data in
parallel. Cross partition joins were supported.

Nov 12 '05 #83

Daniel Morgan

Serge Rielau wrote:

Now you're talking man. It's not yet the 30th indeed.
Not sure what's to brag about <60% scalability on that RAC result though.

Cheers
Serge

Anyone getting only 60% scalability with RAC needs to take the class I
teach. They are missing something essential.

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #84

Blair Adamache

Daniel Morgan wrote:

Anyone getting only 60% scalability with RAC needs to take the class I
teach. They are missing something essential.

Humility?

Nov 12 '05 #85

Mark Townsend

What is an OEM licence ?

Nov 12 '05 #86

dba

Daniel Morgan wrote:

Anyone getting only 60% scalability with RAC needs to take the class I
teach. They are missing something essential.

Add every person that's ever tried to scale RAC over 16 nodes. Daniel,
you're going to be a very rich man!

DBA

Nov 12 '05 #87

Daniel Morgan

dba wrote:

Daniel Morgan wrote:

Anyone getting only 60% scalability with RAC needs to take the class I
teach. They are missing something essential.

Add every person that's ever tried to scale RAC over 16 nodes. Daniel,
you're going to be a very rich man!

DBA

I'll take you up on it. The class is free if after taking it you can't
scale more than 20 nodes at greater than 80% provided you follow the
steps we teach in the class.

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #88

Anton Versteeg

Buck Nuggets wrote:

Still, I stand by my point - oracle is a breeze to install, all those
problems of mine could have been easily resolved by additional
training. Right now I'm lobbying for a couple of weeks of training
each year to keep up on these issues so that I won't make the product
look bad. Wish me luck!

Buck

You need training to install it?
And it is a breeze to install?
Looks more like a hurricane to me :-)

--
Anton Versteeg
IBM Certified DB2 Specialist
IBM Netherlands

Nov 12 '05 #89

Anton Versteeg

What would the scalability be at 128 nodes? 60%?

Daniel Morgan wrote:

dba wrote:

Daniel Morgan wrote:

Anyone getting only 60% scalability with RAC needs to take the class
I teach. They are missing something essential.

Add every person that's ever tried to scale RAC over 16 nodes.
Daniel, you're going to be a very rich man!

DBA

I'll take you up on it. The class is free if after taking it you can't
scale more than 20 nodes at greater than 80% provided you follow the
steps we teach in the class.

--
Anton Versteeg
IBM Certified DB2 Specialist
IBM Netherlands

Nov 12 '05 #90

Blair Adamache

The systems that started this thread cost over six million dollars and
have 64 CPUs. Presumably, the performance experts of Oracle and HP
worked together to achieve the nn% scalability that took the thread on
this tangent. Don't know if these experts ever took courses at the
University of Washington, or asked for a refund.

Daniel Morgan wrote:

dba wrote:

Daniel Morgan wrote:

Anyone getting only 60% scalability with RAC needs to take the class
I teach. They are missing something essential.

Add every person that's ever tried to scale RAC over 16 nodes. Daniel,
you're going to be a very rich man!

DBA

I'll take you up on it. The class is free...

Now I understand your reference to academic freedom earlier in this thread.

Nov 12 '05 #91

Buck Nuggets

Mark Townsend <ma***********@comcast.net> wrote in message news:<40**************@comcast.net>...

What is an OEM licence ?

management purchased the oracle license through the vendor, since it
was much cheaper than either the oracle or db2 licenses.
Unfortunately it turns out that it does not include any kind of oracle
support.

Theoretically, we can get support from the vendor, and in their
opinion the database is an embedded technology within their product.
In reality, oracle, db2, etc aren't very good embedded products and
the vendor doesn't have a good support structure.

buck

Nov 12 '05 #92

Blair Adamache

OEM = original equipment manufacturer

Mark Townsend wrote:

What is an OEM licence ?

Nov 12 '05 #93

Mark Townsend

Buck Nuggets wrote:

Mark Townsend <ma***********@comcast.net> wrote in message news:<40**************@comcast.net>...
What is an OEM licence ?

management purchased the oracle license through the vendor, since it
was much cheaper than either the oracle or db2 licenses.
Unfortunately it turns out that it does not include any kind of oracle
support.

Theoretically, we can get support from the vendor, and in their
opinion the database is an embedded technology within their product.
In reality, oracle, db2, etc aren't very good embedded products and
the vendor doesn't have a good support structure.

buck

OK - so what we would call an ESL (Embedded Software Licence). Note that
your not really supposed to be installing or upgrading or even patching
the database software in these environments (the vendor is supposed to
send you a new version of their app with the latest and greatest version
in it). I'd be interested in said vendors name if you want to send me
offline. C'est la vie.

Nov 12 '05 #94

Mark Townsend

Blair Adamache wrote:

OEM = original equipment manufacturer

Mark Townsend wrote:

What is an OEM licence ?

Thank you - and remember, keep banging the rocks together, guys.

Nov 12 '05 #95

Joseph

"Mark A" <ma@switchboard.net> writes:

I don't know what you mean by parallelism of joins and individual buckets.
But Teradata in the mid 1980's worked pretty much the same way DB2 does
today (at a conceptual level). The table was spread across multiple
partitions based on a hash key, and each partition processed the data in
parallel. Cross partition joins were supported.

When you decluster relations across nodes by, say, hash partitioning,
then to join to relations you need to join the corresponding fragments
if they are both hashed on the join attributes in the declustering.
Here I'm using the term fragment for the set of tuples for a relation
stored at a single node after declustering.

You can further parallize the join of a pair of fragments by using
a hash-join algorithm, but using a different hash function than
was used to decluster the relations. This further partitions each
fragment into buckets that can be joined independently of each other.
Each of these bucket joins can be started on separate processors
enable the steps of a hash join of fragments to be computed in parallel.
The initial declustering of relations enables I/O parallelism, but
the hashing with a separate function to compute hash joins of each
pair of fragments enables CPU parallism. Teradata did not support this
in the mid 1980s. It did use a sort-merge join algorithm in which each
fragment can be partitioned into smaller pieces to be sorted in parallel,
but the merge operation is single-threaded so this join algorithm is not
as parallelizable as a hash-join.

The first GAMMA paper describing the system was published and presented at
the VLDB 1986 conference, meaning the paper was submitted in fall 1985. It
would have been a working system by then, so I'd say the shared-nothing
parallel DB architecture was independently developed by Teradata and
DeWitt's research group, with the latter more highly developed
technologically. Of course, Teradata had to spend more time getting
the code to production standards, whereas a research can just do a prototype
as proof of concept.

Cheers,

Joseph

Nov 12 '05 #96

Mark Townsend

FYI - Oracle also added this capability (we call it partition-wise join)
when we added Hash partitioning in Oracle8i Release 1 - around
1998/1999. ? Note that only one side of the join actually needs a hash
partition key, the other can be hashed on the fly to match.

And before you ask, yes we will ship the join to the nodes that have
data locality.

Joseph wrote:

"Mark A" <ma@switchboard.net> writes:

I don't know what you mean by parallelism of joins and individual buckets.
But Teradata in the mid 1980's worked pretty much the same way DB2 does
today (at a conceptual level). The table was spread across multiple
partitions based on a hash key, and each partition processed the data in
parallel. Cross partition joins were supported.

When you decluster relations across nodes by, say, hash partitioning,
then to join to relations you need to join the corresponding fragments
if they are both hashed on the join attributes in the declustering.
Here I'm using the term fragment for the set of tuples for a relation
stored at a single node after declustering.

You can further parallize the join of a pair of fragments by using
a hash-join algorithm, but using a different hash function than
was used to decluster the relations. This further partitions each
fragment into buckets that can be joined independently of each other.
Each of these bucket joins can be started on separate processors
enable the steps of a hash join of fragments to be computed in parallel.
The initial declustering of relations enables I/O parallelism, but
the hashing with a separate function to compute hash joins of each
pair of fragments enables CPU parallism. Teradata did not support this
in the mid 1980s. It did use a sort-merge join algorithm in which each
fragment can be partitioned into smaller pieces to be sorted in parallel,
but the merge operation is single-threaded so this join algorithm is not
as parallelizable as a hash-join.

The first GAMMA paper describing the system was published and presented at
the VLDB 1986 conference, meaning the paper was submitted in fall 1985. It
would have been a working system by then, so I'd say the shared-nothing
parallel DB architecture was independently developed by Teradata and
DeWitt's research group, with the latter more highly developed
technologically. Of course, Teradata had to spend more time getting
the code to production standards, whereas a research can just do a prototype
as proof of concept.

Cheers,

Joseph

Nov 12 '05 #97

Similar topics