I tend to summarize procedures this way: use the strengths of the language.
LANGUAGE SQL should always be the first choice; if you can use the
mechanism inside that language to do what you want to do, then it will
not only give you the best performance, but it is also the easiest to
maintain and port from system to system (no external files to worry
about, just the DDL). It's a fairly straightforward language to learn,
as it's not that much more complex than the SQL statements it will be
executing.
However, there are operations that that language can't do, such as
accessing files, opening TCPIP ports, etc. In that case, it's really a
matter of choice what EXTERNAL (ie, accessing an external library)
language you use. JAVA would probably be fine in this case, as it's
object structure tends to encapsulate complex functionality quite well,
and it's probably the second most portable language behind SQL, as you
can use the "SQLJ"-schema'ed functions (I won't describe them here --
they're documented quite well online) to manage your JAR files and
ensure changes are picked up on the server. However, most SQL inside of
JAVA procedures is dynamically executed, which can hinder performance,
and can require some configuration (picking a JDK, ensuring the files
required can be access, setting the JAVA_HEAP_SZ) in order to get the
best results.
Really, though, if you can't use LANGUAGE SQL I'd use whatever language
you're comfortable coding in.
Rafael Faria wrote:
Serge Rielau wrote:
Rafael Faria wrote:
Hi All,
We are starting a large data warehousing project using DB2 8.2 on
AIX.
There is a direction to move any new internal development to Java
and a
question was raised: Would it be a good idea to have all stored
procedures that we might need (ETL, additional transformers, etc)
for
this project and any other project written in Java as opposed to
SQL/P?
Does anyone have any number or experience in terms of performance
differences?
It depends.
I would _not_ use Java in any environment with high throughput.
That is Java UDF in queries with many rows or Java Procs for high
through in e.g. batch processing.
Using Java to drive your complex queries (Read: it's not the Java
that
takes time it's the SQL) is no problem.
SQL Procedures are faster than Java today and they will only get
faster
from release to release.
So if your SQL in the procs takes minutes to run, the choice of
language
for logic is irrelevant.
If you drive a lot of logic in your ETL and the SQL is trivial the
choice of language is everything.
Just my 2 cents canadian
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Hi Serge / All,
Thanks very much for your comments. They seem to support what I
expected.
The old DW is running on Oracle on an old Unix server. The new DW is
going to be running on brand new IBM servers and on DB2 8.2 and we will
have additional feeds from new operational systems that the old DW
didn't have to handle. Therefore it's a bit hard to draw conclusions
based on the existing numbers.
At this stage, we don't anticipate any extremely sophisticated logic or
transformations on the new ETL processes. There will be a lot of data
to be moved from operational systems to the warehouse on a daily basis
though, so performance will be a very important factor.
For these reasons I'm inclined to vote against java stored procedures
at this point.
We might put together a prototype to compare both solutions just to
substantiate with numbers our decision.
Given the scenario above would you implement the Java SP using JDBC or
SQLJ?
Thanks again for any feedback.
Regards,
Rafael Faria