Quoth
ph***@pacbell.net (Ralph):
My company is researching the idea of moving from a very large, old
and complex UniData DB to something else that is newer and quicker. I
was thinking about PostgreSQL, but am not sure PostgreSQL could import
the data, and / or what kinds of issues could arrise, not to mention
if it's even possible in the first place.
Your thoughs would be greatly appreciated!
thanks!
Ralph
You might take a look at the "conversions" documents...
<http://techdocs.postgresql.org/#convertfrom>
Take a look at your database schemas to see if you are using any data
types not supported by PostgreSQL; it would seem very unlikely that
this would be the case.
You'll need to start some form of project to see about getting the
data out of the existing database and into PostgreSQL; I'm not sure
what UniData has as export tools, but you have quite a lot of
flexibility in pushing data into PostgreSQL.
The "methodology" that I prefer to use for such conversions is to take
the input data, "massage" it, _as text_, and thereby generate a script
of SQL statements that may be directly loaded in. That is preferable
to writing a program that does direct inserts because you can read the
SQL scripts to help debug the process beforehand, and refer back to it
afterwards if there is a need to document anomalies found later.
If speed is _not_ a priority, then setting up a series of INSERT
statements (perhaps with periodic BEGIN/COMMIT, to cut down on
transaction costs) is one method. That is the 'simple' place to
start.
If it makes sense to do some "interpretation" on the way in, you might
create some stored procedures that are more 'intelligent.' For
instance, in the domain management area, I have sometimes set up a
"create_domain" stored procedure that processes an Internet domain,
filling in values on several of the tables that are involved,
populating appropriate fields with create-time defaults, and setting
up linkages to status information stored in secondary tables.
If you create a suitable set of stored procedures, it may be possible
to "clean up" aspects of the legacy data that may not be totally
clean.
If raw speed is vital, as when time is money, and the time between
shutting down the old system and starting up the new one is a costly
period of downtime, then there are ways of speeding things up. You
generate SQL scripts that contain instances of:
COPY this_table from stdin;
[thousands of rows of tab-delimited data...]
\.
That will be short on data validation, which puts a HIGH value on
testing the conversion HEAVILY before using it...
I have pointed to three methodologies:
1. A bunch of INSERTs;
2. Stored procedures directed to your application;
3. Raw speed via COPY.
It is likely to all be text processing, so whether your favorite text
processing language is Perl, Python, Ruby, Tcl, Awk, or more esoteric
options like Icon, Common Lisp, or M4, that's your call. C/C++/Java
aren't so strong in having nice "syntactic sugar" for string
processing, so they probably aren't nearly as suitable.
--
"cbbrowne","@","ntlug.org"
http://cbbrowne.com/info/linuxdistributions.html
"The reality of the software business today is that if you find
something that can make you ridiculously rich, then that's something
that Microsoft is going to take away from you." -- Max Metral