471,090 Members | 1,334 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,090 software developers and data experts.

FYI, foxpro to postgresql conversion scripts

These seem to work. I'm posting them now before I'm done with
them before I start messing them up with specifics to my database.

Any comment is welcome. However, I've no intention of publishing
production quality code or documentation, these are my notes
and I've used published them here in the event that
somebody finds them useful.

Postgresql developers might note the hackery required to accomodate
COPY's insertion of escape sequences (and the escape
sequences in the syntax of literal strings, be nice to
turn off \'s per-database.)

This uses the XBase perl module to convert from .dbf to
delimited text files.


Postgresql can import delmited text fields (COPY). This means that the
separator and record separator delimters chosen cannot appear in the
data. I choose tab as my field separator and newline as my record
separator. This may not work with all data.

Use the perl program dbfdump to convert individual files.

If your distribution does not contain dbfdump you can hack an
installation of the program in a temporary location by following this

Obtain the code for the XBase module (DBD-XBase-0.241.tar.gz) from
http://www.cpan.org. Place the tarfile in some temporary directory
and cd into that directory.

Install the XBase module (see
http://www.cpan.org/misc/cpan-faq.ht...nstall_private) with: (do
not do this as root)

tar -xzf DBD-XBase-0.241.tar.gz
(# Install XBase module
cd DBD-XBase-0.241
perl Makefile.PL LIB=$(dirname $(pwd)) PREFIX=$(dirname $(pwd))
make test
make install
(# Customize dbfdump to account for our strange install
cd bin
foo=foo.$$ head -n 1 dbfdump | sed "s#\$# -I $(dirname $(pwd))#" >
$foo tail -n +2 dbfdump >> $foo chmod u+w dbfdump cat $foo >
dbfdump chmod u-w dbfdump rm -f $foo )

The XBase version number will vary with the version you have installed.

XBase requires on the perl DBD module. If you don't have that I
imagine you can so something similar to install it.

You should now be able to run the dbfdump program by explictly
supplying a full pathname. Test this by typing:

yourtempdir/bin/dbfdump --help

When you're done with dbfdump you can uninstall by deleting your
temporary directory.

The remainder of this document does not fully qualify the path to the
dbfdump program. If you would like to simply cut and paste the code,
you can (temporarly) alter your $PATH as follows:

export PATH=$PATH:yourtempdir/bin


# Convert foxpro files to tab delimited text files
# Put all your dbf related files into a directory, in this case dbfdir.
# Make a directory for converted output, in this case txtdir.

export PATH=$PATH:./xbase_bin

for suffix in DBF dbf ; do
for f in $(ls dbfdir/*.$suffix) ; do
# Choose a character that doesn't appear in the data as the
# field separator. I choose tab.
dbfdump -fs $'\t' $f > txtdir/$(basename $f $suffix)txt done

file: loader

# load the data from text files

# Needs directories (in the pwd):
# dbfdir all the foxpro .dbf files
# txtdir creates tab delimited text files here
# rejects tab delimited text files of rows that won't convert

# We deal with bad data as follows:
# The raw data row is inserted into a temporary table.
# Then we start a transaction, insert the data into the
# real table and delete the raw data from the temporary table.
# If the real data does not insert, then the whole tranaction
# is rolled back and the raw data is left in the temporary
# table. After completing the load, the bad data is left
# in the temporary table and we dump that back out to the
# filesystem.

function regular_load () {
export table=$1
#total hack, sed command to change old col names to new
export colfilter="$2"

if [ -e dbfdir/${table}.dbf ] ; then
export t=$table.dbf
export t=$table.DBF

# Get the columns and the order in which they appear.
export cols="$(
./xbase_bin/dbfdump --info dbfdir/$t \
| awk '/[1-9][0-9]*\./ {printf("%s%s", sep, $2); sep=", "}'

if [ -n "$colfilter" ] ; then
# Convert the old column names to new column names.
cols="$(echo $cols | $colfilter)";

# Get the data types that go with the columns.
export types="$(
./xbase_bin/dbfdump --info dbfdir/$t \
| awk '/[1-9][0-9]*\./ {printf("%s%s", sep, $3); sep=" "}'

awk -v cols="$cols" -v types="$types" -v table="$table" -v
pwd="$(pwd)" '
function fix_escapes(s) {
# Double all backslashes and single quotes.
# See the GNU awk users guide
return gensub("\\\\", "\\\\\\\\", "g",
gensub("\x27", "\x27\x27", "g", s));
BEGIN {split(types, t);
FS = "\t";
# Create the temporary table.
print "CREATE TEMP TABLE rejects(id INT, rawdata TEXT);";
id = 1;
{# First insert the raw data into the temp table.
printf("INSERT INTO rejects (id, rawdata)
values(%s,\x27%s\x27);\n" \
, id \
, fix_escapes($0));

# Then update the table in a transaction that also removes
# the raw row. Either it all succeeds or it does not.
printf("insert into " table "(" cols ") values (");
sep = "";
for (i = 1; i <= NF; i++) {
printf("%s", sep);
if ($i == "") {
# No data, try inserting a NULL value
} else {
# There is data.
if (t[i] != "N") {
# Data is not numeric, needs quotes.
printf("\x27"); # single quote char
# Ta Da! The data.
# Double all backslashes and single quotes.
printf("%s", fix_escapes($i));
if (t[i] != "N") {
# Data is not numeric, needs closing quotes.
printf("\x27"); # single quote char
sep = ", ";
print ");";
# Delete the raw data from the temp table.
printf("DELETE FROM rejects WHERE id = %s;\n", id);
END {# Dump the rejects back to a file.
print "\\copy rejects (rawdata) TO \x27" pwd "/rejects/" \
table ".txt\x27";
}' \
txtdir/$table.txt \
| psql -d example_db -h server.example.com -f - -U example_user

# Convert rejected data from sql COPY TO format back to tab separated
# data so we can fix it and feed it right back to this program.
# (Welcome to escape hell.)
# Note that this assumes that there are no vertical tabs, newlines,
# or other such characters that COPY would mung.
mv rejects/$table.txt rejects/$tfile
sed 's/\\\\/\\/g' rejects/$tfile | sed 's/\\t/'$'\t''/g' >
rm rejects/$tfile

regular_load acts 'sed s/OLD/RETIRED/'

for t in STATUSES BSTATS MSTATS dcauses rnktypes ; do
regular_load $t
Karl <ko*@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #1
0 2819

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Otis Green | last post: by
3 posts views Thread by Yevgeny Ioffe | last post: by
2 posts views Thread by Gerard M. Operana | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.