By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,496 Members | 1,528 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,496 IT Pros & Developers. It's quick & easy.

LOAD & IMPORT results in different disk space, occupied by a table: why?

P: n/a
Recently I became interested, - Are the data, bulk loaded in the table with
LOAD utility, consume the same disk space as loaded with IMPORT utility? The
answer turned out to be NOT !

Here is a nutshell description of the test. The testing was done at
"DB2/LINUX 8.2.3".

Tables for tests:
F4106 has 5203 rows, 32 columns.
F42199 has 1399252 rows, 245 columns.

Load command:
load client from '/home/share/tabXXXX.ixf' of ixf insert into
proddta.fXXXX NONRECOVERABLE"
Import command:
import from '/home/share/tabXXXX.ixf' of ixf insert into proddta.fXXXX

Between loads I used the following commands to truncate a table under
investigation and clear statistics:

ALTER TABLE PRODDTA.fXXXX ACTIVATE NOT LOGGED INITIALLY WITH EMPTY
TABLE;
RUNSTATS on table PRODDTA.fXXXX

After load I used the same RUNSTATS as above to get the "used pages"
counter (npages) in syscat.tables.

Here are the results:

syscat.tables, npages:
----------------------
TABLE IMPORT LOAD
------- --------- -------
F4106 372 401
F42199 694862 700326

One can see the disk space occupied by data, loaded with LOAD utility is
slightly greater then its counterpart.

If anybody understand this, please, explain.

Cheers,
--
Konstantin Andreev.
Jul 5 '06 #1
Share this Question
Share on Google+
5 Replies


P: n/a
"Konstantin Andreev" <pl**********@datatech.ruwrote in message
news:e8**********@dns.comcor.ru...
Recently I became interested, - Are the data, bulk loaded in the table
with LOAD utility, consume the same disk space as loaded with IMPORT
utility? The answer turned out to be NOT !

Here is a nutshell description of the test. The testing was done at
"DB2/LINUX 8.2.3".

Tables for tests:
F4106 has 5203 rows, 32 columns.
F42199 has 1399252 rows, 245 columns.

Load command:
load client from '/home/share/tabXXXX.ixf' of ixf insert into
proddta.fXXXX NONRECOVERABLE"
Import command:
import from '/home/share/tabXXXX.ixf' of ixf insert into proddta.fXXXX

Between loads I used the following commands to truncate a table under
investigation and clear statistics:

ALTER TABLE PRODDTA.fXXXX ACTIVATE NOT LOGGED INITIALLY WITH EMPTY
TABLE;
RUNSTATS on table PRODDTA.fXXXX

After load I used the same RUNSTATS as above to get the "used pages"
counter (npages) in syscat.tables.

Here are the results:

syscat.tables, npages:
----------------------
TABLE IMPORT LOAD
------- --------- -------
F4106 372 401
F42199 694862 700326

One can see the disk space occupied by data, loaded with LOAD utility is
slightly greater then its counterpart.

If anybody understand this, please, explain.

Cheers,
--
Konstantin Andreev.
The load utility loads the data in blocks (or pages) in the same format they
were exported, even if there are pages which are not completely full when
the data is exported. This is done for reasons of speed and efficiency.

The import utility processes the data by row and performs an insert for each
row, so that it can use all the sequential space in target table without
leaving any unused space on a page.
Jul 5 '06 #2

P: n/a
Mark A wrote:
>Here are the results:

syscat.tables, npages:
----------------------
TABLE IMPORT LOAD
------- --------- -------
F4106 372 401
F42199 694862 700326
The load utility loads the data in blocks (or pages) in the same format they were exported, even if there are pages which are not completely full when the data is exported.
Can not be true. One reason and one confirmation:

- The intermediate data format (DEL,IXF) intended to be interoperable. This allows moving data between different platforms and on-disk structures. It (data format) by definition does not contain page and block information. Thus LOAD operation must reconstruct any data blocks specifically for target platform.

- I just checked - the source table F42199, when exported, occupied: npages=1399252, fpages=1399430. If you are right, LOAD'ed table would occupy the same number of pages, but it occupies just half of them. This is because VALUE COMPRESSION option for target table. Thus, the data pages for pages *were* reconstructed by LOAD.

Cheers,
--
Konstantin Andreev.
Jul 6 '06 #3

P: n/a
"Konstantin Andreev" <pl**********@datatech.ruwrote in message
news:e8**********@dns.comcor.ru...
>
Can not be true. One reason and one confirmation:

- The intermediate data format (DEL,IXF) intended to be interoperable.
This allows moving data between different platforms and on-disk
structures. It (data format) by definition does not contain page and block
information. Thus LOAD operation must reconstruct any data blocks
specifically for target platform.

- I just checked - the source table F42199, when exported, occupied:
npages=1399252, fpages=1399430. If you are right, LOAD'ed table would
occupy the same number of pages, but it occupies just half of them. This
is because VALUE COMPRESSION option for target table. Thus, the data pages
for pages *were* reconstructed by LOAD.

Cheers,
--
Konstantin Andreev.
Let me amend my response to be more accurate.

The load utility loads data a page at a time. It takes the data from the
input file and formats the pages to be loaded. New rows are not placed on
existing pages.

The import utility does regular SQL inserts, and therefore may use existing
space on pages that already have some rows, but where the page is not full.
Jul 6 '06 #4

P: n/a
Mark A wrote:
>Thus LOAD operation must reconstruct any data blocks specifically for target platform.
Let me amend my response to be more accurate.

The load utility loads data a page at a time. It takes the data from the input file and formats the pages to be loaded. New rows are not placed on existing pages.

The import utility does regular SQL inserts, and therefore may use existing space on pages that already have some rows, but where the page is not full.
Sounds reasonable. Let me a bit expand the proposed scenario, to check my understanding.

- Some time the data row in sequence can't be fit on the currently constructed page, thus page fired to disk by LOAD utility and forgotten. Meanwhile among the rows to come could be encountered one, short enough to be placed on the fired page, but LOAD have to place it on the new page. I also expect that if the all rows have equal lengthes then page counts used by LOAD and IMPORT will also by equal.

Please, correct me, if I flounder about.

Thank you,
--
Konstantin Andreev.
Jul 7 '06 #5

P: n/a
"Konstantin Andreev" <pl**********@datatech.ruwrote in message
news:e8**********@dns.comcor.ru...
Sounds reasonable. Let me a bit expand the proposed scenario, to check my
understanding.

- Some time the data row in sequence can't be fit on the currently
constructed page, thus page fired to disk by LOAD utility and forgotten.
Meanwhile among the rows to come could be encountered one, short enough to
be placed on the fired page, but LOAD have to place it on the new page. I
also expect that if the all rows have equal lengthes then page counts used
by LOAD and IMPORT will also by equal.

Please, correct me, if I flounder about.

Thank you,
--
Konstantin Andreev.
Data loaded via the load utility is formatted into pages by the load utility
an then stored in the table a page at a time. This is done outside of the
normal SQL engine. Existing pages are not used for adding the data, and only
new pages are created. It has nothing to do with whether rows will fit in
existing pages, it is done that way for speed. Because the SQL engine is not
used by the load utility, insert triggers will not fire for new rows added
to the table.

Imports are done by submitting regular inserts through the SQL engine and
therefore the rows may end up being inserted on existing pages where there
is space available. Insert triggers are fired, and all data is logged just
like normal SQL.

Therefore it is possible that the import uses less total space than load.
Jul 7 '06 #6

This discussion thread is closed

Replies have been disabled for this discussion.