Connecting Tech Pros Worldwide Forums | Help | Site Map

SQL server 7 physical storage

Pablo Wolfus
Guest
 
Posts: n/a
#1: Jul 23 '05
I need to bulk insert very large amount of data into several MSSQL
tables.
The first Data model definition used identities to mantain relationship
between those tables but we found that natural keys (compound) are
better for
bulk insert (there is no need to obtain the identity first)

My question is, changing the identities to natural keys (in some tables
in
order of 4, 5 attributes) will enlarge my database storage?

I think MSSQL implements relationships with pointers (or hashcodes), so
the
storage size will be similar, right?

Regards,


Simon Hayes
Guest
 
Posts: n/a
#2: Jul 23 '05

re: SQL server 7 physical storage



"Pablo Wolfus" <pwolfus@gmail.com> wrote in message
news:1116526548.511905.314760@g43g2000cwa.googlegr oups.com...[color=blue]
>I need to bulk insert very large amount of data into several MSSQL
> tables.
> The first Data model definition used identities to mantain relationship
> between those tables but we found that natural keys (compound) are
> better for
> bulk insert (there is no need to obtain the identity first)
>
> My question is, changing the identities to natural keys (in some tables
> in
> order of 4, 5 attributes) will enlarge my database storage?
>
> I think MSSQL implements relationships with pointers (or hashcodes), so
> the
> storage size will be similar, right?
>
> Regards,
>[/color]

It probably depends on where you have your indexes, and particularly your
clustered index. If it used to be on the artificial key but is now on the
natural key, then it will be bigger - the table data is still in the leaf
level, but the size of the non-leaf levels will have increased. Of course,
if you remove the existing IDENTITY columns, then you'll save some space
there (4 bytes per row, assuming it's integer type, plus the index). But you
seem to be saying that you've made this change already, so presumably you
can just check the table sizes directly?

Even if the table size does increase noticeably, since disks are so cheap
compared to other system components, it's worth using more space if it gives
you a cleaner data model and better performance.

Simon


Pablo Wolfus
Guest
 
Posts: n/a
#3: Jul 23 '05

re: SQL server 7 physical storage


First of all, thanks for your quick response.

I am at design phase now, so i can't test the impact of this changes.
The modification i'm trying to apply is to the design data model
specification.

I didnīt understand what you meant with "the table data is still in
the leaf level". I know there will be a slight growth if i use natural
keys (due to references) but i need to measure this growth, that is, i
need to know if -fisically- the FK consisting of several data columns
(resident in another master table) is stored in both tables.

If the answer is true, what is the sense that those columns data are
duplicated?

Regards,

Erland Sommarskog
Guest
 
Posts: n/a
#4: Jul 23 '05

re: SQL server 7 physical storage


Pablo Wolfus (pwolfus@gmail.com) writes:[color=blue]
> I didnīt understand what you meant with "the table data is still in
> the leaf level".[/color]

Simon was talking about a clustered index. In a clustered index, the
data is in the leaf level of the index.
[color=blue]
> I know there will be a slight growth if i use natural
> keys (due to references) but i need to measure this growth, that is, i
> need to know if -fisically- the FK consisting of several data columns
> (resident in another master table) is stored in both tables.[/color]

Yes, they are.
[color=blue]
> If the answer is true, what is the sense that those columns data are
> duplicated?[/color]

That is a question that can lead into an answer of quite a philosophical
nature. But I let it suffice to point out two things:

1) Simplicity. Having each table in a space of its own, makes the design
of the storeage engine considerably simpler. And that pays back to
users with improved performance and stability.
2) Conserving space is not all. Sometimes, using more space can improve
performance. Consider:

SELECT * FROM tbl_with_many_foriegn_keys

Assumed that SQL Server did not duplicate the FK columns in this
table. Then this query would have to access the data pages of
the referred tables, and this could be very costly.

As for the design question, using artificial keys for relations can
sometimes be useful when the number of columns gets difficult to manage.
But adding an artificial key can also make queries much more complex.
The space consumption is only one aspect of it all. Ease of development
and performance are others.



--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Closed Thread