By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,105 Members | 2,569 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,105 IT Pros & Developers. It's quick & easy.

Database Design Q: Many Rows or Many Tables?

P: n/a
Hi there

I'm in the planning stages of creating a database, and I have two options
here. Which makes more sense, and/or provides better performance for
queries - a single table with hundreds of thousands of rows, or many
tables with less rows?

GOAL:
Recording information about buildings, storey by storey. Basically I
will be recording six bits of information about each storey in each
building.

SETUP:
There will be one table for Buildings for sure. The big question is about
how I should record those six details per storey.

OPTIONS:
A) One table that records building ID, storey number, and the six
measurements per row.
B) Six tables that record building ID, storey number, and the appropriate
measurement.
Eventually, there could be a few thousand buildings in the database, and
each building may have anywhere from 5 to 50 storeys. Not to mention that
"old information" is to be kept and archived as new information about
storeys is added.

This means with option A, I can expect to have a table of a few hundred
thousand rows.... (obviously i would index the BuildingID and Storey
Number). Sounds like a lot(!), but it means that I could retreive/select
information I need with one single query and no table joins.

If I go with option B, I can expect to have to do six table joins/queries
when retreiving information about one storey in one building, however
there will be less records to sift through.
Which is the better option to take, and why? By 'better', I am referring
to database performance/speed, updates and other future maintenance on
the database. All opinions welcome.

Thanks,
GM
,
Jul 20 '05 #1
Share this Question
Share on Google+
7 Replies


P: n/a
Good Man wrote:
Hi there

I'm in the planning stages of creating a database, and I have two options
here. Which makes more sense, and/or provides better performance for
queries - a single table with hundreds of thousands of rows, or many
tables with less rows?

GOAL:
Recording information about buildings, storey by storey. Basically I
will be recording six bits of information about each storey in each
building.

SETUP:
There will be one table for Buildings for sure. The big question is about
how I should record those six details per storey.

OPTIONS:
A) One table that records building ID, storey number, and the six
measurements per row.
B) Six tables that record building ID, storey number, and the appropriate
measurement.
Eventually, there could be a few thousand buildings in the database, and
each building may have anywhere from 5 to 50 storeys. Not to mention that
"old information" is to be kept and archived as new information about
storeys is added.

This means with option A, I can expect to have a table of a few hundred
thousand rows.... (obviously i would index the BuildingID and Storey
Number). Sounds like a lot(!), but it means that I could retreive/select
information I need with one single query and no table joins.

If I go with option B, I can expect to have to do six table joins/queries
when retreiving information about one storey in one building, however
there will be less records to sift through.


Why would there be less rows with option B? With each scenario, you'll still
have one row per storey, surely? So in fact, option B will be a far larger
database, because you'll be storing each building ID and storey number six times.

Option A is far more sensible, because it'll be smaller, you won't have to do
any table joins, and the data is all in one place, which is easier to maintain
in the long run.

Oli
Jul 20 '05 #2

P: n/a
Oli Filth <ol*******@eatspam.coldmail.com> wrote in
news:r0**************@newsfe1-gui.ntli.net:
Why would there be less rows with option B? With each scenario, you'll
still have one row per storey, surely? So in fact, option B will be a
far larger database, because you'll be storing each building ID and
storey number six times.

Option A is far more sensible, because it'll be smaller, you won't
have to do any table joins, and the data is all in one place, which is
easier to maintain in the long run.


Hi... I left something out regarding my database design structure (option
A may actually record more than one row per story) but still you bring up
a good point - there will still be a similar number of total rows
regardless of which option I choose. Funny how I missed that :P

Thanks for your help... I suppose my real concern was making sure that
MySQL can go through several hundred thousand rows (what about
millions?), but I suppose it can if I have the required disk space and I
index my tables properly...

PS: Regarding indexing, do I just set up the table with an Index, or do I
need to index the table every month or so to make sure new records are
being indexed?

Thanks for the advice...

GM
Jul 20 '05 #3

P: n/a
Good Man wrote:
Thanks for your help... I suppose my real concern was making sure that
MySQL can go through several hundred thousand rows (what about
millions?), but I suppose it can if I have the required disk space and I
index my tables properly...

PS: Regarding indexing, do I just set up the table with an Index, or do I
need to index the table every month or so to make sure new records are
being indexed?

Thanks for the advice...

GM


Hi

I've never used a MySQL table with more than a few hundred records, but I guess
it still works fine with thousands or millions (just that queries will take much
longer).

If you set up a particular field in a table as an index, MySQL will
automatically handle indexing of new insertions. However, if you're frequently
deleting or altering existing records then the table becomes fragmented, so you
should run a "OPTIMIZE TABLE table_name" query every so often, which defrags the
table and re-sorts indexes.

If you're going to be running tables with hundreds of thousands of records, it'd
be well worth thinking about optimisations that you can use for your table
design and the way that you form queries. For instance, use fixed-length CHAR
strings rather than VARCHAR or TEXT. Try
http://dev.mysql.com/doc/mysql/en/My...imization.html for loads of stuff on
optimisation.

Hope this helps,
Oli
Jul 20 '05 #4

P: n/a
Good Man wrote:
Hi there

I'm in the planning stages of creating a database, and I have two options
here. Which makes more sense, and/or provides better performance for
queries - a single table with hundreds of thousands of rows, or many
tables with less rows?

GOAL:
Recording information about buildings, storey by storey. Basically I
will be recording six bits of information about each storey in each
building.

SETUP:
There will be one table for Buildings for sure. The big question is about
how I should record those six details per storey.

OPTIONS:
A) One table that records building ID, storey number, and the six
measurements per row.
B) Six tables that record building ID, storey number, and the appropriate
measurement.
Eventually, there could be a few thousand buildings in the database, and
each building may have anywhere from 5 to 50 storeys. Not to mention that
"old information" is to be kept and archived as new information about
storeys is added.

This means with option A, I can expect to have a table of a few hundred
thousand rows.... (obviously i would index the BuildingID and Storey
Number). Sounds like a lot(!), but it means that I could retreive/select
information I need with one single query and no table joins.

If I go with option B, I can expect to have to do six table joins/queries
when retreiving information about one storey in one building, however
there will be less records to sift through.
Which is the better option to take, and why? By 'better', I am referring
to database performance/speed, updates and other future maintenance on
the database. All opinions welcome.

Thanks,
GM
,


I would say that A is your best solution, but only if you are storing
one peice of information about the building, meaning Building ID. If
you want to store other peices of information about the building, like
street address, then you need a different approach... Option C.

Option C, requires 2 tables... Buildings, and Storeys. Tables are
listed below, and the fields are indented.

Buildings
BuildingID
Address1
Address2
City
... (other fields)
Storeys
BuildingID (FK to Building/BuildingID)
StoreyNumber
BitOfInfo1
BitOfInfo2
...
BitOfInfo6
John
Jul 20 '05 #5

P: n/a
B is better as long as you have a normalized data model.

Fewer tables mean less relational and mor records meaning more system
resources are used for each query

For large volumes do not use MySQL as it is not that scalable

Postgres is much faster

Good Man wrote:
Hi there

I'm in the planning stages of creating a database, and I have two options
here. Which makes more sense, and/or provides better performance for
queries - a single table with hundreds of thousands of rows, or many
tables with less rows?

GOAL:
Recording information about buildings, storey by storey. Basically I
will be recording six bits of information about each storey in each
building.

SETUP:
There will be one table for Buildings for sure. The big question is about
how I should record those six details per storey.

OPTIONS:
A) One table that records building ID, storey number, and the six
measurements per row.
B) Six tables that record building ID, storey number, and the appropriate
measurement.
Eventually, there could be a few thousand buildings in the database, and
each building may have anywhere from 5 to 50 storeys. Not to mention that
"old information" is to be kept and archived as new information about
storeys is added.

This means with option A, I can expect to have a table of a few hundred
thousand rows.... (obviously i would index the BuildingID and Storey
Number). Sounds like a lot(!), but it means that I could retreive/select
information I need with one single query and no table joins.

If I go with option B, I can expect to have to do six table joins/queries
when retreiving information about one storey in one building, however
there will be less records to sift through.
Which is the better option to take, and why? By 'better', I am referring
to database performance/speed, updates and other future maintenance on
the database. All opinions welcome.

Thanks,
GM
,

Jul 20 '05 #6

P: n/a
peter wrote:
B is better as long as you have a normalized data model.

Fewer tables mean less relational and mor records meaning more system
resources are used for each query


Doesn't apply in the OP's case, as his "option B" would literally be his "option
A" split into six.

i.e. For each storey:

"option A":

Table1: ID, StoreyNum, Data1, Data2, Data3, Data4, Data5, Data6
======

"option B":

Table1: ID, StoreyNum, Data1
======
Table2: ID, StoreyNum, Data2
======
Table3: ID, StoreyNum, Data3
======
Table4: ID, StoreyNum, Data4
======
Table5: ID, StoreyNum, Data5
======
Table6: ID, StoreyNum, Data6
======

As the OP has said that each storey will always be associated with six fields of
data, "option B" will take up more space, as ID and StoreyNum will be stored six
times each, *and* he'd have to do a six-table join for every query. There's no
way that's more efficient.

Oli
Jul 20 '05 #7

P: n/a
Good Man <he***@letsgo.com> wrote:
Which is the better option to take, and why? By 'better', I am referring
to database performance/speed, updates and other future maintenance on
the database. All opinions welcome.


I'd go with option A.

You didn't mention what you were searching on? If it's building ID or
storey ID, I wouldn't worry too much about table size. A modern database
should be able to search fixed keys from a table of millions in under
1 second. (where you get into trouble is like('%query%') )

The trick is in understanding how the database interprets the query. Fixed
index lookups are fast. Scanning rows of data is slow, so, try and minimize
rows you'll likely need to scan.

Don't be tempted to create indices w/out first populating the data, an index at the
wrong place & time can slow down the database (particularly inserts).

A good rule to follow is never, ever, EVER duplicate data. Ie, don't store
the same bit of information twice. For instance, don't copy stuff around, use
links: buildings AND stories.

buildings ======
bid | name | info that applies globally to same building

storey ======
sid | bid | info1 | info2 | info3 | info that applies only to storey.
Where building->bid = storey->bid.

That way, if you need to scan for a building name, you won't have to wade
through as much data. Once you find the building name, you can use it's
building ID (bid) to greatly minimize the rows from storeys that you'll be
scanning. (since 'bid' would be indiced and a FIXED query. (bid=code, NOT bid
like('%code%))

You've got: select * from storeys where bid=100 AND info2 LIKE('%str%')

The database should be smart enough to realize it can use it's bid index
to minimize the rows passed on to the like() clause. So, even if your stories
table has 50 buildings with 50 stories each, you'll only end up scanning
100 rows. (50 from the building table to get the BID from the name, and 50
from the stories table that were reduced from the index)

(Experience has shown me to hold off on indices until they are actually
required)

Jamie
--
http://www.geniegate.com Custom web programming
gu******@lnubb.pbz (rot13) User Management Solutions
Jul 20 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.