469,929 Members | 2,070 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,929 developers. It's quick & easy.

Which is better? Int or smalldatetime(MSSQL)

I have a table that contains transactional data. Such as site view by
whom, when, which template, etc, etc...
Everytime when I pulled the report, hh:mm:ss never matters. Only
breakdown by dates, not time.
Having read some chapters of Ralph Kimball's book, I am inspired to
build "date" table with integer as primary key.

Here's what I have for schema of transactional table.

- viewed_customer_id int (4bytes)
- template_id uniqidentifier (16 bytes)
- viewed_on datetime (8 bytes)

And here's the version I am thinking of building

- viewed_customer_id int (4bytes)
- template_id uniqidentifier (16 bytes)
- viewed_date_key int (4 bytes)
- seconds int (4 bytes)

* I put seconds just in case I need to retrieve hour based or minute
based report.

Here's my question. I've also noticed that smalldatetime is also 4bytes
of memory but it consists of 2 sets of 2 bytes.

When I index, would there be significant performance difference between
indexing 4 byte of column and indexing 2bytes x 2 of column?

Jul 23 '05 #1
7 4504
Bostonasian (ax****@gmail.com) writes:
I have a table that contains transactional data. Such as site view by
whom, when, which template, etc, etc...
Everytime when I pulled the report, hh:mm:ss never matters. Only
breakdown by dates, not time.
Having read some chapters of Ralph Kimball's book, I am inspired to
build "date" table with integer as primary key.

Here's what I have for schema of transactional table.

- viewed_customer_id int (4bytes)
- template_id uniqidentifier (16 bytes)
- viewed_on datetime (8 bytes)

And here's the version I am thinking of building

- viewed_customer_id int (4bytes)
- template_id uniqidentifier (16 bytes)
- viewed_date_key int (4 bytes)
- seconds int (4 bytes)
The problem with using int for dates is that you might run into confusion
what your zero date is. In SQL Server the base date is 1900-01-01. In
Visual Basic etc it's 1899-12-30. In Unix it's 1970-01-01.

While datetime is not ideal for dates-only data, it's fairly easy
to use. In our system we actually have a type for it:

EXEC sp_addtype 'aba_date', 'datetime'
go
CREATE RULE aba_date_rule AS convert(char(8), @x, 112) = @x
go
EXEC sp_bindrule 'aba_date_rule', 'aba_date'
go

So it's not even possible to sneak in any seconds there. To strip
Hours and seconds from a value, this is the deal:

convert(char(8), date_value, 112)

(It's important to use format 112, as the resulting string will
always be converted back to datetime correctly.)
Here's my question. I've also noticed that smalldatetime is also 4bytes
of memory but it consists of 2 sets of 2 bytes.

When I index, would there be significant performance difference between
indexing 4 byte of column and indexing 2bytes x 2 of column?


4 bytes is 4 bytes. While smalldatetime may be described as 2+2, it's
nevertheless just another 32 bits when it comes to indexing.

As for whether smalldatetime is an option to datetime, it depends on
your business requirements. If all you use it for is tracking events,
then it might do. We have abandoned smalldatetime in our system, save
for some auditing columns, because we have encountered real-world
data beyond 2076.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #2
Thanks for the advice.
I got that "date" table from Ralph Kimball's "Data Warehouse tookit",
from "Retail" chapter.
I guess in the retail. It's crucial to know sales date was weekday or
weekend? Was is holiday? Which fiscal month/year it was, etc, etc...
Beyond reporting by only dates, there are occasions where I need to
come up with analysis(though it was not often).

Thanks for the tips.

Jul 23 '05 #3
On 30 Mar 2005 07:54:35 -0800, Bostonasian wrote:
Thanks for the advice.
I got that "date" table from Ralph Kimball's "Data Warehouse tookit",
from "Retail" chapter.
I guess in the retail. It's crucial to know sales date was weekday or
weekend? Was is holiday? Which fiscal month/year it was, etc, etc...
Beyond reporting by only dates, there are occasions where I need to
come up with analysis(though it was not often).

Thanks for the tips.


Hi Bostonasian,

Having a table with all dates and various properties of each date (such
as weekend, public holiday, reporting period, etc.) is good. Using an
integer to represent the data is not.

To save myself the typing, I'll just refer you to www.aspfaq.com/2519,
where it is all explained in full detail.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 23 '05 #4

"Hugo Kornelis" <hugo@pe_NO_rFact.in_SPAM_fo> wrote in message
news:jh********************************@4ax.com...
On 30 Mar 2005 07:54:35 -0800, Bostonasian wrote:
Thanks for the advice.
I got that "date" table from Ralph Kimball's "Data Warehouse tookit",
from "Retail" chapter.
I guess in the retail. It's crucial to know sales date was weekday or
weekend? Was is holiday? Which fiscal month/year it was, etc, etc...
Beyond reporting by only dates, there are occasions where I need to
come up with analysis(though it was not often).

Thanks for the tips.


Hi Bostonasian,

Having a table with all dates and various properties of each date (such
as weekend, public holiday, reporting period, etc.) is good. Using an
integer to represent the data is not.

To save myself the typing, I'll just refer you to www.aspfaq.com/2519,
where it is all explained in full detail.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)


While I agree that datetypes should be as close to what they represent,
in the book he refers to the fact table with the detailed data, has an
integer as a key ID to a seperate date table.

The actual datetime value, and all the necesary flags and groupings ( fiscal
quarter, date of week, holiday, etc..) is stored in that side car or
outrigger table.

Since current production versions of SQL don't have a seperate date
datatype, I tend to use smalldatetime instead.

With the side car table, a text representation can also be done to make
reporting easier
i.e. FullDateName column could be "Wednesday, February 31st 2007" , but
sorting is done by actual datetime datatype column.

Jul 23 '05 #5
On Thu, 31 Mar 2005 04:34:02 GMT, David Rawheiser wrote:

(snip quoteback)
While I agree that datetypes should be as close to what they represent,
in the book he refers to the fact table with the detailed data, has an
integer as a key ID to a seperate date table.

The actual datetime value, and all the necesary flags and groupings ( fiscal
quarter, date of week, holiday, etc..) is stored in that side car or
outrigger table. (snip)

Hi David,

I don't have this book, but what I'd like to know is what advantage the
integer ID key of this side car table has over a similar side table with
a datetime (or smalldatetime) key, with a CHECK constraint to ensure
that only the default time portion (midnight) is allowed.

Since current production versions of SQL don't have a seperate date
datatype, I tend to use smalldatetime instead.
As long as you're aware that smalldatetime has a time portion as well
(though with less precision) and that dates after June 6, 2079 can't be
represented, this is fine.

With the side car table, a text representation can also be done to make
reporting easier
i.e. FullDateName column could be "Wednesday, February 31st 2007" , but
sorting is done by actual datetime datatype column.


I agree - but that can equally well be done if the side car table uses a
[small]datetime column os primary key.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 23 '05 #6

"Hugo Kornelis" <hugo@pe_NO_rFact.in_SPAM_fo> wrote in message
news:je********************************@4ax.com...
On Thu, 31 Mar 2005 04:34:02 GMT, David Rawheiser wrote:

(snip quoteback)
While I agree that datetypes should be as close to what they represent,
in the book he refers to the fact table with the detailed data, has an
integer as a key ID to a seperate date table.

The actual datetime value, and all the necesary flags and groupings (
fiscal
quarter, date of week, holiday, etc..) is stored in that side car or
outrigger table.

(snip)

Hi David,

I don't have this book, but what I'd like to know is what advantage the
integer ID key of this side car table has over a similar side table with
a datetime (or smalldatetime) key, with a CHECK constraint to ensure
that only the default time portion (midnight) is allowed.

Since current production versions of SQL don't have a seperate date
datatype, I tend to use smalldatetime instead.


As long as you're aware that smalldatetime has a time portion as well
(though with less precision) and that dates after June 6, 2079 can't be
represented, this is fine.

With the side car table, a text representation can also be done to make
reporting easier
i.e. FullDateName column could be "Wednesday, February 31st 2007" , but
sorting is done by actual datetime datatype column.


I agree - but that can equally well be done if the side car table uses a
[small]datetime column os primary key.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)


I depends on how you are using the date.

If you don't need to present the items by grouping by month, sales period,
fiscal quarters, day of week, etc.
or if you don't want to include weekends or holidays or full moons, etc...
in your reports,
you either need to:
a) put a boat load of complex code in your procedures or functions to
determine these things.
b) create a date calendar table, add columns and set the flags on each of
these atributes.

Thus the resulting query is a simple join from the fact table to this
calendar table and include/exclude in the where clause or groupby those you
need.

You can use the date datatype itself as the keys, and get the same results
but in the book it tends to use abstract datatypes for the fact table
attributes.

http://www.amazon.com/exec/obidos/tg...&s=ebooks&st=*
Jul 23 '05 #7
On Fri, 01 Apr 2005 04:30:36 GMT, David Rawheiser wrote:

(snip)
I depends on how you are using the date.

If you don't need to present the items by grouping by month, sales period,
fiscal quarters, day of week, etc.
or if you don't want to include weekends or holidays or full moons, etc...
in your reports,
you either need to:
a) put a boat load of complex code in your procedures or functions to
determine these things.
b) create a date calendar table, add columns and set the flags on each of
these atributes.
Hi David,

I fully agree - and in most cases, the calendar table is the better
option.

Thus the resulting query is a simple join from the fact table to this
calendar table and include/exclude in the where clause or groupby those you
need.
Yep.

You can use the date datatype itself as the keys, and get the same results
but in the book it tends to use abstract datatypes for the fact table
attributes.

http://www.amazon.com/exec/obidos/tg...&s=ebooks&st=*


And this is where I disagrgee with the advise given in the book. Sure,
there are situations where a surrogate key is better than a natural key,
but in this specific case, I fail to see any advantages.

Like I said - I don't have this book (and I don't intend to buy it
either :-P) Do the authors give any reasons for their choice to use an
artificial key here? If so, I'd be interested to hear it!

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 23 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

5 posts views Thread by Bradley.. .. . . .. .. | last post: by
14 posts views Thread by Kukurydz | last post: by
2 posts views Thread by jblankenburg | last post: by
2 posts views Thread by wildfyre53207 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.