By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,465 Members | 1,474 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,465 IT Pros & Developers. It's quick & easy.

nulls versus ?

P: n/a
Hi,

does anyone know of a resource which discusses the merits of using "nulls"
versus a specific "unknown" value? For example, should I use "null" for an
unknown or unset date, or should I use min-value? For a string (eg a name)
should I use null or an empty string? And so on.

(These sorts of issues crop up a lot in the applications I work on which
mostly involve presenting data on a web interface, and reading and writing
to and from a database).

Thanks,
Peter
May 18 '06 #1
Share this Question
Share on Google+
15 Replies


P: n/a
I think that these depends on the semantics. But a general rule I like
to follow is this:

Null in the database for unknown.
Default values set by the editor when the fields are not completed.

e.g. if name was not completed I put an empty string to the db.
or in the lookuptables I add a value "Not set" and from the editor I
set this as a default value

Tasos

May 18 '06 #2

P: n/a

Hi Peter,

I always create my tables in the database so that all fields are
non-nullable and must have a valid value.
For strings I use an empty string, numeric values vary depending on what
it is for, and for dates I will either use a specific date/time or the
same value as another date in the record.

Nullable fields in a MS SQL table degrades performance, but another
advantage of not allowing null values, is you don't have to check for them.
Paul


Peter Kirk wrote:
Hi,

does anyone know of a resource which discusses the merits of using "nulls"
versus a specific "unknown" value? For example, should I use "null" for an
unknown or unset date, or should I use min-value? For a string (eg a name)
should I use null or an empty string? And so on.

(These sorts of issues crop up a lot in the applications I work on which
mostly involve presenting data on a web interface, and reading and writing
to and from a database).

Thanks,
Peter

May 18 '06 #3

P: n/a
HI,

"Paul Cheetham" <PA******@dsl.pipex.com> wrote in message
news:uy**************@TK2MSFTNGP02.phx.gbl...

Hi Peter,

I always create my tables in the database so that all fields are
non-nullable and must have a valid value.
What you do with a date column? in a escenario where any date accepted by
SQL is an accepted value?
The only way is using null.

and for dates I will either use a specific date/time or the same value as
another date in the record.
So you use incorrect data, only to avoid dealing with null?
Nullable fields in a MS SQL table degrades performance


How ? This is the first time I hear this

--
--
Ignacio Machin,
ignacio.machin AT dot.state.fl.us
Florida Department Of Transportation
May 18 '06 #4

P: n/a
Hi,

"Peter Kirk" <pk@alpha-solutions.dk> wrote in message
news:%2****************@TK2MSFTNGP05.phx.gbl...
Hi,

does anyone know of a resource which discusses the merits of using "nulls"
versus a specific "unknown" value? For example, should I use "null" for an
unknown or unset date, or should I use min-value? For a string (eg a name)
should I use null or an empty string? And so on.


There is not a "one rule fits all" it all depends of the type of data you
are dealing with, the acceptable values and the version of the framework you
use.

I do use null when possible to signal "unknown/unset" values in the DB. In
the code if I can use a default value (like DateTime.MinValue) I use it as
it does work in 1.1 and 2.0.
In case I cannot use a default value , like an integer value . then I create
a wrapper type to signal a null status. This is not longer needed with 2.0
as now you have nullable types.

--
--
Ignacio Machin,
ignacio.machin AT dot.state.fl.us
Florida Department Of Transportation
May 18 '06 #5

P: n/a
>> Nullable fields in a MS SQL table degrades performance,

I never heard that either. In fact, I've heard it the other way: The
non-nullable fields degrade preformance, because the db must now check
each value inserted to make sure it's not null.
but another advantage of not allowing null values, is you don't have to check for them.


Nonsense. You've merely changed the value you have to check for, and
the location of the check. The total work done remains the same,
because you've done nothing with the data entering the system.

In any non-trivial system, some values will be unknowns. Whether you
refer to those unknowns as Null, or "" or -1 or 4-Jul-1776 is relevant.
In each case you have a value you must check for.

May 18 '06 #6

P: n/a
> >> Nullable fields in a MS SQL table degrades performance,

I never heard that either. In fact, I've heard it the other way: The
non-nullable fields degrade preformance, because the db must now check
each value inserted to make sure it's not null.
but another advantage of not allowing null values, is you don't have to check for them.


Nonsense. You've merely changed the value you have to check for, and
the location of the check. The total work done remains the same,
because you've done nothing with the data entering the system.

In any non-trivial system, some values will be unknowns. Whether you
refer to those unknowns as Null, or "" or -1 or 4-Jul-1776 is relevant.
In each case you have a value you must check for.


Indeed, and if the check for null is implemented as an integer compare
against zero, then it's _much_ quicker than doing, perhaps, a
character-by-character compare between strings.

Null is your friend.
May 18 '06 #7

P: n/a
<ja**********@gmail.com> wrote in message
news:11**********************@i40g2000cwc.googlegr oups.com...

I never heard that either. In fact, I've heard it the other way: The
non-nullable fields degrade preformance, because the db must now check
each value inserted to make sure it's not null.


The reason nulls can degrade performance is that the database engine has to
check a bit in the record to determine whether the associated column
contains a null. Also, those bits add to storage requirements.

///ark
May 18 '06 #8

P: n/a
And storing and checking a bit is _more_costly_ than comparing a larger
structure?
May 18 '06 #9

P: n/a
"PIEBALD" <PI*****@discussions.microsoft.com> wrote in message
news:2E**********************************@microsof t.com...
And storing and checking a bit is _more_costly_ than comparing a larger
structure?


Depends on usage. Typically, data is more often read than written. Non-null
columns need checking on write. Null columns need checking on read. Plus the
storage, which can double in the case of a small row size.
May 18 '06 #10

P: n/a
Huh? Which "database engine" are you talking about? Please qualify this
ridiculous statement with some sort of example/evidence.

May 18 '06 #11

P: n/a
>> Non-null columns need checking on write.
Null columns need checking on read.


Nope. I don't see it.
First of all, the Null/Not Null flag must always exist, as even a
not-null column must give some response to a Is-null query.

So, basically, we have two null-related boolean conditions: Can-be-null
(Y/N) and Is-Null (Y/N), with the limitation that "NY" is invalid.
Your contension is that given that limitation, we can, on occasion, not
store the second bit, and derive it's value algorithmically.

But that requires doing lots of work (columns which appear to be a
fixed size can be either of two different sizes), just to save the
storage of one bit. Since there are other boolean flags which need to
be saved, and bits are doled out in packages of 8 (min) or 32 (more
likely), storing the is-null flag for a not-null column is almost
certainly zero cost. (ie, it inflicts a cost only if the number other
flags total an exact multiple of 8)

so, in pseudo-code:
On insert:
if (column nullable)
if (data is null)
field.is_null = t
else
field = data
field.is_null = f
else
if (data is null)
throw exception
else
field = data
field.is_null = f

On Select
response.data = field
response.nullable = column nullable
response.is_null = field.is_null

Given that, I cannot imagine any scenario, where one path would be
significantly faster that another.

May 18 '06 #12

P: n/a
<ja**********@gmail.com> wrote in message
news:11**********************@j55g2000cwa.googlegr oups.com...
So, basically, we have two null-related boolean conditions: Can-be-null
(Y/N) and Is-Null (Y/N), with the limitation that "NY" is invalid.
Sort of. The first condition is stored with the table's metadata, whereas
the second has to be stored with each row.
Your contention is that given that limitation, we can, on occasion, not
store the second bit, and derive its value algorithmically.
Right. If the metadata says the column can't be null, we don't have to store
nullness for that column in each row.
But that requires doing lots of work (columns which appear to be a
fixed size can be either of two different sizes),
In SQL Server, each row with nullable columns includes a bitmap indicating
whether each nullable column actually contains null. See "Inside SQL Server
2000", p. 235.
just to save the
storage of one bit. Since there are other boolean flags which need to
be saved, and bits are doled out in packages of 8 (min) or 32 (more
likely), storing the is-null flag for a not-null column is almost
certainly zero cost. (ie, it inflicts a cost only if the number other
flags total an exact multiple of 8)
Including 0. :) Which is why I said that making a column nullable can double
the size of the row. The null bitmask (probably because it's variable
length) isn't squeezed in with other metadata.
so, in pseudo-code:
On insert:
if (column nullable)
if (data is null)
field.is_null = t
else
field = data
field.is_null = f
else
if (data is null)
throw exception
else
field = data
field.is_null = f


No, field.is_null does not need to be set in the last case. Your example
actually points out something I missed -- that even writing a nullable field
can impose a performance hit.

///ark
May 18 '06 #13

P: n/a
<sd********@gmail.com> wrote in message
news:11*********************@j55g2000cwa.googlegro ups.com...
Huh? Which "database engine" are you talking about? Please qualify this
ridiculous statement with some sort of example/evidence.


You're going to have to be more specific about which ridiculous statement
you're talking about - I make so many. :)

///ark
May 18 '06 #14

P: n/a
Anything in regard to degrading performance by using nullable columns.
Please supply a supporting example with explain plan and statistics.

May 18 '06 #15

P: n/a
<sd********@gmail.com> wrote in message
news:11*********************@j33g2000cwa.googlegro ups.com...
Anything in regard to degrading performance by using nullable columns.
Please supply a supporting example with explain plan and statistics.


First let me know if you read and disagree with my response to James (and
why).
May 18 '06 #16

This discussion thread is closed

Replies have been disabled for this discussion.