473,406 Members | 2,698 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

nulls versus ?

Hi,

does anyone know of a resource which discusses the merits of using "nulls"
versus a specific "unknown" value? For example, should I use "null" for an
unknown or unset date, or should I use min-value? For a string (eg a name)
should I use null or an empty string? And so on.

(These sorts of issues crop up a lot in the applications I work on which
mostly involve presenting data on a web interface, and reading and writing
to and from a database).

Thanks,
Peter
May 18 '06 #1
15 1280
I think that these depends on the semantics. But a general rule I like
to follow is this:

Null in the database for unknown.
Default values set by the editor when the fields are not completed.

e.g. if name was not completed I put an empty string to the db.
or in the lookuptables I add a value "Not set" and from the editor I
set this as a default value

Tasos

May 18 '06 #2

Hi Peter,

I always create my tables in the database so that all fields are
non-nullable and must have a valid value.
For strings I use an empty string, numeric values vary depending on what
it is for, and for dates I will either use a specific date/time or the
same value as another date in the record.

Nullable fields in a MS SQL table degrades performance, but another
advantage of not allowing null values, is you don't have to check for them.
Paul


Peter Kirk wrote:
Hi,

does anyone know of a resource which discusses the merits of using "nulls"
versus a specific "unknown" value? For example, should I use "null" for an
unknown or unset date, or should I use min-value? For a string (eg a name)
should I use null or an empty string? And so on.

(These sorts of issues crop up a lot in the applications I work on which
mostly involve presenting data on a web interface, and reading and writing
to and from a database).

Thanks,
Peter

May 18 '06 #3
HI,

"Paul Cheetham" <PA******@dsl.pipex.com> wrote in message
news:uy**************@TK2MSFTNGP02.phx.gbl...

Hi Peter,

I always create my tables in the database so that all fields are
non-nullable and must have a valid value.
What you do with a date column? in a escenario where any date accepted by
SQL is an accepted value?
The only way is using null.

and for dates I will either use a specific date/time or the same value as
another date in the record.
So you use incorrect data, only to avoid dealing with null?
Nullable fields in a MS SQL table degrades performance


How ? This is the first time I hear this

--
--
Ignacio Machin,
ignacio.machin AT dot.state.fl.us
Florida Department Of Transportation
May 18 '06 #4
Hi,

"Peter Kirk" <pk@alpha-solutions.dk> wrote in message
news:%2****************@TK2MSFTNGP05.phx.gbl...
Hi,

does anyone know of a resource which discusses the merits of using "nulls"
versus a specific "unknown" value? For example, should I use "null" for an
unknown or unset date, or should I use min-value? For a string (eg a name)
should I use null or an empty string? And so on.


There is not a "one rule fits all" it all depends of the type of data you
are dealing with, the acceptable values and the version of the framework you
use.

I do use null when possible to signal "unknown/unset" values in the DB. In
the code if I can use a default value (like DateTime.MinValue) I use it as
it does work in 1.1 and 2.0.
In case I cannot use a default value , like an integer value . then I create
a wrapper type to signal a null status. This is not longer needed with 2.0
as now you have nullable types.

--
--
Ignacio Machin,
ignacio.machin AT dot.state.fl.us
Florida Department Of Transportation
May 18 '06 #5
>> Nullable fields in a MS SQL table degrades performance,

I never heard that either. In fact, I've heard it the other way: The
non-nullable fields degrade preformance, because the db must now check
each value inserted to make sure it's not null.
but another advantage of not allowing null values, is you don't have to check for them.


Nonsense. You've merely changed the value you have to check for, and
the location of the check. The total work done remains the same,
because you've done nothing with the data entering the system.

In any non-trivial system, some values will be unknowns. Whether you
refer to those unknowns as Null, or "" or -1 or 4-Jul-1776 is relevant.
In each case you have a value you must check for.

May 18 '06 #6
> >> Nullable fields in a MS SQL table degrades performance,

I never heard that either. In fact, I've heard it the other way: The
non-nullable fields degrade preformance, because the db must now check
each value inserted to make sure it's not null.
but another advantage of not allowing null values, is you don't have to check for them.


Nonsense. You've merely changed the value you have to check for, and
the location of the check. The total work done remains the same,
because you've done nothing with the data entering the system.

In any non-trivial system, some values will be unknowns. Whether you
refer to those unknowns as Null, or "" or -1 or 4-Jul-1776 is relevant.
In each case you have a value you must check for.


Indeed, and if the check for null is implemented as an integer compare
against zero, then it's _much_ quicker than doing, perhaps, a
character-by-character compare between strings.

Null is your friend.
May 18 '06 #7
<ja**********@gmail.com> wrote in message
news:11**********************@i40g2000cwc.googlegr oups.com...

I never heard that either. In fact, I've heard it the other way: The
non-nullable fields degrade preformance, because the db must now check
each value inserted to make sure it's not null.


The reason nulls can degrade performance is that the database engine has to
check a bit in the record to determine whether the associated column
contains a null. Also, those bits add to storage requirements.

///ark
May 18 '06 #8
And storing and checking a bit is _more_costly_ than comparing a larger
structure?
May 18 '06 #9
"PIEBALD" <PI*****@discussions.microsoft.com> wrote in message
news:2E**********************************@microsof t.com...
And storing and checking a bit is _more_costly_ than comparing a larger
structure?


Depends on usage. Typically, data is more often read than written. Non-null
columns need checking on write. Null columns need checking on read. Plus the
storage, which can double in the case of a small row size.
May 18 '06 #10
Huh? Which "database engine" are you talking about? Please qualify this
ridiculous statement with some sort of example/evidence.

May 18 '06 #11
>> Non-null columns need checking on write.
Null columns need checking on read.


Nope. I don't see it.
First of all, the Null/Not Null flag must always exist, as even a
not-null column must give some response to a Is-null query.

So, basically, we have two null-related boolean conditions: Can-be-null
(Y/N) and Is-Null (Y/N), with the limitation that "NY" is invalid.
Your contension is that given that limitation, we can, on occasion, not
store the second bit, and derive it's value algorithmically.

But that requires doing lots of work (columns which appear to be a
fixed size can be either of two different sizes), just to save the
storage of one bit. Since there are other boolean flags which need to
be saved, and bits are doled out in packages of 8 (min) or 32 (more
likely), storing the is-null flag for a not-null column is almost
certainly zero cost. (ie, it inflicts a cost only if the number other
flags total an exact multiple of 8)

so, in pseudo-code:
On insert:
if (column nullable)
if (data is null)
field.is_null = t
else
field = data
field.is_null = f
else
if (data is null)
throw exception
else
field = data
field.is_null = f

On Select
response.data = field
response.nullable = column nullable
response.is_null = field.is_null

Given that, I cannot imagine any scenario, where one path would be
significantly faster that another.

May 18 '06 #12
<ja**********@gmail.com> wrote in message
news:11**********************@j55g2000cwa.googlegr oups.com...
So, basically, we have two null-related boolean conditions: Can-be-null
(Y/N) and Is-Null (Y/N), with the limitation that "NY" is invalid.
Sort of. The first condition is stored with the table's metadata, whereas
the second has to be stored with each row.
Your contention is that given that limitation, we can, on occasion, not
store the second bit, and derive its value algorithmically.
Right. If the metadata says the column can't be null, we don't have to store
nullness for that column in each row.
But that requires doing lots of work (columns which appear to be a
fixed size can be either of two different sizes),
In SQL Server, each row with nullable columns includes a bitmap indicating
whether each nullable column actually contains null. See "Inside SQL Server
2000", p. 235.
just to save the
storage of one bit. Since there are other boolean flags which need to
be saved, and bits are doled out in packages of 8 (min) or 32 (more
likely), storing the is-null flag for a not-null column is almost
certainly zero cost. (ie, it inflicts a cost only if the number other
flags total an exact multiple of 8)
Including 0. :) Which is why I said that making a column nullable can double
the size of the row. The null bitmask (probably because it's variable
length) isn't squeezed in with other metadata.
so, in pseudo-code:
On insert:
if (column nullable)
if (data is null)
field.is_null = t
else
field = data
field.is_null = f
else
if (data is null)
throw exception
else
field = data
field.is_null = f


No, field.is_null does not need to be set in the last case. Your example
actually points out something I missed -- that even writing a nullable field
can impose a performance hit.

///ark
May 18 '06 #13
<sd********@gmail.com> wrote in message
news:11*********************@j55g2000cwa.googlegro ups.com...
Huh? Which "database engine" are you talking about? Please qualify this
ridiculous statement with some sort of example/evidence.


You're going to have to be more specific about which ridiculous statement
you're talking about - I make so many. :)

///ark
May 18 '06 #14
Anything in regard to degrading performance by using nullable columns.
Please supply a supporting example with explain plan and statistics.

May 18 '06 #15
<sd********@gmail.com> wrote in message
news:11*********************@j33g2000cwa.googlegro ups.com...
Anything in regard to degrading performance by using nullable columns.
Please supply a supporting example with explain plan and statistics.


First let me know if you read and disagree with my response to James (and
why).
May 18 '06 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Dan Perlman | last post by:
From: "Dan Perlman" <dan@dpci.NOSPAM.us> Subject: ODBC creating nulls? Date: Friday, July 09, 2004 10:43 AM Hi, Below is my VB6 code that writes data from an Access 2000 table to a PG table....
3
by: aaj | last post by:
Hi I am probably going to regret asking this because I'm sure you are going to tell me my design is bad 8-) ah well we all have to learn.... anyway I often use Nulls as a marker to see if...
6
by: mike | last post by:
I'm doing what I thought was a simple GROUP BY summary of fairly simple data and the my numbers aren't working out Some results are showing up <NULL> when I know the data is in the database ...
17
by: Raziq Shekha | last post by:
Hello everyone, SQL Server 2000. I have a database in which there are several objects which have ansi nuls and quoted identifier turned ON. Is there a way I can generate a script which: (1)...
0
by: Rhino | last post by:
I am working with SQL Functions in DB2 for Windows/Linux/UNIX (V8.2.1) and am having a problem setting input parameters for SQL Functions to null in the Development Center. My simple function,...
1
by: PST | last post by:
Here's a problem I'm trying to deal with: I'm working on a Frontpage 2000 website for a boat handicapping system, built in Access 97. What I'm trying to accomplish is: The user enters a...
24
by: Karen Hill | last post by:
Suppose you have a textBox control. It is empty and you send the this.textBox1.Text.ToString() value into a database. The control does not send a NULL but instead a "". What is the _CORRECT_...
4
by: BerkshireGuy | last post by:
I have the following dataset: PolNum DateInterviewComplete --------------------------------------------- 1234 11/23/2006 1323 11/03/2006 4342 Null 2322 Null I want to...
6
by: Cliff72 | last post by:
I need to fill in the nulls in the batch field the value from the record immediately preceding the null one ie replace the nulls with the preceding value until I hit a record with a value in...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.