pi***********@gmail.com (pi***********@gmail.com) writes:
I have a SQL 2005 database that's created by a survey data collection
system. Users of this system are fairly non-technical and have little
to no conscious control over the datatypes. As a result, the data is
mostly stored quite inefficiently as varchars. For example, there is
data that could be stored in a column of bits and it's stored as a
varchar value of 0 or 1. (Yuck, I know.)
Well, obvious varchar is not right, but it is not that much of a disaster,
although there is a two-byte overhead for each value. What is more important
is that there is a constraint that disallows any other values. But I
can guess from the background, that there isn't. :-)
More generally, there is not really an agreement in the SQL community how
to store boolean values. Bit is probably the most commonly used data type,
and I am that camp myself. But others advocate things like:
iscompleted char(1) NOT NULL
CONSTRAINT ckc_iscompleted CHECK (iscompleted IN ('Y', 'N'))
Others use T or F for true or false. Or J or N as we did in an older
incarnation of our system. (J from "ja", the Swedish word for "yes".) So
why not 0 or 1?
What I am getting at here is that while the varchar thing is not optimal,
it may not be broken enough to warrant a change. It may be sufficient to
add a constraint to ascertain that there are no other values.
I am building a reporting system using this raw data and have a new
table structure designed that is much more efficient (and better for
reporting). Does anyone have any suggestions for getting this data
into my new structure? Specifically, how would you recommend checking
that varchar field and determining it could be stored as a bit?
SELECT varcharcol, COUNT(*) FROM tbl GROUP BY varcharcol
should give you an indication of what really is in that column. Say that you
find something like:
0 1234
1 9802
Y 23
N 12
X 1
Then you could convert to bit in this way:
SELECT CASE WHEN varcharcol IN ('0', 'N') THEN convert(bit, 0)
WHEN varcharcol IN ('1', 'Y') THEN convert(bit, 1)
END
Question remains what you should do with that X column. Make the bit
column nullable and store it as NULL? Just drop that row on the floor?
--
Erland Sommarskog, SQL Server MVP,
es****@sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx