473,385 Members | 1,766 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

SQL Datatypes

I have a SQL 2005 database that's created by a survey data collection
system. Users of this system are fairly non-technical and have little
to no conscious control over the datatypes. As a result, the data is
mostly stored quite inefficiently as varchars. For example, there is
data that could be stored in a column of bits and it's stored as a
varchar value of 0 or 1. (Yuck, I know.)

I am building a reporting system using this raw data and have a new
table structure designed that is much more efficient (and better for
reporting). Does anyone have any suggestions for getting this data
into my new structure? Specifically, how would you recommend checking
that varchar field and determining it could be stored as a bit?

Apr 1 '07 #1
3 3284
pi***********@gmail.com (pi***********@gmail.com) writes:
I have a SQL 2005 database that's created by a survey data collection
system. Users of this system are fairly non-technical and have little
to no conscious control over the datatypes. As a result, the data is
mostly stored quite inefficiently as varchars. For example, there is
data that could be stored in a column of bits and it's stored as a
varchar value of 0 or 1. (Yuck, I know.)
Well, obvious varchar is not right, but it is not that much of a disaster,
although there is a two-byte overhead for each value. What is more important
is that there is a constraint that disallows any other values. But I
can guess from the background, that there isn't. :-)

More generally, there is not really an agreement in the SQL community how
to store boolean values. Bit is probably the most commonly used data type,
and I am that camp myself. But others advocate things like:

iscompleted char(1) NOT NULL
CONSTRAINT ckc_iscompleted CHECK (iscompleted IN ('Y', 'N'))

Others use T or F for true or false. Or J or N as we did in an older
incarnation of our system. (J from "ja", the Swedish word for "yes".) So
why not 0 or 1?

What I am getting at here is that while the varchar thing is not optimal,
it may not be broken enough to warrant a change. It may be sufficient to
add a constraint to ascertain that there are no other values.
I am building a reporting system using this raw data and have a new
table structure designed that is much more efficient (and better for
reporting). Does anyone have any suggestions for getting this data
into my new structure? Specifically, how would you recommend checking
that varchar field and determining it could be stored as a bit?
SELECT varcharcol, COUNT(*) FROM tbl GROUP BY varcharcol

should give you an indication of what really is in that column. Say that you
find something like:

0 1234
1 9802
Y 23
N 12
X 1

Then you could convert to bit in this way:

SELECT CASE WHEN varcharcol IN ('0', 'N') THEN convert(bit, 0)
WHEN varcharcol IN ('1', 'Y') THEN convert(bit, 1)
END

Question remains what you should do with that X column. Make the bit
column nullable and store it as NULL? Just drop that row on the floor?
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Apr 1 '07 #2
On Apr 1, 5:00 am, Erland Sommarskog <esq...@sommarskog.sewrote:
SELECT varcharcol, COUNT(*) FROM tbl GROUP BY varcharcol

should give you an indication of what really is in that column. Say that you
find something like:

0 1234
1 9802
Y 23
N 12
X 1
I see the approach you're taking. I should have been more clear in my
original post - it's not just bool's that I have issues with - it's
every data type. Data that should be stored as an int (or tinyint) is
stored as a varchar, data that should be stored as a datetime is
stored as a varchar, data that should be stored as a float is stored
as a varchar. I'm looking to convert all of that data. Bool data was
just an example.

I'll work with this suggestion and see how far I can get.

Apr 1 '07 #3
pi***********@gmail.com (pi***********@gmail.com) writes:
I see the approach you're taking. I should have been more clear in my
original post - it's not just bool's that I have issues with - it's
every data type. Data that should be stored as an int (or tinyint) is
stored as a varchar, data that should be stored as a datetime is
stored as a varchar, data that should be stored as a float is stored
as a varchar. I'm looking to convert all of that data. Bool data was
just an example.
The exact validation is different depending on data type. Important is
the use of the CASE expression:

SELECT CASE isdate(varcharcol) THEN convert(datetime, varcharcol) END
FROM tbl
WHERE isdate(varcharcol) = 1

If you don't have the CASE, you can still get a conversion error despite
the filter in the WHERE clause.

Dates are particularly nasty to deal with. If a column says 01/02/03,
what this mean? Feb 1st, 2003, 2nd Jan 2003, March 2nd 2001 or something
else?

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Apr 1 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

17
by: John Bentley | last post by:
John Bentley: INTRO The phrase "decimal number" within a programming context is ambiguous. It could refer to the decimal datatype or the related but separate concept of a generic decimal number....
1
by: Jan Agermose | last post by:
Im writing information into an existing excel document using a connection string like: strConn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + Filename + ";Extended Properties=\"Excel...
3
by: Mark | last post by:
What are the best .NET datatypes to handle SQL Server's Float and Real datatypes? I'd like to avoid using the SQL Server specific datatypes like SqlInt32 or similar. Thanks in advance. -Mark
9
by: Mark Rae | last post by:
Hi, I posted a couple of days ago about the possibility of "simulating" in ..NET1.1 the nullable datatypes available in .NET2.0 - I'm nearly there, but require a bit more guidance. Basically,...
4
by: Curious | last post by:
Hi, I have a class with properties and each property has specific datatypes. Is it possible to extract the datatypes of each property found in a class? I would like to extract these...
2
by: Mark Gibson | last post by:
Hello, I've been experimenting with dblink recently, and have encountered some limitations I'd like to discuss. I've been trying to create views of remote tables, like so: CREATE VIEW stuff...
0
by: Randy Yates | last post by:
It might be a good idea to place a reference to table 8-1 in the various subsections of the Datatypes section in the online postgresql documentation. Otherwise, when hyperjumping from the table of...
2
by: circuit_breaker | last post by:
Hi, Is there's a built-in object inside of mySQL that hold its various datatypes? As an example, I'd like to populate an array with the following values: "BOOL",...
5
by: michal | last post by:
hi guys, i thought you might be interested in a nice JSON class which converts ASP datatypes (basic datatypes, dictionaries, recordsets, ...) into JSON so that javascript can easily understand it...
6
by: Jeff | last post by:
Dear experts! ..NET 2.0 I'm trying to make an array containg multiple datatypes. This array will consist of 3 items (string, string, integer): my first try was this, (of course it fails)...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.