By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,004 Members | 1,222 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,004 IT Pros & Developers. It's quick & easy.

q: BIT OPERATIONS and their usefullness..

P: n/a
I noticed that DB2 does not support bitwise operations but I found
some UDFs that seem to give this functionality..

There seems to be a consensus that bitoperations are not usefull (or
bad practice) in sql.. but I am currious though,

say I have a table with a few columns..

attribute_color, attributegroup_2, attributegroup_3....

each one storing a bitstring defining the which of a bunch of values
this attribute contains..

for example...

a choice of (red, yellow, blue, green) can be stored as 0011 (in
attribute_color) if blue and green are the selected colors.. this
makes counting attribute combinations very simple and efficiant over a
lot of rows..

if I wanted to know which items are blue AND green I could simply do a
select count(*) from attribute_table where attribute_color BITAND 11
== 11...
likewise all green colors can be counted with attribute_color BITAND
01=01..

for a large number of attributes and options and adhoc counts of
attribute combinations this works very well.. confining all the data
to one row.

I realise that the software has to keep track of bit positions and
meanings.. but thats little to worry about considering the trade off
between this method and lots of joins or pivoting attribute tables,
is it not?

is this a poor method of storing attributes (that belong to a
multivalue group) ? seems to work nice and fast...

Can someone tell me why this is bad practice? and why is it that there
are no built in BIT operations to do this?

thanks,
Vasili
Dec 13 '07 #1
Share this Question
Share on Google+
3 Replies


P: n/a
vasilip wrote:
I noticed that DB2 does not support bitwise operations but I found
some UDFs that seem to give this functionality..
I beg to differ:
http://publib.boulder.ibm.com/infoce.../r0052628.html
There seems to be a consensus that bitoperations are not useful (or
bad practice) in sql.. but I am currious though,

say I have a table with a few columns..

attribute_color, attributegroup_2, attributegroup_3....

each one storing a bitstring defining the which of a bunch of values
this attribute contains..

for example...

a choice of (red, yellow, blue, green) can be stored as 0011 (in
attribute_color) if blue and green are the selected colors.. this
makes counting attribute combinations very simple and efficiant over a
lot of rows..

if I wanted to know which items are blue AND green I could simply do a
select count(*) from attribute_table where attribute_color BITAND 11
== 11...
likewise all green colors can be counted with attribute_color BITAND
01=01..

for a large number of attributes and options and adhoc counts of
attribute combinations this works very well.. confining all the data
to one row.

I realise that the software has to keep track of bit positions and
meanings.. but thats little to worry about considering the trade off
between this method and lots of joins or pivoting attribute tables,
is it not?

is this a poor method of storing attributes (that belong to a
multivalue group) ? seems to work nice and fast...

Can someone tell me why this is bad practice? and why is it that there
are no built in BIT operations to do this?
Your example is actually very well suited to explain it.
CREATE TABLE stuff(id INT, red INT, yellow INT, blue INT, green INT,
color INT GENERATED ALWAYS
AS (red * 8 + yellow * 4 + blue * 2 + green));
CREATE INDEX redidx ON stuff(red);
CREATE INDEX yellowidx ON stuff(yellow);
CREATE INDEX blueidx ON stuff(blue);
CREATE INDEX greenidx ON stuff(green);
CREATE INDEX color ON stuff(color);

SELECT id FROM stuff WHERE red = 1 AND green = 1

Allows the the DBMS to use technology that is called INDEX-ANDING.
That is the DBMS will find all the rows that are red , all those that
are green and intersect them efficiently.
The cost will increase below linear as the table size increases.

By contrast
SELECT id FROM stuff WHERE BITAND(color, 3) = 3

The DBMS will have to look at each row in the table and apply the BITAND
function (DB2 won't even look at that index).
Cost will lineary increase with the size of the table.

BIT functions aren't always bad. E.g. it is impractical to
index 100 columns each being either "on" or "off".
The cost to INSERT and DELETE operations would be to high and there is
too much overhead.

So as always. Never say never - or always :-)

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Dec 13 '07 #2

P: n/a
FYI, BIT* functions were added for IBM Content Manager.
Ironically the guy who drove them in is a relational database legend.

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Dec 14 '07 #3

P: n/a
>There seems to be a consensus that bit operations are not useful (or bad practice) in SQL.. <<

Bad practice. The SQL Standard was written to be hardware
independent, and not to depend on binary implementations like high-
end, low-end, 8, 16, 32 or 64 bit words, etc.

1NF requires that all values be scalars, so bit operations are a
violation of the foundations of RDBMS.

From a programming viewpoint, they are proprietary and do not port.
The constraints needed to keep them valid become nightmares of
complexity ("Hey, this column says we have a pregnant man with yellow
eyes!"). Then indexing and searching are messy and hard to maintain.
>say I have a table with a few columns..
attribute_color, attributegroup_2, attributegroup_3....
each one storing a bitstring defining the which of a bunch of values
this attribute contains.. <<

Why not normalize the schema instead? What happens when you add
purple? Or drop green? You have to do a serious ALTER TABE and UPDATE
>if I wanted to know which items are blue AND green I could simply do a select count(*) from attribute_table where attribute_color BITAND 11 == 11... <<
SQL does not use a == sign; you are going to low-level assembly
languages and C programming. We would write something like this:

SELECT sku
FROM Inventory
WHERE item_color IN ('green', 'blue')
GROUP BY sku
HAVING COUNT(DISTINCT item_color) = 2;

It would be better to use the Land or Pantone color numbers in a real
database, but you get the idea.

With an index this is fast on a row-oriented DB; it is almost
instantaneous with a hashed or column-oriented DB.
Dec 15 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.