By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,493 Members | 1,302 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,493 IT Pros & Developers. It's quick & easy.

Optimizing a simple UNION query

P: n/a
As usual, it is 2:00am, and I'm pulling my hair out, finally resorting to
posting in the newsgroups for help. :)

Simple problem, in theory.

Given table "map":

CREATE TABLE map (
entry_id int(10) unsigned NOT NULL auto_increment,
piece_id int(10) unsigned NOT NULL default '0',
hval mediumint(3) unsigned NOT NULL default '0',
pct decimal(4,2) NOT NULL default '0.00',
PRIMARY KEY (entry_id),
KEY piece_id (piece_id),
KEY hue (hval),
KEY pct (pct),
KEY idxcombo (hval,pct,piece_id)
) TYPE=MyISAM;

Pardon all of the indexes, but we've all been there. Trying anything at
this point. Basically, there are multiple rows for each "piece_id" (foreign
key into another table, but not unique in this table). Each of the rows
also contains a value for "hval" and "pct". No big deal. So, something like
this:

piece_id hval pct
1 20 1.55
1 124 2.67
1 25 14.02
1 20 1.12
2 192 7.89
2 22 2.00
2 21 4.31
2 25 62.90

What I am trying to do sounds simple, but ... I need to run a query to find
how many unique items (corresponding to piece_id) contain "hval" values
across multiple ranges, and whose SUM(pct) for those hval's is over a
certain threshhold. For example:

PSEUDO QUERY: Give me all unique piece_id's that have an hval between "1
and 20" and "200-275". Along the way, sum up the pct (percentage) values
for each hval/piece, because at the end, we want the total percentages of
each case, and that total percentage must be over our threshhold value.

A simple query for one range is, of course, elementary (note the threshhold
value of 25):

---
SELECT SUM( pct ) AS total, piece_id
FROM map
WHERE ( hval >=217 AND hval <=267)
GROUP BY piece_id
HAVING total >=25
ORDER BY total DESC
LIMIT 0,30
---

This says "give me all piece_id's and the total percentage (SUMmed) for all
hvals between a range of 217 and 267, if that total percentage is over 25".

Easy enough. But adding a new range in the form of another set of (hval
=xxx AND hval <=yyy) won't work for me, as I need to find all piece_id's

that have records in BOTH ranges, not either/or. For instance:

---
SELECT SUM( pct ) AS total, piece_id
FROM map
WHERE ( hval >=217 AND hval <=267) AND ( hval >=20 AND hval <=75)
GROUP BY piece_id
HAVING total >=25
ORDER BY total DESC
LIMIT 0,30
---

This won't work, for obvious reasons. Any given row will contain only one
hval, and in most cases, it will never fall into both ranges, although it
may (rarely). I need to handle both situations, I guess. So, I'm trying a
UNION:

---
(SELECT SUM( pct ) AS total, piece_id
FROM map
WHERE ( hval >=217 AND hval <=267)
GROUP BY piece_id
HAVING total >=25)

UNION

(SELECT SUM( pct ) AS total, piece_id
FROM map
WHERE ( hval >=20 AND hval <=75)
GROUP BY piece_id
HAVING total >=25)

ORDER BY total DESC
LIMIT 0, 30
---

I'm trying to find which "piece_id"s have rows that fall into both ranges. I
suppose this would work, but ... the EXPLAIN against this reveals:

table type possible_keys key key_len ref rows Extra
map range hval,idxcombo idxcombo 3 NULL 26704 Using where; Using
index; Using temporary; Using f...
map range hval,idxcombo idxcombo 3 NULL 25111 Using where; Using
index; Using temporary; Using f...
map index NULL hval 3 NULL 281716 Using index
Nice. I was OK with the first two. But damn ... that third one will kill
me, especially when this table ends up with millions of rows. :(

Any ideas? I am running MySQL 4.0, but not 4.1, so I don't have subselect
capability handy.

Seems so simple. I probably just can't see the forest for the trees right
now. Any help would be greatly appreciated!

Peace.
Jul 20 '05 #1
Share this question for a faster answer!
Share on Google+

This discussion thread is closed

Replies have been disabled for this discussion.