By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,645 Members | 1,048 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,645 IT Pros & Developers. It's quick & easy.

Grouping Query Range

P: 39
Hi

I have a query which groups Number of People together by number of days and gets the following result :

Days-------- Number of People

1----------------100

2----------------150

3 ---------------- 175

4----------------200

5----------------225

6----------------75

7----------------100


However I would like to group the data like this:


Days ----------- Number of People

0 - 1----------------100

2 - 3----------------325

4 - 5---------------- 425

6+----------------175


Any ideas how to do this?

Thanks
May 23 '08 #1
Share this Question
Share on Google+
26 Replies


ADezii
Expert 5K+
P: 8,669
Hi

I have a query which groups Number of People together by number of days and gets the following result :

Days-------- Number of People

1----------------100

2----------------150

3 ---------------- 175

4----------------200

5----------------225

6----------------75

7----------------100


However I would like to group the data like this:


Days ----------- Number of People

0 - 1----------------100

2 - 3----------------325

4 - 5---------------- 425

6+----------------175


Any ideas how to do this?

Thanks
Expand|Select|Wrap|Line Numbers
  1. SELECT DISTINCTROW Partition([Days],0,500,2) AS Day_Range, Sum([Number of People]) AS [People in Range]
  2. FROM <Your Table Name Here>
  3. GROUP BY Partition([Days],0,500,2);
  4.  
SAMPLE OUTPUT:
Expand|Select|Wrap|Line Numbers
  1. Day_Range    People in Range
  2.    0:1              100
  3.    2:3               325
  4.    4:5               21
  5.  
May 23 '08 #2

P: 39
Expand|Select|Wrap|Line Numbers
  1. SELECT DISTINCTROW Partition([Days],0,500,2) AS Day_Range, Sum([Number of People]) AS [People in Range]
  2. FROM <Your Table Name Here>
  3. GROUP BY Partition([Days],0,500,2);
  4.  
SAMPLE OUTPUT:
Expand|Select|Wrap|Line Numbers
  1. Day_Range    People in Range
  2.    0:1              100
  3.    2:3               325
  4.    4:5               21
  5.  
Thanks for your help but just by looking at this i can tell its not going to work!!

I assume this is not a simple process? Any one any other ideas?
May 26 '08 #3

ADezii
Expert 5K+
P: 8,669
Thanks for your help but just by looking at this i can tell its not going to work!!

I assume this is not a simple process? Any one any other ideas?
Actually, it is a very simple process. Copy and Paste the SQL Statement into the SQL View of a Query, substituting your Table name for the demo. Just for my own curiosity, what makes you say it won't work even though you haven't tried it?
May 26 '08 #4

P: 39
Actually, it is a very simple process. Copy and Paste the SQL Statement into the SQL View of a Query, substituting your Table name for the demo. Just for my own curiosity, what makes you say it won't work even though you haven't tried it?
Sorry, I was in a bit of a mood with SQL at the time I replied!! But what makes me think it wont work is the 6 and over field. I assume yr solution only allows for fixed ranges. 0-1, 1-2, 3-4.

I really want 0, 1, 2, 3 - 5 and 6 and over. Will your solutioin give me this?
May 27 '08 #5

ADezii
Expert 5K+
P: 8,669
Sorry, I was in a bit of a mood with SQL at the time I replied!! But what makes me think it wont work is the 6 and over field. I assume yr solution only allows for fixed ranges. 0-1, 1-2, 3-4.

I really want 0, 1, 2, 3 - 5 and 6 and over. Will your solutioin give me this?
Sorry, I misinterpreted your request. You are correct in that my solution will only allow for a fixed range with an interval of 1. I do not think that there is an SQL based solution to your problem, but then again, a VBA based solution should not be that difficult. SQL is not my strong point, but please be patient and I'll call in some of the more proficient SQL experts in on this problem.
May 27 '08 #6

FishVal
Expert 2.5K+
P: 2,653
Hi, JC.

Partition function will return value w/o right bound whenever its Number argument higher then its stop argument

example
? Partition(1,0,5,2)
0: 1
? Partition(2,0,5,2)
2: 3
? Partition(5,0,5,2)
4: 5
? Partition(6,0,5,2)
6:
? Partition(16,0,5,2)
6:

You may reformat Partition() function output to what you like or write your own VBA function doing the same.
Additionally there is a pure SQL solution but it requires additional table.

Regards,
Fish
May 27 '08 #7

P: 39
Hi, JC.

Partition function will return value w/o right bound whenever its Number argument higher then its stop argument

example
? Partition(1,0,5,2)
0: 1
? Partition(2,0,5,2)
2: 3
? Partition(5,0,5,2)
4: 5
? Partition(6,0,5,2)
6:
? Partition(16,0,5,2)
6:

You may reformat Partition() function output to what you like or write your own VBA function doing the same.
Additionally there is a pure SQL solution but it requires additional table.

Regards,
Fish
OK! Thanks for all your help! Ive managed to solve the problem by having a query with 6 subqueries in it! Seems to work fine!

Thanks for all your help
May 27 '08 #8

Expert Mod 2.5K+
P: 2,545
Hi. The most flexible way to do what you ask is as Fish indicated - use a separate band table. That way you can use bands with unequal boundaries (as illustrated in the attached zipped database).

The banding table is added to the base data without using INNER, LEFT or RIGHT JOINs; instead, the rows of the banding table are selected where the value being compared is between the band minimum and maximum. The query with the joined banding can then be used as the basis for crosstabs, simple counts, histograms and so on.

I attach in zip form a sample database (A2003 format) which provides a simple example of age banding data. The age band table is used in the queries to provide the base data for histograms.

The extract is age, category and gender of a set of students, and illustrates in cut-down form how such banding works in practice. I use this technique (on a larger dataset) to perform age-related analysis of student data by gender, course type, location, campus, course, qualification and so on - all based on a single overarching query providing a view of the student data banded by age band.

-Stewart
Attached Files
File Type: zip Histogram.zip (19.4 KB, 293 views)
May 27 '08 #9

ADezii
Expert 5K+
P: 8,669
Hi. The most flexible way to do what you ask is as Fish indicated - use a separate band table. That way you can use bands with unequal boundaries (as illustrated in the attached zipped database).

The banding table is added to the base data without using INNER, LEFT or RIGHT JOINs; instead, the rows of the banding table are selected where the value being compared is between the band minimum and maximum. The query with the joined banding can then be used as the basis for crosstabs, simple counts, histograms and so on.

I attach in zip form a sample database (A2003 format) which provides a simple example of age banding data. The age band table is used in the queries to provide the base data for histograms.

The extract is age, category and gender of a set of students, and illustrates in cut-down form how such banding works in practice. I use this technique (on a larger dataset) to perform age-related analysis of student data by gender, course type, location, campus, course, qualification and so on - all based on a single overarching query providing a view of the student data banded by age band.

-Stewart
Hello and thanks for the reply, Stewart. I'd like to see that Attached DB whenever you get a chance.
May 27 '08 #10

Expert Mod 2.5K+
P: 2,545
Should be visible now, ADezii - if not please let me know.

-Stewart
Hello and thanks for the reply, Stewart. I'd like to see that Attached DB whenever you get a chance.
May 27 '08 #11

ADezii
Expert 5K+
P: 8,669
Should be visible now, ADezii - if not please let me know.

-Stewart
I know I'm getting older, Stewart, but I still can't see anything. No comment necessary, NeoPa! (LOL).
May 27 '08 #12

NeoPa
Expert Mod 15k+
P: 31,656
Subscribing .
May 27 '08 #13

Expert Mod 2.5K+
P: 2,545
Hi ADezii. It's attached on post #9 - at least I can see it and it is listed as one view, so someone else can too...

By the way, although the example uses a Count query any form of totalling can be done on the banded data - Sum (which the examples in post 1 would require), Average, Min, Max, whatever.

-Stewart
I know I'm getting older, Stewart, but I still can't see anything. No comment necessary, NeoPa! (LOL).
May 27 '08 #14

ADezii
Expert 5K+
P: 8,669
Hi ADezii. It's attached on post #9 - at least I can see it and it is listed as one view, so someone else can too...

By the way, although the example uses a Count query any form of totalling can be done on the banded data - Sum (which the examples in post 1 would require), Average, Min, Max, whatever.

-Stewart
Thanks Stewart for the explanation and Attachment, interesting concept.
May 28 '08 #15

NeoPa
Expert Mod 15k+
P: 31,656
Hi. The most flexible way to do what you ask is as Fish indicated - use a separate band table. That way you can use bands with unequal boundaries (as illustrated in the attached zipped database).

The banding table is added to the base data without using INNER, LEFT or RIGHT JOINs; instead, the rows of the banding table are selected where the value being compared is between the band minimum and maximum.
...
Nice idea Stewart.

I'm curious as to why you wouldn't use an INNER JOIN to link the tables though?

I would expect that to work better than a simple WHERE clause.
May 28 '08 #16

Expert Mod 2.5K+
P: 2,545
Hi NeoPa. Reason for no joins is that there are no common fields involved. The comparison is of a single discrete value in one table against a minimum to maximum range of values in the banding table, not key against key. It is the principle of the histogram, where discrete values of a population are grouped over specified intervals into separate but contiguous bands, with counts for each band subsequently displayed in the typical column chart form.

This kind of banding application is one of those rare occasions when such a Cartesian join between tables is essential, with the inherent multiplication of rows stopped by the banding range always matching just one of the rows in the banding table.

The reason a band number is included is to be able to order the bands correctly if required - and to be able to left-join the banding table back to the results later if necessary (because, depending on the range of the data, the result bandings are likely to be non-contiguous, and left-joining the bandings back will allow inclusion of all bands, even if there are no results for particular bands).

Hope this helps to explain the approach.

-Stewart
May 28 '08 #17

FishVal
Expert 2.5K+
P: 2,653
I agree with NeoPa. I would prefer to JOIN tables though with only two tables involved it seems not to have a great advantage.

Regards,
Fish
May 28 '08 #18

Expert Mod 2.5K+
P: 2,545
I agree entirely, Fish. When working with relational tables there are always joins involved, and in all of my applications I define the joins and set relational constraints as a matter of routine. However, the banding application is one of these rare occasions when this is simply not possible - in set algebra terms the sets involved are disjoint and cannot be equijoined or outer joined.

When I analyse several years worth of student data, involving around 30,000 rows, the time penalties and delays are in the subsequent multiple-field grouping and summations not in the formation of a banded base table involving joining of the 30,000 rows to a small table of around 8 rows using a where clause to restrict the rows joined.

-Stewart
May 28 '08 #19

FishVal
Expert 2.5K+
P: 2,653
Well. As I could recall, JOIN
Expand|Select|Wrap|Line Numbers
  1. ... FROM t1 ... JOIN t2 ON blnExpr
  2.  
is a later SQL syntax improvement for cartesian product filtering
Expand|Select|Wrap|Line Numbers
  1. ... FROM t1, t2 WHERE blnExpr
  2.  
The only difference known to me is that Expr from the first snippet must be evaluated on fields from both tables being joined.
All the rest seems to be allowed - any kind and combination of arithmetic and logic operations, VBA calls etc. So long it turns into boolean.

Advantage of JOIN syntax is that it allows outer joins and explicitely define the order of joining.

In context of OP's question both syntaxes performs more or less equally.

Regards,
Fish

P.S. Did you mean something else?
May 28 '08 #20

Expert Mod 2.5K+
P: 2,545
Hi Fish, and thanks for a very thoughtful reply. What I was referring to is that for the most part Cartesian products are avoided, and seen as 'a bad thing', because of the row multiplication involved, but there are occasions when no other approach will do. The lack of an explicit join could be detrimental in some circumstances, but as your reply clarifies the JOIN and WHERE approaches have similar roots and will perform similarly given the small scale of the banding table involved.

Where I am coming from is that the use of a Cartesian product in this case is quite deliberate and intentional, as it is dictated by the disjoint nature of the datasets - a very different case to the accidental creation of a cross-multipled table for related data by neglecting to join two tables correctly.

One other difference in the JOIN and WHERE approaches is that with explicit JOINs the database engine query optimisers can avoid projecting the Cartesian product to begin with, thereby operating more efficiently than applying a WHERE clause after the event might do - depending on how efficient the underlying operations and the optimising algorithms really are (which we can only infer from overall performance, as the optimisations are not visible or alterable by us).

Thank you again for your thoughtful reply which provides further explanation of the JOIN vs WHERE approaches. Really interesting, even if we are drifting a bit from the original thread!

-Stewart
May 28 '08 #21

FishVal
Expert 2.5K+
P: 2,653
.....
Where I am coming from is that the use of a Cartesian product in this case is quite deliberate and intentional, as it is dictated by the disjoint nature of the datasets - a very different case to the accidental creation of a cross-multipled table for related data by neglecting to join two tables correctly.
.....
Sorry, Stewart.

What "disjoint nature" is all about?
Just couldn't understand why it does "dictate". ;)

Regards,
Fish
May 28 '08 #22

Expert Mod 2.5K+
P: 2,545
Hi Fish. Disjoint is a term from set theory and just means that there are no elements in common. Set theory underlies the relational model developed by EF Codd in the 60s, updated by him since and (imperfectly) implemented in the various standards for SQL ever since.

When there is a relation between tables T1 and T2 the underlying sets have at least one element of overlap. There is no such overlap in the case of a range table (where each row represents a paired minimum to maximum range), and the data which contains discrete values (not min and max tuples). The sets are disjoint, therefore no common fields, so no way to join them on a relationship between fields.

To illustrate: age part of student data set:

{16, 19, 18, 18, 15, 64, 20,...,32}

Part of range set:

{{14, 15}, {16, 17}, {18, 21},...{61, 100}}

The sets are disjoint, so no value-based join is possible. If the product of the set of N values with a range set of M values was created there would be N x M rows in the result - the Cartesian product. But there is a functional relationship between ages and age ranges, which is that the discrete value will match one of the tuples where its value is between the values of the tuple elements - so although no 1-1 join is possible the two datasets of N x M rows can still be combined to give just N rows for the result by using the WHERE clause to find the correct range.

There is a more formal treatment of disjoint sets at Wikipedia - linked here.

I hope this helps.

-Stewart
Sorry, Stewart.

What "disjoint nature" is all about?
Just couldn't understand why it does "dictate". ;)

Regards,
Fish
May 29 '08 #23

FishVal
Expert 2.5K+
P: 2,653
Ok, Stewart.

You have not convinced me though. What you are talking about is table PK/FK relations based on key equalness. Table join, on the other hand, is a tool to combine records on some criteria regardless of PK/FK relationships. Sure it is commonly used to get denormalized view of normalized tables, but the area of table join application is definitely not limited by this.

Expressions like
Expand|Select|Wrap|Line Numbers
  1. SELECT * FROM t1, t2 WHERE t1.f1>t2.UpperBound AND t1.f1<t2.LowerBound;
  2.  
and
Expand|Select|Wrap|Line Numbers
  1. SELECT * FROM t1 INNER JOIN t2 ON t1.f1>t2.UpperBound AND t1.f1<t2.LowerBound;
  2.  
are equivalents and both working.
The advantage of JOIN syntax is, as I've stated already, explicit join order and possibility to perform outer (LEFT, RIGHT) joins.
BTW, earlier, outer joins had a special syntax:
Expand|Select|Wrap|Line Numbers
  1. ... FROM t1, t2 WHERE t1.f1 *= t2.f2
  2.  
was used to make left join
and
Expand|Select|Wrap|Line Numbers
  1. ... FROM t1, t2 WHERE t1.f1 =* t2.f2
  2.  
for right join.
In some RDBMS this syntax is accepted to provide backward compatibility but Jet SQL doesn't support it.

Regards,
Fish
May 29 '08 #24

Expert Mod 2.5K+
P: 2,545
Thank you, Fish, for another very interesting reply. I live and learn - I have not myself used table joins based on inequality criteria as you have shown. You are quite right that I am looking at the relationships from a PK/FK perspective. In those terms, there is a fundamental difference in the joining of the sets involved. In a PK/FK relationship all FKs are a subset of the PK set; in a table relationship where a table join as you have shown is involved one set is not a subset of the other. Different views of the same situation - and both ways valid. Really interesting!

-Stewart
May 29 '08 #25

Expert Mod 2.5K+
P: 2,545
Update on post 25: In the query editor SQL view Access accepts the table join version of the SQL just as Fish has said, but the query editor cannot then show the query in design view. The SQL table join statement does indeed work equivalently otherwise.

Expand|Select|Wrap|Line Numbers
  1. SELECT [Age Histogram Base].Type, [Age Histogram Base].Age, [Age Histogram Base].Gender, [Age Band].[Band], [Age Band].[Band Label]
  2. FROM [Age Histogram Base] INNER JOIN [Age Band]
  3. ON [Age Histogram Base].Age >= [Age Band].[Min] And [Age Histogram Base].Age <= [Age Band].[Max];
Timing update
Running a quick test of timing differences on an enlarged set of data (221k rows) the summary crosstab based on the WHERE version of the banded data took 6 seconds to run. The same crosstab based on the table-join version took just under 30 seconds to run.

On the simple Count of the banded rows both versions ran in about 6 seconds.

-Stewart
May 29 '08 #26

NeoPa
Expert Mod 15k+
P: 31,656
Nice idea Stewart.

I'm curious as to why you wouldn't use an INNER JOIN to link the tables though?

I would expect that to work better than a simple WHERE clause.
You boys seem to have sorted all this without me (You're just too quick :D).

FYI Fish correctly interpreted exactly the point I was raising.

I would expect, conceptually, that the INNER JOIN should work more efficiently, although I accept that this may well be negligible for small amounts of locally held data. It may prove more important if linking to remote (Non-Access) tables though.
May 29 '08 #27

Post your reply

Sign in to post your reply or Sign up for a free account.