Query to group sequential items - Microsoft SQL Server

cyrus.kapadia

Let's say I have the following table:

entry product quality
1 A 80
2 A 70
3 A 80
4 B 60
5 B 90
6 C 80
7 D 80
8 A 50
9 C 70

I'm looking for a way to find the average "quality" value for a
SEQUENTIAL GROUPING of the same Product. For exmple, I need an
average of Entry 1+2+3 (because this is the first grouping of the same
product type), but NOT want that average to include row 8 (which is
also Product A, but in a different "group".)

I'm sure it can be done (because I can describe it!), but I'll be a
monkey's uncle if I can figure out how. I would imagine it would
involve some sort of running tally that references the next record as
it goes... to see if the product type has changed. Perhaps use of a
temporary table?

Muchas gracias!!
Cy.

Jul 23 '05 #1

Subscribe Reply

3293

David Rawheiser

Easy way ... cursor or loop thru as you stated.

WARNING - THE FOLLOWING IS AN UNTESTED HALF BACKED IDEA -
CONSUME AT YOUR OWN RISK

The set oriented way would require the addition of a grouping column,
initially null and populated via update statements from a temp table

use something like this to generate a set of the grouping transition rows.

-- GENERATED GROUP IDS AND GET MAX ENTRY IN GROUP
select
identity(int,1, 1) as groupid
a.product,
a.entry
into #groupings
from mytable a
join mytable b on a.product != b.product and a.entry = b.entry + 1

-- UPDATES BASE TABLE WITH FOR MAX ENTRY IN GROUP
update a
set groupid = g.groupid
from mytable a
join #grouping g on a.entry = g.entry

-- UPDATES PRIOR ENTRIES IN GROUP
update a
set a.groupid = g.groupid
from mytable a
join #grouping g on a.entry < g.entry
where a.groupid is null

-- QUERY TO RETURN RESULTS YOU ARE LOOKING FOR
select groupid , min( product ) , max( entry ) , min( entry) , sum (
quantity ) , count(*) , avg( quantity)
from mytable
group by groupid

-------------

I am not so sure about the 2nd update here, as I am tired and going to bed
soon. you may also need to join to the grouping temp table on the product
and also put a not exists() in the where clause, but you may be covered by
the simple is null to prevent muliple updates.

Let me know how you make out, and if this points you in a good direction or
throws you off track.
<cy***********@ us.pm.com> wrote in message
news:11******** **************@ c13g2000cwb.goo glegroups.com.. .

Let's say I have the following table:

entry product quality
1 A 80
2 A 70
3 A 80
4 B 60
5 B 90
6 C 80
7 D 80
8 A 50
9 C 70

I'm looking for a way to find the average "quality" value for a
SEQUENTIAL GROUPING of the same Product. For exmple, I need an
average of Entry 1+2+3 (because this is the first grouping of the same
product type), but NOT want that average to include row 8 (which is
also Product A, but in a different "group".)

I'm sure it can be done (because I can describe it!), but I'll be a
monkey's uncle if I can figure out how. I would imagine it would
involve some sort of running tally that references the next record as
it goes... to see if the product type has changed. Perhaps use of a
temporary table?

Muchas gracias!!
Cy.

Jul 23 '05 #2

John Gilson

<cy***********@ us.pm.com> wrote in message
news:11******** **************@ c13g2000cwb.goo glegroups.com.. .

Let's say I have the following table:

entry product quality
1 A 80
2 A 70
3 A 80
4 B 60
5 B 90
6 C 80
7 D 80
8 A 50
9 C 70

I'm looking for a way to find the average "quality" value for a
SEQUENTIAL GROUPING of the same Product. For exmple, I need an
average of Entry 1+2+3 (because this is the first grouping of the same
product type), but NOT want that average to include row 8 (which is
also Product A, but in a different "group".)

I'm sure it can be done (because I can describe it!), but I'll be a
monkey's uncle if I can figure out how. I would imagine it would
involve some sort of running tally that references the next record as
it goes... to see if the product type has changed. Perhaps use of a
temporary table?

Muchas gracias!!
Cy.

CREATE TABLE ProductEntries
(
product_entry INT NOT NULL PRIMARY KEY,
product_code CHAR(1) NOT NULL,
product_quality INT NOT NULL
)

INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (1, 'A', 80)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (2, 'A', 70)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (3, 'A', 80)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (4, 'B', 60)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (5, 'B', 90)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (6, 'C', 80)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (7, 'D', 80)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (8, 'A', 50)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (9, 'C', 70)

SELECT PR.product_code AS product_code,
PR.start_produc t_entry AS start_product_e ntry,
MAX(P.product_e ntry) AS end_product_ent ry,
AVG(CAST(P.prod uct_quality AS DECIMAL)) AS avg_product_qua lity
FROM (SELECT MIN(PE.product_ entry) AS start_product_e ntry,
PE.next_product _entry AS end_product_ent ry,
PE.product_code
FROM (SELECT P1.product_entr y, P1.product_code ,
MIN(P2.product_ entry) AS next_product_en try
FROM ProductEntries AS P1
LEFT OUTER JOIN
ProductEntries AS P2
ON P2.product_entr y > P1.product_entr y AND
P2.product_code <> P1.product_code
GROUP BY P1.product_entr y, P1.product_code ) AS PE
GROUP BY PE.product_code , PE.next_product _entry) AS PR
INNER JOIN
ProductEntries AS P
ON P.product_code = PR.product_code AND
P.product_entry >= PR.start_produc t_entry AND
(PR.end_product _entry IS NULL OR
P.product_entry < PR.end_product_ entry)
GROUP BY PR.product_code , PR.start_produc t_entry
ORDER BY start_product_e ntry

product_code start_product_e ntry end_product_ent ry avg_product_qua lity
A 1 3 76.666666
B 4 5 75.000000
C 6 6 80.000000
D 7 7 80.000000
A 8 8 50.000000
C 9 9 70.000000

--
JAG

Jul 23 '05 #3

David Rawheiser

Sure, that may work as well.

"John Gilson" <ja*@acm.org> wrote in message
news:5z******** ***********@twi ster.nyc.rr.com ...

<cy***********@ us.pm.com> wrote in message
news:11******** **************@ c13g2000cwb.goo glegroups.com.. .
Let's say I have the following table:

entry product quality
1 A 80
2 A 70
3 A 80
4 B 60
5 B 90
6 C 80
7 D 80
8 A 50
9 C 70

I'm looking for a way to find the average "quality" value for a
SEQUENTIAL GROUPING of the same Product. For exmple, I need an
average of Entry 1+2+3 (because this is the first grouping of the same
product type), but NOT want that average to include row 8 (which is
also Product A, but in a different "group".)

I'm sure it can be done (because I can describe it!), but I'll be a
monkey's uncle if I can figure out how. I would imagine it would
involve some sort of running tally that references the next record as
it goes... to see if the product type has changed. Perhaps use of a
temporary table?

Muchas gracias!!
Cy.

CREATE TABLE ProductEntries
(
product_entry INT NOT NULL PRIMARY KEY,
product_code CHAR(1) NOT NULL,
product_quality INT NOT NULL
)

INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (1, 'A', 80)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (2, 'A', 70)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (3, 'A', 80)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (4, 'B', 60)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (5, 'B', 90)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (6, 'C', 80)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (7, 'D', 80)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (8, 'A', 50)
INSERT INTO ProductEntries (product_entry, product_code, product_quality )
VALUES (9, 'C', 70)

SELECT PR.product_code AS product_code,
PR.start_produc t_entry AS start_product_e ntry,
MAX(P.product_e ntry) AS end_product_ent ry,
AVG(CAST(P.prod uct_quality AS DECIMAL)) AS
avg_product_qua lity
FROM (SELECT MIN(PE.product_ entry) AS start_product_e ntry,
PE.next_product _entry AS end_product_ent ry,
PE.product_code
FROM (SELECT P1.product_entr y, P1.product_code ,
MIN(P2.product_ entry) AS
next_product_en try
FROM ProductEntries AS P1
LEFT OUTER JOIN
ProductEntries AS P2
ON P2.product_entr y >
P1.product_entr y AND
P2.product_code <>
P1.product_code
GROUP BY P1.product_entr y, P1.product_code ) AS
PE
GROUP BY PE.product_code , PE.next_product _entry) AS PR
INNER JOIN
ProductEntries AS P
ON P.product_code = PR.product_code AND
P.product_entry >= PR.start_produc t_entry AND
(PR.end_product _entry IS NULL OR
P.product_entry < PR.end_product_ entry)
GROUP BY PR.product_code , PR.start_produc t_entry
ORDER BY start_product_e ntry

product_code start_product_e ntry end_product_ent ry avg_product_qua lity
A 1 3 76.666666
B 4 5 75.000000
C 6 6 80.000000
D 7 7 80.000000
A 8 8 50.000000
C 9 9 70.000000

--
JAG

Jul 23 '05 #4

--CELKO--

That is too much work! Let's move the average calculation into a
scalar subquery that will be done last, after all the clusters are
found. The little-used = ALL predicate can replace a lot of your
logic. And we pull up the usual Sequence auxiliary table.

SELECT prod_code, MIN(start) AS start, finish,
(SELECT AVG(CAST(prod_q uality AS DECIMAL(8,4)))
FROM ProductEntries AS P3
WHERE P3.prod_entry
BETWEEN MIN(start)
AND X.finish) AS avg_quality
FROM (SELECT P1.prod_code, S1.seq, MAX(S2.seq) AS finish
FROM ProductEntries AS P1, Sequence AS S1, Sequence AS S2
WHERE S1.seq <= S2.seq
AND S2.seq <= (SELECT MAX(prod_entry) FROM ProductEntries)
AND P1.prod_code
= ALL (SELECT P2.prod_code
FROM ProductEntries AS P2
WHERE P2.prod_entry BETWEEN S1.seq AND S2.seq)
GROUP BY P1.prod_code, S1.seq)
AS X (prod_code, start, finish)
GROUP BY prod_code, finish;

Another version requires two sentinal values
--
INSERT INTO ProductEntries VALUES (0, '?', 0);
INSERT INTO ProductEntries VALUES (10, '?', 0);

SELECT DISTINCT P1.prod_code, S1.seq AS start, S2.seq AS finish,
(SELECT AVG(CAST(prod_q uality AS DECIMAL(8,4)))
FROM ProductEntries AS P3
WHERE P3.prod_entry
BETWEEN S1.seq AND S2.seq) AS avg_quality
FROM ProductEntries AS P1,
(SELECT seq FROM Sequence
UNION ALL SELECT 0) AS S1, Sequence AS S2
WHERE S1.seq <= S2.seq
AND S2.seq <= (SELECT MAX(prod_entry) + 1 FROM ProductEntries)
AND P1.prod_code
<> (SELECT P3.prod_code
FROM ProductEntries AS P3
WHERE P3.prod_entry = S1.seq - 1)
AND P1.prod_code
<> (SELECT P4.prod_code
FROM ProductEntries AS P4
WHERE P4.prod_entry = S2.seq + 1)
AND P1.prod_code
= ALL (SELECT P2.prod_code
FROM ProductEntries AS P2
WHERE P2.prod_entry BETWEEN S1.seq AND S2.seq);

Jul 23 '05 #5

John Gilson

"--CELKO--" <jc*******@eart hlink.net> wrote in message
news:11******** **************@ c13g2000cwb.goo glegroups.com.. .

That is too much work! Let's move the average calculation into a
scalar subquery that will be done last, after all the clusters are
found. The little-used = ALL predicate can replace a lot of your
logic. And we pull up the usual Sequence auxiliary table.

SELECT prod_code, MIN(start) AS start, finish,
(SELECT AVG(CAST(prod_q uality AS DECIMAL(8,4)))
FROM ProductEntries AS P3
WHERE P3.prod_entry
BETWEEN MIN(start)
AND X.finish) AS avg_quality
FROM (SELECT P1.prod_code, S1.seq, MAX(S2.seq) AS finish
FROM ProductEntries AS P1, Sequence AS S1, Sequence AS S2
WHERE S1.seq <= S2.seq
AND S2.seq <= (SELECT MAX(prod_entry) FROM ProductEntries)
AND P1.prod_code
= ALL (SELECT P2.prod_code
FROM ProductEntries AS P2
WHERE P2.prod_entry BETWEEN S1.seq AND S2.seq)
GROUP BY P1.prod_code, S1.seq)
AS X (prod_code, start, finish)
GROUP BY prod_code, finish;

Another version requires two sentinal values
--
INSERT INTO ProductEntries VALUES (0, '?', 0);
INSERT INTO ProductEntries VALUES (10, '?', 0);

SELECT DISTINCT P1.prod_code, S1.seq AS start, S2.seq AS finish,
(SELECT AVG(CAST(prod_q uality AS DECIMAL(8,4)))
FROM ProductEntries AS P3
WHERE P3.prod_entry
BETWEEN S1.seq AND S2.seq) AS avg_quality
FROM ProductEntries AS P1,
(SELECT seq FROM Sequence
UNION ALL SELECT 0) AS S1, Sequence AS S2
WHERE S1.seq <= S2.seq
AND S2.seq <= (SELECT MAX(prod_entry) + 1 FROM ProductEntries)
AND P1.prod_code
<> (SELECT P3.prod_code
FROM ProductEntries AS P3
WHERE P3.prod_entry = S1.seq - 1)
AND P1.prod_code
<> (SELECT P4.prod_code
FROM ProductEntries AS P4
WHERE P4.prod_entry = S2.seq + 1)
AND P1.prod_code
= ALL (SELECT P2.prod_code
FROM ProductEntries AS P2
WHERE P2.prod_entry BETWEEN S1.seq AND S2.seq);

Less work? Debatable. Also, this won't work if the product_entry values
aren't consecutive.

--
JAG

Jul 23 '05 #6

--CELKO--

>> Less work? Debatable. <<

Fewer nesting levels should be a bit faster. But trying to find the
start and finish points is going to get really bad as the number of row
increases.

Also, this won't work if the product_entry values

aren't consecutive. <<

It depends on the sequence of tests having no gaps.

This is one that might be better done with a cursor and a WHILE loop
that accumulates a count and total of each quality test to a working
table.

--CELKO--
Please post DDL in a human-readable format and not a machne-generated
one. This way people do not have to guess what the keys, constraints,
Declarative Referential Integrity, datatypes, etc. in your schema are.
Sample data is also a good idea, along with clear specifications.
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

Jul 23 '05 #7

John Gilson

"--CELKO--" <re************ **@earthlink.ne t> wrote in message
news:1102630139 .1f7348d9b8d0e1 527f37d16587e3c ecc@teranews...

Less work? Debatable. <<
Fewer nesting levels should be a bit faster. But trying to find the
start and finish points is going to get really bad as the number of row
increases.
Also, this won't work if the product_entry values
aren't consecutive. <<

It depends on the sequence of tests having no gaps.

This is one that might be better done with a cursor and a WHILE loop
that accumulates a count and total of each quality test to a working
table.

You could be right but bite your tongue!

--
JAG
--CELKO--
Please post DDL in a human-readable format and not a machne-generated
one. This way people do not have to guess what the keys, constraints,
Declarative Referential Integrity, datatypes, etc. in your schema are.
Sample data is also a good idea, along with clear specifications.
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

Jul 23 '05 #8

David Rawheiser

Actually if this was an ongoing query: performance-wise it would be best to
have a trigger or "phase shift" grouping id set as part of the insert
operation.

"John Gilson" <ja*@acm.org> wrote in message
news:bo******** ***********@twi ster.nyc.rr.com ...

"--CELKO--" <re************ **@earthlink.ne t> wrote in message
news:1102630139 .1f7348d9b8d0e1 527f37d16587e3c ecc@teranews...
>> Less work? Debatable. <<

Fewer nesting levels should be a bit faster. But trying to find the
start and finish points is going to get really bad as the number of row
increases.
>> Also, this won't work if the product_entry values

aren't consecutive. <<

It depends on the sequence of tests having no gaps.

This is one that might be better done with a cursor and a WHILE loop
that accumulates a count and total of each quality test to a working
table.

You could be right but bite your tongue!

--
JAG
--CELKO--
Please post DDL in a human-readable format and not a machne-generated
one. This way people do not have to guess what the keys, constraints,
Declarative Referential Integrity, datatypes, etc. in your schema are.
Sample data is also a good idea, along with clear specifications.
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

Jul 23 '05 #9

--CELKO--

>> Actually if this was an ongoing query: performance-wise it would be
best to have a trigger or "phase shift" grouping id set as part of the
insert operation. <<

My impulse is for a "cluster group number" column as each test is done.
Look to see if the current quality test is on the same product as the
most recent one, etc.

--CELKO--
Please post DDL in a human-readable format and not a machne-generated
one. This way people do not have to guess what the keys, constraints,
Declarative Referential Integrity, datatypes, etc. in your schema are.
Sample data is also a good idea, along with clear specifications.
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

Jul 23 '05 #10

Similar topics

12694

SQL Query - Find block of sequential numbers

by: Jenn L | last post by:

I have a database that is pre-populated with sequential part numbers. As people reserve the parts I update a flag to show the # is no longer available. Now they want the ability to take out a block of "x" number of sequential part numbers - say for example 5. If my database had the following numbers available: 101 104 105 110

Microsoft SQL Server

2344

group by clause Query help

by: aj70000 | last post by:

This is my query select ano,max(date),a_subject from MY_TAB where table_name='xyz' and ano=877 group by a_subject,ano order by a_subject ANO max(Date) A_Subject 877 2005-01-20 00:00:00.000 Subject_1 877 1900-01-01 00:00:00.000 Subject_2 877 2004-12-20 00:00:00.000 Subject_3

Microsoft SQL Server

611

Query problem

by: signaturefactory | last post by:

I am trying the following query in and oleDbCommand: SELECT PartLocations.LocationName, Sum(PartsJournal.Quantity) AS SumOfQuantity, PartsJournal.PartsLotNumber FROM PartLocations INNER JOIN PartsJournal ON PartLocations.LocationID = PartsJournal.LocationID GROUP BY PartLocations.LocationName, PartsJournal.PartsLotNumber,...

Microsoft SQL Server

2645

HELP PLEASE. Query Question.

by: Orion | last post by:

Hi, This is kind of last minute, I have a day and a half left to figure this out. I'm working on a project using ms-sqlserver. We are creating a ticket sales system, as part of the system, I need to be able to do a search for specific tickets withing price ranges, different locations within the theaters, etc. etc. My problem is in the...

Microsoft SQL Server

9407

calculating a median within a query

by: jonm4102 | last post by:

I'm trying to calculate the median of some numerical data. The data can only be found in a query (henceforth query 1) field I previously made, and I would prefer to calculate the median in a new query it without making a table out of query 1. I can't find a median function in the "Total" field, so is there so way to make an expression to...

Microsoft Access / VBA

3795

Ranking Query (from Postings) won't work!

by: sara | last post by:

I hope someone can help with this. Our director wants to have a report that will have the departments (Retail stores) across the top, stores down the side and the RANKING of the YTD dept sales within the chain in the cell. Store/Dept 1 2 3 4 B 8 1 5 2 R ...

Microsoft Access / VBA

1841

Need help with query performance.

by: Hemant Shah | last post by:

Folks, I am having problem with an application that uses static SQL, the application basically browses through the table given start and end key most of the time it is processed from begining to end. The db2expln tells me that there is no Data or Index prefetch. I am running DB2 UDB 8 on AIX 5.3, and I am using DMS tablespace on raw...

DB2 Database

2845

Sum(or Group By) in Query causing weird characters

by: Petulant | last post by:

Hello~ Help Please~ I am new to Access, so am a little behind the learning curve. I have a query that I have been running for a while that has worked fine and now (with no changes) is throwing up weird characters(like little blocks) in the Item Note field. (Note : don't know if this infomation is important the only other thing I can think of...

Microsoft Access / VBA

3124

Query on Query - put results onto report?

by: lorirobn | last post by:

Hi, I have a report displaying items that are missing from a room. I created 2 queries, the first getting the items IN the room, and the second being an "unmatched" query that references the first query where Item is Null. I use a subreport for the details, and the results display correctly. However, the Report_Details event of the...

Microsoft Access / VBA

7478

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...

General

7668

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...

C / C++

7923

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...

Online Marketing

7437

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...

Windows Server

7773

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...

General

5984

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...

Career Advice

4960

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...

C# / C Sharp

3448

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

1025

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP