473,326 Members | 2,168 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

OLAP windowing legerdemain

Friends:

Say I have a simple table like so:

REGION (CHAR(3)
SSN CHAR(9)
ID SMALLINT

With the following data:

REGION SSN ID
------ --------- --
002 555112222 2
008 555112222 8
038 555112222 27
098 111223333 89
016 111223333 14

I'd like to find pair-wise duplicates based on SSN, yielding a result
set like so:

REGION_1 ID_1 SSN REGION_2 ID_2
-------- ---- ------- -------- ----
002 2 555112222 008 8
008 8 555112222 038 27
038 27 555112222 002 2
098 89 111223333 016 14

Now, I could certainly do a self-join (and may yet have to) WHERE ID !=
ID AND SSN = SSN, but this has at least two drawbacks: 1) two passes
through the access path, and 2) I'd get matches and their (redundant)
reciprocals, though on this latter point I'm OK with a little
post-filtering, algorithms for this being available (e.g., those by
Kyte and Molinaro).

What I'm hoping is that among you is someone who knows of a usage of
the OLAP windowing functions for just such an occasion.

My first attempt (below) seems OK for IDs, but generates reciprocals
and will get messy when I try to incoporate the correct corresponding
REGIONs.

WITH
SSNS
(
REGION,
SSN,
ID
)
AS
(
VALUES
('002','555112222', 2),
('008','555112222', 8),
('038','555112222', 27),
('098','111223333', 89),
('016','111223333', 14)
)
SELECT
ID ID_1,
CASE WHEN ID = ID1 THEN ID2 ELSE ID1 END ID_2
FROM
(
SELECT
ID,
MAX(ID) OVER (PARTITION BY SSN ORDER BY ID ROWS BETWEEN CURRENT ROW
AND 1 FOLLOWING) AS ID1,
MIN(ID) OVER (PARTITION BY SSN ORDER BY ID) AS ID2
FROM
SSNS A
) X

ID1 ID2
----------- -----------
14 89
89 14
2 8
8 27
27 2

Regards,

--Jeff

Oct 9 '06 #1
3 1601

jefftyzzer wrote:
Friends:

Say I have a simple table like so:

REGION (CHAR(3)
SSN CHAR(9)
ID SMALLINT

With the following data:

REGION SSN ID
------ --------- --
002 555112222 2
008 555112222 8
038 555112222 27
098 111223333 89
016 111223333 14

I'd like to find pair-wise duplicates based on SSN, yielding a result
set like so:

REGION_1 ID_1 SSN REGION_2 ID_2
-------- ---- ------- -------- ----
002 2 555112222 008 8
008 8 555112222 038 27
038 27 555112222 002 2
098 89 111223333 016 14
This should do the trick:

with region (REGION, SSN, ID) as (values
('002', '555112222', 2),
('008', '555112222', 8),
('038', '555112222', 27),
('098', '111223333', 89),
('016', '111223333', 14)),
intermediate_result (region, id1, ssn, id2, row, id_count) as
(select region, id as id1, ssn, coalesce(max(id) over(partition by ssn
order by id rows between 1 following and 1 following), min(id)
over(partition by ssn)) as id2, row_number() over(partition by ssn) as
row, sum(1) over(partition by ssn) as id_count
from region
order by ssn, id)
select region, id1, ssn, id2 from intermediate_result
where id_count 1 and (id_count 2 or row = 1)
REGION ID1 SSN ID2
====== === === ===
016 14 111223333 89
002 2 555112222 8
008 8 555112222 27
038 27 555112222 2

-Chris

Oct 10 '06 #2
VERY nice! Thanks, Chris--this is a huge help. I appreciate your
spending time on this.

--Jeff

ChrisC wrote:
jefftyzzer wrote:
Friends:

Say I have a simple table like so:

REGION (CHAR(3)
SSN CHAR(9)
ID SMALLINT

With the following data:

REGION SSN ID
------ --------- --
002 555112222 2
008 555112222 8
038 555112222 27
098 111223333 89
016 111223333 14

I'd like to find pair-wise duplicates based on SSN, yielding a result
set like so:

REGION_1 ID_1 SSN REGION_2 ID_2
-------- ---- ------- -------- ----
002 2 555112222 008 8
008 8 555112222 038 27
038 27 555112222 002 2
098 89 111223333 016 14

This should do the trick:

with region (REGION, SSN, ID) as (values
('002', '555112222', 2),
('008', '555112222', 8),
('038', '555112222', 27),
('098', '111223333', 89),
('016', '111223333', 14)),
intermediate_result (region, id1, ssn, id2, row, id_count) as
(select region, id as id1, ssn, coalesce(max(id) over(partition by ssn
order by id rows between 1 following and 1 following), min(id)
over(partition by ssn)) as id2, row_number() over(partition by ssn) as
row, sum(1) over(partition by ssn) as id_count
from region
order by ssn, id)
select region, id1, ssn, id2 from intermediate_result
where id_count 1 and (id_count 2 or row = 1)
REGION ID1 SSN ID2
====== === === ===
016 14 111223333 89
002 2 555112222 8
008 8 555112222 27
038 27 555112222 2

-Chris
Oct 10 '06 #3
To display Region2, it will be better to JOIN Regin table. This also
remove unneccesary rows.

-------------------- Commands Entered ------------------------------
WITH region (REGION, SSN, ID) as (values
('002', '555112222', 2),
('008', '555112222', 8),
('038', '555112222', 27),
('098', '111223333', 89),
('016', '111223333', 14)
)
SELECT region_1, id_1, S2.ssn, R.region AS region_2, id_2
FROM (SELECT region_1, id_1, ssn
, NULLIF(COALESCE(idf,idmin),idp) AS id_2
FROM (SELECT region AS region_1, id AS id_1, ssn
, MAX(id) OVER(PARTITION BY ssn ORDER BY id
ROWS BETWEEN 1 FOLLOWING
AND 1 FOLLOWING) AS idf
, MAX(id) OVER(PARTITION BY ssn ORDER BY id
ROWS BETWEEN 1 PRECEDING
AND 1 PRECEDING) AS idp
, MIN(id) OVER(PARTITION BY ssn) AS idmin
FROM region
) S1
) S2
INNER JOIN
region R
ON R.id = S2.id_2
AND R.ssn = S2.ssn
;
---------------------------------------------------------------------
REGION_1 ID_1 SSN REGION_2 ID_2
-------- ----------- --------- -------- -----------
016 14 111223333 098 89
002 2 555112222 008 8
008 8 555112222 038 27
038 27 555112222 002 2

4 record(s) selected.

Oct 18 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Will | last post by:
On the subject of Data Warehouses, Data Cubes & OLAP…. I would like to speak frankly about Data Warehouses, Data Cubes and OLAP (on-line analytical processing). Has it dawned on anyone else...
0
by: DD | last post by:
Hi Guys! Just would like to share with you my experiense in this matter. I was trying to evaluate how suitable Oracle OLAP for our applications. As probably you did, I have downloaded from OTN...
5
by: Framework fan | last post by:
Hello, If I wrote the next ebay (yes I know, yawn-snore) and I had a database with 5 million auction items in it, what would be a really good strategy to get a search done very quickly? Would...
2
by: Gadi Refaeli | last post by:
Hello All, We are currently considering DB purchases for a new system, we are looking at Cognos, Oracle, DB2 and Essbase. We came across some questions regarding Essbase and DB2 OLAP. 1. Are...
3
by: Eduardo Quiroz Salinas | last post by:
do someone knows where can i get a good tutorial or how to make OLAP cubes.???? thanx a lot -- Linux user number 344659 "...Los que no requieren de un dios para ser virtuosos, son la...
3
by: George Sakkis | last post by:
After a brief search, I didn't find any python package related to OLAP and pivot tables. Did I miss anything ? To be more precise, I'm not so interested in a full-blown OLAP server with an RDBMS...
0
by: YellowFin Announcements | last post by:
Yellowfin Reporting Announces Release 3 OLAP Connectivity New Features Including OLAP-to-Relational Drill Through Provide Customers with One Complete Web BI Tool for OLAP Analysis Yellowfin,...
0
by: YellowFin | last post by:
Yellowfin Announces Release 3 OLAP Connectivity New Features Including OLAP-to-Relational Drill Through Provide Customers with One Complete Web BI Tool for OLAP Analysis Yellowfin, today...
0
by: jefftyzzer | last post by:
Friends: In v8 or v9 LUW, if I were to write (in Java or C) my own user-defined aggregate function, would I be able to use it in an OLAP windowing clause as I could use, e.g., COUNT or SUM? ...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.