By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,694 Members | 2,015 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,694 IT Pros & Developers. It's quick & easy.

SQL query - duplicate records - different dates - how to get only latest information?

P: n/a
I have a SQL query I need to design to select name and email addresses
for policies that are due and not renewed in a given time period. The
problem is, the database keeps the information for every renewal in
the history of the policyholder.
The information is in 2 tables, policy and customer, which share the
custid data. The polno changes with every renewal Renewals in 2004
would be D, 2005 S, and 2006 L. polexpdates for a given customer could
be 2007-03-21, 2006-03-21, 2005-03-21, and 2004-09-21, with polno of
1234 (original policy), 1234D (renewal in 2004), 1234S (renewal in
2005), and 1235L (renewed in 2006).
The policy is identified in trantype as either 'rwl' for renewal, or
'nbs' for new business.

The policies would have poleffdates of 2004-03-21 (original 6 month
policy) 2004-09-21 (first 6 month renewal) , 2005-03-21 (2nd renewal,
1 year), 2006-03-21(3rd renewal, 1 yr).

I want ONLY THE LATEST information, and keep getting early
information.
My current query structure is:
select c.lastname, c.email, p.polno, p.polexpdate
from policy p, customer c
where p.polid = c.polid
and p.polexpdate between '2006-03-01 and 2006-03-31
and p.polno like '1234%s'
and p.trantype like 'rwl'
and c.email is not null

union

select c.lastname, c.email, p.polno, p.polexpdate
from policy p, customer c
where p.polid = c.polid
and p.polexpdate between '2006-03-01 and 2006-03-31
and p.polno like '1234%'
and p.trantype like 'nbs'
and c.email is not null

How do I make this query give me ONLY the polno 123%, or 123%S
information, and not give me the information on policies that ALSO
have 123%L policies, and/ or renewal dates after 2006-03-31?

Adding a 'and not polexpdate > 2006-03-31' does not work.

I am working with SQL SERVER 2003. Was using SQL Server 7, but found
it was too restrictive, and I had a valid 2003 licence, so I upgraded,
and still could not do it (after updating the syntax - things like
using single quotes instead of double, etc)

I keep getting those policies that were due in the stated range and
HAVE been renewed as well as those which have not. I need to get only
those which have NOT been renewed, and I cannot modify the database in
any way.
*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 17 '06 #1
Share this Question
Share on Google+
24 Replies


P: n/a
There is no such thing as SQL Server 2003....do you mean 2000 or 2005?

You have a lot of questions here, I'll take a stab at a couple of them
and then we'll go from there.

Structure and data would have helped me give you a MUCH better answer.
I'm going to use a simpler model.

CREATE TABLE Policy (
PolID int,
PolNo nvarchar(30),
PolExpDate datetime,
TranType nvarchar(30)
)

CREATE TABLE Customer (
Lastname nvarchar(30),
Email nvarchar(30),
PolID int
)

INSERT INTO Customer VALUES ('Abby', 'a@a.com', 1)
INSERT INTO Customer VALUES ('Brad', 'b@b.com.com', 2)
INSERT INTO Customer VALUES ('Chris', NULL, 3)

INSERT INTO Policy VALUES (1, '1234', '20030101', 'nbs')
INSERT INTO Policy VALUES (1, '1234D', '20040101', 'rwl')
INSERT INTO Policy VALUES (1, '1234S', '20050101', 'rwl')
INSERT INTO Policy VALUES (1, '1234L', '20060101', 'rwl')

INSERT INTO Policy VALUES (2, '1234', '20030101', 'nbs')
INSERT INTO Policy VALUES (2, '1234D', '20040101', 'rwl')
INSERT INTO Policy VALUES (2, '1234S', '20050101', 'rwl')
INSERT INTO Policy VALUES (2, '1234L', '20060101', 'rwl')

INSERT INTO Policy VALUES (3, '1234', '20030101', 'nbs')
INSERT INTO Policy VALUES (3, '1234D', '20040101', 'rwl')
INSERT INTO Policy VALUES (3, '1234S', '20050101', 'rwl')
INSERT INTO Policy VALUES (3, '1234L', '20060101', 'rwl')

SELECT * FROM Policy P
JOIN Customer C ON C.PolID = P.PolID
WHERE P.PolExpDate =
(SELECT TOP 1 P2.PolExpDate
FROM Policy P2
WHERE P2.PolID = C.PolID
ORDER BY P2.PolExpDate DESC
)
AND NOT C.Email IS NULL
AND (P.TranType IN ('nbs', 'rwl'))

Note that you probably should have a TransTypes table so that you
aren't storing strings ('rwl', 'nbs') in your policy table over and
over.

You also have no need for unions here.

If you want specific types, you can use ORs or the like.

I'm not sure what you want with the dates in the where clause. Do you
want only records that have their most recent event between two dates?

Also note, there are probably a hundred ways to do this...this is
unlikely to be the fastest. Without really knowing what you want
though...

Mar 17 '06 #2

P: n/a
On 17 Mar 2006 12:49:52 -0800, "figital" <mh****@gmail.com> wrote:
There is no such thing as SQL Server 2003....do you mean 2000 or 2005?

You have a lot of questions here, I'll take a stab at a couple of them
and then we'll go from there.

Structure and data would have helped me give you a MUCH better answer.
I'm going to use a simpler model.

CREATE TABLE Policy (
PolID int,
PolNo nvarchar(30),
PolExpDate datetime,
TranType nvarchar(30)
)

CREATE TABLE Customer (
Lastname nvarchar(30),
Email nvarchar(30),
PolID int
)

INSERT INTO Customer VALUES ('Abby', 'a@a.com', 1)
INSERT INTO Customer VALUES ('Brad', 'b@b.com.com', 2)
INSERT INTO Customer VALUES ('Chris', NULL, 3)

INSERT INTO Policy VALUES (1, '1234', '20030101', 'nbs')
INSERT INTO Policy VALUES (1, '1234D', '20040101', 'rwl')
INSERT INTO Policy VALUES (1, '1234S', '20050101', 'rwl')
INSERT INTO Policy VALUES (1, '1234L', '20060101', 'rwl')

INSERT INTO Policy VALUES (2, '1234', '20030101', 'nbs')
INSERT INTO Policy VALUES (2, '1234D', '20040101', 'rwl')
INSERT INTO Policy VALUES (2, '1234S', '20050101', 'rwl')
INSERT INTO Policy VALUES (2, '1234L', '20060101', 'rwl')

INSERT INTO Policy VALUES (3, '1234', '20030101', 'nbs')
INSERT INTO Policy VALUES (3, '1234D', '20040101', 'rwl')
INSERT INTO Policy VALUES (3, '1234S', '20050101', 'rwl')
INSERT INTO Policy VALUES (3, '1234L', '20060101', 'rwl')

SELECT * FROM Policy P
JOIN Customer C ON C.PolID = P.PolID
WHERE P.PolExpDate =
(SELECT TOP 1 P2.PolExpDate
FROM Policy P2
WHERE P2.PolID = C.PolID
ORDER BY P2.PolExpDate DESC
)
AND NOT C.Email IS NULL
AND (P.TranType IN ('nbs', 'rwl'))

Note that you probably should have a TransTypes table so that you
aren't storing strings ('rwl', 'nbs') in your policy table over and
over.

You also have no need for unions here.

If you want specific types, you can use ORs or the like.

I'm not sure what you want with the dates in the where clause. Do you
want only records that have their most recent event between two dates?

Also note, there are probably a hundred ways to do this...this is
unlikely to be the fastest. Without really knowing what you want
though...

OK, here is the scoop.
The program that creates and uses the database does not allow us to
select only the policies that are due and nor yet renewed in a given
time period and return the e-mail address.

We need this functionality to email policy reminders with a link to
the e-commerce page that allows them to renew online. This will save
mailing notices to those policyholders that have e-mail and deal with
the company in that manner.

There are two tables involved - and they can NOT be modified in any
way - we are stuck with them.

The customer table has all kinds of customer information fields - over
half of which are not used at all. Among them are the lastname (which
has first and last name in it), email address,policy-id, and a lot of
stuff we don't need to know.

The policy information table has the policy-id, the policy number, the
renewal date, and the effective date, as well as the transaction type.
The policy number starts with a particular numeric pattern, which
identifies the particular policies we are interested in. The rest of
the policy number identifies the actual policy, and starts out as a
numeric string, like 3215.
When it is renewed, it is saved with a suffix letter. Might be d for
2005, s for 2005, and l for 2006 (I think that is correct - but you
get the idea). Nothing is EVER deleted from this database.

SO - John Doe has policy 12345-123 issued Mar 1 2003, due Mar 1 2004.
John Doe's policy is renewed in Feb 2004, and becomes 12345-123d, due
Mar 1, 2005.
It is renewed and becomes 12345-123s, due Mar 2006
It is renewed Mar 2006 and becomes 12345-123L, due Mar 1 2007.

Jane Smith has a policy #12345-923 issued Mar 9, 2005, due Mar 9,
2006.
IT is renewed and becomes 12345-923s, due Mar 9, 2006

I need to do a search by date and policy number to find Jane Smith,
and others like her, while not finding John Doe, and others like him,
in Feb 2006 to remind them to renew their policies.
Don't want to bother those who have already looked after theirs. Don't
want to miss the one that's going to forget without a reminder.

I WAS using the union because I was breaking it down looking for only
renewals with a S at the end of the policy number, eliminating the
1993, 1994, and 1995 renewals - and specifying poltype "rwl" so we
were only looking at renewals.
Then I was adding the policies that were new business "nbs", and had
not yet had a renewal, so had no 'S' on the end.

That was the theory behind the construct.

I know we can likely get rid of that complication IF we can figure out
how to get only the LATEST renewal date - using the capabilities of
SQL Server 2003. Have not seen any documentation on "top", so am
unaware of if and how it works.
*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 18 '06 #3

P: n/a
(clare at snyder.on.ca) writes:
I want ONLY THE LATEST information, and keep getting early
information.
My current query structure is:
select c.lastname, c.email, p.polno, p.polexpdate
from policy p, customer c
where p.polid = c.polid
and p.polexpdate between '2006-03-01 and 2006-03-31
and p.polno like '1234%s'
and p.trantype like 'rwl'
and c.email is not null

union

select c.lastname, c.email, p.polno, p.polexpdate
from policy p, customer c
where p.polid = c.polid
and p.polexpdate between '2006-03-01 and 2006-03-31
and p.polno like '1234%'
and p.trantype like 'nbs'
and c.email is not null

How do I make this query give me ONLY the polno 123%, or 123%S
information, and not give me the information on policies that ALSO
have 123%L policies, and/ or renewal dates after 2006-03-31?

Adding a 'and not polexpdate > 2006-03-31' does not work.
It appears that the data you have is very messy, with various pieces
of information built into the policy number. Since I don't know the
tables in detail, and only have your narrative to work from, this
is a bit of guesswork:

SELECT c.lastname, c.email, p.polno, p.polexpdate
FROM customer c
JOIN (SELECT polid, polexpdate = MAX(polexpdate)
FROM policies
WHERE polno like '1234%'
GROUP BY polid) AS p1 ON p1.polid = c.polid
JOIN policies p ON p.polid = p1.polid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060301' AND '20060331'
AND c.email IS NOT NULL

First, I have assumed that the polid does not change over renewals, but
that is plain guess on my part.

The inner SELECT gives the most recent expiration date for each polid,
and then I filter to see only those with expiration in March 2006.

If this does not address your problem, I would suggest that you post:

o CREATE TABLE statements for your tables. Don't forget to include
defintion of primary keys and foreign keys.
o INSERT statemetns with sample data.
o The desired result given the sample.
I am working with SQL SERVER 2003. Was using SQL Server 7, but found
it was too restrictive, and I had a valid 2003 licence, so I upgraded,
and still could not do it (after updating the syntax - things like
using single quotes instead of double, etc)
Please clarify if you are working with SQL 2000 or SQL 2005. There is
no such thing as SQL 2003.
I know we can likely get rid of that complication IF we can figure out
how to get only the LATEST renewal date - using the capabilities of
SQL Server 2003. Have not seen any documentation on "top", so am
unaware of if and how it works.


You can read about TOP in Books Online. By the way, it was available in
SQL 7 as well.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Mar 18 '06 #4

P: n/a
On Sat, 18 Mar 2006 16:05:54 +0000 (UTC), Erland Sommarskog
<es****@sommarskog.se> wrote:
(clare at snyder.on.ca) writes:
I want ONLY THE LATEST information, and keep getting early
information.
My current query structure is:
select c.lastname, c.email, p.polno, p.polexpdate
from policy p, customer c
where p.polid = c.polid
and p.polexpdate between '2006-03-01 and 2006-03-31
and p.polno like '1234%s'
and p.trantype like 'rwl'
and c.email is not null

union

select c.lastname, c.email, p.polno, p.polexpdate
from policy p, customer c
where p.polid = c.polid
and p.polexpdate between '2006-03-01 and 2006-03-31
and p.polno like '1234%'
and p.trantype like 'nbs'
and c.email is not null

How do I make this query give me ONLY the polno 123%, or 123%S
information, and not give me the information on policies that ALSO
have 123%L policies, and/ or renewal dates after 2006-03-31?

Adding a 'and not polexpdate > 2006-03-31' does not work.


It appears that the data you have is very messy, with various pieces
of information built into the policy number. Since I don't know the
tables in detail, and only have your narrative to work from, this
is a bit of guesswork:

SELECT c.lastname, c.email, p.polno, p.polexpdate
FROM customer c
JOIN (SELECT polid, polexpdate = MAX(polexpdate)
FROM policies
WHERE polno like '1234%'
GROUP BY polid) AS p1 ON p1.polid = c.polid
JOIN policies p ON p.polid = p1.polid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060301' AND '20060331'
AND c.email IS NOT NULL

First, I have assumed that the polid does not change over renewals, but
that is plain guess on my part.

The inner SELECT gives the most recent expiration date for each polid,
and then I filter to see only those with expiration in March 2006.

If this does not address your problem, I would suggest that you post:

o CREATE TABLE statements for your tables. Don't forget to include
defintion of primary keys and foreign keys.
o INSERT statemetns with sample data.
o The desired result given the sample.
I am working with SQL SERVER 2003. Was using SQL Server 7, but found
it was too restrictive, and I had a valid 2003 licence, so I upgraded,
and still could not do it (after updating the syntax - things like
using single quotes instead of double, etc)


Please clarify if you are working with SQL 2000 or SQL 2005. There is
no such thing as SQL 2003.
I know we can likely get rid of that complication IF we can figure out
how to get only the LATEST renewal date - using the capabilities of
SQL Server 2003. Have not seen any documentation on "top", so am
unaware of if and how it works.


You can read about TOP in Books Online. By the way, it was available in
SQL 7 as well.

It is SQL 2000 - just checked.

I will try your code and see what happens.
Thanks.
*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 18 '06 #5

P: n/a

On Sat, 18 Mar 2006 16:05:54 +0000 (UTC), Erland Sommarskog
<es****@sommarskog.se> wrote:
I have a few syntax questions. I will insert them inline.
It appears that the data you have is very messy, with various pieces
of information built into the policy number. Since I don't know the
tables in detail, and only have your narrative to work from, this
is a bit of guesswork:

SELECT c.lastname, c.email, p.polno, p.polexpdate
FROM customer c
Don't I need 'policies p' in here too? JOIN (SELECT polid, polexpdate = MAX(polexpdate)
FROM policies
WHERE polno like '1234%'
and p.polno insted of polno? GROUP BY polid) AS p1 ON p1.polid = c.polid
And what is the 'BY polid)' and the 'AS p1 ON p1.polid' JOIN policies p ON p.polid = p1.polid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060301' AND '20060331'
Actual date info is " between '2006-03-01 00:00:00.000 and 2006-03-31
00:00:00.000' " AND c.email IS NOT NULL

First, I have assumed that the polid does not change over renewals, but
that is plain guess on my part.
I think your guess is correct, from what I've seen.
The inner SELECT gives the most recent expiration date for each polid,
and then I filter to see only those with expiration in March 2006.

If this does not address your problem, I would suggest that you post:
I think your methodology may work, if we get the syntax straightened
out.
o CREATE TABLE statements for your tables. Don't forget to include
defintion of primary keys and foreign keys.
o INSERT statemetns with sample data.
o The desired result given the sample.
I am working with SQL SERVER 2003. Was using SQL Server 7, but found
it was too restrictive, and I had a valid 2003 licence, so I upgraded,
and still could not do it (after updating the syntax - things like
using single quotes instead of double, etc)


Please clarify if you are working with SQL 2000 or SQL 2005. There is
no such thing as SQL 2003.
I know we can likely get rid of that complication IF we can figure out
how to get only the LATEST renewal date - using the capabilities of
SQL Server 2003. Have not seen any documentation on "top", so am
unaware of if and how it works.


You can read about TOP in Books Online. By the way, it was available in
SQL 7 as well.


*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 18 '06 #6

P: n/a
On Sat, 18 Mar 2006 16:05:54 +0000 (UTC), Erland Sommarskog
<es****@sommarskog.se> wrote:
(clare at snyder.on.ca) writes:
I want ONLY THE LATEST information, and keep getting early
information.
My current query structure is:
select c.lastname, c.email, p.polno, p.polexpdate
from policy p, customer c
where p.polid = c.polid
and p.polexpdate between '2006-03-01 and 2006-03-31
and p.polno like '1234%s'
and p.trantype like 'rwl'
and c.email is not null

union

select c.lastname, c.email, p.polno, p.polexpdate
from policy p, customer c
where p.polid = c.polid
and p.polexpdate between '2006-03-01 and 2006-03-31
and p.polno like '1234%'
and p.trantype like 'nbs'
and c.email is not null

How do I make this query give me ONLY the polno 123%, or 123%S
information, and not give me the information on policies that ALSO
have 123%L policies, and/ or renewal dates after 2006-03-31?

Adding a 'and not polexpdate > 2006-03-31' does not work.


It appears that the data you have is very messy, with various pieces
of information built into the policy number. Since I don't know the
tables in detail, and only have your narrative to work from, this
is a bit of guesswork:

SELECT c.lastname, c.email, p.polno, p.polexpdate
FROM customer c
JOIN (SELECT polid, polexpdate = MAX(polexpdate)
FROM policies
WHERE polno like '1234%'
GROUP BY polid) AS p1 ON p1.polid = c.polid
JOIN policies p ON p.polid = p1.polid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060301' AND '20060331'
AND c.email IS NOT NULL

First, I have assumed that the polid does not change over renewals, but
that is plain guess on my part.

The inner SELECT gives the most recent expiration date for each polid,
and then I filter to see only those with expiration in March 2006.

If this does not address your problem, I would suggest that you post:

o CREATE TABLE statements for your tables. Don't forget to include
defintion of primary keys and foreign keys.
o INSERT statemetns with sample data.
o The desired result given the sample.
I am working with SQL SERVER 2003. Was using SQL Server 7, but found
it was too restrictive, and I had a valid 2003 licence, so I upgraded,
and still could not do it (after updating the syntax - things like
using single quotes instead of double, etc)


Please clarify if you are working with SQL 2000 or SQL 2005. There is
no such thing as SQL 2003.
I know we can likely get rid of that complication IF we can figure out
how to get only the LATEST renewal date - using the capabilities of
SQL Server 2003. Have not seen any documentation on "top", so am
unaware of if and how it works.


You can read about TOP in Books Online. By the way, it was available in
SQL 7 as well.

Using just the inner SELECT, written as :
SELECT polid. polexpdate =MAX (polexpdate)
FROM afw_basicpolinfo p WHERE p.polno like '1932700%'
GROUP BY polid AS p1 ON p1.polid-p1 p1 polid
Join afw_basicpolinfo p on p.polid = p1.polid
AND p.polexpdate=p1.polexpdate
WHERE p.polexpdate between '2006-04-15 00:00:00.000 and 2006-04-31
00:00:00.000'
and c.email IS NOT NULL
I get an error:

SERVER: Msg 156, Level 15, Stare 1, Line 3
Incorrect syntax near the keyword "AS".
*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 18 '06 #7

P: n/a
On Sat, 18 Mar 2006 16:05:54 +0000 (UTC), Erland Sommarskog
<es****@sommarskog.se> wrote:

It appears that the data you have is very messy, with various pieces
of information built into the policy number. Since I don't know the
tables in detail, and only have your narrative to work from, this
is a bit of guesswork:

SELECT c.lastname, c.email, p.polno, p.polexpdate
FROM customer c
JOIN (SELECT polid, polexpdate = MAX(polexpdate)
FROM policies
WHERE polno like '1234%'
GROUP BY polid) AS p1 ON p1.polid = c.polid
JOIN policies p ON p.polid = p1.polid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060301' AND '20060331'
AND c.email IS NOT NULL

First, I have assumed that the polid does not change over renewals, but
that is plain guess on my part.

The inner SELECT gives the most recent expiration date for each polid,
and then I filter to see only those with expiration in March 2006.

The "outer select" works if written:

Select c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer c, afw_basicpolinfo p

If we get the "inner select" working, the JOIN will likely work as
well.

*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 18 '06 #8

P: n/a
On Sat, 18 Mar 2006 16:05:54 +0000 (UTC), Erland Sommarskog
<es****@sommarskog.se> wrote:
(clare at snyder.on.ca) writes:
I want ONLY THE LATEST information, and keep getting early
information.
My current query structure is:
select c.lastname, c.email, p.polno, p.polexpdate
from policy p, customer c
where p.polid = c.polid
and p.polexpdate between '2006-03-01 and 2006-03-31
and p.polno like '1234%s'
and p.trantype like 'rwl'
and c.email is not null

union

select c.lastname, c.email, p.polno, p.polexpdate
from policy p, customer c
where p.polid = c.polid
and p.polexpdate between '2006-03-01 and 2006-03-31
and p.polno like '1234%'
and p.trantype like 'nbs'
and c.email is not null

How do I make this query give me ONLY the polno 123%, or 123%S
information, and not give me the information on policies that ALSO
have 123%L policies, and/ or renewal dates after 2006-03-31?

Adding a 'and not polexpdate > 2006-03-31' does not work.


It appears that the data you have is very messy, with various pieces
of information built into the policy number. Since I don't know the
tables in detail, and only have your narrative to work from, this
is a bit of guesswork:

SELECT c.lastname, c.email, p.polno, p.polexpdate
FROM customer c
JOIN (SELECT polid, polexpdate = MAX(polexpdate)
FROM policies
WHERE polno like '1234%'
GROUP BY polid) AS p1 ON p1.polid = c.polid
JOIN policies p ON p.polid = p1.polid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060301' AND '20060331'
AND c.email IS NOT NULL

First, I have assumed that the polid does not change over renewals, but
that is plain guess on my part.

The inner SELECT gives the most recent expiration date for each polid,
and then I filter to see only those with expiration in March 2006.

If this does not address your problem, I would suggest that you post:

o CREATE TABLE statements for your tables. Don't forget to include
defintion of primary keys and foreign keys.
o INSERT statemetns with sample data.
o The desired result given the sample.
I am working with SQL SERVER 2003. Was using SQL Server 7, but found
it was too restrictive, and I had a valid 2003 licence, so I upgraded,
and still could not do it (after updating the syntax - things like
using single quotes instead of double, etc)


Please clarify if you are working with SQL 2000 or SQL 2005. There is
no such thing as SQL 2003.
I know we can likely get rid of that complication IF we can figure out
how to get only the LATEST renewal date - using the capabilities of
SQL Server 2003. Have not seen any documentation on "top", so am
unaware of if and how it works.


You can read about TOP in Books Online. By the way, it was available in
SQL 7 as well.

And polid needs to be custid. I've remoted into the server today to
get the proper field names and to attempt to run the queries.
The policies table is afw_basicpolinfo, and the customers is
afw_customer.

*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 18 '06 #9

P: n/a
(clare at snyder.on.ca) writes:
SELECT c.lastname, c.email, p.polno, p.polexpdate
FROM customer c
Don't I need 'policies p' in here too?


It's further below. Note that I am using the newer ANSI syntax for
joins,
JOIN (SELECT polid, polexpdate = MAX(polexpdate)
FROM policies
WHERE polno like '1234%'


and p.polno insted of polno?


We're inside a derived table. A derived table is a temp table within the
query, but only conceptually. SQL Server may recast the comuptation
order, and often do. The purpose of the derived table is find the
latest expiration day for each policy holder.
GROUP BY polid) AS p1 ON p1.polid = c.polid


And what is the 'BY polid)'


The end of a GROUP BY clause.
and the 'AS p1 ON p1.polid'


AS p1 establishes an alias for the derived table. The ON clause specifies
on which columns the derived table is to be join with customers.
JOIN policies p ON p.polid = p1.polid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060301' AND '20060331'


Actual date info is " between '2006-03-01 00:00:00.000 and 2006-03-31
00:00:00.000' "


Always use the format YYYYMMDD for date literals, as this date format
is always interpreted the same. The format YYYY-MM-DD fails when the
current language is set to for instance German or British.
First, I have assumed that the polid does not change over renewals, but
that is plain guess on my part.


I think your guess is correct, from what I've seen.


Oh well, if you don't know what is the key in your data, no one here
will be able to help you with part.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Mar 18 '06 #10

P: n/a
On Sat, 18 Mar 2006 20:33:17 +0000 (UTC), Erland Sommarskog
<es****@sommarskog.se> wrote:
(clare at snyder.on.ca) writes:
SELECT c.lastname, c.email, p.polno, p.polexpdate
FROM customer c


Don't I need 'policies p' in here too?


It's further below. Note that I am using the newer ANSI syntax for
joins,
JOIN (SELECT polid, polexpdate = MAX(polexpdate)
FROM policies
WHERE polno like '1234%'


and p.polno insted of polno?


We're inside a derived table. A derived table is a temp table within the
query, but only conceptually. SQL Server may recast the comuptation
order, and often do. The purpose of the derived table is find the
latest expiration day for each policy holder.
GROUP BY polid) AS p1 ON p1.polid = c.polid


And what is the 'BY polid)'


The end of a GROUP BY clause.
and the 'AS p1 ON p1.polid'


AS p1 establishes an alias for the derived table. The ON clause specifies
on which columns the derived table is to be join with customers.
JOIN policies p ON p.polid = p1.polid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060301' AND '20060331'


Actual date info is " between '2006-03-01 00:00:00.000 and 2006-03-31
00:00:00.000' "


Always use the format YYYYMMDD for date literals, as this date format
is always interpreted the same. The format YYYY-MM-DD fails when the
current language is set to for instance German or British.
First, I have assumed that the polid does not change over renewals, but
that is plain guess on my part.


I think your guess is correct, from what I've seen.


Oh well, if you don't know what is the key in your data, no one here
will be able to help you with part.

I am assuming the POLID is the common key. I was told it is. It does
not appear to ever get changed.It seems to work just fine as I had it
written with the exception I cannot get the LATEST date info.

*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 18 '06 #11

P: n/a
On Sat, 18 Mar 2006 20:33:17 +0000 (UTC), Erland Sommarskog
<es****@sommarskog.se> wrote:


Oh well, if you don't know what is the key in your data, no one here
will be able to help you with part.

Eric and others,
Please bear with me.
This SQL thing is one place where I am a real, raw neophyte - but I'm
learning.
Hard to teach an old dog new tricks.

I've figured out all the syntax, and it looks like I have it working,
with one glitch that is NOBODY'S FAULT but mine, because I forgot to
mention it.

There is ONE condition that I forgot to mention in the polno field.
Not only do they add a different suffix for every year's renewal(which
now appears to be a non-issue) but if a policy is cancelled, it can
have CANC at the end, or NOTRENEWED.

I need to filter THAT out, so the cancelled policies are not notified.

What I have working now is:

SELECT distinct c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer c
JOIN (SELECT custid, polexpdate=MAX(polexpdate)
FROM afw_basicpolinfo
WHERE polno like '1932700%'

GROUP BY custid)AS p1 ON p1.custid=c.custid
JOIN afw_basicpolinfo p ON p.custid = p1.custid
AND p.polexpdate=p1.polexpdate
WHERE p.polexpdate between '20060201'and '20060228'
AND c.email IS NOT NULL

Order by 4,1
If for instance Joe Blow is due within the specified time period, but
has cancelled his policy, Joe Blow will turn up twice in the output,
once as "1932700-1234s", and once as "1932700-123sCANC"

Putting the line in after the "Where polno like" line has no effect.
This has to be simpler than the max date problem.

Thanks in advance, guys.

I have tried putting "and polno not like '1932700%CANC'" in the line
before "AND c.email IS NOT NULL"and I still get the cancelled policy,
but only once - without the CANC on the polno.
*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 19 '06 #12

P: n/a
(clare at snyder.on.ca) writes:
There is ONE condition that I forgot to mention in the polno field.
Not only do they add a different suffix for every year's renewal(which
now appears to be a non-issue) but if a policy is cancelled, it can
have CANC at the end, or NOTRENEWED.


AND polno NOT LIKE '%CANC'
AND polno NOT LIKE '%NOTRENEWED'

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Mar 19 '06 #13

P: n/a
On Sun, 19 Mar 2006 11:06:27 +0000 (UTC), Erland Sommarskog
<es****@sommarskog.se> wrote:
(clare at snyder.on.ca) writes:
There is ONE condition that I forgot to mention in the polno field.
Not only do they add a different suffix for every year's renewal(which
now appears to be a non-issue) but if a policy is cancelled, it can
have CANC at the end, or NOTRENEWED.


AND polno NOT LIKE '%CANC'
AND polno NOT LIKE '%NOTRENEWED'

In my last post I said:
I have tried putting "and polno not like '1932700%CANC'" in the line
before "AND c.email IS NOT NULL"and I still get the cancelled policy,
but only once - without the CANC on the polno.
The cancelled policy still shows up, butas due for renewal, and not
cancelled.

Also said "Putting the line in after the "Where polno like" line has
no effect.", in other words, putting it there does not remove either
the duplicate OR both incidents of the record.

I think I need to somehow do a second query on all cancelled policies
(polno like '%canc") and somehow tell the query to NOT select the
lastname and email addresses for anyone on THAT list. Some sort of an
EXCLUDE function - which I don't understand.

*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 19 '06 #14

P: n/a
(clare at snyder.on.ca) writes:
On Sun, 19 Mar 2006 11:06:27 +0000 (UTC), Erland Sommarskog
<es****@sommarskog.se> wrote:
(clare at snyder.on.ca) writes:
There is ONE condition that I forgot to mention in the polno field.
Not only do they add a different suffix for every year's renewal(which
now appears to be a non-issue) but if a policy is cancelled, it can
have CANC at the end, or NOTRENEWED.


AND polno NOT LIKE '%CANC'
AND polno NOT LIKE '%NOTRENEWED'

In my last post I said:
I have tried putting "and polno not like '1932700%CANC'" in the line
before "AND c.email IS NOT NULL"and I still get the cancelled policy,
but only once - without the CANC on the polno.
The cancelled policy still shows up, butas due for renewal, and not
cancelled.

Also said "Putting the line in after the "Where polno like" line has
no effect.", in other words, putting it there does not remove either
the duplicate OR both incidents of the record.


There is a common recommendation for posts to the SQL Server newsgroups
and that is that you include:

o CREATE TABLE statements for your tables.
o INSERT statemetns with sample data.
o The desired output given the sample.

If you put that effort into your post, it is likely that someone will
make the simple effort of copying and pasting the script into a
query tool to develop a tested solution.

If you make less effort, the person who answer will also make less effort
in his posting, and the answers you get be less accurate.

I would have guessed the NOT LIKE clauses would make it, because if
their latest policy does not have CANC, then I don't understand what
is going on. If there is an earlier cancellation, would that not mean
that they have come back.

Nevertheless, you can add a WHERE NOT EXISTS:

SELECT c.lastname, c.email, p.polno, p.polexpdate
FROM customer c
JOIN (SELECT polid, polexpdate = MAX(polexpdate)
FROM policies
WHERE polno like '1234%'
GROUP BY polid) AS p1 ON p1.polid = c.polid
JOIN policies p ON p.polid = p1.polid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060301' AND '20060331'
AND c.email IS NOT NULL
AND NOT EXISTS (SELECT *
FROM policies p2
WHERE p.polid = p2.polid
AND (p.polno LIKE '123%CANC' OR
p.polno LIKE '123%NOTRENEWD))
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Mar 19 '06 #15

P: n/a
On Sun, 19 Mar 2006 01:27:19 -0500, clare at snyder.on.ca wrote:

(snip)
I have tried putting "and polno not like '1932700%CANC'" in the line
before "AND c.email IS NOT NULL"and I still get the cancelled policy,
but only once - without the CANC on the polno.


Hi Clare,

So if I understand correctly - if there is ONE occurence of a policy
that ends in CANC or NOTRENEWED, then NO occurence of that policy should
be returned by the query?

Did someone already mention elsewhere in this thread how terribly bad
practice it is to combine policy numbers with status codes into one
single column?

Based on the query you posted, this modification will probably do what
you need:

SELECT DISTINCT c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer AS c
INNER JOIN (SELECT custid, MAX(polexpdate) AS polexpdate
FROM afw_basicpolinfo
WHERE polno LIKE '1932700%'
GROUP BY custid) AS p1
ON p1.custid = c.custid
INNER JOIN afw_basicpolinfo AS p
ON p.custid = p1.custid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060201' AND '20060228'
AND c.email IS NOT NULL
AND NOT EXISTS (SELECT *
FROM afw_basicpolinfo
WHERE (polno LIKE '1932700%CANC'
OR polno LIKE '1932700%NOTRENEWED')
AND custid = c.custid)
ORDER BY p.polexpdate, c.lastname

The one below might also work - it might be faster, though I personally
find it less intuitive to understand.

SELECT DISTINCT c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer AS c
INNER JOIN (SELECT custid, MAX(polexpdate) AS polexpdate
FROM afw_basicpolinfo
WHERE polno LIKE '1932700%'
GROUP BY custid
HAVING MAX(CASE WHEN polno LIKE '%CANC'
OR polno LIKE '%NOTRENEWED'
THEN 1
ELSE 0
END) = 0) AS p1
ON p1.custid = c.custid
INNER JOIN afw_basicpolinfo AS p
ON p.custid = p1.custid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060201' AND '20060228'
AND c.email IS NOT NULL
ORDER BY p.polexpdate, c.lastname

(Both queries are untested - see www.aspfaq.com/5006 if you prefer a
tested reply)

--
Hugo Kornelis, SQL Server MVP
Mar 19 '06 #16

P: n/a
On Sun, 19 Mar 2006 22:26:37 +0000 (UTC), Erland Sommarskog
<es****@sommarskog.se> wrote:
(clare at snyder.on.ca) writes:
On Sun, 19 Mar 2006 11:06:27 +0000 (UTC), Erland Sommarskog
<es****@sommarskog.se> wrote:
(clare at snyder.on.ca) writes:
There is ONE condition that I forgot to mention in the polno field.
Not only do they add a different suffix for every year's renewal(which
now appears to be a non-issue) but if a policy is cancelled, it can
have CANC at the end, or NOTRENEWED.

AND polno NOT LIKE '%CANC'
AND polno NOT LIKE '%NOTRENEWED'

In my last post I said:
I have tried putting "and polno not like '1932700%CANC'" in the line
before "AND c.email IS NOT NULL"and I still get the cancelled policy,
but only once - without the CANC on the polno.
The cancelled policy still shows up, butas due for renewal, and not
cancelled.

Also said "Putting the line in after the "Where polno like" line has
no effect.", in other words, putting it there does not remove either
the duplicate OR both incidents of the record.


There is a common recommendation for posts to the SQL Server newsgroups
and that is that you include:

o CREATE TABLE statements for your tables.
o INSERT statemetns with sample data.
o The desired output given the sample.

If you put that effort into your post, it is likely that someone will
make the simple effort of copying and pasting the script into a
query tool to develop a tested solution.

If you make less effort, the person who answer will also make less effort
in his posting, and the answers you get be less accurate.

I would have guessed the NOT LIKE clauses would make it, because if
their latest policy does not have CANC, then I don't understand what
is going on. If there is an earlier cancellation, would that not mean
that they have come back.


The problem is the cancellation is processed on the renewal date. If
the cancellation was entered the day after the renewal/expiry date it
would not be a problem.
As for the create and insert statements, I have never had to do them
before -so am not sure I would get them right.
Nevertheless, you can add a WHERE NOT EXISTS:

SELECT c.lastname, c.email, p.polno, p.polexpdate
FROM customer c
JOIN (SELECT polid, polexpdate = MAX(polexpdate)
FROM policies
WHERE polno like '1234%'
GROUP BY polid) AS p1 ON p1.polid = c.polid
JOIN policies p ON p.polid = p1.polid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060301' AND '20060331'
AND c.email IS NOT NULL
AND NOT EXISTS (SELECT *
FROM policies p2
WHERE p.polid = p2.polid
AND (p.polno LIKE '123%CANC' OR
p.polno LIKE '123%NOTRENEWD))


Thanks Erland. That is what I suspected would be required but I did
not know how to accomplish it. I'm learning!!!!
The p1 and p2 are temporary tables, right?

*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 20 '06 #17

P: n/a
On Sun, 19 Mar 2006 23:33:01 +0100, Hugo Kornelis
<hu**@perFact.REMOVETHIS.info.INVALID> wrote:

Thanks Hugo, I'll try these as well!!!
On Sun, 19 Mar 2006 01:27:19 -0500, clare at snyder.on.ca wrote:

(snip)
I have tried putting "and polno not like '1932700%CANC'" in the line
before "AND c.email IS NOT NULL"and I still get the cancelled policy,
but only once - without the CANC on the polno.


Hi Clare,

So if I understand correctly - if there is ONE occurence of a policy
that ends in CANC or NOTRENEWED, then NO occurence of that policy should
be returned by the query?

Did someone already mention elsewhere in this thread how terribly bad
practice it is to combine policy numbers with status codes into one
single column?

Based on the query you posted, this modification will probably do what
you need:

SELECT DISTINCT c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer AS c
INNER JOIN (SELECT custid, MAX(polexpdate) AS polexpdate
FROM afw_basicpolinfo
WHERE polno LIKE '1932700%'
GROUP BY custid) AS p1
ON p1.custid = c.custid
INNER JOIN afw_basicpolinfo AS p
ON p.custid = p1.custid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060201' AND '20060228'
AND c.email IS NOT NULL
AND NOT EXISTS (SELECT *
FROM afw_basicpolinfo
WHERE (polno LIKE '1932700%CANC'
OR polno LIKE '1932700%NOTRENEWED')
AND custid = c.custid)
ORDER BY p.polexpdate, c.lastname

The one below might also work - it might be faster, though I personally
find it less intuitive to understand.

SELECT DISTINCT c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer AS c
INNER JOIN (SELECT custid, MAX(polexpdate) AS polexpdate
FROM afw_basicpolinfo
WHERE polno LIKE '1932700%'
GROUP BY custid
HAVING MAX(CASE WHEN polno LIKE '%CANC'
OR polno LIKE '%NOTRENEWED'
THEN 1
ELSE 0
END) = 0) AS p1
ON p1.custid = c.custid
INNER JOIN afw_basicpolinfo AS p
ON p.custid = p1.custid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060201' AND '20060228'
AND c.email IS NOT NULL
ORDER BY p.polexpdate, c.lastname

(Both queries are untested - see www.aspfaq.com/5006 if you prefer a
tested reply)


*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 20 '06 #18

P: n/a
On Sun, 19 Mar 2006 23:33:01 +0100, Hugo Kornelis
<hu**@perFact.REMOVETHIS.info.INVALID> wrote:

Hi Clare,

So if I understand correctly - if there is ONE occurence of a policy
that ends in CANC or NOTRENEWED, then NO occurence of that policy should
be returned by the query?

You have it rightDid someone already mention elsewhere in this thread how terribly bad
practice it is to combine policy numbers with status codes into one
single column?
Yes, but that is totally beyond my control. This program was written
ages ago by who-knows-who, and became a commercial product that I and
many others now have to live with.
There is a new version out now that we will be updating to this
summer, but it still uses the old data structure, to the best of my
knowledge. It's another one of these "unplanned" products that just
grew like flopsy. It took me over 2 months just to figure out what the
darn thing was doing, before I could even start figuring out what to
ask for, much less how to ask for it. I hate this part of the job, but
somebody's got to do it!

Thanks for all the help!!!!!!!!

Based on the query you posted, this modification will probably do what
you need:

SELECT DISTINCT c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer AS c
INNER JOIN (SELECT custid, MAX(polexpdate) AS polexpdate
FROM afw_basicpolinfo
WHERE polno LIKE '1932700%'
GROUP BY custid) AS p1
ON p1.custid = c.custid
INNER JOIN afw_basicpolinfo AS p
ON p.custid = p1.custid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060201' AND '20060228'
AND c.email IS NOT NULL
AND NOT EXISTS (SELECT *
FROM afw_basicpolinfo
WHERE (polno LIKE '1932700%CANC'
OR polno LIKE '1932700%NOTRENEWED')
AND custid = c.custid)
ORDER BY p.polexpdate, c.lastname

The one below might also work - it might be faster, though I personally
find it less intuitive to understand.

SELECT DISTINCT c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer AS c
INNER JOIN (SELECT custid, MAX(polexpdate) AS polexpdate
FROM afw_basicpolinfo
WHERE polno LIKE '1932700%'
GROUP BY custid
HAVING MAX(CASE WHEN polno LIKE '%CANC'
OR polno LIKE '%NOTRENEWED'
THEN 1
ELSE 0
END) = 0) AS p1
ON p1.custid = c.custid
INNER JOIN afw_basicpolinfo AS p
ON p.custid = p1.custid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060201' AND '20060228'
AND c.email IS NOT NULL
ORDER BY p.polexpdate, c.lastname

(Both queries are untested - see www.aspfaq.com/5006 if you prefer a
tested reply)


*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 20 '06 #19

P: n/a
On Sun, 19 Mar 2006 23:33:01 +0100, Hugo Kornelis
<hu**@perFact.REMOVETHIS.info.INVALID> wrote:

Based on the query you posted, this modification will probably do what
you need:

SELECT DISTINCT c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer AS c
INNER JOIN (SELECT custid, MAX(polexpdate) AS polexpdate
FROM afw_basicpolinfo
WHERE polno LIKE '1932700%'
GROUP BY custid) AS p1
ON p1.custid = c.custid
INNER JOIN afw_basicpolinfo AS p
ON p.custid = p1.custid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060201' AND '20060228'
AND c.email IS NOT NULL
AND NOT EXISTS (SELECT *
FROM afw_basicpolinfo
WHERE (polno LIKE '1932700%CANC'
OR polno LIKE '1932700%NOTRENEWED')
AND custid = c.custid)
ORDER BY p.polexpdate, c.lastname

This APPEARS to be working - I will have to "prove" the results at
the office tomorrow, as it still shows some outstanding policies from
Feb. Not sure it should.
The one below might also work - it might be faster, though I personally
find it less intuitive to understand.
It gives me an empty result for Feb, and gives me a list for April -
which is what I would expect, so this one may just be it!! Will keep
you updated.

Thanks Hugo.

Erlund's last attempt gives me the same result as Hugo's first
example.

Thanks for the help, Erlund.
Between you two fine fellows I THINK I've learned a lot.
SELECT DISTINCT c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer AS c
INNER JOIN (SELECT custid, MAX(polexpdate) AS polexpdate
FROM afw_basicpolinfo
WHERE polno LIKE '1932700%'
GROUP BY custid
HAVING MAX(CASE WHEN polno LIKE '%CANC'
OR polno LIKE '%NOTRENEWED'
THEN 1
ELSE 0
END) = 0) AS p1
ON p1.custid = c.custid
INNER JOIN afw_basicpolinfo AS p
ON p.custid = p1.custid
AND p.polexpdate = p1.polexpdate
WHERE p.polexpdate BETWEEN '20060201' AND '20060228'
AND c.email IS NOT NULL
ORDER BY p.polexpdate, c.lastname

(Both queries are untested - see www.aspfaq.com/5006 if you prefer a
tested reply)


*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 20 '06 #20

P: n/a
Clare,

You usually can't just take pieces of a large query and expect them to
work all by themselves.

What is this in the 3rd line, from your post at Sat, Mar 18 2006 2:20
pm?:

SELECT polid. polexpdate =MAX (polexpdate)
FROM afw_basicpolinfo p WHERE p.polno like '1932700%'
GROUP BY polid AS p1 ON p1.polid-p1 p1 polid -- <-- what is this?!
Join afw_basicpolinfo p on p.polid = p1.polid
AND p.polexpdate=p1.polexpdate
WHERE p.polexpdate between '2006-04-15 00:00:00.000 and 2006-04-31
00:00:00.000'
and c.email IS NOT NULL

I have never seen syntax like that before.

Mar 20 '06 #21

P: n/a
On 20 Mar 2006 06:18:15 -0800, "figital" <mh****@gmail.com> wrote:
Clare,

You usually can't just take pieces of a large query and expect them to
work all by themselves.
I realize that, but SUBQUERIES can work on their own.
What is this in the 3rd line, from your post at Sat, Mar 18 2006 2:20
pm?:
Well, I'll have to guess as I have no record of a 2:20pm post on Sat.,
but the code will be either Erlund's or Hugos, and I think it has been
mis-copied. The closest I can see is :

GROUP BY custid)AS p1 ON p1.custid=c.custid from Hugo in the early
stages, or:
GROUP BY custid)
AS p1
ON p1.custid=c.custid
From Hugo's final version, which, by the way, appears to be the only
one that actually works the way I had intended it to. I wish I could
completely understand the query, and it might be helpfull if Hugo
could put a documented version of the code up for the edification of
all of us.

How 'bout it Hugo??
You have already gone above and beyond, but it is VERY obvious you
know what you are doing, and a quick explanation of what you actually
did - what each expression does, would definitely be appreciated by
myself, and I'm sure by "figital" and others.

Again, my thanks to all who have responded so far - and to Hugo and
Erlund in particular.

SELECT polid. polexpdate =MAX (polexpdate)
FROM afw_basicpolinfo p WHERE p.polno like '1932700%'
GROUP BY polid AS p1 ON p1.polid-p1 p1 polid -- <-- what is this?!
Join afw_basicpolinfo p on p.polid = p1.polid
AND p.polexpdate=p1.polexpdate
WHERE p.polexpdate between '2006-04-15 00:00:00.000 and 2006-04-31
00:00:00.000'
and c.email IS NOT NULL

I have never seen syntax like that before.


*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 20 '06 #22

P: n/a
On Mon, 20 Mar 2006 10:53:29 -0500, clare at snyder.on.ca wrote:
I can't read much Dutch,(though my ancestors came to the Americas from
Holland and the lowcountry many, many years ago) but looking at Hugo's
bio on the PerFact.info page, It's obvious he is a very knowlegable
professional.

Isn't usenet/ internet wonderful? And it's the good fellows like Hugo
that make it work so well!!!!!


*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 20 '06 #23

P: n/a
On Mon, 20 Mar 2006 10:53:29 -0500, clare at snyder.on.ca wrote:

(snip)
From Hugo's final version, which, by the way, appears to be the only
one that actually works the way I had intended it to. I wish I could
completely understand the query, and it might be helpfull if Hugo
could put a documented version of the code up for the edification of
all of us.

How 'bout it Hugo??


Hi Clare,

Flattery always works for me. And you have included enough flattery to
last me a year or two! <g>

I'm not sure which one of my two queries you'd like to see commented, so
I'll just do both.

Here's #1. Basically a copy of Erland's original suggestion, with an
extra subquery in the WHERE clause to handle the additional problem of
cancelled policies.

SELECT DISTINCT c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer AS c
INNER JOIN (SELECT custid, MAX(polexpdate) AS polexpdate
FROM afw_basicpolinfo
WHERE polno LIKE '1932700%'
GROUP BY custid) AS p1

The four lines above are a so-called "derived table". A derived table is
a subquery in the FROM clause. It can only be a non-correlated query,
i.e. you must be able to run it on it's own. If you select just the part
from "SELECT custind" up to "GROUP BY custid" and execute it, you'll get
a normal result. Using a derived table is basically just a shortcut for
making, using, and then dropping a temp table or a view. The "AS p1"
gives the result of this derived table a name (alias).

In this derived table, rows are selected that match the LIKE; they are
then arranged into groups for each distinct custid and the most recent
expiration date ("MAX(polexpdate)") in each group is selected. The
result of this derived table is a table named p1, with two columns:
custid (holding the customerid) and polexpdate (holding the most recent
expiration date for policy 1932700 for this customer).

ON p1.custid = c.custid
INNER JOIN afw_basicpolinfo AS p
ON p.custid = p1.custid
AND p.polexpdate = p1.polexpdate

The results of the derived table are now joined back to the complete
afw_basicpolinfo table. Since both custid and polexpdate have to match,
we'll only join to rows that match a "most recent" policy expiration
date. Through this technique, we can now address other columns in the
same row, that we couldn't include in the derived table without breaking
it's logic. In this query, this extra join is required because you want
to display p.polno in the results.

WHERE p.polexpdate BETWEEN '20060201' AND '20060228'
AND c.email IS NOT NULL
AND NOT EXISTS (SELECT *
FROM afw_basicpolinfo
WHERE (polno LIKE '1932700%CANC'
OR polno LIKE '1932700%NOTRENEWED')
AND custid = c.custid)

Here's the subquery I added. Erland's first suggestion when you added
the requirement to filter out cancelled policies was based on the
assumption that the encoding ("ending in CANC or NOTRENEWED") was always
in the row with the most recent expiration date. You said that the CANC
or NOTRENEWED code might also be in a different row for the same policy
and the same customer. So I set up this subquery. This is a correlated
subquery: it can't be executed on it's own, since it references columns
from tables in the outer query. It has to be re-evaluated for each row
in the outer table. (In theory - the query optimizer in the DB engine
might transform this to different code that produces the same results
faster).

Given a specific customer (c.custid), this subquery attempts to find any
row for that customer with a policy number that starts with 1932700 and
ends in either CANC or NOTRENEWED. If it finds one (or more), then the
EXISTS evaluates to true and hence the NOT EXISTS evaluates to false and
the row in the outer query is rejected from the result set.

ORDER BY p.polexpdate, c.lastname
That was my first query. A quite straightforward addition of your extra
requirement to the existing query suggested by Erland.

My second suggestion was more adventurous and needs lots more testing
but has great potential of being faster, since it eliminates the need to
evaluate a subquery for each row that is produced by the joins in the
outer query.

SELECT DISTINCT c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer AS c
INNER JOIN (SELECT custid, MAX(polexpdate) AS polexpdate
FROM afw_basicpolinfo
WHERE polno LIKE '1932700%'
GROUP BY custid
HAVING MAX(CASE WHEN polno LIKE '%CANC'
OR polno LIKE '%NOTRENEWED'
THEN 1
ELSE 0
END) = 0) AS p1

What we have here is the same derived table technique, but the subquery
in the derived table is extended with a HAVING clause. A HAVING clause
is much like a WHERE clause - except a WHERE clause is used to reject
individual rows before grouping them (if a GROUP BY is present); a
HAVING clause eliminates complete groups of rows after grouping them
according to the GROUP BY clause.

This particular HAVING clause is complex. Let's start at the innermost
level:

CASE WHEN polno LIKE '%CANC'
OR polno LIKE '%NOTRENEWED'
THEN 1
ELSE 0
END

This CASE expression will be evaluated for each row in the group. If the
polno of that row ends in CANC of NOTRENEWED, the result is 1; if not,
it is 0. So 1 means that the row is flagged as cancelled; 0 means it's
not.

MAX(CASE WHEN polno LIKE '%CANC'
OR polno LIKE '%NOTRENEWED'
THEN 1
ELSE 0
END) = 0) AS p1

After determining 1 (flagged as cancelled) or 0 (not thusly flagged) for
each row in the group, the maximum value of the group is identified. If
at least one row in the group is flagged as cancelled, the MAX value of
the CASE expression has to be 1. The MAX can only be 0 if not a single
row in the group is marked as cancelled.

HAVING MAX(CASE WHEN polno LIKE '%CANC'
OR polno LIKE '%NOTRENEWED'
THEN 1
ELSE 0
END) = 0

And therefor, this HAVING clause will completely reject all rows from a
customer that has at least one row with a cancellation marker.

In the end, the derived table will hold custid and their most recent
expiration date (as in the first query), but only for customers for whom
there is no cancellation marker set on any row.

ON p1.custid = c.custid
INNER JOIN afw_basicpolinfo AS p
ON p.custid = p1.custid
AND p.polexpdate = p1.polexpdate

And as a result, this inner join will only provide the full information
from rows for customers without cancellation and their most recent
expiration date.

WHERE p.polexpdate BETWEEN '20060201' AND '20060228'
AND c.email IS NOT NULL
ORDER BY p.polexpdate, c.lastname

Since the customers with cancellation were already filtered out in the
derived table, the rest of the query is in this case unchanged from the
original query by Erland.

I already mentioned that I expect this query to be faster than the first
suggestion. In case you're wondering why: the subquery in the first
suggestion asks SQL Server to go back to the table, and do another
search over the data. This will probably result in more logical I/O
requests. And depending on the size of your table, that might even
result in more physical I/O as well.

The second suggestion "moves" the extra logic to the derived table. This
means that SQL Server has to do more processing there, but it's done in
a place where we already ask SQL Server to read all rows. It might take
a bit more CPU power, but not more I/O. And since I/O is almost always
the bottleneck for performance in databases, I expect this query to run
faster.

I hope this helps!!

--
Hugo Kornelis, SQL Server MVP
Mar 24 '06 #24

P: n/a
On Fri, 24 Mar 2006 22:41:58 +0100, Hugo Kornelis
<hu**@perFact.REMOVETHIS.info.INVALID> wrote:
On Mon, 20 Mar 2006 10:53:29 -0500, clare at snyder.on.ca wrote:

(snip)
From Hugo's final version, which, by the way, appears to be the only
one that actually works the way I had intended it to. I wish I could
completely understand the query, and it might be helpfull if Hugo
could put a documented version of the code up for the edification of
all of us.

How 'bout it Hugo??


Hi Clare,

Flattery always works for me. And you have included enough flattery to
last me a year or two! <g>

I'm not sure which one of my two queries you'd like to see commented, so
I'll just do both.

Here's #1. Basically a copy of Erland's original suggestion, with an
extra subquery in the WHERE clause to handle the additional problem of
cancelled policies.

SELECT DISTINCT c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer AS c
INNER JOIN (SELECT custid, MAX(polexpdate) AS polexpdate
FROM afw_basicpolinfo
WHERE polno LIKE '1932700%'
GROUP BY custid) AS p1

The four lines above are a so-called "derived table". A derived table is
a subquery in the FROM clause. It can only be a non-correlated query,
i.e. you must be able to run it on it's own. If you select just the part
from "SELECT custind" up to "GROUP BY custid" and execute it, you'll get
a normal result. Using a derived table is basically just a shortcut for
making, using, and then dropping a temp table or a view. The "AS p1"
gives the result of this derived table a name (alias).

In this derived table, rows are selected that match the LIKE; they are
then arranged into groups for each distinct custid and the most recent
expiration date ("MAX(polexpdate)") in each group is selected. The
result of this derived table is a table named p1, with two columns:
custid (holding the customerid) and polexpdate (holding the most recent
expiration date for policy 1932700 for this customer).

ON p1.custid = c.custid
INNER JOIN afw_basicpolinfo AS p
ON p.custid = p1.custid
AND p.polexpdate = p1.polexpdate

The results of the derived table are now joined back to the complete
afw_basicpolinfo table. Since both custid and polexpdate have to match,
we'll only join to rows that match a "most recent" policy expiration
date. Through this technique, we can now address other columns in the
same row, that we couldn't include in the derived table without breaking
it's logic. In this query, this extra join is required because you want
to display p.polno in the results.

WHERE p.polexpdate BETWEEN '20060201' AND '20060228'
AND c.email IS NOT NULL
AND NOT EXISTS (SELECT *
FROM afw_basicpolinfo
WHERE (polno LIKE '1932700%CANC'
OR polno LIKE '1932700%NOTRENEWED')
AND custid = c.custid)

Here's the subquery I added. Erland's first suggestion when you added
the requirement to filter out cancelled policies was based on the
assumption that the encoding ("ending in CANC or NOTRENEWED") was always
in the row with the most recent expiration date. You said that the CANC
or NOTRENEWED code might also be in a different row for the same policy
and the same customer. So I set up this subquery. This is a correlated
subquery: it can't be executed on it's own, since it references columns
from tables in the outer query. It has to be re-evaluated for each row
in the outer table. (In theory - the query optimizer in the DB engine
might transform this to different code that produces the same results
faster).

Given a specific customer (c.custid), this subquery attempts to find any
row for that customer with a policy number that starts with 1932700 and
ends in either CANC or NOTRENEWED. If it finds one (or more), then the
EXISTS evaluates to true and hence the NOT EXISTS evaluates to false and
the row in the outer query is rejected from the result set.

ORDER BY p.polexpdate, c.lastname
That was my first query. A quite straightforward addition of your extra
requirement to the existing query suggested by Erland.

My second suggestion was more adventurous and needs lots more testing
but has great potential of being faster, since it eliminates the need to
evaluate a subquery for each row that is produced by the joins in the
outer query.

SELECT DISTINCT c.lastname, c.email, p.polno, p.polexpdate
FROM afw_customer AS c
INNER JOIN (SELECT custid, MAX(polexpdate) AS polexpdate
FROM afw_basicpolinfo
WHERE polno LIKE '1932700%'
GROUP BY custid
HAVING MAX(CASE WHEN polno LIKE '%CANC'
OR polno LIKE '%NOTRENEWED'
THEN 1
ELSE 0
END) = 0) AS p1

What we have here is the same derived table technique, but the subquery
in the derived table is extended with a HAVING clause. A HAVING clause
is much like a WHERE clause - except a WHERE clause is used to reject
individual rows before grouping them (if a GROUP BY is present); a
HAVING clause eliminates complete groups of rows after grouping them
according to the GROUP BY clause.

This particular HAVING clause is complex. Let's start at the innermost
level:

CASE WHEN polno LIKE '%CANC'
OR polno LIKE '%NOTRENEWED'
THEN 1
ELSE 0
END

This CASE expression will be evaluated for each row in the group. If the
polno of that row ends in CANC of NOTRENEWED, the result is 1; if not,
it is 0. So 1 means that the row is flagged as cancelled; 0 means it's
not.

MAX(CASE WHEN polno LIKE '%CANC'
OR polno LIKE '%NOTRENEWED'
THEN 1
ELSE 0
END) = 0) AS p1

After determining 1 (flagged as cancelled) or 0 (not thusly flagged) for
each row in the group, the maximum value of the group is identified. If
at least one row in the group is flagged as cancelled, the MAX value of
the CASE expression has to be 1. The MAX can only be 0 if not a single
row in the group is marked as cancelled.

HAVING MAX(CASE WHEN polno LIKE '%CANC'
OR polno LIKE '%NOTRENEWED'
THEN 1
ELSE 0
END) = 0

And therefor, this HAVING clause will completely reject all rows from a
customer that has at least one row with a cancellation marker.

In the end, the derived table will hold custid and their most recent
expiration date (as in the first query), but only for customers for whom
there is no cancellation marker set on any row.

ON p1.custid = c.custid
INNER JOIN afw_basicpolinfo AS p
ON p.custid = p1.custid
AND p.polexpdate = p1.polexpdate

And as a result, this inner join will only provide the full information
from rows for customers without cancellation and their most recent
expiration date.

WHERE p.polexpdate BETWEEN '20060201' AND '20060228'
AND c.email IS NOT NULL
ORDER BY p.polexpdate, c.lastname

Since the customers with cancellation were already filtered out in the
derived table, the rest of the query is in this case unchanged from the
original query by Erland.

I already mentioned that I expect this query to be faster than the first
suggestion. In case you're wondering why: the subquery in the first
suggestion asks SQL Server to go back to the table, and do another
search over the data. This will probably result in more logical I/O
requests. And depending on the size of your table, that might even
result in more physical I/O as well.

The second suggestion "moves" the extra logic to the derived table. This
means that SQL Server has to do more processing there, but it's done in
a place where we already ask SQL Server to read all rows. It might take
a bit more CPU power, but not more I/O. And since I/O is almost always
the bottleneck for performance in databases, I expect this query to run
faster.

I hope this helps!!

Thanks Hugo!! Clarifies a lot for me, and answers the question the
other guy asked.
*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
Mar 25 '06 #25

This discussion thread is closed

Replies have been disabled for this discussion.