472,784 Members | 796 Online

# Question involving OLE DB, recursion, and filtered views

I have a programming problem in OLEDB and C++ that seems to be pointing me
toward using layered views and hierarchical rowsets. However, I am uncertain
of the precise implementation and need guidance from someone experienced in
those areas.
The problem, in the abstract, is how to select a set of records in one
table, each one of which has a relation to every one of a set of records in
another table, the tables being joined in a many-to-many relationship.

To illustrate, consider two tables, Boys and Girls, each comprising a Name
and ID column, and a junction table BG, comprising two columns BoyID and
GirlID, thus joining the two tables Boys and Girls. A given Boy (record) has
a relationship with a given Girl record if the BG table contains a record
containing the respective keys.

The relationship is described as "friendship" but you may use your
imagination; it will help to understand the problem.

In any case, friendship may only take place between Boys and Girls. Each Boy
may have been friendly with zero or any arbitrary number of Girls; the
reverse is true for the Girls to Boys. The following sample data will
illustrate:

BOYS
----
ID Name
----+---------------
2 |Bill
3 |Charles
4 |Dick
5 |Randy

GIRLS
-----
ID Name
----+---------------
1 |Ann
2 |Betsy
3 |Cathy
4 |Ellen
5 |Floozy
6 |Virginia

BG (Boys-Girls)
-------------------------------------------
1 1 (Adam + Ann were friendly at least once)
1 2 (Adam + Betsy "")
1 5 (Adam + Floozy "")
2 1 (Bill + Ann)
2 3 (Bill + Cathy)
2 5 (Bill + Floozy)
3 4 (Charles + Ellen)
3 5 (Charles + Floozy)
4 1 (Dick + Ann)
4 2 (Dick + Betsy)
4 3 (Dick + Cathy)
4 4 (Dick + Ellen)
4 5 (Dick + Floozy)

Randy and Virginia have no relations with anyone, hence their names.

Now, the final objective is to be able to ask a question like this:
"Given a list of names of boys, what are the names of the girls who have
been friendly with ALL of the boys in that list?" For example, if we were to
supply a list with the single name of "Adam", we would get the list (Ann,
Betsy, Floozy). Querying with the names (Bill, Dick), we would get the list
(Ann, Cathy, Floozy), because they are joined to each one of (Bill, Dick).
If we were to supply the list (Adam, Bill, Charles) we would get the list
containing only the name Floozy, because she is the only girl who has (had)
relations with each of the boys in the supplied list. Supplying Randy with
any or no other boys' names would get an empty list, because he has no
relations.

Simple enough. It is trivial to concoct a query to find matches on a list
with just one name:

SELECT Girls.Name from Boys, Girls, BG WHERE Girls.ID = BG.GirlID and \
BG.BoyID = Boys.ID and Boys.Name = (?)

(If you like JOIN syntax it is even easier)

Extending this to a query for two names is not much harder:

SELECT Girls.Name from Boys, Girls, BG WHERE Girls.ID = BG.GirlID and \
BG.BoyID = Boys.ID and Boys.Name = (?) and Girls.Name IN
(SELECT Girls.Name from Boys, Girls, BG WHERE Girls.ID = BG.GirlID
and \
BG.BoyID = Boys.ID and Boys.Name = (?))

One could in principle extend this indefinitely ... but this is ugly enough
as it is.
As a solution to the original problem - that is, how to do this with an
essentially arbitrarily long list of boys - it is basically infeasible,
especially if you are constructing accessors and queries in C++ on the fly.
Yuck.

So, it dawned on me that the thing to do is use views: Get essentially the
entire set of relations, then filter the results repetitively (or
recursively) through views to get the subset of Girls with relations to
each Boy in the list. This is where I am stuck (in the first place, this
might not be the easiest way to deal with this problem. If there is a better
overall approach please point it out).

In the first blush, I could join all the records on their relations in the
BG table, then filter the resultant rowset:

CCommand<MyAccessor> pCmd;
char * sQryString = \
"SELECT * from Boys, Girls, BG, where Boys.ID = BG.BoyId and
BG.GirlID = Girls.ID;"
.... blah ... QI(&pCmd), etc. ....
hr = pCmd->Open(session, .., sQryString, etc.);

....and then...

CComPtr<IUnknown> pView;
CComPtr<IRowsetView> pRowsetView;
CComPtr<IViewFilter> pViewFilter;
CComPtr<IViewChapter> pViewChapter;

hr = pCmd->m_spRowset->QueryInterface(&pView);
hr = pCmd->m_spRowset->QueryInterface(&pRowsetView);
hr = pRowsetView->CreateView(0, IID_IViewFilter, (IUnknown **)&pViewFilter);
hr = pViewFilter->QueryInterface(&pViewChapter);

Now I have a chapter and a filter. Now I want to filter for all the records
that contain the first of my list of names. Supposing my list is (Adam,
Dick), I will construct a filter on that name (I omit the details). This
yields a view rowset containing Adam's name and ID joined to the names and
IDs of each of his partners. I would now like to know essentially the
following:

What is the subset of Girl's names within the above list, which also have
relations with Dick?

How can I construct a view/chaptered rowset which will give me this
information? If I filter again on the Boys.Name on the view I just got, I
will get an empty result set, because every boys' name in the resultant set
will be Adam, and filtering again on equality with Dick will come up empty.

Basically, I need to know how to take the resultant rowset and use it as
input in another query.

And I need to do it against a Jet backend, using OLE DB, from C++.

RDeW
Nov 13 '05 #1
9 1946
"Riley DeWiley" <ri***********@gmail.com> wrote in
news:10************@corp.supernews.com:
Now, the final objective is to be able to ask a question like this:
"Given a list of names of boys, what are the names of the girls who
have been friendly with ALL of the boys in that list?" For example, if
we were to supply a list with the single name of "Adam", we would get
the list (Ann, Betsy, Floozy). Querying with the names (Bill, Dick),
we would get the list (Ann, Cathy, Floozy), because they are joined to
each one of (Bill, Dick). If we were to supply the list (Adam, Bill,
Charles) we would get the list containing only the name Floozy,
because she is the only girl who has (had) relations with each of the
boys in the supplied list. Supplying Randy with any or no other boys'
names would get an empty list, because he has no relations.

Simple enough. It is trivial to concoct a query to find matches on a
list with just one name:

SELECT Girls.Name from Boys, Girls, BG WHERE Girls.ID = BG.GirlID and
\
BG.BoyID = Boys.ID and Boys.Name = (?)

(If you like JOIN syntax it is even easier)

This can be extended to an arbitrary-sized list:

SELECT Girls.Name
FROM Girls
INNER JOIN BG ON Girls.ID = BG.GirlID
INNER JOIN Boys ON BG.BoyID = Boys.ID
GROUP BY Girls.Name
HAVING COUNT(*) = 3

on the assumption that (BoyID,GirlID) is unique within BG (as it had
better be). All that needs to be parameterized is the IN list and the
number the count is compared to in the HAVING clause.
Nov 13 '05 #2

"Eric Bohlman" <eb******@omsdev.com> wrote in message
news:Xn****************************@130.133.1.4...
"Riley DeWiley" <ri***********@gmail.com> wrote in
news:10************@corp.supernews.com:
Now, the final objective is to be able to ask a question like this:
"Given a list of names of boys, what are the names of the girls who
have been friendly with ALL of the boys in that list?" For example, if
we were to supply a list with the single name of "Adam", we would get
the list (Ann, Betsy, Floozy). Querying with the names (Bill, Dick),
we would get the list (Ann, Cathy, Floozy), because they are joined to
each one of (Bill, Dick). If we were to supply the list (Adam, Bill,
Charles) we would get the list containing only the name Floozy,
because she is the only girl who has (had) relations with each of the
boys in the supplied list. Supplying Randy with any or no other boys'
names would get an empty list, because he has no relations.

Simple enough. It is trivial to concoct a query to find matches on a
list with just one name:

SELECT Girls.Name from Boys, Girls, BG WHERE Girls.ID = BG.GirlID and
\
BG.BoyID = Boys.ID and Boys.Name = (?)

(If you like JOIN syntax it is even easier)

This can be extended to an arbitrary-sized list:

SELECT Girls.Name
FROM Girls
INNER JOIN BG ON Girls.ID = BG.GirlID
INNER JOIN Boys ON BG.BoyID = Boys.ID
GROUP BY Girls.Name
HAVING COUNT(*) = 3

on the assumption that (BoyID,GirlID) is unique within BG (as it had
better be). All that needs to be parameterized is the IN list and the
number the count is compared to in the HAVING clause.

Hmm, had not thought of that. Are you aware of any upper bound on the number
of entries
in the IN (x, y, ....) clause? The list is potentially quite large, far
larger than in this contrived example.

There is an additional complication, beyond the scope of the problem I
originally posted, in that I would like to get partial matches of names. For
example, I would like to match girls' names as LIKE 'Ann*' to get Ann, Anna,
Annemarie, etc. But that is another question.
Nov 13 '05 #3
"Riley DeWiley" <ri***********@gmail.com> wrote in message news:<10************@corp.supernews.com>...

[...]
Now, the final objective is to be able to ask a question like this:
"Given a list of names of boys, what are the names of the girls who have
been friendly with ALL of the boys in that list?" For example, if we were to
supply a list with the single name of "Adam", we would get the list (Ann,
Betsy, Floozy). Querying with the names (Bill, Dick), we would get the list
(Ann, Cathy, Floozy), because they are joined to each one of (Bill, Dick).
If we were to supply the list (Adam, Bill, Charles) we would get the list
containing only the name Floozy, because she is the only girl who has (had)
relations with each of the boys in the supplied list. Supplying Randy with
any or no other boys' names would get an empty list, because he has no
relations.

What you are looking for is called relational division. SQL does not
directly support queries like "gimmie the names of boys that has
relationships with all girls". Instead you have to rephrase your
question as "gimme the names of boys where it does not exist girls
that he doesnt have a relationship with". If you google for Celko and
hangar in comp.database.* you will find a post that illustrates this
in a nice way.

In your case the query would look something like

select boy_id, name from boys b
where not exists (
select 1 from girls g
where not exists (
select 1 from BG
where (boy_id, girl_id) = (b.boy_id, g.girl_id)
)
)

Other things to concider is whether a has a relationship with b
implies that b also have a relationship with a. If you dont want to
record a relationship twice you could create a view like

create view relates_to as (
select boy_id, girl_id from bg
union
select girl_id, boy_id from bg
)

and use that in your query. Finally, it is easier for people to help
you if you provide ddl and insert statements instead of ascii tables.
I.e.

create table boys (
boy_id int not null primary key
name char(10) not null
);

insert into boys values (1,'Adam'), (2,'Bill'), ...
HTH
/Lennart

[...]
Nov 13 '05 #4
I would prefer to use 2 tables, Persons with PersonID, names & sex, and a
second table of JoinPersonPerson which has a joint key of Person1ID and
Person2ID

That way you can a have homosexual relationships ( when I say you, I am
speaking in the general sense) .

You have a form listing person1 and a subform listing person2.
Put whaterver search facilities you like on the main form. Colour it pink
for a girl and blue for a boy. You can even do conditional formatting on the
subform to show same sex relations.

HTH

Phil
"Riley DeWiley" <ri***********@gmail.com> wrote in message
news:10************@corp.supernews.com...
I have a programming problem in OLEDB and C++ that seems to be pointing me
toward using layered views and hierarchical rowsets. However, I am
uncertain
of the precise implementation and need guidance from someone experienced
in
those areas.
The problem, in the abstract, is how to select a set of records in one
table, each one of which has a relation to every one of a set of records
in
another table, the tables being joined in a many-to-many relationship.

To illustrate, consider two tables, Boys and Girls, each comprising a Name
and ID column, and a junction table BG, comprising two columns BoyID and
GirlID, thus joining the two tables Boys and Girls. A given Boy (record)
has
a relationship with a given Girl record if the BG table contains a record
containing the respective keys.

The relationship is described as "friendship" but you may use your
imagination; it will help to understand the problem.

In any case, friendship may only take place between Boys and Girls. Each
Boy
may have been friendly with zero or any arbitrary number of Girls; the
reverse is true for the Girls to Boys. The following sample data will
illustrate:

BOYS
----
ID Name
----+---------------
2 |Bill
3 |Charles
4 |Dick
5 |Randy

GIRLS
-----
ID Name
----+---------------
1 |Ann
2 |Betsy
3 |Cathy
4 |Ellen
5 |Floozy
6 |Virginia

BG (Boys-Girls)
-------------------------------------------
1 1 (Adam + Ann were friendly at least once)
1 2 (Adam + Betsy "")
1 5 (Adam + Floozy "")
2 1 (Bill + Ann)
2 3 (Bill + Cathy)
2 5 (Bill + Floozy)
3 4 (Charles + Ellen)
3 5 (Charles + Floozy)
4 1 (Dick + Ann)
4 2 (Dick + Betsy)
4 3 (Dick + Cathy)
4 4 (Dick + Ellen)
4 5 (Dick + Floozy)

Randy and Virginia have no relations with anyone, hence their names.

Now, the final objective is to be able to ask a question like this:
"Given a list of names of boys, what are the names of the girls who have
been friendly with ALL of the boys in that list?" For example, if we were
to
supply a list with the single name of "Adam", we would get the list (Ann,
Betsy, Floozy). Querying with the names (Bill, Dick), we would get the
list
(Ann, Cathy, Floozy), because they are joined to each one of (Bill,
Dick).
If we were to supply the list (Adam, Bill, Charles) we would get the list
containing only the name Floozy, because she is the only girl who has
relations with each of the boys in the supplied list. Supplying Randy
with
any or no other boys' names would get an empty list, because he has no
relations.

Simple enough. It is trivial to concoct a query to find matches on a list
with just one name:

SELECT Girls.Name from Boys, Girls, BG WHERE Girls.ID = BG.GirlID and \
BG.BoyID = Boys.ID and Boys.Name = (?)

(If you like JOIN syntax it is even easier)

Extending this to a query for two names is not much harder:

SELECT Girls.Name from Boys, Girls, BG WHERE Girls.ID = BG.GirlID and \
BG.BoyID = Boys.ID and Boys.Name = (?) and Girls.Name IN
(SELECT Girls.Name from Boys, Girls, BG WHERE Girls.ID = BG.GirlID
and \
BG.BoyID = Boys.ID and Boys.Name = (?))

One could in principle extend this indefinitely ... but this is ugly
enough
as it is.
As a solution to the original problem - that is, how to do this with an
essentially arbitrarily long list of boys - it is basically infeasible,
especially if you are constructing accessors and queries in C++ on the
fly.
Yuck.

So, it dawned on me that the thing to do is use views: Get essentially the
entire set of relations, then filter the results repetitively (or
recursively) through views to get the subset of Girls with relations to
each Boy in the list. This is where I am stuck (in the first place, this
might not be the easiest way to deal with this problem. If there is a
better
overall approach please point it out).

In the first blush, I could join all the records on their relations in the
BG table, then filter the resultant rowset:

CCommand<MyAccessor> pCmd;
char * sQryString = \
"SELECT * from Boys, Girls, BG, where Boys.ID = BG.BoyId and
BG.GirlID = Girls.ID;"
... blah ... QI(&pCmd), etc. ....
hr = pCmd->Open(session, .., sQryString, etc.);

...and then...

CComPtr<IUnknown> pView;
CComPtr<IRowsetView> pRowsetView;
CComPtr<IViewFilter> pViewFilter;
CComPtr<IViewChapter> pViewChapter;

hr = pCmd->m_spRowset->QueryInterface(&pView);
hr = pCmd->m_spRowset->QueryInterface(&pRowsetView);
hr = pRowsetView->CreateView(0, IID_IViewFilter, (IUnknown
**)&pViewFilter);
hr = pViewFilter->QueryInterface(&pViewChapter);

Now I have a chapter and a filter. Now I want to filter for all the
records
that contain the first of my list of names. Supposing my list is (Adam,
Dick), I will construct a filter on that name (I omit the details). This
yields a view rowset containing Adam's name and ID joined to the names and
IDs of each of his partners. I would now like to know essentially the
following:

What is the subset of Girl's names within the above list, which also have
relations with Dick?

How can I construct a view/chaptered rowset which will give me this
information? If I filter again on the Boys.Name on the view I just got, I
will get an empty result set, because every boys' name in the resultant
set
will be Adam, and filtering again on equality with Dick will come up
empty.

Basically, I need to know how to take the resultant rowset and use it as
input in another query.

And I need to do it against a Jet backend, using OLE DB, from C++.

RDeW

Nov 13 '05 #5
> > This can be extended to an arbitrary-sized list:

SELECT Girls.Name
FROM Girls
INNER JOIN BG ON Girls.ID = BG.GirlID
INNER JOIN Boys ON BG.BoyID = Boys.ID
GROUP BY Girls.Name
HAVING COUNT(*) = 3

on the assumption that (BoyID,GirlID) is unique within BG (as it had
better be). All that needs to be parameterized is the IN list and the
number the count is compared to in the HAVING clause.
Hmm, had not thought of that. Are you aware of any upper bound on the number
of entries
in the IN (x, y, ....) clause? The list is potentially quite large, far
larger than in this contrived example.

My idea is to create Boys_Name_List table. And insert names in each
row in the table.

There is an additional complication, beyond the scope of the problem I
originally posted, in that I would like to get partial matches of names. For
example, I would like to match girls' names as LIKE 'Ann*' to get Ann, Anna,
Annemarie, etc. But that is another question.

I think Eric's query is basically "Relational Division".
Here are some another examples. You do not need count the number of
Boys beforehand.
(I tested them on DB2 UDB V8. So, some may not work on other DBMS.)

(Example 1)
SELECT Girls.Name
FROM Boys
INNER JOIN
Girls
LEFT OUTER JOIN
BG
ON BG.BoyID = Boys.ID
AND BG.GirlID = Girls.ID
GROUP BY
Girls.Name
HAVING COUNT(Boys.ID) = COUNT(BG.GirlID);
(Example 2)
SELECT Girls.Name
FROM Girls
WHERE Girls.ID
IN (SELECT BG1.GirlID
FROM BG BG1
WHERE NOT EXISTS
(SELECT *
FROM Boys
AND NOT EXISTS
(SELECT *
FROM BG BG2
WHERE BG2.GirlID = BG1.GirlID
AND BG2.BoyID = Boys.ID
)
)
);
(Example 3)
SELECT Girls.Name
FROM Girls
WHERE Girls.ID
NOT IN
(SELECT GirlID
FROM (SELECT Boys.ID, Girls.ID
FROM Girls
INNER JOIN
Boys
EXCEPT ALL
SELECT *
FROM BG
) Q (BoyID, GirlID)
);
(Example 4)
SELECT GirlsName
FROM (SELECT Girls.Name, BG.BoyID
, ROWNUMBER() OVER(PARTITION BY Girls.ID ORDER BY
BG.BoyID DESC)
FROM Boys
INNER JOIN
Girls
LEFT OUTER JOIN
BG
ON BG.BoyID = Boys.ID
AND BG.GirlID = Girls.ID
) Q (GirlsName, BoyID, rn)
WHERE rn = 1
AND BoyID IS NOT NULL;
Nov 13 '05 #6
Hi,

As Eric wrote, you would need to use IN clause to do this. But since IN
could be very long, you might face situation, when you SQL statement could
fail. If you are using SQL Server 200 in your case, then you could avoid
this problem creating stored procedure and pass list of the IN values as XML
string. Inside of this SP, you could query this XML inside of IN clause

--
Val Mazur
Microsoft MVP
"Riley DeWiley" <ri***********@gmail.com> wrote in message
news:10*************@corp.supernews.com...

"Eric Bohlman" <eb******@omsdev.com> wrote in message
news:Xn****************************@130.133.1.4...
"Riley DeWiley" <ri***********@gmail.com> wrote in
news:10************@corp.supernews.com:
> Now, the final objective is to be able to ask a question like this:
> "Given a list of names of boys, what are the names of the girls who
> have been friendly with ALL of the boys in that list?" For example, if
> we were to supply a list with the single name of "Adam", we would get
> the list (Ann, Betsy, Floozy). Querying with the names (Bill, Dick),
> we would get the list (Ann, Cathy, Floozy), because they are joined to
> each one of (Bill, Dick). If we were to supply the list (Adam, Bill,
> Charles) we would get the list containing only the name Floozy,
> because she is the only girl who has (had) relations with each of the
> boys in the supplied list. Supplying Randy with any or no other boys'
> names would get an empty list, because he has no relations.
>
> Simple enough. It is trivial to concoct a query to find matches on a
> list with just one name:
>
> SELECT Girls.Name from Boys, Girls, BG WHERE Girls.ID = BG.GirlID and
> \
> BG.BoyID = Boys.ID and Boys.Name = (?)
>
> (If you like JOIN syntax it is even easier)

This can be extended to an arbitrary-sized list:

SELECT Girls.Name
FROM Girls
INNER JOIN BG ON Girls.ID = BG.GirlID
INNER JOIN Boys ON BG.BoyID = Boys.ID
GROUP BY Girls.Name
HAVING COUNT(*) = 3

on the assumption that (BoyID,GirlID) is unique within BG (as it had
better be). All that needs to be parameterized is the IN list and the
number the count is compared to in the HAVING clause.

Hmm, had not thought of that. Are you aware of any upper bound on the
number
of entries
in the IN (x, y, ....) clause? The list is potentially quite large, far
larger than in this contrived example.

There is an additional complication, beyond the scope of the problem I
originally posted, in that I would like to get partial matches of names.
For
example, I would like to match girls' names as LIKE 'Ann*' to get Ann,
Anna,
Annemarie, etc. But that is another question.

Nov 13 '05 #7
Riley DeWiley wrote:
Hmm, had not thought of that. Are you aware of any upper bound on the number
of entries
in the IN (x, y, ....) clause? The list is potentially quite large, far
larger than in this contrived example.
In Access I passed about 15,000...maybe it was 12,000...elements in the
In clause. Very fast return of recs. I too was wondering what the
upper limit was. Perhaps the length of a string.
There is an additional complication, beyond the scope of the problem I
originally posted, in that I would like to get partial matches of names. For
example, I would like to match girls' names as LIKE 'Ann*' to get Ann, Anna,
Annemarie, etc. But that is another question.

Nov 13 '05 #8
Wow, I don't need anywhere near 12,000, maybe twenty is fine. Thank you.

Riley DeWiley wrote:
Hmm, had not thought of that. Are you aware of any upper bound on the number of entries
in the IN (x, y, ....) clause? The list is potentially quite large, far
larger than in this contrived example.

In Access I passed about 15,000...maybe it was 12,000...elements in the
In clause. Very fast return of recs. I too was wondering what the
upper limit was. Perhaps the length of a string.

There is an additional complication, beyond the scope of the problem I
originally posted, in that I would like to get partial matches of names. For example, I would like to match girls' names as LIKE 'Ann*' to get Ann, Anna, Annemarie, etc. But that is another question.

Nov 13 '05 #9

"Tokunaga T." <to*****@jp.ibm.com> wrote in message
This can be extended to an arbitrary-sized list:

SELECT Girls.Name
FROM Girls
INNER JOIN BG ON Girls.ID = BG.GirlID
INNER JOIN Boys ON BG.BoyID = Boys.ID
GROUP BY Girls.Name
HAVING COUNT(*) = 3

on the assumption that (BoyID,GirlID) is unique within BG (as it had
better be). All that needs to be parameterized is the IN list and the
number the count is compared to in the HAVING clause.

Hmm, had not thought of that. Are you aware of any upper bound on the number of entries
in the IN (x, y, ....) clause? The list is potentially quite large, far
larger than in this contrived example.

My idea is to create Boys_Name_List table. And insert names in each
row in the table.

There is an additional complication, beyond the scope of the problem I
originally posted, in that I would like to get partial matches of names. For example, I would like to match girls' names as LIKE 'Ann*' to get Ann, Anna, Annemarie, etc. But that is another question.

I think Eric's query is basically "Relational Division".
Here are some another examples. .....

OK, that answers the original question, thanks to all who replied. That
question however was just a special case of the real problem, which is as
follows: Given a list of quarry names of the form ( Ann*, Betsy, Su*, *ny),
to mean "girls name LIKE Ann* or EQUALS Betsy or LIKE Su* or LIKE *ny". This
is a generalization of the problem I had originally posted, which implied
simple equality as being the operator in every case.

So, question #1 (How) can the solutions to the original question be
generalized to this problem?

Q2: Given that a goal is to avoid building complex queries and avoiding
complicated parameter binding issues (although I might yet do that), (how)
can I implement this as a filtered view into a rowset?

Thanks again to all who have replied.

RDeW
Nov 13 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.