472,976 Members | 1,432 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,976 software developers and data experts.

Best way for a Set-based query

I have a database where each row has a set of keywords separated by
commas eg .

ROW 1 ID,<other columns>,"green,hat,man,boat"
ROW 2 ID,<other columns>,"green,boat,seaside"
ROW 3 ID,<other columns>,"red,hat"

I want users to be able to find similar records to the one they are
looking it (this is art BTW). ie if they are looking at row 1 and
click "Find Similar Works of Art" I want to do a query where I find
records that have a commonality of keywords and obviously the ones
that match the most keywords should be shown first. The keywords are
currently just stored comma-separated in a single column but I am not
attached to this. Nor are they guaranteed to come in any order (ie
they are not alpha sorted) but they could be.

Number of records is around 15000 on a hosted server.

Any tips for best query/algorithm/data storage to achieve this? or
references to useful sources?

Thanks,
RB
Jul 23 '05 #1
3 3152
Roland Burr wrote:
I have a database where each row has a set of keywords separated by
commas eg .

ROW 1 ID,<other columns>,"green,hat,man,boat"
ROW 2 ID,<other columns>,"green,boat,seaside"
ROW 3 ID,<other columns>,"red,hat"

I want users to be able to find similar records to the one they are
looking it (this is art BTW). ie if they are looking at row 1 and
click "Find Similar Works of Art" I want to do a query where I find
records that have a commonality of keywords and obviously the ones
that match the most keywords should be shown first. The keywords are
currently just stored comma-separated in a single column but I am not
attached to this. Nor are they guaranteed to come in any order (ie
they are not alpha sorted) but they could be.

Number of records is around 15000 on a hosted server.

Any tips for best query/algorithm/data storage to achieve this? or
references to useful sources?

Thanks,
RB


normally you break the keywords out into a child table, so that you have two
tables (there may be typos in my SQL, this is off the top of my head):

CREATE TABLE artwork (
ID char(10) PRIMARY KEY,
....other columns...
)

CREATE TABLE keywords (
ID char(10) FOREIGN KEY REFERENCES artwork(ID),
keyword char(10)
)

Index on keyword and the SQL queries should write themselves.

--
Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)
Jul 23 '05 #2
Roland Burr (rb*****@hotmail.com) writes:
I have a database where each row has a set of keywords separated by
commas eg .

ROW 1 ID,<other columns>,"green,hat,man,boat"
ROW 2 ID,<other columns>,"green,boat,seaside"
ROW 3 ID,<other columns>,"red,hat"

I want users to be able to find similar records to the one they are
looking it (this is art BTW). ie if they are looking at row 1 and
click "Find Similar Works of Art" I want to do a query where I find
records that have a commonality of keywords and obviously the ones
that match the most keywords should be shown first. The keywords are
currently just stored comma-separated in a single column but I am not
attached to this. Nor are they guaranteed to come in any order (ie
they are not alpha sorted) but they could be.


I'm glad to hear that you are not attached to the format, because
this sort of format, is really ugly to work with in SQL, as it
violates a basic principle for relational database design. As Kenneth
said, store the keywords as rows.

Find the number of matches given one item would be as easy as this:

SELECT k1.id, COUNT(*)
JOIN keywords k1
WHERE EXISTS (SELECT *
FROM keywordss k2
WHERE k2.id = @currentid
AND k1.keyword = k2.keyword)
ORDER BY 2 DESC

But I will have to add the disclaimed that I have not tested the
above.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #3
>> I have a database where each row has a set of keywords separated by
commas <<

You are in violation of First Normal Form, so we are not even in a
relational model at the start of this.
a query where I find records [sic] that have a commonality of keywords <<

CREATE TABLE Keywords
(title_id INTEGER NOT NULL,
keyword VARCHAR(15) NOT NULL,
PRIMARY KEY (title_id, keyword));

load this table with your search words:

CREATE TABLE Searchwords
(keyword VARCHAR(15) NOT NULL PRIMARY KEY);

now the query is easy.

SELECT K1.title_id, COUNT(*) AS matches
Keywords AS K1,
Searchwords AS S1
WHERE K1.keyword = S1.keyword
GROUP BY K1.title_id
ORDER BY matches DESC;
... and obviously the ones that match the most keywords should be

shown first.<<

Actually, ordering is a display issue for the front end and not the
database, but you can add an ORDER BY clause. Having said all of this,
have you considered using a text search tool instead of SQL? Cheap,
fast and much more powerful for this knd of thing.

Jul 23 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

33
by: Frank | last post by:
What is the best IDE for developing PHP applications? What do you use and why? Thanks.
1
by: Chris Uwins | last post by:
Hi there, i know theres a number of ways I can achieve this but want to know the best, (but still quite simple). Up until a year ago I never used Access but have designed a few databases for...
5
by: Schoo | last post by:
I have some program constants that I need to set and I thought it would be best to set them all up in a central location so that I can change them quickly when we go from development to production....
10
by: jojobar | last post by:
Hello, I am trying to use vs.net 2005 to migrate a project originally in vs.net 2003. I started with creation of a "web site", and then created folders for each component of the site. I read...
16
by: JD | last post by:
Hi guys What's the best way to specify font size using CSS? I try to avoid absolute units like pt and px because then users can't resize the fonts in IE, but % and em are a complete pain to use...
8
by: Tim Chase | last post by:
When you have a set, known to be of length one, is there a "best" ("most pythonic") way to retrieve that one item? # given that I've got Python2.3. on hand, # hack the following two lines to get...
45
by: Summercoolness | last post by:
it seems that range() can be really slow: the following program will run, and the last line shows how long it ran for: import time startTime = time.time() a = 1.0
41
by: Jim | last post by:
Hi guys, I have an object which represents an "item" in a CMS "component" where an "item" in the most basic form just a field, and a "component" is effectively a table. "item" objects can be...
7
by: PeteSpeed | last post by:
Hi, I need help with creating an algorithm to solve the problem below in the most efficient manner (efficiency in this case means low processing, not memory) (Disclaimer: I studied discrete...
6
by: tig2810 | last post by:
Hi I was wondering if someone could let me know the best/correct way of doing this. I have a shortened example of code below: In the beginning of the class I have a List of string containing...
0
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 4 Oct 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
4
NeoPa
by: NeoPa | last post by:
Hello everyone. I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report). I know it can be done by selecting :...
3
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM) Please note that the UK and Europe revert to winter time on...
3
by: nia12 | last post by:
Hi there, I am very new to Access so apologies if any of this is obvious/not clear. I am creating a data collection tool for health care employees to complete. It consists of a number of...
0
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.