I have a database where each row has a set of keywords separated by
commas eg .
ROW 1 ID,<other columns>,"green,hat,man,boat"
ROW 2 ID,<other columns>,"green,boat,seaside"
ROW 3 ID,<other columns>,"red,hat"
I want users to be able to find similar records to the one they are
looking it (this is art BTW). ie if they are looking at row 1 and
click "Find Similar Works of Art" I want to do a query where I find
records that have a commonality of keywords and obviously the ones
that match the most keywords should be shown first. The keywords are
currently just stored comma-separated in a single column but I am not
attached to this. Nor are they guaranteed to come in any order (ie
they are not alpha sorted) but they could be.
Number of records is around 15000 on a hosted server.
Any tips for best query/algorithm/data storage to achieve this? or
references to useful sources?
Thanks,
RB 3 3152
Roland Burr wrote: I have a database where each row has a set of keywords separated by commas eg .
ROW 1 ID,<other columns>,"green,hat,man,boat" ROW 2 ID,<other columns>,"green,boat,seaside" ROW 3 ID,<other columns>,"red,hat"
I want users to be able to find similar records to the one they are looking it (this is art BTW). ie if they are looking at row 1 and click "Find Similar Works of Art" I want to do a query where I find records that have a commonality of keywords and obviously the ones that match the most keywords should be shown first. The keywords are currently just stored comma-separated in a single column but I am not attached to this. Nor are they guaranteed to come in any order (ie they are not alpha sorted) but they could be.
Number of records is around 15000 on a hosted server.
Any tips for best query/algorithm/data storage to achieve this? or references to useful sources?
Thanks, RB
normally you break the keywords out into a child table, so that you have two
tables (there may be typos in my SQL, this is off the top of my head):
CREATE TABLE artwork (
ID char(10) PRIMARY KEY,
....other columns...
)
CREATE TABLE keywords (
ID char(10) FOREIGN KEY REFERENCES artwork(ID),
keyword char(10)
)
Index on keyword and the SQL queries should write themselves.
--
Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)
Roland Burr (rb*****@hotmail.com) writes: I have a database where each row has a set of keywords separated by commas eg .
ROW 1 ID,<other columns>,"green,hat,man,boat" ROW 2 ID,<other columns>,"green,boat,seaside" ROW 3 ID,<other columns>,"red,hat"
I want users to be able to find similar records to the one they are looking it (this is art BTW). ie if they are looking at row 1 and click "Find Similar Works of Art" I want to do a query where I find records that have a commonality of keywords and obviously the ones that match the most keywords should be shown first. The keywords are currently just stored comma-separated in a single column but I am not attached to this. Nor are they guaranteed to come in any order (ie they are not alpha sorted) but they could be.
I'm glad to hear that you are not attached to the format, because
this sort of format, is really ugly to work with in SQL, as it
violates a basic principle for relational database design. As Kenneth
said, store the keywords as rows.
Find the number of matches given one item would be as easy as this:
SELECT k1.id, COUNT(*)
JOIN keywords k1
WHERE EXISTS (SELECT *
FROM keywordss k2
WHERE k2.id = @currentid
AND k1.keyword = k2.keyword)
ORDER BY 2 DESC
But I will have to add the disclaimed that I have not tested the
above.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se
Books Online for SQL Server SP3 at http://www.microsoft.com/sql/techinf...2000/books.asp
>> I have a database where each row has a set of keywords separated by
commas <<
You are in violation of First Normal Form, so we are not even in a
relational model at the start of this. a query where I find records [sic] that have a commonality of
keywords <<
CREATE TABLE Keywords
(title_id INTEGER NOT NULL,
keyword VARCHAR(15) NOT NULL,
PRIMARY KEY (title_id, keyword));
load this table with your search words:
CREATE TABLE Searchwords
(keyword VARCHAR(15) NOT NULL PRIMARY KEY);
now the query is easy.
SELECT K1.title_id, COUNT(*) AS matches
Keywords AS K1,
Searchwords AS S1
WHERE K1.keyword = S1.keyword
GROUP BY K1.title_id
ORDER BY matches DESC;
... and obviously the ones that match the most keywords should be
shown first.<<
Actually, ordering is a display issue for the front end and not the
database, but you can add an ORDER BY clause. Having said all of this,
have you considered using a text search tool instead of SQL? Cheap,
fast and much more powerful for this knd of thing. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Frank |
last post by:
What is the best IDE for developing PHP applications?
What do you use and why?
Thanks.
|
by: Chris Uwins |
last post by:
Hi there, i know theres a number of ways I can achieve this but want
to know the best, (but still quite simple).
Up until a year ago I never used Access but have designed a few
databases for...
|
by: Schoo |
last post by:
I have some program constants that I need to set and I thought it would be
best to set them all up in a central location so that I can change them
quickly when we go from development to production....
|
by: jojobar |
last post by:
Hello,
I am trying to use vs.net 2005 to migrate a project originally in vs.net
2003. I started with creation of a "web site", and then created folders for
each component of the site.
I read...
|
by: JD |
last post by:
Hi guys
What's the best way to specify font size using CSS? I try to avoid
absolute units like pt and px because then users can't resize the fonts
in IE, but % and em are a complete pain to use...
|
by: Tim Chase |
last post by:
When you have a set, known to be of length one, is there a "best"
("most pythonic") way to retrieve that one item?
# given that I've got Python2.3. on hand,
# hack the following two lines to get...
|
by: Summercoolness |
last post by:
it seems that range() can be really slow:
the following program will run, and the last line shows how long it ran
for:
import time
startTime = time.time()
a = 1.0
|
by: Jim |
last post by:
Hi guys,
I have an object which represents an "item" in a CMS "component" where
an "item" in the most basic form just a field, and a "component" is
effectively a table.
"item" objects can be...
|
by: PeteSpeed |
last post by:
Hi,
I need help with creating an algorithm to solve the problem below in the most efficient manner (efficiency in this case means low processing, not memory)
(Disclaimer: I studied discrete...
|
by: tig2810 |
last post by:
Hi
I was wondering if someone could let me know the best/correct way of doing this. I have a shortened example of code below:
In the beginning of the class I have a List of string containing...
|
by: lllomh |
last post by:
Define the method first
this.state = {
buttonBackgroundColor: 'green',
isBlinking: false, // A new status is added to identify whether the button is blinking or not
}
autoStart=()=>{
|
by: DJRhino |
last post by:
Was curious if anyone else was having this same issue or not....
I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 4 Oct 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM)
The start time is equivalent to 19:00 (7PM) in Central...
|
by: NeoPa |
last post by:
Hello everyone.
I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report).
I know it can be done by selecting :...
|
by: NeoPa |
last post by:
Introduction
For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
|
by: Teri B |
last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course.
0ne-to-many. One course many roles.
Then I created a report based on the Course form and...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM)
Please note that the UK and Europe revert to winter time on...
|
by: nia12 |
last post by:
Hi there,
I am very new to Access so apologies if any of this is obvious/not clear.
I am creating a data collection tool for health care employees to complete. It consists of a number of...
|
by: NeoPa |
last post by:
Introduction
For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...
| |