473,394 Members | 1,739 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Document Search for CMS

I have a database which stores documents (e.g. policies and guidelines)
in our content management system. The follow relationships exist:

Documents (1-to-Many) LinkTableKeywordLink (Many-to-1) Keywords

Documents (1-to-Many) LinkTableAttachments (Many-to-1) Attachments

Documents (1-to-Many) LookUpSubjects

Documents (1-to-Many) LookUpDocType

When the user conducts a search I want the SQL to check if the string
they enter is present in:

* the Title, Author, Summary fields of Documents OR
* the Title or Path of the Attachments
* the Keywords that are links OR
* the Subject that is linked OR
* the DocType that is linked.

Is this possible in one SQL query. I have tried the following:

SELECT DISTINCT Documents.Document, Documents.Title,Priority

FROM Documents,
Attachments,Keywords,LinkTableAttachments,LinkTabl eKeywordLink

WHERE
((Documents.Document=LinkTableAttachments.Document ID and
LinkTableAttachments.AttachmentID=Attachments.Id) OR
(Documents.Document=LinkTableKeywordLink.DocumentI D and
LinkTableKeywordLink.KeywordID=Keywords.Id))

AND (Documents.Title Like '%SEARCHSTRING%' OR Documents.Author Like
'%SEARCHSTRING%' OR Documents.Summary Like '%SEARCHSTRING%' OR
Keywords.Keyword Like '%SEARCHSTRING%' OR DocType.DocType Like
'%SEARCHSTRING%' OR Attachments.AttachmentTitle Like '%SEARCHSTRING%'
OR Attachments.Path like '%SEARCHSTRING%' or Subjects.Subject Like
'%SEARCHSTRING%')

AND StartDate<=getDate() and ReviewDate>getDate() order by Title
but this causes, perhaps understandably, a timeout error.

Any thoughts?

Thanks!
Steve

Dec 28 '06 #1
4 1303
(St***********@gmail.com) writes:
Is this possible in one SQL query. I have tried the following:

SELECT DISTINCT Documents.Document, Documents.Title,Priority

FROM Documents,
Attachments,Keywords,LinkTableAttachments,LinkTabl eKeywordLink

WHERE
((Documents.Document=LinkTableAttachments.Document ID and
LinkTableAttachments.AttachmentID=Attachments.Id) OR
(Documents.Document=LinkTableKeywordLink.DocumentI D and
LinkTableKeywordLink.KeywordID=Keywords.Id))

AND (Documents.Title Like '%SEARCHSTRING%' OR Documents.Author Like
'%SEARCHSTRING%' OR Documents.Summary Like '%SEARCHSTRING%' OR
Keywords.Keyword Like '%SEARCHSTRING%' OR DocType.DocType Like
'%SEARCHSTRING%' OR Attachments.AttachmentTitle Like '%SEARCHSTRING%'
OR Attachments.Path like '%SEARCHSTRING%' or Subjects.Subject Like
'%SEARCHSTRING%')

AND StartDate<=getDate() and ReviewDate>getDate() order by Title
but this causes, perhaps understandably, a timeout error.
The most crucial is probably the date conditions. This is a difficult
one there are two dates. But a start at least is make sure that the
query is parameterised, so the last line:

AND StartDate <= @now and ReviewDate @now order by Title

And with parameter I don't mean that you do "SELECT @now = getdate()"
just before the query. You need it to pass @now as a parameter to a
stored procedure or a parameterised query. Then the optimzer will
look at the value to determine how selective an index might be and
use it. If you use an unknown value, the optimizer will assume that 30%
of the rows will match, which is far too many,
If the search columns should are indexed, you should distinguish between
"starts with" and "contains" for the indexes to be useful. That is, only
include % first the search string, if the user actually wants "conatins",
because indexes are only good for starting values.

Adjust the application to have no timeout (= 0) or a very long timeout,
because in some cases the user may need to do searches that take a very long
time.

Finally, drop the DISTINCT, and use the EXISTS predicate to get from
the subtables. That is more efficient,since there no need to sort
the result to weed out the duplicates.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Dec 28 '06 #2
Hi - and thanks for the feedback.

If I run the query with against just one of the 1-many-many-1 table
relationships it works perfectly. The issue, I think, is having two
OR-ed together. When running against one set (e.g. just keywords and
not attachments) the search is very quick.

Best wishes,
Steve
Erland Sommarskog wrote:

>
The most crucial is probably the date conditions. This is a difficult
one there are two dates. But a start at least is make sure that the
query is parameterised, so the last line:

AND StartDate <= @now and ReviewDate @now order by Title

And with parameter I don't mean that you do "SELECT @now = getdate()"
just before the query. You need it to pass @now as a parameter to a
stored procedure or a parameterised query. Then the optimzer will
look at the value to determine how selective an index might be and
use it. If you use an unknown value, the optimizer will assume that 30%
of the rows will match, which is far too many,
If the search columns should are indexed, you should distinguish between
"starts with" and "contains" for the indexes to be useful. That is, only
include % first the search string, if the user actually wants "conatins",
because indexes are only good for starting values.

Adjust the application to have no timeout (= 0) or a very long timeout,
because in some cases the user may need to do searches that take a very long
time.

Finally, drop the DISTINCT, and use the EXISTS predicate to get from
the subtables. That is more efficient,since there no need to sort
the result to weed out the duplicates.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Dec 29 '06 #3
(St***********@gmail.com) writes:
Hi - and thanks for the feedback.

If I run the query with against just one of the 1-many-many-1 table
relationships it works perfectly. The issue, I think, is having two
OR-ed together. When running against one set (e.g. just keywords and
not attachments) the search is very quick.
The more tables you throw into the mix, the more tables there are to search.

If you want me or anyone else to say somehting more useful, we would need to
see:

o CREATE TABLE and INDEX statements for the tables.
o The fast query (and repeat the slow while you are it).
o The output from SET STATISTICS PROFILE ON when you run the two from
a query tool. This output is very wide, so put in an attachment if
possible.
o Which version of SQL Server you are running.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Dec 29 '06 #4
If you want me or anyone else to say somehting more useful, we would need to
see:

o CREATE TABLE and INDEX statements for the tables.
o The fast query (and repeat the slow while you are it).
o The output from SET STATISTICS PROFILE ON when you run the two from
a query tool. This output is very wide, so put in an attachment if
possible.
o Which version of SQL Server you are running.
Hi. Thanks. The question was really about whether the SQL is valid
(given the relationships that are defined) and/or whether it could be
written more efficiently.

Jan 16 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Alexandre Plennevaux | last post by:
hello mates, my question is such: should i turn a 80 pages doc file into a database so that: - i can make it searchable through an online form - it can be modified and every modification...
0
by: David Furey | last post by:
Hi, I am using the XSLT document to filter records from an XML document The XSL is shown below: I have a parameter named "search-for" that is used to bring back a list of Vendor Names that...
4
by: Dante | last post by:
Hello. I have a Javascript that gets data from an XML document and displays it through javascript. The problem is that when I do dcfile.getElementsByTagName("subhead").firstChild.nodeName all I...
8
by: Phil Powell | last post by:
if (document.location.href.indexOf('?') >= 0) document.location.href = document.location.href.substring(0, document.location.href.indexOf('?')); if (document.location.href.indexOf('#') >= 0) {...
20
by: weston | last post by:
I've got a piece of code where, for all the world, it looks like this fails in IE 6: hometab = document.getElementById('hometab'); but this succeeds: hometabemt =...
5
by: garfy | last post by:
Hi i get this error in validation Line 22 column 6: document type does not allow element "title" here. <title>Seo Web Design Los Angeles - Web Design And Search Engine Optimization L ...
3
by: =?iso-8859-1?q?Marcos_Jos=E9_Setim?= | last post by:
Hi, I would like to know if is better to use document.forms to detect forms or getElementById. Thanks
8
by: geoffdude | last post by:
Hi everyone, newbie here. I'm really close to finishing my zip code search filter thingy to return a specific message (ultimately a unique phone number for sales contact) after a form query (of...
6
by: Rob | last post by:
Hello, I'm sure this has come up before. I have need for a collection of all elements/objects in an HTML document that have any kind of an attribute (HTML or CSS) that is making use of a URL to...
5
by: ankit1999 | last post by:
I have a problem, everytime i'm run this page http://click2travel.in/index.php i get the this error,,,
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.