473,809 Members | 2,708 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How should I set up this database?

Imagine 25.000.000 records in a database with a table like this:

TABLE SUBJECTS

table_id (INT)
subject (VARCHAR(200))
date_added (TIMESTAMP)

The SUBJECT field is a long string. Something like these:

Example 1:
"Johnny posts 'PHP Examples' version 2'

Example 2:
"Christmas Melodies from Japan (2001)"

QUESTION:

The user should be able to search the database on the SUBJECT field and
should be able to use partial keywords like:

Keyword examples:

- lodies (finds Melodies)
- Examp (finds Examples)
- John (finds Johnny)

Right now, using WHERE LIKE takes 40 seconds since it checks every single
record. I can't use indexes either. "Of course" you might say.

But what I want to know is... what is the workaround for this? How can I
create a database which allows people to do wilcard searches while keeping
my queries lightning fast?

Other websites and applications have this... so there must be a brilliant
"trick" or workaround for this.

Any ideas?
Jul 20 '05 #1
4 1662
Ondernemer wrote:
Other websites and applications have this... so there must be a brilliant
"trick" or workaround for this.


Basic idea is to create a table where is a list of all words used in
your text. Then create another table which links these words to the
table where the actual texts are located. After searching correct words
from the word table, rest of the queries can use indexes so it should be
fast. Even if you have to check all rows in the word table, you will
still propably have less than 25 000 000 rows there.

I just wrote some time ago a pretty good example about this subject:

http://groups.google.com/groups?selm...0read3.inet.fi

I should warn you that I have not studied this method much, so there
might be better ways to implement the same thing. But the idea behind
the method should be clear enough.
Jul 20 '05 #2

"Aggro" wrote:
http://groups.google.com/groups?selm...0read3.inet.fi


Brilliant. I actually was thinking the same thing, but was afraid I was
thinking in an amateur way and there was a more efficient command/way of
doing this. :-)

Thanks m8.
Jul 20 '05 #3

"Aggro" <sp**********@y ahoo.com> wrote in message
news:2E******** *******@read3.i net.fi...
Ondernemer wrote:
Other websites and applications have this... so there must be a brilliant "trick" or workaround for this.


Basic idea is to create a table where is a list of all words used in
your text. Then create another table which links these words to the
table where the actual texts are located. After searching correct words
from the word table, rest of the queries can use indexes so it should be
fast. Even if you have to check all rows in the word table, you will
still propably have less than 25 000 000 rows there.

I just wrote some time ago a pretty good example about this subject:

http://groups.google.com/groups?selm...0read3.inet.fi

I should warn you that I have not studied this method much, so there
might be better ways to implement the same thing. But the idea behind
the method should be clear enough.


"And" will not work using a table of keywords. "Or" will work but you will
get everything. The best way to do this is to create a column with all
keywords (this must be of type MyIsam) then do a full text search.

Rich
Jul 20 '05 #4
"Ondernemer " <no**@email.com > wrote in message
news:10******** *****@corp.supe rnews.com...
Imagine 25.000.000 records in a database with a table like this:

TABLE SUBJECTS

table_id (INT)
subject (VARCHAR(200))
date_added (TIMESTAMP)

The SUBJECT field is a long string. Something like these:

Example 1:
"Johnny posts 'PHP Examples' version 2'

Example 2:
"Christmas Melodies from Japan (2001)"

QUESTION:

The user should be able to search the database on the SUBJECT field and
should be able to use partial keywords like:

Keyword examples:

- lodies (finds Melodies)
- Examp (finds Examples)
- John (finds Johnny)

Right now, using WHERE LIKE takes 40 seconds since it checks every single
record. I can't use indexes either. "Of course" you might say.

But what I want to know is... what is the workaround for this? How can I
create a database which allows people to do wilcard searches while keeping
my queries lightning fast?

Other websites and applications have this... so there must be a brilliant
"trick" or workaround for this.

Any ideas?


One quick idea would be to place a LIMIT on the returned rows (like
search engines). Look up the use of LIMIT in the mysql docs...
http://dev.mysql.com/doc/mysql/en/SELECT.html

From the page:

The LIMIT clause can be used to constrain the number of rows returned by the
SELECT statement. LIMIT takes one or two numeric arguments, which must be
integer constants. With two arguments, the first argument specifies the
offset of the first row to return, and the second specifies the maximum
number of rows to return. The offset of the initial row is 0 (not 1):

mysql> SELECT * FROM table LIMIT 5,10; # Retrieve rows 6-15
For compatibility with PostgreSQL, MySQL also supports the LIMIT row_count
OFFSET offset syntax. To retrieve all rows from a certain offset up to the
end of the result set, you can use some large number for the second
parameter. This statement retrieves all rows from the 96th row to the last:

mysql> SELECT * FROM table LIMIT 95,184467440737 09551615;
With one argument, the value specifies the number of rows to return from the
beginning of the result set:

mysql> SELECT * FROM table LIMIT 5; # Retrieve first 5 rows
--- used from the MySQL web site---
Jul 20 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

18
7391
by: cjl | last post by:
Hey all: I know that it is silly in the age of Google to 'lose' something on the internet, but I recently checked out a project that had implemented a database with a subset of SQL in pure client-side javascript. I forgot to bookmark it, and now I can't find it. Anyone?
0
1578
by: Manzoorul Hassan | last post by:
I just Installed MySQL v4.1.14 and had created a Database (from a script). But I would like to redo the Database portion of it but am not able to drop the Database. If I do a "show database;" I still see the database. But when I look in the data directory, it is not there. Below is my attempt the create / drop the DB: mysql> create database testlink; ERROR 1007 (HY000): Can't create database 'testlink'; database exists mysql> drop...
6
7348
by: Marvin Libson | last post by:
Hi All: I am running DB2 UDB V7.2 with FP11. Platform is Windows 2000. I have created a java UDF and trigger. When I update my database I get the following error: SQL1224N A database agent could not be started to service a request, or was terminated as a result of a database system shutdown or a force command.
0
2282
by: Alex | last post by:
Hi all, I've been running a db2 V8.1 databasle to store my radius server accounting info for a *long* time and have never had any problems with it. Last week we had a power outage in our computer room and since then I've been having problems with one of the 3 db2 instances running on my server ( BTW 8.1 FP4). The databases on the other 2 instances work fine but as far as the radius server one is concerned whenever i try
1
5971
by: pintur | last post by:
The message is: SQL1036C Errore di I/O durante l' accesso al database. SQLSTATE=58030 what is the proble? what for restore tables? thanks
5
34050
by: Steve | last post by:
Hi; I thought I would rephrase a question I asked in another post to more quickly get to the heart of the matter. My apologies for anyone who is offended by what appears to be a repetition. My company has two identical web sites. One copy is for our customer, and one copy is for us to test our code changes on. We developed a hard to isolate bug in the copy of the web stie for our
0
16383
debasisdas
by: debasisdas | last post by:
DATABASE LINK =============== A database link is a path through which a remote user in another database can connect to any other database. Once created database link exists as an object in the user schema. DATABASE LINK -3 TYPES ======================= PRIVATE--DEFAULT--A PRIVATE DBLINK IS ONLY AVAILABLE TO THE USER WHO HAS CREATED IT.IT IS NOT POSSIBLE FOR A USER TO GRANT ACCESS ON A PRIVATE DBLINK TO OTHER USERS. ------- PUBLIC--IT...
0
3326
by: Jack | last post by:
Training Classes for Oracle10g, 9i, 8i Certification training in Oracle10g and 9i: DBA, Developer, Discoverer. training conducted at your location worldwide. Courseware licensing also available. http://a.uuload.com/Oracle-Database.htm Unlock the Value of Your IT System Confio Software's Ignite for Oracle uses wait-event analysis to improve database performance by 65% or more. Gain visibility of every Oracle wait event. Download free...
0
3300
by: Winder | last post by:
Training Classes for Oracle10g, 9i, 8i Certification training in Oracle10g and 9i: DBA, Developer, Discoverer. training conducted at your location worldwide. Courseware licensing also available. http://a.uuload.com/Oracle-Database.htm Unlock the Value of Your IT System Confio Software's Ignite for Oracle uses wait-event analysis to improve database performance by 65% or more. Gain visibility of every Oracle wait event. Download free...
0
2564
by: Laurynn | last post by:
# (ebook - pdf) - programming - mysql - php database applicati # (Ebook - Pdf)Learnkey How To Design A Database - Sql And Crystal Report # (ebook-pdf) E F Codd - Extending the Database Relational Model to Capture More Meaning (1979) # Addison Wesley - Database Design for Mere Mortals chm # Addison Wesley - Refactoring Databases Evolutionary Database Design (2006) # Apress Beginning Databases with PostgreSQL From Novice to Professional...
0
9721
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9601
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10637
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9199
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6881
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5550
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5687
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3861
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3014
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.