By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,148 Members | 759 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,148 IT Pros & Developers. It's quick & easy.

select statement needed

P: n/a
I need a MySQL select statement as part of a PHP script. I want to find
rows where a certain column either starts with or equals a user-supplied
string. The string will be 1 or more characters in length.
The sending page allows the user to choose which search criteria he
wants, so this (and many other) pieces of the select statement may or
may not be present. The statement always starts with

SELECT * from logdata where driver = '$_COOKIE[username]'

If the user has selected them and supplied the proper data for a search,
other elements will be added to the query string. For example, it may
look like

SELECT * from logdata where driver = '$_COOKIE[username]' AND amount >=
$_POST[txtLowAmount] AND amount <= $_POST[txtHighAmount] AND restaurant
regexp???

I'm not even sure if that is where a regexp belongs or if that is how it
should be used.
Ultimately, I would prefer it be non case sensitive.
Here are some examples of what I want:

If the user string is "CHIL"
Chili's true
Charlie's false
CHIL true
ch false

Thank you in advance
Jeff Sandler
Jul 20 '05 #1
Share this Question
Share on Google+
11 Replies


P: n/a
Jeff Sandler wrote:
If the user string is "CHIL"
Chili's true
Charlie's false
CHIL true
ch false


select * from logdata where driver like 'CHIL%';

If you also want that:
a CHIL true

You will need:
select * from logdata where driver like '%CHIL%';
Jul 20 '05 #2

P: n/a
Aggro wrote:
You will need:
select * from logdata where driver like '%CHIL%';


Additional note: If you use the % in the front of word, index in that
column won't be used. So be carefull if you have a lot of rows in that
table.
Jul 20 '05 #3

P: n/a
wow..
that's hard..
now I know why my database string query is very slow.

how can I look for a word in a tekst field the most efficient?
can I index on some words (e.g. frequent used words).

"Aggro" <sp**********@yahoo.com> wrote in message
news:qv***************@read3.inet.fi...
: Aggro wrote:
:
: > You will need:
: > select * from logdata where driver like '%CHIL%';
:
: Additional note: If you use the % in the front of word, index in that
: column won't be used. So be carefull if you have a lot of rows in that
: table.
Jul 20 '05 #4

P: n/a
Wouter wrote:
how can I look for a word in a tekst field the most efficient?
can I index on some words (e.g. frequent used words).


If you have only about 20 or 100 unique words in your table that repeat
there multiple times, you should put these words in a second table with
numerical id value.

Then in your first table, instead of the word, use the id value from the
second table. Then you can run "like '%something%'" to second table and
use the result id values to first table "select * from first where
word_id in (1,2,3,4,7);"

If your first table contains sentences, then you can again create a
second table and put all the unique words found, in the sentences, in
the second table. Your second table is normally smaller than the first
table (with human languages), but this is not always the case (with
random strings).

But if the index-table grows big, your queries will propably be slow.
Note that Google for example doesn't allow to use * or % in the words
for this reason. The index table is large, because there are many
different languages used in the Internet.
Jul 20 '05 #5

P: n/a
If I do understand you correctly, I need to make a second table like this:

word_id, word

and add in the first table an extra ENUM field that contains the words_ids.

The search in the new table for the maching words getting the word_id s.
Then look-up in the first table in the new field the maching word_id
combination.

My table is about 1GByte containg newsarticles in a text field.

Wouter

"Aggro" <sp**********@yahoo.com> wrote in message
news:Fm**************@read3.inet.fi...
: Wouter wrote:
:
: > how can I look for a word in a tekst field the most efficient?
: > can I index on some words (e.g. frequent used words).
:
: If you have only about 20 or 100 unique words in your table that repeat
: there multiple times, you should put these words in a second table with
: numerical id value.
:
: Then in your first table, instead of the word, use the id value from the
: second table. Then you can run "like '%something%'" to second table and
: use the result id values to first table "select * from first where
: word_id in (1,2,3,4,7);"
:
: If your first table contains sentences, then you can again create a
: second table and put all the unique words found, in the sentences, in
: the second table. Your second table is normally smaller than the first
: table (with human languages), but this is not always the case (with
: random strings).
:
: But if the index-table grows big, your queries will propably be slow.
: Note that Google for example doesn't allow to use * or % in the words
: for this reason. The index table is large, because there are many
: different languages used in the Internet.
Jul 20 '05 #6

P: n/a
Wouter wrote:
If I do understand you correctly, I need to make a second table like this:

word_id, word
I would prefer column names: id and word in the second table. And in the
first table word_id would be good, assuming the second table name is
word. Because in the second table the id would be the id pointing to the
row itself, and normally "id" is used as a name in those cases.
and add in the first table an extra ENUM field that contains the words_ids.
Not enum, simple int unsigned will do, but make sure that each id you
put in word_id would have a matching row in "word" table.
The search in the new table for the maching words getting the word_id s.
Then look-up in the first table in the new field the maching word_id
combination.
Yep. I guess you could also use joins to make it happen inside one
query, but joins can be tricky to use so I suggest you to try first with
two queries and measure some times and then try with the join to see if
it affects the speed.
My table is about 1GByte containg newsarticles in a text field.


Creating the word table might take some time on slow computers ;) But if
there is only one language used, I would make a guess that this would
speed up queries quite a bit. But with this method you can't search for
full sentences. Unless you first use the words to find corresponding
rows in the first table and then only to those rows run a text search
for the given sentence. This should speed things up to the normal, since
you don't have to check all the rows in that table.
Jul 20 '05 #7

P: n/a
Wouter wrote:
If I do understand you correctly, I need to make a second table like this:
word_id, word
and add in the first table an extra ENUM field that contains the words_ids.
Another method is to create a many-to-many table, mapping words to
articles. That is, a table with fields article_id, word_id that
reference the primary keys of the article table and the word table
respectively.

The advantage of this is that you can add more words to the list of
indexed words easily. You can also search for occurrances of articles
matching a specific word through simple INNER JOIN conditions, instead
of SET operations.
My table is about 1GByte containg newsarticles in a text field.


That's a pretty sizeable table, but how much does storage cost per 1GB
these days, about US $0.50?

Yet another strategy for text searches in MySQL is to use a full-text
index. See http://dev.mysql.com/doc/mysql/en/Fulltext_Search.html

Full text index is not the best solution in all situations, but it's
quick to set up, and will give some benefit while you develop a more
custom solution.

Regards,
Bill K.
Jul 20 '05 #8

P: n/a
ok let's make an example:
Table1:

id, article_titel, article_text, word_id

Table2:

id, word
Then a user searches for:

"George W. Bush"
And as resut I'll give the user all articles that contain
"George", "W." and "Bush"
I lookup these 3 words in Table2.
Now I don't understand how I should find the correct article in table 1.
If word_id was a ENUM in the first table, I would search for all that have
all 3 id's from the search in table2.

But now word_id is an integer.
I do not follow what to so.
Yes, I understand that indexing in this way takes a lot of time, but if it
speeds up the queries, it would be wonderfull :)
(And yes, I have a slow computer; XP 2000+ :).

Wouter


"Aggro" <sp**********@yahoo.com> wrote in message
news:Qs**************@read3.inet.fi...
: Wouter wrote:
:
: > If I do understand you correctly, I need to make a second table like
this:
: >
: > word_id, word
:
: I would prefer column names: id and word in the second table. And in the
: first table word_id would be good, assuming the second table name is
: word. Because in the second table the id would be the id pointing to the
: row itself, and normally "id" is used as a name in those cases.
:
: > and add in the first table an extra ENUM field that contains the
words_ids.
:
: Not enum, simple int unsigned will do, but make sure that each id you
: put in word_id would have a matching row in "word" table.
:
: > The search in the new table for the maching words getting the word_id s.
: > Then look-up in the first table in the new field the maching word_id
: > combination.
:
: Yep. I guess you could also use joins to make it happen inside one
: query, but joins can be tricky to use so I suggest you to try first with
: two queries and measure some times and then try with the join to see if
: it affects the speed.
:
: > My table is about 1GByte containg newsarticles in a text field.
:
: Creating the word table might take some time on slow computers ;) But if
: there is only one language used, I would make a guess that this would
: speed up queries quite a bit. But with this method you can't search for
: full sentences. Unless you first use the words to find corresponding
: rows in the first table and then only to those rows run a text search
: for the given sentence. This should speed things up to the normal, since
: you don't have to check all the rows in that table.
Jul 20 '05 #9

P: n/a
Wouter wrote:
Then a user searches for:

"George W. Bush"
And as resut I'll give the user all articles that contain
"George", "W." and "Bush"
If I may suggest, convert all non alphabet characters into white space
and then trip all white spaces into one 1 white space between words.
Then from the list of words, drop off at least all words that have less
than 3 characters. Also make all words lowercase. Also be sure not to
insert the same word twice in the word table (propably would be best to
set in unique) and not to attach the same word twice in the article_word
table (I tell more about this table below).

I lookup these 3 words in Table2.
Now I don't understand how I should find the correct article in table 1.
If word_id was a ENUM in the first table, I would search for all that have
all 3 id's from the search in table2.

But now word_id is an integer.
I do not follow what to so.


Uh, my mistake, let's make a full example so I won't give any more wrong
ideas ;)

create table article(
id int unsigned primary key auto_increment,
titel text,
article_text text );

create table word(
id int unsigned primary key auto_increment,
word varchar(255) unique );

create table article_word(
article_id int unsigned,
word_id int unsigned
);

# indexes would be nice too

create index article_word_index on article_word(word_id);

# You could index the word table also, but that is usefull only if you
# don't use the % in front of words. If you can live with the idea that
# only full words can be used, you should get pretty fast queries by
# indexing the word table also.

insert into article values( 1, 'title1', 'I just found out that George
W. Bush is the president of USA, again.' );
insert into article values( 2, 'title2', 'Just some text' );
insert into article values( 3, 'title3', 'I repeat the word George.' );

insert into word values( 1, 'just' );
insert into word values( 2, 'found' );
insert into word values( 3, 'out' );
insert into word values( 4, 'that' );
insert into word values( 5, 'george' );
insert into word values( 6, 'bush' );
insert into word values( 7, 'the' );
insert into word values( 8, 'president' );
insert into word values( 9, 'usa' );
insert into word values( 10, 'again' );
insert into word values( 11, 'this' );
insert into word values( 12, 'some' );
insert into word values( 13, 'text' );
insert into word values( 14, 'repeat' );
insert into word values( 15, 'word' );

insert into article_word values( 1, 1);
insert into article_word values( 1, 2);
insert into article_word values( 1, 3);
insert into article_word values( 1, 4);
insert into article_word values( 1, 5);
insert into article_word values( 1, 6);
insert into article_word values( 1, 7);
insert into article_word values( 1, 8);
insert into article_word values( 1, 9);
insert into article_word values( 2, 1);
insert into article_word values( 2, 12);
insert into article_word values( 2, 13);
insert into article_word values( 3, 14);
insert into article_word values( 3, 15);
insert into article_word values( 3, 5);

And now to the queries:

mysql> select * from word where word like '%George%' or word like '%Bush%';
+----+--------+
| id | word |
+----+--------+
| 5 | george |
| 6 | bush |
+----+--------+
2 rows in set (0.00 sec)

mysql> select article_id,article_text from article_word,article where
article_word.word_id in(5,6) and article.id = article_word.article_id
group by article_id;

1 | I just found out that George W. Bush is the president of USA, again.
3 | I repeat the word George.

2 rows in set (0.01 sec)

---------------------
To search only articles where all the worlds all, is a bit harder with
one query. Easy solution that comes to my mind, but returls also perhaps
non wanter results:

select count(*) as matches, article_id,article_text from
article_word,article where article_word.word_id in(5,6) and article.id =
article_word.article_id group by article_id order by matches desc;

This would return also articles that don't match all the words used, but
sorts them so that articles where most words match, are returned first.
Jul 20 '05 #10

P: n/a
AH :)
I see :)

I think GROUP would slow down the process, right?
I see this is easy to maintain.

Is this faster than the SET thing?
Hmm.. set can only have 64 words..
ok, so that's no option :)

My words are never larger then 42 characters.
I'n not shure why ;-)
Thanks a lot, when I have bought a new hard disk (I have more then 1 table
of 1 GB) I'll make this index construction.
And of cause I'll let you all know the result.

Does the query speed up when I change from MyISAM to InnoDB?
(because InnoDB takes a lot more space).

Wouter


"Aggro" <sp**********@yahoo.com> wrote in message
news:Hp**************@read3.inet.fi...
: Wouter wrote:
:
: > Then a user searches for:
: >
: > "George W. Bush"
: >
: >
: > And as resut I'll give the user all articles that contain
: > "George", "W." and "Bush"
:
: If I may suggest, convert all non alphabet characters into white space
: and then trip all white spaces into one 1 white space between words.
: Then from the list of words, drop off at least all words that have less
: than 3 characters. Also make all words lowercase. Also be sure not to
: insert the same word twice in the word table (propably would be best to
: set in unique) and not to attach the same word twice in the article_word
: table (I tell more about this table below).
:
: >
: > I lookup these 3 words in Table2.
: >
: >
: > Now I don't understand how I should find the correct article in table 1.
: > If word_id was a ENUM in the first table, I would search for all that
have
: > all 3 id's from the search in table2.
: >
: > But now word_id is an integer.
: > I do not follow what to so.
:
: Uh, my mistake, let's make a full example so I won't give any more wrong
: ideas ;)
:
: create table article(
: id int unsigned primary key auto_increment,
: titel text,
: article_text text );
:
: create table word(
: id int unsigned primary key auto_increment,
: word varchar(255) unique );
:
: create table article_word(
: article_id int unsigned,
: word_id int unsigned
: );
:
: # indexes would be nice too
:
: create index article_word_index on article_word(word_id);
:
: # You could index the word table also, but that is usefull only if you
: # don't use the % in front of words. If you can live with the idea that
: # only full words can be used, you should get pretty fast queries by
: # indexing the word table also.
:
: insert into article values( 1, 'title1', 'I just found out that George
: W. Bush is the president of USA, again.' );
: insert into article values( 2, 'title2', 'Just some text' );
: insert into article values( 3, 'title3', 'I repeat the word George.' );
:
: insert into word values( 1, 'just' );
: insert into word values( 2, 'found' );
: insert into word values( 3, 'out' );
: insert into word values( 4, 'that' );
: insert into word values( 5, 'george' );
: insert into word values( 6, 'bush' );
: insert into word values( 7, 'the' );
: insert into word values( 8, 'president' );
: insert into word values( 9, 'usa' );
: insert into word values( 10, 'again' );
: insert into word values( 11, 'this' );
: insert into word values( 12, 'some' );
: insert into word values( 13, 'text' );
: insert into word values( 14, 'repeat' );
: insert into word values( 15, 'word' );
:
: insert into article_word values( 1, 1);
: insert into article_word values( 1, 2);
: insert into article_word values( 1, 3);
: insert into article_word values( 1, 4);
: insert into article_word values( 1, 5);
: insert into article_word values( 1, 6);
: insert into article_word values( 1, 7);
: insert into article_word values( 1, 8);
: insert into article_word values( 1, 9);
: insert into article_word values( 2, 1);
: insert into article_word values( 2, 12);
: insert into article_word values( 2, 13);
: insert into article_word values( 3, 14);
: insert into article_word values( 3, 15);
: insert into article_word values( 3, 5);
:
: And now to the queries:
:
: mysql> select * from word where word like '%George%' or word like
'%Bush%';
: +----+--------+
: | id | word |
: +----+--------+
: | 5 | george |
: | 6 | bush |
: +----+--------+
: 2 rows in set (0.00 sec)
:
: mysql> select article_id,article_text from article_word,article where
: article_word.word_id in(5,6) and article.id = article_word.article_id
: group by article_id;
:
: 1 | I just found out that George W. Bush is the president of USA, again.
: 3 | I repeat the word George.
:
: 2 rows in set (0.01 sec)
:
: ---------------------
: To search only articles where all the worlds all, is a bit harder with
: one query. Easy solution that comes to my mind, but returls also perhaps
: non wanter results:
:
: select count(*) as matches, article_id,article_text from
: article_word,article where article_word.word_id in(5,6) and article.id =
: article_word.article_id group by article_id order by matches desc;
:
: This would return also articles that don't match all the words used, but
: sorts them so that articles where most words match, are returned first.
Jul 20 '05 #11

P: n/a
Wouter wrote:
Does the query speed up when I change from MyISAM to InnoDB?
(because InnoDB takes a lot more space).


I have not deployed any significant project using InnoDB yet, but my
understanding is that it is usually slower than MyISAM. But in
exchange, you gain features like referential integrity, cascading
updates/deletes, and support for transactions.

Regards,
Bill K.
Jul 20 '05 #12

This discussion thread is closed

Replies have been disabled for this discussion.