By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,691 Members | 1,997 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,691 IT Pros & Developers. It's quick & easy.

questions about tsearch2 (for czech language)

P: n/a
Hello

I try tsearch2 within czech environment. It is works fine, but I have two
questions.

1. I have words "se", "ve" in my czech stop words. But I get this words in
result. Why? Have I problem with my configuration?

tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
Skalici.');
ts_name | tok_type | description | token | dict_name | tsvector
---------------+----------+-------------+---------+-------------+-----------
default_czech | lword | Latin word | jmenuji | {cz_ispell} |
'jmenuji'
default_czech | lword | Latin word | se | {cz_ispell} | 'se'
default_czech | lword | Latin word | Pavel | {cz_ispell} | 'pavel'
default_czech | word | Word | Stěhule | {cz_ispell} |
default_czech | lword | Latin word | a | {cz_ispell} |
default_czech | word | Word | bydlím | {cz_ispell} | 'bydlet'
default_czech | lword | Latin word | ve | {cz_ispell} | 've'
default_czech | lword | Latin word | Skalici | {cz_ispell} |
'skalici'
(8 řádek)

tsearch2=# select * from pg_ts_cfgmap where ts_name='default_czech';
ts_name | tok_alias | dict_name
---------------+--------------+-------------
default_czech | email | {simple}
default_czech | file | {simple}
default_czech | float | {simple}
default_czech | host | {simple}
default_czech | hword | {cz_ispell}
default_czech | int | {simple}
default_czech | lhword | {cz_ispell}
default_czech | lpart_hword | {cz_ispell}
default_czech | lword | {cz_ispell}
default_czech | nlhword | {cz_ispell}
default_czech | nlpart_hword | {cz_ispell}
default_czech | nlword | {cz_ispell}
default_czech | part_hword | {simple}
default_czech | sfloat | {simple}
default_czech | uint | {simple}
default_czech | uri | {simple}
default_czech | url | {simple}
default_czech | version | {simple}
default_czech | word | {cz_ispell}
(19 řádek)

2. I use small czech dictionary. I need don't erase words which aren't in
dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
simple dict into cfg map, but witout sucess

tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
Skalici.'); ts_name | tok_type | description | token |
dict_name | tsvector
---------------+----------+-------------+---------+--------------------+-----------
default_czech | word | Word | Stěhule | {cz_ispell,simple} |
default_czech | lword | Latin word | a | {cz_ispell,simple} |
default_czech | word | Word | bydlím | {cz_ispell,simple} |
'bydlet'
Thank You
Pavel Stehule
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 12 '05 #1
Share this Question
Share on Google+
9 Replies


P: n/a
On Mon, 22 Dec 2003, Pavel Stehule wrote:
Hello

I try tsearch2 within czech environment. It is works fine, but I have two
questions.

1. I have words "se", "ve" in my czech stop words. But I get this words in
result. Why? Have I problem with my configuration?
did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;


tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
Skalici.');
ts_name | tok_type | description | token | dict_name | tsvector
---------------+----------+-------------+---------+-------------+-----------
default_czech | lword | Latin word | jmenuji | {cz_ispell} |
'jmenuji'
default_czech | lword | Latin word | se | {cz_ispell} | 'se'
default_czech | lword | Latin word | Pavel | {cz_ispell} | 'pavel'
default_czech | word | Word | Stěhule | {cz_ispell} |
default_czech | lword | Latin word | a | {cz_ispell} |
default_czech | word | Word | bydlím | {cz_ispell} | 'bydlet'
default_czech | lword | Latin word | ve | {cz_ispell} | 've'
default_czech | lword | Latin word | Skalici | {cz_ispell} |
'skalici'
(8 řádek)

tsearch2=# select * from pg_ts_cfgmap where ts_name='default_czech';
ts_name | tok_alias | dict_name
---------------+--------------+-------------
default_czech | email | {simple}
default_czech | file | {simple}
default_czech | float | {simple}
default_czech | host | {simple}
default_czech | hword | {cz_ispell}
default_czech | int | {simple}
default_czech | lhword | {cz_ispell}
default_czech | lpart_hword | {cz_ispell}
default_czech | lword | {cz_ispell}
default_czech | nlhword | {cz_ispell}
default_czech | nlpart_hword | {cz_ispell}
default_czech | nlword | {cz_ispell}
default_czech | part_hword | {simple}
default_czech | sfloat | {simple}
default_czech | uint | {simple}
default_czech | uri | {simple}
default_czech | url | {simple}
default_czech | version | {simple}
default_czech | word | {cz_ispell}
(19 řádek)

2. I use small czech dictionary. I need don't erase words which aren't in
dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
simple dict into cfg map, but witout sucess

Example, please ! What do you mean 'erase words' ?

tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
Skalici.'); ts_name | tok_type | description | token |
dict_name | tsvector
---------------+----------+-------------+---------+--------------------+-----------
default_czech | word | Word | Stěhule | {cz_ispell,simple} |
default_czech | lword | Latin word | a | {cz_ispell,simple} |
default_czech | word | Word | bydlím | {cz_ispell,simple} |
'bydlet'
Thank You
Pavel Stehule
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 12 '05 #2

P: n/a
> > result. Why? Have I problem with my configuration?

did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;

tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy

2. I use small czech dictionary. I need don't erase words which aren't in
dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
simple dict into cfg map, but witout sucess


Example, please ! What do you mean 'erase words' ?

tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
Skalici.'); ts_name | tok_type | description | token |
dict_name | tsvector
---------------+----------+-------------+---------+--------------------+-----------
default_czech | word | Word | Stěhule | {cz_ispell,simple} |
default_czech | lword | Latin word | a | {cz_ispell,simple} |
default_czech | word | Word | bydlím | {cz_ispell,simple} |
'bydlet'


If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 12 '05 #3

P: n/a
Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:
result. Why? Have I problem with my configuration?


did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;

tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy

2. I use small czech dictionary. I need don't erase words which aren't in
dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
simple dict into cfg map, but witout sucess


Example, please ! What do you mean 'erase words' ?

tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
Skalici.'); ts_name | tok_type | description | token |
dict_name | tsvector
---------------+----------+-------------+---------+--------------------+-----------
default_czech | word | Word | Stěhule | {cz_ispell,simple} |
default_czech | lword | Latin word | a | {cz_ispell,simple} |
default_czech | word | Word | bydlím | {cz_ispell,simple} |
'bydlet'


If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #4

P: n/a
Oleg

You has true. After restart of postmaster all works fine.

tsearch2=# select to_tsvector('default_czech','Jmenuji se Pavel Stěhule');
to_tsvector
------------------------------------
'pavel':3 'stěhule':4 'jmenovat':1

Thank You very much

Pavel Stehule
On Mon, 22 Dec 2003, Oleg Bartunov wrote:
Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:
> result. Why? Have I problem with my configuration?

did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;

tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy
>
> 2. I use small czech dictionary. I need don't erase words which aren't in
> dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
> simple dict into cfg map, but witout sucess
>

Example, please ! What do you mean 'erase words' ?
> tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
> Skalici.'); ts_name | tok_type | description | token |
> dict_name | tsvector
> ---------------+----------+-------------+---------+--------------------+-----------
> default_czech | word | Word | Stěhule | {cz_ispell,simple} |
> default_czech | lword | Latin word | a | {cz_ispell,simple} |
> default_czech | word | Word | bydlím | {cz_ispell,simple} |
> 'bydlet'
>
>


If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #5

P: n/a
> You has true. After restart of postmaster all works fine.
One comment, you don't need restart postmaster, you should reconnect to
postgresql by exit and start psql. Every new connect creates new child of
postmaster.


tsearch2=# select to_tsvector('default_czech','Jmenuji se Pavel Stěhule');
to_tsvector
------------------------------------
'pavel':3 'stěhule':4 'jmenovat':1

Thank You very much

Pavel Stehule
On Mon, 22 Dec 2003, Oleg Bartunov wrote:

Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:

>result. Why? Have I problem with my configuration?

did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;
tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy
>2. I use small czech dictionary. I need don't erase words which aren't in
>dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
>simple dict into cfg map, but witout sucess
>

Example, please ! What do you mean 'erase words' ?

>tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
>Skalici.'); ts_name | tok_type | description | token |
>dict_name | tsvector
>---------------+----------+-------------+---------+--------------------+-----------
> default_czech | word | Word | Stěhule | {cz_ispell,simple} |
> default_czech | lword | Latin word | a | {cz_ispell,simple} |
> default_czech | word | Word | bydlím | {cz_ispell,simple} |
>'bydlet'
>
>

If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)


Regards,
Oleg
________________________________________________ _____________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly


--
Teodor Sigaev E-mail: te****@sigaev.ru
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 12 '05 #6

P: n/a


On Tue, 23 Dec 2003, Teodor Sigaev wrote:
You has true. After restart of postmaster all works fine.

One comment, you don't need restart postmaster, you should reconnect to
postgresql by exit and start psql. Every new connect creates new child of
postmaster.

true, but I like hard solutions, :->
"/etc/init.d/postgresql restart" is my top command

I work only one on this database, a can use en force.

Pavel

tsearch2=# select to_tsvector('default_czech','Jmenuji se Pavel Stěhule');
to_tsvector
------------------------------------
'pavel':3 'stěhule':4 'jmenovat':1

Thank You very much

Pavel Stehule
On Mon, 22 Dec 2003, Oleg Bartunov wrote:

Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:
>>result. Why? Have I problem with my configuration?
>
>did you specify stop words in dictionaries configuration ?
>
>select * from pg_ts_dict;
>

tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy
>>2. I use small czech dictionary. I need don't erase words which aren't in
>>dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
>>simple dict into cfg map, but witout sucess
>>
>
>Example, please ! What do you mean 'erase words' ?
>
>
>
>>tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
>>Skalici.'); ts_name | tok_type | description | token |
>>dict_name | tsvector
>>---------------+----------+-------------+---------+--------------------+-----------
>> default_czech | word | Word | Stěhule | {cz_ispell,simple} |
>> default_czech | lword | Latin word | a | {cz_ispell,simple} |
>> default_czech | word | Word | bydlím | {cz_ispell,simple} |
>>'bydlet'
>>
>>

If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)

Regards,
Oleg
________________________________________________ _____________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 12 '05 #7

P: n/a
Oleg

You has true. After restart of postmaster all works fine.

tsearch2=# select to_tsvector('default_czech','Jmenuji se Pavel Stěhule');
to_tsvector
------------------------------------
'pavel':3 'stěhule':4 'jmenovat':1

Thank You very much

Pavel Stehule
On Mon, 22 Dec 2003, Oleg Bartunov wrote:
Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:
> result. Why? Have I problem with my configuration?

did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;

tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy
>
> 2. I use small czech dictionary. I need don't erase words which aren't in
> dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
> simple dict into cfg map, but witout sucess
>

Example, please ! What do you mean 'erase words' ?
> tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
> Skalici.'); ts_name | tok_type | description | token |
> dict_name | tsvector
> ---------------+----------+-------------+---------+--------------------+-----------
> default_czech | word | Word | Stěhule | {cz_ispell,simple} |
> default_czech | lword | Latin word | a | {cz_ispell,simple} |
> default_czech | word | Word | bydlím | {cz_ispell,simple} |
> 'bydlet'
>
>


If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #8

P: n/a
> You has true. After restart of postmaster all works fine.
One comment, you don't need restart postmaster, you should reconnect to
postgresql by exit and start psql. Every new connect creates new child of
postmaster.


tsearch2=# select to_tsvector('default_czech','Jmenuji se Pavel Stěhule');
to_tsvector
------------------------------------
'pavel':3 'stěhule':4 'jmenovat':1

Thank You very much

Pavel Stehule
On Mon, 22 Dec 2003, Oleg Bartunov wrote:

Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:

>result. Why? Have I problem with my configuration?

did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;
tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy
>2. I use small czech dictionary. I need don't erase words which aren't in
>dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
>simple dict into cfg map, but witout sucess
>

Example, please ! What do you mean 'erase words' ?

>tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
>Skalici.'); ts_name | tok_type | description | token |
>dict_name | tsvector
>---------------+----------+-------------+---------+--------------------+-----------
> default_czech | word | Word | Stěhule | {cz_ispell,simple} |
> default_czech | lword | Latin word | a | {cz_ispell,simple} |
> default_czech | word | Word | bydlím | {cz_ispell,simple} |
>'bydlet'
>
>

If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)


Regards,
Oleg
________________________________________________ _____________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly


--
Teodor Sigaev E-mail: te****@sigaev.ru
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 12 '05 #9

P: n/a


On Tue, 23 Dec 2003, Teodor Sigaev wrote:
You has true. After restart of postmaster all works fine.

One comment, you don't need restart postmaster, you should reconnect to
postgresql by exit and start psql. Every new connect creates new child of
postmaster.

true, but I like hard solutions, :->
"/etc/init.d/postgresql restart" is my top command

I work only one on this database, a can use en force.

Pavel

tsearch2=# select to_tsvector('default_czech','Jmenuji se Pavel Stěhule');
to_tsvector
------------------------------------
'pavel':3 'stěhule':4 'jmenovat':1

Thank You very much

Pavel Stehule
On Mon, 22 Dec 2003, Oleg Bartunov wrote:

Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:
>>result. Why? Have I problem with my configuration?
>
>did you specify stop words in dictionaries configuration ?
>
>select * from pg_ts_dict;
>

tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy
>>2. I use small czech dictionary. I need don't erase words which aren't in
>>dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
>>simple dict into cfg map, but witout sucess
>>
>
>Example, please ! What do you mean 'erase words' ?
>
>
>
>>tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
>>Skalici.'); ts_name | tok_type | description | token |
>>dict_name | tsvector
>>---------------+----------+-------------+---------+--------------------+-----------
>> default_czech | word | Word | Stěhule | {cz_ispell,simple} |
>> default_czech | lword | Latin word | a | {cz_ispell,simple} |
>> default_czech | word | Word | bydlím | {cz_ispell,simple} |
>>'bydlet'
>>
>>

If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)

Regards,
Oleg
________________________________________________ _____________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 12 '05 #10

This discussion thread is closed

Replies have been disabled for this discussion.