472,352 Members | 1,533 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,352 software developers and data experts.

questions about tsearch2 (for czech language)

Hello

I try tsearch2 within czech environment. It is works fine, but I have two
questions.

1. I have words "se", "ve" in my czech stop words. But I get this words in
result. Why? Have I problem with my configuration?

tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
Skalici.');
ts_name | tok_type | description | token | dict_name | tsvector
---------------+----------+-------------+---------+-------------+-----------
default_czech | lword | Latin word | jmenuji | {cz_ispell} |
'jmenuji'
default_czech | lword | Latin word | se | {cz_ispell} | 'se'
default_czech | lword | Latin word | Pavel | {cz_ispell} | 'pavel'
default_czech | word | Word | Stěhule | {cz_ispell} |
default_czech | lword | Latin word | a | {cz_ispell} |
default_czech | word | Word | bydlím | {cz_ispell} | 'bydlet'
default_czech | lword | Latin word | ve | {cz_ispell} | 've'
default_czech | lword | Latin word | Skalici | {cz_ispell} |
'skalici'
(8 řádek)

tsearch2=# select * from pg_ts_cfgmap where ts_name='default_czech';
ts_name | tok_alias | dict_name
---------------+--------------+-------------
default_czech | email | {simple}
default_czech | file | {simple}
default_czech | float | {simple}
default_czech | host | {simple}
default_czech | hword | {cz_ispell}
default_czech | int | {simple}
default_czech | lhword | {cz_ispell}
default_czech | lpart_hword | {cz_ispell}
default_czech | lword | {cz_ispell}
default_czech | nlhword | {cz_ispell}
default_czech | nlpart_hword | {cz_ispell}
default_czech | nlword | {cz_ispell}
default_czech | part_hword | {simple}
default_czech | sfloat | {simple}
default_czech | uint | {simple}
default_czech | uri | {simple}
default_czech | url | {simple}
default_czech | version | {simple}
default_czech | word | {cz_ispell}
(19 řádek)

2. I use small czech dictionary. I need don't erase words which aren't in
dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
simple dict into cfg map, but witout sucess

tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
Skalici.'); ts_name | tok_type | description | token |
dict_name | tsvector
---------------+----------+-------------+---------+--------------------+-----------
default_czech | word | Word | Stěhule | {cz_ispell,simple} |
default_czech | lword | Latin word | a | {cz_ispell,simple} |
default_czech | word | Word | bydlím | {cz_ispell,simple} |
'bydlet'
Thank You
Pavel Stehule
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 12 '05 #1
9 1900
On Mon, 22 Dec 2003, Pavel Stehule wrote:
Hello

I try tsearch2 within czech environment. It is works fine, but I have two
questions.

1. I have words "se", "ve" in my czech stop words. But I get this words in
result. Why? Have I problem with my configuration?
did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;


tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
Skalici.');
ts_name | tok_type | description | token | dict_name | tsvector
---------------+----------+-------------+---------+-------------+-----------
default_czech | lword | Latin word | jmenuji | {cz_ispell} |
'jmenuji'
default_czech | lword | Latin word | se | {cz_ispell} | 'se'
default_czech | lword | Latin word | Pavel | {cz_ispell} | 'pavel'
default_czech | word | Word | Stěhule | {cz_ispell} |
default_czech | lword | Latin word | a | {cz_ispell} |
default_czech | word | Word | bydlím | {cz_ispell} | 'bydlet'
default_czech | lword | Latin word | ve | {cz_ispell} | 've'
default_czech | lword | Latin word | Skalici | {cz_ispell} |
'skalici'
(8 řádek)

tsearch2=# select * from pg_ts_cfgmap where ts_name='default_czech';
ts_name | tok_alias | dict_name
---------------+--------------+-------------
default_czech | email | {simple}
default_czech | file | {simple}
default_czech | float | {simple}
default_czech | host | {simple}
default_czech | hword | {cz_ispell}
default_czech | int | {simple}
default_czech | lhword | {cz_ispell}
default_czech | lpart_hword | {cz_ispell}
default_czech | lword | {cz_ispell}
default_czech | nlhword | {cz_ispell}
default_czech | nlpart_hword | {cz_ispell}
default_czech | nlword | {cz_ispell}
default_czech | part_hword | {simple}
default_czech | sfloat | {simple}
default_czech | uint | {simple}
default_czech | uri | {simple}
default_czech | url | {simple}
default_czech | version | {simple}
default_czech | word | {cz_ispell}
(19 řádek)

2. I use small czech dictionary. I need don't erase words which aren't in
dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
simple dict into cfg map, but witout sucess

Example, please ! What do you mean 'erase words' ?

tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
Skalici.'); ts_name | tok_type | description | token |
dict_name | tsvector
---------------+----------+-------------+---------+--------------------+-----------
default_czech | word | Word | Stěhule | {cz_ispell,simple} |
default_czech | lword | Latin word | a | {cz_ispell,simple} |
default_czech | word | Word | bydlím | {cz_ispell,simple} |
'bydlet'
Thank You
Pavel Stehule
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 12 '05 #2
> > result. Why? Have I problem with my configuration?

did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;

tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy

2. I use small czech dictionary. I need don't erase words which aren't in
dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
simple dict into cfg map, but witout sucess


Example, please ! What do you mean 'erase words' ?

tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
Skalici.'); ts_name | tok_type | description | token |
dict_name | tsvector
---------------+----------+-------------+---------+--------------------+-----------
default_czech | word | Word | Stěhule | {cz_ispell,simple} |
default_czech | lword | Latin word | a | {cz_ispell,simple} |
default_czech | word | Word | bydlím | {cz_ispell,simple} |
'bydlet'


If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 12 '05 #3
Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:
result. Why? Have I problem with my configuration?


did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;

tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy

2. I use small czech dictionary. I need don't erase words which aren't in
dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
simple dict into cfg map, but witout sucess


Example, please ! What do you mean 'erase words' ?

tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
Skalici.'); ts_name | tok_type | description | token |
dict_name | tsvector
---------------+----------+-------------+---------+--------------------+-----------
default_czech | word | Word | Stěhule | {cz_ispell,simple} |
default_czech | lword | Latin word | a | {cz_ispell,simple} |
default_czech | word | Word | bydlím | {cz_ispell,simple} |
'bydlet'


If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #4
Oleg

You has true. After restart of postmaster all works fine.

tsearch2=# select to_tsvector('default_czech','Jmenuji se Pavel Stěhule');
to_tsvector
------------------------------------
'pavel':3 'stěhule':4 'jmenovat':1

Thank You very much

Pavel Stehule
On Mon, 22 Dec 2003, Oleg Bartunov wrote:
Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:
> result. Why? Have I problem with my configuration?

did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;

tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy
>
> 2. I use small czech dictionary. I need don't erase words which aren't in
> dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
> simple dict into cfg map, but witout sucess
>

Example, please ! What do you mean 'erase words' ?
> tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
> Skalici.'); ts_name | tok_type | description | token |
> dict_name | tsvector
> ---------------+----------+-------------+---------+--------------------+-----------
> default_czech | word | Word | Stěhule | {cz_ispell,simple} |
> default_czech | lword | Latin word | a | {cz_ispell,simple} |
> default_czech | word | Word | bydlím | {cz_ispell,simple} |
> 'bydlet'
>
>


If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #5
> You has true. After restart of postmaster all works fine.
One comment, you don't need restart postmaster, you should reconnect to
postgresql by exit and start psql. Every new connect creates new child of
postmaster.


tsearch2=# select to_tsvector('default_czech','Jmenuji se Pavel Stěhule');
to_tsvector
------------------------------------
'pavel':3 'stěhule':4 'jmenovat':1

Thank You very much

Pavel Stehule
On Mon, 22 Dec 2003, Oleg Bartunov wrote:

Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:

>result. Why? Have I problem with my configuration?

did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;
tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy
>2. I use small czech dictionary. I need don't erase words which aren't in
>dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
>simple dict into cfg map, but witout sucess
>

Example, please ! What do you mean 'erase words' ?

>tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
>Skalici.'); ts_name | tok_type | description | token |
>dict_name | tsvector
>---------------+----------+-------------+---------+--------------------+-----------
> default_czech | word | Word | Stěhule | {cz_ispell,simple} |
> default_czech | lword | Latin word | a | {cz_ispell,simple} |
> default_czech | word | Word | bydlím | {cz_ispell,simple} |
>'bydlet'
>
>

If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)


Regards,
Oleg
________________________________________________ _____________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly


--
Teodor Sigaev E-mail: te****@sigaev.ru
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 12 '05 #6


On Tue, 23 Dec 2003, Teodor Sigaev wrote:
You has true. After restart of postmaster all works fine.

One comment, you don't need restart postmaster, you should reconnect to
postgresql by exit and start psql. Every new connect creates new child of
postmaster.

true, but I like hard solutions, :->
"/etc/init.d/postgresql restart" is my top command

I work only one on this database, a can use en force.

Pavel

tsearch2=# select to_tsvector('default_czech','Jmenuji se Pavel Stěhule');
to_tsvector
------------------------------------
'pavel':3 'stěhule':4 'jmenovat':1

Thank You very much

Pavel Stehule
On Mon, 22 Dec 2003, Oleg Bartunov wrote:

Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:
>>result. Why? Have I problem with my configuration?
>
>did you specify stop words in dictionaries configuration ?
>
>select * from pg_ts_dict;
>

tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy
>>2. I use small czech dictionary. I need don't erase words which aren't in
>>dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
>>simple dict into cfg map, but witout sucess
>>
>
>Example, please ! What do you mean 'erase words' ?
>
>
>
>>tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
>>Skalici.'); ts_name | tok_type | description | token |
>>dict_name | tsvector
>>---------------+----------+-------------+---------+--------------------+-----------
>> default_czech | word | Word | Stěhule | {cz_ispell,simple} |
>> default_czech | lword | Latin word | a | {cz_ispell,simple} |
>> default_czech | word | Word | bydlím | {cz_ispell,simple} |
>>'bydlet'
>>
>>

If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)

Regards,
Oleg
________________________________________________ _____________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 12 '05 #7
Oleg

You has true. After restart of postmaster all works fine.

tsearch2=# select to_tsvector('default_czech','Jmenuji se Pavel Stěhule');
to_tsvector
------------------------------------
'pavel':3 'stěhule':4 'jmenovat':1

Thank You very much

Pavel Stehule
On Mon, 22 Dec 2003, Oleg Bartunov wrote:
Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:
> result. Why? Have I problem with my configuration?

did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;

tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy
>
> 2. I use small czech dictionary. I need don't erase words which aren't in
> dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
> simple dict into cfg map, but witout sucess
>

Example, please ! What do you mean 'erase words' ?
> tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
> Skalici.'); ts_name | tok_type | description | token |
> dict_name | tsvector
> ---------------+----------+-------------+---------+--------------------+-----------
> default_czech | word | Word | Stěhule | {cz_ispell,simple} |
> default_czech | lword | Latin word | a | {cz_ispell,simple} |
> default_czech | word | Word | bydlím | {cz_ispell,simple} |
> 'bydlet'
>
>


If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #8
> You has true. After restart of postmaster all works fine.
One comment, you don't need restart postmaster, you should reconnect to
postgresql by exit and start psql. Every new connect creates new child of
postmaster.


tsearch2=# select to_tsvector('default_czech','Jmenuji se Pavel Stěhule');
to_tsvector
------------------------------------
'pavel':3 'stěhule':4 'jmenovat':1

Thank You very much

Pavel Stehule
On Mon, 22 Dec 2003, Oleg Bartunov wrote:

Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:

>result. Why? Have I problem with my configuration?

did you specify stop words in dictionaries configuration ?

select * from pg_ts_dict;
tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy
>2. I use small czech dictionary. I need don't erase words which aren't in
>dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
>simple dict into cfg map, but witout sucess
>

Example, please ! What do you mean 'erase words' ?

>tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
>Skalici.'); ts_name | tok_type | description | token |
>dict_name | tsvector
>---------------+----------+-------------+---------+--------------------+-----------
> default_czech | word | Word | Stěhule | {cz_ispell,simple} |
> default_czech | lword | Latin word | a | {cz_ispell,simple} |
> default_czech | word | Word | bydlím | {cz_ispell,simple} |
>'bydlet'
>
>

If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)


Regards,
Oleg
________________________________________________ _____________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly


--
Teodor Sigaev E-mail: te****@sigaev.ru
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 12 '05 #9


On Tue, 23 Dec 2003, Teodor Sigaev wrote:
You has true. After restart of postmaster all works fine.

One comment, you don't need restart postmaster, you should reconnect to
postgresql by exit and start psql. Every new connect creates new child of
postmaster.

true, but I like hard solutions, :->
"/etc/init.d/postgresql restart" is my top command

I work only one on this database, a can use en force.

Pavel

tsearch2=# select to_tsvector('default_czech','Jmenuji se Pavel Stěhule');
to_tsvector
------------------------------------
'pavel':3 'stěhule':4 'jmenovat':1

Thank You very much

Pavel Stehule
On Mon, 22 Dec 2003, Oleg Bartunov wrote:

Pavel,

did you restart psql session after modifying tsearch2 configuration ?
btw, there is czech dictionary available from http://lingucomponent.openoffice.org...ictionary.html
We have utility to convert myspell dicts to ispell one. It's included
in 7.5 development. Patch for 7.4 could be downloaded from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Also, historically, we use openfts mailing list for discussion of
tsearch2.

Oleg
On Mon, 22 Dec 2003, Pavel Stehule wrote:
>>result. Why? Have I problem with my configuration?
>
>did you specify stop words in dictionaries configuration ?
>
>select * from pg_ts_dict;
>

tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------
dict_name | cz_ispell
dict_init | 173405
dict_initoption |
DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
dict_lexize | 173406
dict_comment |

[postgres@usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
se
sem
si
svůj
ve
vám
váš
viz
vy
>>2. I use small czech dictionary. I need don't erase words which aren't in
>>dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
>>simple dict into cfg map, but witout sucess
>>
>
>Example, please ! What do you mean 'erase words' ?
>
>
>
>>tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
>>Skalici.'); ts_name | tok_type | description | token |
>>dict_name | tsvector
>>---------------+----------+-------------+---------+--------------------+-----------
>> default_czech | word | Word | Stěhule | {cz_ispell,simple} |
>> default_czech | lword | Latin word | a | {cz_ispell,simple} |
>> default_czech | word | Word | bydlím | {cz_ispell,simple} |
>>'bydlet'
>>
>>

If tsearch didn't find word in dictionary, then erase this from result.
True? My surname, fo example isn't in dictionary, but I wont save this
word in result (tsvector).

I use

tsearch2=# select version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
20030715 (Red Hat Linux 3.3-14)

Regards,
Oleg
________________________________________________ _____________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 12 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
by: Nigel J. Andrews | last post by:
This will be a little vague, it was last night and I can't now do the test in that db (see below) so can't give the exact wording. I seem to...
0
by: Piotr Nienaltowski | last post by:
!!! DEADLINE FOR PAPER SUBMISSIONS HAS BEEN EXTENDED UNTIL FEBRUARY 26, 2004 !!! ----------------------------------------------------------------...
2
by: Fischer Ulrich | last post by:
Hi I have a problem with the restoring of a database which uses tsearch2. I made a backup as discribed in 'tsearch-v2-intro' on the tsearch2...
2
by: Chris Gamache | last post by:
Tsearch2 comes with its own tsearch2 trigger function. You pass column names to it, and it puts a vanilla tsvector into the column named in TG_ARGV....
0
by: Markus Wollny | last post by:
Hi! Sorry to bother you, but I just don't know how to get tsearch2 configured correctly for my setup. I've got a 7.4.3 database-cluster initdb'ed...
3
by: Marcel Boscher | last post by:
Hello everybody, i tried to "J.U.S.T" install the FullTextSearchTool tsearch2 under the guidiance of :...
2
by: Net Virtual Mailing Lists | last post by:
Hello, If I have a rule like this: CREATE OR REPLACE RULE sometable_update AS ON UPDATE TO table2 DO UPDATE cache SET updated_dt=NULL WHERE...
0
by: TSD 2006 | last post by:
********************************************************* TSD 2006 - CALL FOR DEMONSTRATIONS AND PARTICIPATION...
0
by: TSD 2006 | last post by:
********************************************************* TSD 2006 - CALL FOR PARTICIPATION...
1
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and...
0
jalbright99669
by: jalbright99669 | last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was...
2
by: Matthew3360 | last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it...
0
hi
by: WisdomUfot | last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific...
0
by: Matthew3360 | last post by:
Hi, I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand....
0
BLUEPANDA
by: BLUEPANDA | last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS...
0
by: Rahul1995seven | last post by:
Introduction: In the realm of programming languages, Python has emerged as a powerhouse. With its simplicity, versatility, and robustness, Python...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.