472,129 Members | 1,790 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,129 software developers and data experts.

TSearch2: Problems with compound words and stop words

Hi there,

i have some troubles with my TSearch2 Installation. I have done this
installation as described in

http://www.sai.msu.su/~megera/oddmus...compound_words <http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>

I used the german myspell dictionary from
http://lingucomponent.openoffice.org/spell_dic.html and converted it with
my2ispell

Nearly everything is working fine so far, except two problems:

1.) The stopword-file seems to be ignored: If i try it with SELECT
to_tsvector("default_german", "ein Haus") i get

"ein":1 "haus":2

ein should be a Stopword for german (and is defined the german.stop file as
well)
2.) The compound words feature doesn"t work too. I have tried a lot of words,
i.e. "Fehlermeldung" with SELECT to_tsvector("default_german", "Fehlermeldung")
i only get
"fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated
entries. Is there anything wrong with the dictonary or my configuration?
My current configuration:

pg_ts_cfg:

default default C
default_russian default ru_RU.KOI8-R
simple default NULL
default_german default de_DE.ISO8859-1

pg_ts_cfgmap:

default_german host {simple}
default_german hword {simple}
default_german int {simple}
default_german nlhword {simple}
default_german nlpart_hword {simple}
default_german nlword {simple}
default_german part_hword {simple}
default_german sfloat {simple}
default_german uint {simple}
default_german uri {simple}
default_german url {simple}
default_german version {simple}
default_german word {simple}
default_german lpart_hword {de_ispell,german_snowball}
default_german lword {de_ispell,german_snowball}
default_german lhword {de_ispell,german_snowball}
pg_ts_dict:

de_ispell | 17166 |
DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" | 17167 | NULL
german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german


Can anyone help me?

regards

Timo
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #1
7 5304
Timo,

I forward your message to openfts mailing list.
Also, could you specify if locale settings are correct for your
database and what dictionary you have downloaded.

Oleg
On Fri, 5 Nov 2004, Timo Haberkern wrote:
Hi there,

i have some troubles with my TSearch2 Installation. I have done this
installation as described in
http://www.sai.msu.su/~megera/oddmus...compound_words
<http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>

I used the german myspell dictionary from
http://lingucomponent.openoffice.org/spell_dic.html and converted it with
my2ispell

Nearly everything is working fine so far, except two problems:

1.) The stopword-file seems to be ignored: If i try it with SELECT
to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2

ein should be a Stopword for german (and is defined the german.stop file as
well)

2.) The compound words feature doesn"t work too. I have tried a lot of words,
i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
"Fehlermeldung")
i only get
"fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated
entries. Is there anything wrong with the dictonary or my configuration?
My current configuration:

pg_ts_cfg:

default default C
default_russian default ru_RU.KOI8-R
simple default NULL
default_german default de_DE.ISO8859-1
pg_ts_cfgmap:

default_german host {simple}
default_german hword {simple}
default_german int {simple}
default_german nlhword {simple}
default_german nlpart_hword {simple}
default_german nlword {simple}
default_german part_hword {simple}
default_german sfloat {simple}
default_german uint {simple}
default_german uri {simple}
default_german url {simple}
default_german version {simple}
default_german word {simple}
default_german lpart_hword {de_ispell,german_snowball}
default_german lword {de_ispell,german_snowball}
default_german lhword {de_ispell,german_snowball}
pg_ts_dict:

de_ispell | 17166 |
DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" | 17167 |
NULL
german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german

Can anyone help me?

regards

Timo
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #2
Timo,

please, check you apply patch for compound word support.
What is version of postgresql ?
Does ispell dict works for non-compound words ?

Oleg

On Fri, 5 Nov 2004, Timo Haberkern wrote:
Hi there,

i have some troubles with my TSearch2 Installation. I have done this
installation as described in
http://www.sai.msu.su/~megera/oddmus...compound_words
<http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>

I used the german myspell dictionary from
http://lingucomponent.openoffice.org/spell_dic.html and converted it with
my2ispell

Nearly everything is working fine so far, except two problems:

1.) The stopword-file seems to be ignored: If i try it with SELECT
to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2

ein should be a Stopword for german (and is defined the german.stop file as
well)

2.) The compound words feature doesn"t work too. I have tried a lot of words,
i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
"Fehlermeldung")
i only get
"fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated
entries. Is there anything wrong with the dictonary or my configuration?
My current configuration:

pg_ts_cfg:

default default C
default_russian default ru_RU.KOI8-R
simple default NULL
default_german default de_DE.ISO8859-1
pg_ts_cfgmap:

default_german host {simple}
default_german hword {simple}
default_german int {simple}
default_german nlhword {simple}
default_german nlpart_hword {simple}
default_german nlword {simple}
default_german part_hword {simple}
default_german sfloat {simple}
default_german uint {simple}
default_german uri {simple}
default_german url {simple}
default_german version {simple}
default_german word {simple}
default_german lpart_hword {de_ispell,german_snowball}
default_german lword {de_ispell,german_snowball}
default_german lhword {de_ispell,german_snowball}
pg_ts_dict:

de_ispell | 17166 |
DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" | 17167 |
NULL
german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german

Can anyone help me?

regards

Timo
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #3
Oleg,

i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword
patch yesterday. The configuration changed a little bit but the result
is the same. I get no compound words. I'm using the locale de_DE with
encoding ISO8859-1 for the database.

I think i spell is working correctly except the compound words. If i try

SELECT lexize('de_ispell', 'springt')

i get

lexize
{springen,springen}

which seems correct.
But a SELECT lexize('de_ispell', 'Autobahn')

results in

lexize
{autobahn}

i would expect {auto,bahn, autobahn}

The new configuration after the compound word patch:
Actions dict_name
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=2&sortdir= asc&strings=expanded&page=1>
dict_init
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=3&sortdir= asc&strings=expanded&page=1>
dict_initoption
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=4&sortdir= asc&strings=expanded&page=1>
dict_lexize
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=5&sortdir= asc&strings=expanded&page=1>
dict_comment
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=6&sortdir= asc&strings=expanded&page=1>

Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=simple&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=simple&database=selina_rote x&schema=public&table=pg_ts_dict&return_url=tblpro perties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsche ma%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back& sortkey=&sortdir=>
simple dex_init(text) /NULL/ dex_lexize(internal,internal,integer)
Simple example of dictionary.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=en_stem&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=en_stem&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
en_stem snb_en_init(text)
/usr/local/pgsql/share/contrib/english.stop
snb_lexize(internal,internal,integer) English Stemmer. Snowball.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=ru_stem&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=ru_stem&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
ru_stem snb_ru_init(text)
/usr/local/pgsql/share/contrib/russian.stop
snb_lexize(internal,internal,integer) Russian Stemmer. Snowball.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=ispell_template&database=s elina_rotex&schema=public&table=pg_ts_dict&return_ url=tblproperties.php%3Fdatabase%3Dselina_rotex%26 amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_ desc=Back&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=ispell_template&database=se lina_rotex&schema=public&table=pg_ts_dict&return_u rl=tblproperties.php%3Fdatabase%3Dselina_rotex%26a mp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_d esc=Back&sortkey=&sortdir=>
ispell_template spell_init(text) /NULL/
spell_lexize(internal,internal,integer) ISpell interface. Must have
..dict and .aff files
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=synonym&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=synonym&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
synonym syn_init(text) /NULL/
syn_lexize(internal,internal,integer) Example of synonym dictionary
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=de_ispell&database=selina_ rotex&schema=public&table=pg_ts_dict&return_url=tb lproperties.php%3Fdatabase%3Dselina_rotex%26amp%3B schema%3Dpublic%26table%3Dpg_ts_dict&return_desc=B ack&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=de_ispell&database=selina_r otex&schema=public&table=pg_ts_dict&return_url=tbl properties.php%3Fdatabase%3Dselina_rotex%26amp%3Bs chema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Ba ck&sortkey=&sortdir=>
de_ispell spell_init(text)
DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict",
AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff",
StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"
spell_lexize(internal,internal,integer) /NULL/

Timo
Oleg Bartunov wrote:
Timo,

please, check you apply patch for compound word support.
What is version of postgresql ?
Does ispell dict works for non-compound words ?

Oleg

On Fri, 5 Nov 2004, Timo Haberkern wrote:
Hi there,

i have some troubles with my TSearch2 Installation. I have done this
installation as described in
http://www.sai.msu.su/~megera/oddmus...compound_words
<http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>
I used the german myspell dictionary from
http://lingucomponent.openoffice.org/spell_dic.html and converted it
with
my2ispell

Nearly everything is working fine so far, except two problems:

1.) The stopword-file seems to be ignored: If i try it with SELECT
to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2

ein should be a Stopword for german (and is defined the german.stop
file as
well)

2.) The compound words feature doesn"t work too. I have tried a lot
of words,
i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
"Fehlermeldung")
i only get
"fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated
entries. Is there anything wrong with the dictonary or my configuration?
My current configuration:

pg_ts_cfg:

default default C
default_russian default ru_RU.KOI8-R
simple default NULL
default_german default de_DE.ISO8859-1
pg_ts_cfgmap:

default_german host {simple}
default_german hword {simple}
default_german int {simple}
default_german nlhword {simple}
default_german nlpart_hword {simple}
default_german nlword {simple}
default_german part_hword {simple}
default_german sfloat {simple}
default_german uint {simple}
default_german uri {simple}
default_german url {simple}
default_german version {simple}
default_german word {simple}
default_german lpart_hword {de_ispell,german_snowball}
default_german lword {de_ispell,german_snowball}
default_german lhword {de_ispell,german_snowball}
pg_ts_dict:

de_ispell | 17166 |
DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" |
17167 | NULL
german_snowball | 17357 | NULL | 17162 | Snowball stemmer for
german

Can anyone help me?

regards

Timo
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #4
On Fri, 5 Nov 2004, Timo Haberkern wrote:
Oleg,

i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword patch
yesterday. The configuration changed a little bit but the result is the same.
I get no compound words. I'm using the locale de_DE with encoding ISO8859-1
for the database.

I think i spell is working correctly except the compound words. If i try

SELECT lexize('de_ispell', 'springt')

i get

lexize
{springen,springen}

which seems correct.
But a SELECT lexize('de_ispell', 'Autobahn')

results in

lexize
{autobahn}

i would expect {auto,bahn, autobahn}
Hmm, have you checked 'Autobahn' in ispell dictionary ? Does dictionary
you used supports 'Z' flag for compound words ?


The new configuration after the compound word patch:

Seems you overestimate my capabilities :)


Actions dict_name
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=2&sortdir= asc&strings=expanded&page=1>
dict_init
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=3&sortdir= asc&strings=expanded&page=1>
dict_initoption
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=4&sortdir= asc&strings=expanded&page=1>
dict_lexize
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=5&sortdir= asc&strings=expanded&page=1>
dict_comment
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=6&sortdir= asc&strings=expanded&page=1>
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=simple&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=simple&database=selina_rote x&schema=public&table=pg_ts_dict&return_url=tblpro perties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsche ma%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back& sortkey=&sortdir=>
simple dex_init(text) /NULL/ dex_lexize(internal,internal,integer) Simple
example of dictionary.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=en_stem&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=en_stem&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
en_stem snb_en_init(text) /usr/local/pgsql/share/contrib/english.stop
snb_lexize(internal,internal,integer) English Stemmer. Snowball.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=ru_stem&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=ru_stem&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
ru_stem snb_ru_init(text) /usr/local/pgsql/share/contrib/russian.stop
snb_lexize(internal,internal,integer) Russian Stemmer. Snowball.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=ispell_template&database=s elina_rotex&schema=public&table=pg_ts_dict&return_ url=tblproperties.php%3Fdatabase%3Dselina_rotex%26 amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_ desc=Back&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=ispell_template&database=se lina_rotex&schema=public&table=pg_ts_dict&return_u rl=tblproperties.php%3Fdatabase%3Dselina_rotex%26a mp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_d esc=Back&sortkey=&sortdir=>
ispell_template spell_init(text) /NULL/
spell_lexize(internal,internal,integer) ISpell interface. Must have
.dict and .aff files
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=synonym&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=synonym&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
synonym syn_init(text) /NULL/ syn_lexize(internal,internal,integer)
Example of synonym dictionary
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=de_ispell&database=selina_ rotex&schema=public&table=pg_ts_dict&return_url=tb lproperties.php%3Fdatabase%3Dselina_rotex%26amp%3B schema%3Dpublic%26table%3Dpg_ts_dict&return_desc=B ack&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=de_ispell&database=selina_r otex&schema=public&table=pg_ts_dict&return_url=tbl properties.php%3Fdatabase%3Dselina_rotex%26amp%3Bs chema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Ba ck&sortkey=&sortdir=>
de_ispell spell_init(text)
DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict",
AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff",
StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"
spell_lexize(internal,internal,integer) /NULL/

Timo
Oleg Bartunov wrote:
Timo,

please, check you apply patch for compound word support.
What is version of postgresql ?
Does ispell dict works for non-compound words ?

Oleg

On Fri, 5 Nov 2004, Timo Haberkern wrote:
Hi there,

i have some troubles with my TSearch2 Installation. I have done this
installation as described in
http://www.sai.msu.su/~megera/oddmus...compound_words
<http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>

I used the german myspell dictionary from
http://lingucomponent.openoffice.org/spell_dic.html and converted it with
my2ispell

Nearly everything is working fine so far, except two problems:

1.) The stopword-file seems to be ignored: If i try it with SELECT
to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2

ein should be a Stopword for german (and is defined the german.stop file
as
well)

2.) The compound words feature doesn"t work too. I have tried a lot of
words,
i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
"Fehlermeldung")
i only get
"fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated
entries. Is there anything wrong with the dictonary or my configuration?
My current configuration:

pg_ts_cfg:

default default C
default_russian default ru_RU.KOI8-R
simple default NULL
default_german default de_DE.ISO8859-1
pg_ts_cfgmap:

default_german host {simple}
default_german hword {simple}
default_german int {simple}
default_german nlhword {simple}
default_german nlpart_hword {simple}
default_german nlword {simple}
default_german part_hword {simple}
default_german sfloat {simple}
default_german uint {simple}
default_german uri {simple}
default_german url {simple}
default_german version {simple}
default_german word {simple}
default_german lpart_hword {de_ispell,german_snowball}
default_german lword {de_ispell,german_snowball}
default_german lhword {de_ispell,german_snowball}
pg_ts_dict:

de_ispell | 17166 |
DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" | 17167
| NULL
german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german

Can anyone help me?

regards

Timo
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #5
sorry for the late answer, i was on holyday,

see my remarks below
Oleg Bartunov wrote:
On Fri, 5 Nov 2004, Timo Haberkern wrote:
Oleg,

i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword
patch yesterday. The configuration changed a little bit but the
result is the same. I get no compound words. I'm using the locale
de_DE with encoding ISO8859-1 for the database.

I think i spell is working correctly except the compound words. If i try

SELECT lexize('de_ispell', 'springt')

i get

lexize
{springen,springen}

which seems correct.
But a SELECT lexize('de_ispell', 'Autobahn')

results in

lexize
{autobahn}

i would expect {auto,bahn, autobahn}

Hmm, have you checked 'Autobahn' in ispell dictionary ? Does
dictionary you used supports 'Z' flag for compound words ?


Autobahn is in the ispell dictionary. What does a ispell dictionary
need to support the Z flag???
Timo




The new configuration after the compound word patch:


Seems you overestimate my capabilities :)


Actions dict_name
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=2&sortdir= asc&strings=expanded&page=1>
dict_init
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=3&sortdir= asc&strings=expanded&page=1>
dict_initoption
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=4&sortdir= asc&strings=expanded&page=1>
dict_lexize
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=5&sortdir= asc&strings=expanded&page=1>
dict_comment
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=6&sortdir= asc&strings=expanded&page=1>
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=simple&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=simple&database=selina_rote x&schema=public&table=pg_ts_dict&return_url=tblpro perties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsche ma%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back& sortkey=&sortdir=>
simple dex_init(text) /NULL/
dex_lexize(internal,internal,integer) Simple example of dictionary.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=en_stem&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=en_stem&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
en_stem snb_en_init(text)
/usr/local/pgsql/share/contrib/english.stop
snb_lexize(internal,internal,integer) English Stemmer. Snowball.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=ru_stem&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=ru_stem&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
ru_stem snb_ru_init(text)
/usr/local/pgsql/share/contrib/russian.stop
snb_lexize(internal,internal,integer) Russian Stemmer. Snowball.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=ispell_template&database=s elina_rotex&schema=public&table=pg_ts_dict&return_ url=tblproperties.php%3Fdatabase%3Dselina_rotex%26 amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_ desc=Back&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=ispell_template&database=se lina_rotex&schema=public&table=pg_ts_dict&return_u rl=tblproperties.php%3Fdatabase%3Dselina_rotex%26a mp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_d esc=Back&sortkey=&sortdir=>
ispell_template spell_init(text) /NULL/
spell_lexize(internal,internal,integer) ISpell interface. Must
have .dict and .aff files
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=synonym&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=synonym&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
synonym syn_init(text) /NULL/
syn_lexize(internal,internal,integer) Example of synonym dictionary
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=de_ispell&database=selina_ rotex&schema=public&table=pg_ts_dict&return_url=tb lproperties.php%3Fdatabase%3Dselina_rotex%26amp%3B schema%3Dpublic%26table%3Dpg_ts_dict&return_desc=B ack&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=de_ispell&database=selina_r otex&schema=public&table=pg_ts_dict&return_url=tbl properties.php%3Fdatabase%3Dselina_rotex%26amp%3Bs chema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Ba ck&sortkey=&sortdir=>
de_ispell spell_init(text)
DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict",
AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff",
StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"
spell_lexize(internal,internal,integer) /NULL/

Timo
Oleg Bartunov wrote:
Timo,

please, check you apply patch for compound word support.
What is version of postgresql ?
Does ispell dict works for non-compound words ?

Oleg

On Fri, 5 Nov 2004, Timo Haberkern wrote:

Hi there,

i have some troubles with my TSearch2 Installation. I have done this
installation as described in
http://www.sai.msu.su/~megera/oddmus...compound_words
<http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>

I used the german myspell dictionary from
http://lingucomponent.openoffice.org/spell_dic.html and converted
it with
my2ispell

Nearly everything is working fine so far, except two problems:

1.) The stopword-file seems to be ignored: If i try it with SELECT
to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2

ein should be a Stopword for german (and is defined the german.stop
file as
well)

2.) The compound words feature doesn"t work too. I have tried a lot
of words,
i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
"Fehlermeldung")
i only get
"fehlermeldung":1 but i would expect "fehler" and "meldung" as
seperated
entries. Is there anything wrong with the dictonary or my
configuration?
My current configuration:

pg_ts_cfg:

default default C
default_russian default ru_RU.KOI8-R
simple default NULL
default_german default de_DE.ISO8859-1
pg_ts_cfgmap:

default_german host {simple}
default_german hword {simple}
default_german int {simple}
default_german nlhword {simple}
default_german nlpart_hword {simple}
default_german nlword {simple}
default_german part_hword {simple}
default_german sfloat {simple}
default_german uint {simple}
default_german uri {simple}
default_german url {simple}
default_german version {simple}
default_german word {simple}
default_german lpart_hword {de_ispell,german_snowball}
default_german lword {de_ispell,german_snowball}
default_german lhword {de_ispell,german_snowball}
pg_ts_dict:

de_ispell | 17166 |
DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"
| 17167 | NULL
german_snowball | 17357 | NULL | 17162 | Snowball stemmer for
german

Can anyone help me?

regards

Timo
---------------------------(end of
broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of
broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #6
On Wed, 17 Nov 2004, Timo Haberkern wrote:
sorry for the late answer, i was on holyday,

see my remarks below
Oleg Bartunov wrote:
On Fri, 5 Nov 2004, Timo Haberkern wrote:
Oleg,

i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword patch
yesterday. The configuration changed a little bit but the result is the
same. I get no compound words. I'm using the locale de_DE with encoding
ISO8859-1 for the database.

I think i spell is working correctly except the compound words. If i try

SELECT lexize('de_ispell', 'springt')

i get

lexize
{springen,springen}

which seems correct.
But a SELECT lexize('de_ispell', 'Autobahn')

results in

lexize
{autobahn}

i would expect {auto,bahn, autobahn}

Hmm, have you checked 'Autobahn' in ispell dictionary ? Does dictionary you
used supports 'Z' flag for compound words ?


Autobahn is in the ispell dictionary. What does a ispell dictionary need to
support the Z flag???


Try ispell -C Autobahn
search 'compound' in 'man ispell' for details.
the problem exists only if ispell *does* splits word correctly while tsearch2
doesn't. You should find correct ispell dictionary for german or create it
yourself. You may consult monzilla.net
http://staff.science.uva.nl/~christo...roject-dr.html


Timo




The new configuration after the compound word patch:


Seems you overestimate my capabilities :)


Actions dict_name
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=2&sortdir= asc&strings=expanded&page=1>
dict_init
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=3&sortdir= asc&strings=expanded&page=1>
dict_initoption
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=4&sortdir= asc&strings=expanded&page=1>
dict_lexize
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=5&sortdir= asc&strings=expanded&page=1>
dict_comment
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=6&sortdir= asc&strings=expanded&page=1>
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=simple&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=simple&database=selina_rote x&schema=public&table=pg_ts_dict&return_url=tblpro perties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsche ma%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back& sortkey=&sortdir=>
simple dex_init(text) /NULL/
dex_lexize(internal,internal,integer) Simple example of dictionary.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=en_stem&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=en_stem&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
en_stem snb_en_init(text) /usr/local/pgsql/share/contrib/english.stop
snb_lexize(internal,internal,integer) English Stemmer. Snowball.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=ru_stem&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=ru_stem&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
ru_stem snb_ru_init(text) /usr/local/pgsql/share/contrib/russian.stop
snb_lexize(internal,internal,integer) Russian Stemmer. Snowball.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=ispell_template&database=s elina_rotex&schema=public&table=pg_ts_dict&return_ url=tblproperties.php%3Fdatabase%3Dselina_rotex%26 amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_ desc=Back&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=ispell_template&database=se lina_rotex&schema=public&table=pg_ts_dict&return_u rl=tblproperties.php%3Fdatabase%3Dselina_rotex%26a mp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_d esc=Back&sortkey=&sortdir=>
ispell_template spell_init(text) /NULL/
spell_lexize(internal,internal,integer) ISpell interface. Must have
.dict and .aff files
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=synonym&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=synonym&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
synonym syn_init(text) /NULL/
syn_lexize(internal,internal,integer) Example of synonym dictionary
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=de_ispell&database=selina_ rotex&schema=public&table=pg_ts_dict&return_url=tb lproperties.php%3Fdatabase%3Dselina_rotex%26amp%3B schema%3Dpublic%26table%3Dpg_ts_dict&return_desc=B ack&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=de_ispell&database=selina_r otex&schema=public&table=pg_ts_dict&return_url=tbl properties.php%3Fdatabase%3Dselina_rotex%26amp%3Bs chema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Ba ck&sortkey=&sortdir=>
de_ispell spell_init(text)
DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict",
AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff",
StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"
spell_lexize(internal,internal,integer) /NULL/

Timo
Oleg Bartunov wrote:

Timo,

please, check you apply patch for compound word support.
What is version of postgresql ?
Does ispell dict works for non-compound words ?

Oleg

On Fri, 5 Nov 2004, Timo Haberkern wrote:

> Hi there,
>
> i have some troubles with my TSearch2 Installation. I have done this
> installation as described in
> http://www.sai.msu.su/~megera/oddmus...compound_words
> <http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>
> I used the german myspell dictionary from
> http://lingucomponent.openoffice.org/spell_dic.html and converted it
> with
> my2ispell
>
> Nearly everything is working fine so far, except two problems:
>
> 1.) The stopword-file seems to be ignored: If i try it with SELECT
> to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2
>
> ein should be a Stopword for german (and is defined the german.stop file
> as
> well)
>
> 2.) The compound words feature doesn"t work too. I have tried a lot of
> words,
> i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
> "Fehlermeldung")
> i only get
> "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated
> entries. Is there anything wrong with the dictonary or my configuration?
>
>
> My current configuration:
>
> pg_ts_cfg:
>
> default default C
> default_russian default ru_RU.KOI8-R
> simple default NULL
> default_german default de_DE.ISO8859-1
> pg_ts_cfgmap:
>
> default_german host {simple}
> default_german hword {simple}
> default_german int {simple}
> default_german nlhword {simple}
> default_german nlpart_hword {simple}
> default_german nlword {simple}
> default_german part_hword {simple}
> default_german sfloat {simple}
> default_german uint {simple}
> default_german uri {simple}
> default_german url {simple}
> default_german version {simple}
> default_german word {simple}
> default_german lpart_hword {de_ispell,german_snowball}
> default_german lword {de_ispell,german_snowball}
> default_german lhword {de_ispell,german_snowball}
>
>
> pg_ts_dict:
>
> de_ispell | 17166 |
> DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
> AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
> StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" |
> 17167 | NULL
> german_snowball | 17357 | NULL | 17162 | Snowball stemmer for
> german
>
>
>
> Can anyone help me?
>
> regards
>
> Timo
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #7
Timo,

take a look into .aff file and search 'compoundwords'.
german ispell file I got from http://j3e.de/ispell/igerman98/ has no
support for compound words: 'compoundwords off'

Norwegian, for example, has:

compoundwords controlled z

compoundmin 4
Oleg
On Wed, 17 Nov 2004, Oleg Bartunov wrote:
On Wed, 17 Nov 2004, Timo Haberkern wrote:
sorry for the late answer, i was on holyday,

see my remarks below
Oleg Bartunov wrote:
On Fri, 5 Nov 2004, Timo Haberkern wrote:

Oleg,

i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword patch
yesterday. The configuration changed a little bit but the result is the
same. I get no compound words. I'm using the locale de_DE with encoding
ISO8859-1 for the database.

I think i spell is working correctly except the compound words. If i try

SELECT lexize('de_ispell', 'springt')

i get

lexize
{springen,springen}

which seems correct.
But a SELECT lexize('de_ispell', 'Autobahn')

results in

lexize
{autobahn}

i would expect {auto,bahn, autobahn}
Hmm, have you checked 'Autobahn' in ispell dictionary ? Does dictionary
you used supports 'Z' flag for compound words ?


Autobahn is in the ispell dictionary. What does a ispell dictionary need
to support the Z flag???


Try ispell -C Autobahn search 'compound' in 'man ispell' for details. the
problem exists only if ispell *does* splits word correctly while tsearch2
doesn't. You should find correct ispell dictionary for german or create it
yourself. You may consult monzilla.net
http://staff.science.uva.nl/~christo...roject-dr.html


Timo



The new configuration after the compound word patch:
Seems you overestimate my capabilities :)

Actions dict_name
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=2&sortdir= asc&strings=expanded&page=1>
dict_init
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=3&sortdir= asc&strings=expanded&page=1>
dict_initoption
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=4&sortdir= asc&strings=expanded&page=1>
dict_lexize
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=5&sortdir= asc&strings=expanded&page=1>
dict_comment
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&ta ble=pg_ts_dict&return_url=tblproperties.php%3Fdata base%3Dselina_rotex%26amp%3Bschema%3Dpublic%26tabl e%3Dpg_ts_dict&return_desc=Back&sortkey=6&sortdir= asc&strings=expanded&page=1>
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=simple&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=simple&database=selina_rote x&schema=public&table=pg_ts_dict&return_url=tblpro perties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsche ma%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back& sortkey=&sortdir=>
simple dex_init(text) /NULL/
dex_lexize(internal,internal,integer) Simple example of dictionary.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=en_stem&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=en_stem&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
en_stem snb_en_init(text) /usr/local/pgsql/share/contrib/english.stop
snb_lexize(internal,internal,integer) English Stemmer. Snowball.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=ru_stem&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=ru_stem&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
ru_stem snb_ru_init(text) /usr/local/pgsql/share/contrib/russian.stop
snb_lexize(internal,internal,integer) Russian Stemmer. Snowball.
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=ispell_template&database=s elina_rotex&schema=public&table=pg_ts_dict&return_ url=tblproperties.php%3Fdatabase%3Dselina_rotex%26 amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_ desc=Back&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=ispell_template&database=se lina_rotex&schema=public&table=pg_ts_dict&return_u rl=tblproperties.php%3Fdatabase%3Dselina_rotex%26a mp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_d esc=Back&sortkey=&sortdir=>
ispell_template spell_init(text) /NULL/
spell_lexize(internal,internal,integer) ISpell interface. Must have
.dict and .aff files
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=synonym&database=selina_ro tex&schema=public&table=pg_ts_dict&return_url=tblp roperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsc hema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Bac k&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=synonym&database=selina_rot ex&schema=public&table=pg_ts_dict&return_url=tblpr operties.php%3Fdatabase%3Dselina_rotex%26amp%3Bsch ema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back &sortkey=&sortdir=>
synonym syn_init(text) /NULL/
syn_lexize(internal,internal,integer) Example of synonym dictionary
Edit
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&pa ge=1&key%5Bdict_name%5D=de_ispell&database=selina_ rotex&schema=public&table=pg_ts_dict&return_url=tb lproperties.php%3Fdatabase%3Dselina_rotex%26amp%3B schema%3Dpublic%26table%3Dpg_ts_dict&return_desc=B ack&sortkey=&sortdir=>
Delete
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&pag e=1&key%5Bdict_name%5D=de_ispell&database=selina_r otex&schema=public&table=pg_ts_dict&return_url=tbl properties.php%3Fdatabase%3Dselina_rotex%26amp%3Bs chema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Ba ck&sortkey=&sortdir=>
de_ispell spell_init(text)
DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict",
AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff",
StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"
spell_lexize(internal,internal,integer) /NULL/

Timo
Oleg Bartunov wrote:

> Timo,
>
> please, check you apply patch for compound word support.
> What is version of postgresql ?
> Does ispell dict works for non-compound words ?
>
> Oleg
>
> On Fri, 5 Nov 2004, Timo Haberkern wrote:
>
>> Hi there,
>>
>> i have some troubles with my TSearch2 Installation. I have done this
>> installation as described in
>> http://www.sai.msu.su/~megera/oddmus...compound_words
>> <http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>
>> I used the german myspell dictionary from
>> http://lingucomponent.openoffice.org/spell_dic.html and converted it
>> with
>> my2ispell
>>
>> Nearly everything is working fine so far, except two problems:
>>
>> 1.) The stopword-file seems to be ignored: If i try it with SELECT
>> to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2
>>
>> ein should be a Stopword for german (and is defined the german.stop
>> file as
>> well)
>>
>> 2.) The compound words feature doesn"t work too. I have tried a lot of
>> words,
>> i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
>> "Fehlermeldung")
>> i only get
>> "fehlermeldung":1 but i would expect "fehler" and "meldung" as
>> seperated
>> entries. Is there anything wrong with the dictonary or my
>> configuration?
>>
>>
>> My current configuration:
>>
>> pg_ts_cfg:
>>
>> default default C
>> default_russian default ru_RU.KOI8-R
>> simple default NULL
>> default_german default de_DE.ISO8859-1
>> pg_ts_cfgmap:
>>
>> default_german host {simple}
>> default_german hword {simple}
>> default_german int {simple}
>> default_german nlhword {simple}
>> default_german nlpart_hword {simple}
>> default_german nlword {simple}
>> default_german part_hword {simple}
>> default_german sfloat {simple}
>> default_german uint {simple}
>> default_german uri {simple}
>> default_german url {simple}
>> default_german version {simple}
>> default_german word {simple}
>> default_german lpart_hword {de_ispell,german_snowball}
>> default_german lword {de_ispell,german_snowball}
>> default_german lhword {de_ispell,german_snowball}
>>
>>
>> pg_ts_dict:
>>
>> de_ispell | 17166 |
>> DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
>> AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
>> StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" |
>> 17167 | NULL
>> german_snowball | 17357 | NULL | 17162 | Snowball stemmer for
>> german
>>
>>
>>
>> Can anyone help me?
>>
>> regards
>>
>> Timo
>>
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 4: Don't 'kill -9' the postmaster
>>
>
> Regards,
> Oleg
> __________________________________________________ ___________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to ma*******@postgresql.org)
>
>
Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by Alexander Rüegg | last post: by
reply views Thread by George Essig | last post: by
16 posts views Thread by Ben | last post: by
reply views Thread by Ben | last post: by
3 posts views Thread by Marcel Boscher | last post: by
1 post views Thread by Dawid Kuroczko | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.