473,396 Members | 1,990 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

tsearch: how to get a list of stopwords?

Hi there,

me again. How do I find the stopwords that tsearch uses in its standard
configuration? I've looked at contrib/tsearch/dict/porter_english.dct and get
a feeling it's somewhere in there but I can't decipher it. Any suggestions?

Joerg
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 11 '05 #1
3 4118
On Thu, 28 Aug 2003, Joerg Erdmenger wrote:
Hi there,

me again. How do I find the stopwords that tsearch uses in its standard
configuration? I've looked at contrib/tsearch/dict/porter_english.dct and get
a feeling it's somewhere in there but I can't decipher it. Any suggestions?

You're right. They're encoded in engstoptree :)
I suggest you not bother with old tsearch and look to tsearch2 version
which is much improved both in performance and flexibility.
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Oleg

Joerg
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 11 '05 #2
hi
me again. How do I find the stopwords that tsearch uses in its standard
configuration? I've looked at contrib/tsearch/dict/porter_english.dct and
get a feeling it's somewhere in there but I can't decipher it. Any
suggestions?


You're right. They're encoded in engstoptree :)
I suggest you not bother with old tsearch and look to tsearch2 version
which is much improved both in performance and flexibility.
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

well, I would like but I've got to get it to work on a production server; I
will try to get the admins to install it but I guess it will take some time -
meanwhile - is there anyway to get to the list of stopwords so that I can
build a filter for those as a temporary workaround?

thanks

Joerg
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #3
On Thu, 28 Aug 2003, Joerg Erdmenger wrote:
hi
me again. How do I find the stopwords that tsearch uses in its standard
configuration? I've looked at contrib/tsearch/dict/porter_english.dct and
get a feeling it's somewhere in there but I can't decipher it. Any
suggestions?
You're right. They're encoded in engstoptree :)
I suggest you not bother with old tsearch and look to tsearch2 version
which is much improved both in performance and flexibility.
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

well, I would like but I've got to get it to work on a production server; I
will try to get the admins to install it but I guess it will take some time -
meanwhile - is there anyway to get to the list of stopwords so that I can
build a filter for those as a temporary workaround?


tsearch2 could live with tsearch, so you may play with it.
I attached english.stop file from OpenFTS distribution. But I'm not 100% sure
it's the same as in portereng.c :)

thanks

Joerg


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Joerg Erdmenger | last post by:
Hi there, I have an issue with tsearch. I'm using tsearch as a search mechanism on a website making various queries created from the words that a user has put in. So I query for all the words...
13
by: Nigel J. Andrews | last post by:
This will be a little vague, it was last night and I can't now do the test in that db (see below) so can't give the exact wording. I seem to remember a report a little while ago about tsearch v2...
1
by: sector119 | last post by:
Hi Is there some one who was able to create ukrainian or russian-urainian stemmer dict for tsearch v2? -- WBR, sector119 ---------------------------(end of...
4
by: Bas Scheffers | last post by:
Hi, Is there a way to use tsearch so that it returns documents that have less than all the required keywords? The idea is that if a document only has 3 out of 4 terms, it is still returned, but...
2
by: Rajesh Kumar Mallah | last post by:
Hi, I think when search terms have "."s in them they become case sensitive in tsearch searches. How can we make them insensitive? Regds Mallah.
3
by: Marcel Boscher | last post by:
For now i am almost statisfied with my tsearch2 installation war over night somehow it seems to work, finally... 3 probably easy questions remain... 1.) Is it possible to index already filled...
0
by: Justin Kennedy | last post by:
The short question is why does this: select to_tsvector('default', coalesce(name, '') ||' '|| coalesce(description, '') ||' '|| coalesce(keywords,'')) from link_items; give different results...
1
by: linbl352 | last post by:
Does anybody can tell me the following answers? In the description of tsearch function, what is purpose of the callback function( compar) as follows: void *tsearch(const void *key, void **rootp, ...
0
by: almurph | last post by:
Hi, I'm new to mySQL so I hope that you can help me. How do you remove all the default stopwords in mySQL 5.0 so that the full-text indexing does not block any words? Thanks for any...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.