473,405 Members | 2,415 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,405 software developers and data experts.

RSS Search

Hi! I'm looking for ideas on what would the best approach to design a
search system for a RSS feeds. I will have some 50 RSS feeds (all RSS
2.0 compliant) stored locally on the web server. Now I'm wondering
what would the best method to allow searching of these RSS files.
Since the search will cater to multiple users the search system has to
be robust and efficient. Some ideas that I have for the RSS search
system are:

1. Store all RSS files locally on the web server file system and
perform file system queries. But I guess this might get slow when a
number of users try to search. Moreover, the queries may not be
extensible (for example to allow boolean operations etc).

2. Move the RSS data to the database and then search perform search
using LIKE (or the more advanced indexing service features).

3. Use a 3rd party full-text search engine like Lucene.

4. Use something like XQuery or XPath to query the RSS files directly
but this again *might* (not sure since I haven't worked with either)
get slow when a number of users try to search.

Also, the RSS files I have on the web server will be updated every
hour or so.

So, I have the ideas but I'm not quite sure which one would the most
suitable and efficient. If anyone has ideas on implementing such a
search system for RSS feeds then please share your insight. Thank you
guys!
Nov 18 '05 #1
3 880
You might want to shread the XML docs/RSS feeds and store them in a
relational database and FTI the columns of interest and query them there.

I would advise against storing them in the file system or storing them in
XML format in the image type columns. Although Indexing Services and SQL FTS
does support querying XML/RSS feeds using the XML iFilter, you can't index
properties using SQL Server FTS, and Indexing Services support isn't much
better.

You could index the XML/RSS as text but there are some problems indexing the
XML tags.

XQuery FTS support will be supplied when SQL 2005 which will RTM next year.

Please refer to:

http://msdn.microsoft.com/xml/defaul.../sql2k5xml.asp

for more info.

--
Hilary Cotter
Looking for a book on SQL Server replication?
http://www.nwsu.com/0974973602.html
"RiceGuy" <9i*************@jetable.org> wrote in message
news:d7**************************@posting.google.c om...
Hi! I'm looking for ideas on what would the best approach to design a
search system for a RSS feeds. I will have some 50 RSS feeds (all RSS
2.0 compliant) stored locally on the web server. Now I'm wondering
what would the best method to allow searching of these RSS files.
Since the search will cater to multiple users the search system has to
be robust and efficient. Some ideas that I have for the RSS search
system are:

1. Store all RSS files locally on the web server file system and
perform file system queries. But I guess this might get slow when a
number of users try to search. Moreover, the queries may not be
extensible (for example to allow boolean operations etc).

2. Move the RSS data to the database and then search perform search
using LIKE (or the more advanced indexing service features).

3. Use a 3rd party full-text search engine like Lucene.

4. Use something like XQuery or XPath to query the RSS files directly
but this again *might* (not sure since I haven't worked with either)
get slow when a number of users try to search.

Also, the RSS files I have on the web server will be updated every
hour or so.

So, I have the ideas but I'm not quite sure which one would the most
suitable and efficient. If anyone has ideas on implementing such a
search system for RSS feeds then please share your insight. Thank you
guys!

Nov 18 '05 #2
Han
You may want xquery friendly database like sql2005. You have xml typed
field. e.g.

create table table1 (i int, x xml)

Now you can directly xquery the xml field like,

select i, x from table1 where x.exist(xquery)

Free download is,

http://lab.msdn.microsoft.com/express/sql/

"RiceGuy" <9i*************@jetable.org> wrote in message
news:d7**************************@posting.google.c om...
Hi! I'm looking for ideas on what would the best approach to design a
search system for a RSS feeds. I will have some 50 RSS feeds (all RSS
2.0 compliant) stored locally on the web server. Now I'm wondering
what would the best method to allow searching of these RSS files.
Since the search will cater to multiple users the search system has to
be robust and efficient. Some ideas that I have for the RSS search
system are:

1. Store all RSS files locally on the web server file system and
perform file system queries. But I guess this might get slow when a
number of users try to search. Moreover, the queries may not be
extensible (for example to allow boolean operations etc).

2. Move the RSS data to the database and then search perform search
using LIKE (or the more advanced indexing service features).

3. Use a 3rd party full-text search engine like Lucene.

4. Use something like XQuery or XPath to query the RSS files directly
but this again *might* (not sure since I haven't worked with either)
get slow when a number of users try to search.

Also, the RSS files I have on the web server will be updated every
hour or so.

So, I have the ideas but I'm not quite sure which one would the most
suitable and efficient. If anyone has ideas on implementing such a
search system for RSS feeds then please share your insight. Thank you
guys!

Nov 18 '05 #3
RiceGuy,
In addition to what Hilary has recommend, the following web article
"Creating SQL Based RSS Feed ..." at http://www.sswug.org/see/18299 defines
a sample table and data along with a stored proc "GenerateRssFeed" and then
use:

Execute the below sql script to generate the RSS feed.

sp_makewebtask @outputfile = 'C:Rss.xml', ---- Point 1
@query = 'Exec GenerateRssFeed', -- Put the SP name here
@templatefile = 'C:RssFeedTemplate.xml' -- Point 2

The article is good and directly explains how RSS feed can be generated
directly from SQL Server 2000. The above web article can also be found at
http://www.dotnetforce.com/(0eqeob45...aspx?t=a&n=204

If you're interested in the XQuery support (but not the FTS component), you
might want to review the newly released beta version of SQL Sever 2005
Express, the MSDE 2000 replacement at:
http://lab.msdn.microsoft.com/express/sql/default.aspx

Regards,
John


"Hilary Cotter" <hi*****@att.net> wrote in message
news:uS**************@TK2MSFTNGP11.phx.gbl...
You might want to shread the XML docs/RSS feeds and store them in a
relational database and FTI the columns of interest and query them there.

I would advise against storing them in the file system or storing them in
XML format in the image type columns. Although Indexing Services and SQL FTS does support querying XML/RSS feeds using the XML iFilter, you can't index
properties using SQL Server FTS, and Indexing Services support isn't much
better.

You could index the XML/RSS as text but there are some problems indexing the XML tags.

XQuery FTS support will be supplied when SQL 2005 which will RTM next year.
Please refer to:

http://msdn.microsoft.com/xml/defaul.../sql2k5xml.asp
for more info.

--
Hilary Cotter
Looking for a book on SQL Server replication?
http://www.nwsu.com/0974973602.html
"RiceGuy" <9i*************@jetable.org> wrote in message
news:d7**************************@posting.google.c om...
Hi! I'm looking for ideas on what would the best approach to design a
search system for a RSS feeds. I will have some 50 RSS feeds (all RSS
2.0 compliant) stored locally on the web server. Now I'm wondering
what would the best method to allow searching of these RSS files.
Since the search will cater to multiple users the search system has to
be robust and efficient. Some ideas that I have for the RSS search
system are:

1. Store all RSS files locally on the web server file system and
perform file system queries. But I guess this might get slow when a
number of users try to search. Moreover, the queries may not be
extensible (for example to allow boolean operations etc).

2. Move the RSS data to the database and then search perform search
using LIKE (or the more advanced indexing service features).

3. Use a 3rd party full-text search engine like Lucene.

4. Use something like XQuery or XPath to query the RSS files directly
but this again *might* (not sure since I haven't worked with either)
get slow when a number of users try to search.

Also, the RSS files I have on the web server will be updated every
hour or so.

So, I have the ideas but I'm not quite sure which one would the most
suitable and efficient. If anyone has ideas on implementing such a
search system for RSS feeds then please share your insight. Thank you
guys!


Nov 18 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: R. Rajesh Jeba Anbiah | last post by:
Q: Is PHP search engine friendly? Q: Will search engine spiders crawl my PHP pages? A: Spiders should crawl anything provided they're accessible. Since, nowadays most of the websites are been...
1
by: Les Juby | last post by:
A year or two back I needed a search script to scan thru HTML files on a client site. Usual sorta thing. A quick search turned up a neat script that provided great search results. It was fast,...
5
by: George | last post by:
Hi, Anyone has the background for explaining? I have made a search on my name and I have got a link to another search engine. The link's title was the search phrase for the other search engine...
3
by: Alastair | last post by:
Hello guys, I've been building a search facility for an intranet site I'm part of developing and we've been building a search engine using Index Server. It mostly works, however there have been...
39
by: Noticedtrends | last post by:
Can inference search-engines narrow-down the number of often irrelevant results, by using specific keywords; for the purpose of discerning emerging social & business trends? For example, if...
22
by: Phlip | last post by:
C++ers: Here's an open ended STL question. What's the smarmiest most templated way to use <string>, <algorithms> etc. to turn this: " able search baker search charlie " into this: " able...
28
by: joshc | last post by:
If I have an array of data that I know to be sorted in increasing order, and the array is less than 50 elements, and I want to find the first element greater than a certain value, is a simple...
4
by: BenCoo | last post by:
Hello, In a Binary Search Tree I get the error : Object must be of type String if I run the form only with the "Dim bstLidnummer As New BinarySearchTree" it works fine. Thanks for any...
1
Merlin1857
by: Merlin1857 | last post by:
How to search multiple fields using ASP A major issue for me when I first started writing in VB Script was constructing the ability to search a table using multiple field input from a form and...
0
by: passion | last post by:
"Specialized Search Engines" along with Google Search Capability (2 in 1): http://specialized-search-engines.blogspot.com/ Billions of websites are available on the web and plenty of extremely...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.