473,414 Members | 1,577 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,414 software developers and data experts.

Searching Content

sql
I am developing a Content Management System in ASP .NET with a SQL
Server database. The main content is stored as HTML (still not sure if
this is the way to go yet!) so that it is displayed on the page with
formatting, which will be specified by the user when adding the content
to a page. I am working on a search facility, which searches the main
content against the search word/s entered. The problem is that the
content contains HTML. It is possible that the HTML itself could
contain the word/s searched on by the user. I have 2 posiblilities
here, none of which i like, which are as follows:-

1. Store the html and a plain text version of the content, and search
the plain text version
2. Strip the html in the VB .NET code then search the string again to
see if the word/s are still present

Anyone got any ideas on how this should be done?

Oct 10 '06 #1
3 1098
formatting, which will be specified by the user when adding the content
to a page. I am working on a search facility, which searches the main
content against the search word/s entered. The problem is that the
content contains HTML. It is possible that the HTML itself could
Another option:

1. Turn on Indexing Engine
2. Don't keep the files in SQL-DB. But in file-system. Raw html
3. Create a catalog for the folder
4. Create a linked server in SQL-Server mapped to the indexing engine to get
a view
5. Apply full-text search on the view!

Want the code? ... eh! :)
--
Happy Hacking,
Gaurav Vaish | www.mastergaurav.com
www.edujinionline.com
http://eduzine.edujinionline.com
-----------------------------------------
Oct 10 '06 #2
sql
Gaurav,

Thanks for the reply. I just want to clarify what I have done here. The
website has just 1 page, default.aspx, which is totally data driven.
The menus and the content are built on page load event by passing a
page id to the database. It is the main content which is stored as HTML
so that when it is displayed it has the required formatting for the
display (i.e Bold text, Italics, bullet points etc.)

So basically none of the pages in the website actually exist as HTML
pages on the file server on the IIS box. Hope this makes sense.

Thanks.

Oct 11 '06 #3
page id to the database. It is the main content which is stored as HTML
so that when it is displayed it has the required formatting for the
display (i.e Bold text, Italics, bullet points etc.)
Assuming that none of the dynamic-HTML content contains any form-elements
that may submit the main form (the [runat='server'] form) of the page, if
any, because that may break ASP.Net processing.

Also, I am assuming that what you want to do is grab the HTML from the
appropriate row for which the searched matched and display the contents.

Since you would not be looking at 'WHERE' or 'LIKE' match but free-flow-text
match, I would suggest one of these options:

1. Microsoft Indexing Service to index your files. Instead of putting the
HTML-content in db, put them in files and then let MIS do the job. The
results are fairly good. We have been using it for our internal purposes...
basically to test our KM product.

2. Buy Google MiniSearch. Let it index the documents. You query it using
APIs (Web Service enabled). You trust Google? I do... at least for search.
Do look at the cost figures... MiniSearch can index upto around 1million
documents of any type (you'd specifically be interested in HTML contents
only) but costs around $x,000 (don't recall if x = 2 or x = 5 :D).
If you are looking to scale up your operations and have control over hosting
environment, my personal recommendation would be Google MiniSearch since as
the size of repository grows, MIS tends to get very slower. (size >=
100-200k documents; well, we have a mix of text, html, Office [doc, ppt,
xls] etc documents).
Hope that helps!
--
Happy Hacking,
Gaurav Vaish | www.mastergaurav.com
www.edujinionline.com
http://eduzine.edujinionline.com
-----------------------------------------
Oct 13 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Les Juby | last post by:
Can anyone please help with some VBScripting that would enable me to search thru Word documents (Word 2000 format) and return the filenames to use in a archive search on a legal website.? TIA ...
0
by: Mike | last post by:
Sites using thumbnail preview for world wide web file navigation and searching. Below are list of sites that are either researching or providing thumbnail preview images for online web...
14
by: Ludwig77 | last post by:
I read that there are some tags that can be entered in a web page's meta tags in order to prevent web bot searching and indexing of the web page for search engines. What is the tagging that I...
8
by: Gordon Knote | last post by:
Hi can anyone tell me what's the best way to search in binary content? Best if someone could post or link me to some source code (in C/C++). The search should be as fast as possible and it would...
8
by: sandeep | last post by:
Our team is developing proxy server(in VC++)which can handle 5000 clients. I have to implement cache part so when ever a new request com from client I have to check the request URL content is in...
5
by: justobservant | last post by:
When more than one keyword is typed into a search-query, most of the search-results displayed indicate specified keywords scattered throughout an entire website of content i.e., this is shown as...
1
by: Tony | last post by:
I have been using TinyMCE as a WYSIWYG editor for getting content into a database and then exporting that data into an XML format to redender in flash using CDATA. The problem is that I didn't...
3
by: Aaron | last post by:
I'm trying to parse a table on a webpage to pull down some data I need. The page is based off of information entered into a form. when you submit the data from the form it displays a...
4
by: dodjem | last post by:
Hi all, I am facing an issue which I really don't know how to solve after googling for quite some time. I have 2 tables: one is for my new articles and one is for the website content; I would...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.