By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,109 Members | 932 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,109 IT Pros & Developers. It's quick & easy.

Web page content in database + formatting?

P: n/a
Hi all,

Ok - this leads on from speaking to a couple here and in the SQL server
group...

I've an application which allows the user to type in their text into a form,
they add 'happy' tags around their words, the app then replaces these with
the html equivalent and saves it to the database...

Thus far this has been working very well.

I've been asked to add search functionality to the site now, and whilst I've
already made a good start on this, one slight fly in my coding oink-ment
(:@\) is the fact that when I search through it I have things like :

<b>hello world</b>

My initial search syntax might end something like this:

where PageContent Like '% hello %'

this would run off and try to find all the instances of 'hello' where its a
word in its own right, but as I've seen now - in the example above it
wouldn't find the word because of the first <b> tag.

Aaron (and others) mentioned a few ways to get around this but suggested the
problem was because I have the formatting and data in the same table....

There are currently 100+ pages, so therefore fixing/changing this could be a
bit of a sod, lucky I work closely with the company using this so I am happy
to spend the extra time and page by page if need be change each one to
correct it.

What I am unable to come up with yet is a 'good' way to seperate the
formatting from the text.

Thoughts so far :

2 tables - one with formatted text used only for display - and the second
then only used for searching, the content would be written to both tables at
the time of the page being created and then both again when updated etc.

This would be the 'easiest' way (apart from the 100+ already created), but I
dont 'personally' think its the best approach because of the data
replication.

Another thought was to have a lookup table, this would contain the page id,
and then many rows for each page with the character position of an opening
formatting tag, and the closing character position of a formatting tag, the
type of tag, and the 'detail' for the tag, ie

pageid 1
charpos 10
tagtype 1 (<a href>)
tagdetail <a href="http://www.mydomain.com" title="my domain">

This would enable me to strip the tags from the data table (only one now) -
but then there would be an overhead when putting it all together, and
obviously when saving the page initially or updating it later as it would
have to run through this procedure and try and find them all....

Because of the freedom the user has, ie, its not just 'header' and 'body'
and then I always make the header bold or something there are quite a few
tags that can be used, and some of them with variable data inside, ie the
hyperlinks for web pages or email addresses, I have image and document tags
for a repository for images and documents and so on...

Other than these 2 ideas I cannot at this time think of anything else, I
cant just use css because again I have the hyperlinks etc, and even then
there would need to be 'something' in the data that says "this has to be
bold"...

Anyone got any thoughts/ideas...?

As I said, I'm more than happy to change the 100+ pages and remove the tags
from the text, thus correcting the problem with the search, but I need a way
that will definately work before I even think about climbing that mountain!

Thanks for your time,

Regards

Rob
Jul 19 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
Rob
Why not put a start and stop character on all non text values, .
In database store something like this [<b>]Hellow[</b>]World or if you use
them try | & ~
Than write 1 asp function that strips them out of the imcomming string, and
one function that puts them in.
That should be easy,.
As far as converting existing tags, you should work on copies of the
database and write a few throw away temp functions to prepare the existing
data,.
You will need to filter incoming user data to not accept your control
characters or all will go down toilet 8-)

Regards
Don
"Rob Meade" <ro********@NO-SPAM.kingswoodweb.net> wrote in message
news:xz*********************@news-text.cableinet.net...
Hi all,

Ok - this leads on from speaking to a couple here and in the SQL server
group...

I've an application which allows the user to type in their text into a form, they add 'happy' tags around their words, the app then replaces these with
the html equivalent and saves it to the database...

Thus far this has been working very well.

I've been asked to add search functionality to the site now, and whilst I've already made a good start on this, one slight fly in my coding oink-ment
(:@\) is the fact that when I search through it I have things like :

<b>hello world</b>

My initial search syntax might end something like this:

where PageContent Like '% hello %'

this would run off and try to find all the instances of 'hello' where its a word in its own right, but as I've seen now - in the example above it
wouldn't find the word because of the first <b> tag.

Aaron (and others) mentioned a few ways to get around this but suggested the problem was because I have the formatting and data in the same table....

There are currently 100+ pages, so therefore fixing/changing this could be a bit of a sod, lucky I work closely with the company using this so I am happy to spend the extra time and page by page if need be change each one to
correct it.

What I am unable to come up with yet is a 'good' way to seperate the
formatting from the text.

Thoughts so far :

2 tables - one with formatted text used only for display - and the second
then only used for searching, the content would be written to both tables at the time of the page being created and then both again when updated etc.

This would be the 'easiest' way (apart from the 100+ already created), but I dont 'personally' think its the best approach because of the data
replication.

Another thought was to have a lookup table, this would contain the page id, and then many rows for each page with the character position of an opening
formatting tag, and the closing character position of a formatting tag, the type of tag, and the 'detail' for the tag, ie

pageid 1
charpos 10
tagtype 1 (<a href>)
tagdetail <a href="http://www.mydomain.com" title="my domain">

This would enable me to strip the tags from the data table (only one now) - but then there would be an overhead when putting it all together, and
obviously when saving the page initially or updating it later as it would
have to run through this procedure and try and find them all....

Because of the freedom the user has, ie, its not just 'header' and 'body'
and then I always make the header bold or something there are quite a few
tags that can be used, and some of them with variable data inside, ie the
hyperlinks for web pages or email addresses, I have image and document tags for a repository for images and documents and so on...

Other than these 2 ideas I cannot at this time think of anything else, I
cant just use css because again I have the hyperlinks etc, and even then
there would need to be 'something' in the data that says "this has to be
bold"...

Anyone got any thoughts/ideas...?

As I said, I'm more than happy to change the 100+ pages and remove the tags from the text, thus correcting the problem with the search, but I need a way that will definately work before I even think about climbing that mountain!
Thanks for your time,

Regards

Rob

Jul 19 '05 #2

P: n/a
"Don Grover" wrote ...
Rob
Why not put a start and stop character on all non text values, .
In database store something like this [<b>]Hellow[</b>]World or if you use them try | & ~
Than write 1 asp function that strips them out of the imcomming string, and one function that puts them in.


Hi Don, thanks for the reply.

This is kinda what I have at the moment, from the user entering the data on
the form like this :

bold etc

which then gets changed using my function to

<b>bold</b>

which is then saved to the database.

I'm sure I could come up with something in ASP to remove the tags, ie, a
regular expression or something (with help from this group of course :D) -
but its on the SQL side I need to do this, when I execute the
searches...unless I have a clean, non-formatted version of the data in the
first place. I dont think I would want to retrieve every page and its
contents/data from the database into ASP then search through it in ASP.

Regards

Rob
Jul 19 '05 #3

P: n/a
Hi Rob

Sounds like an sql user function, i would be asking in sql programing or a
similar newsgroup.
But creating a filter for searching would not be that hard i would assume in
sql, you can you could create a stored procedure that called from the asp
page and in turn calls a function being passed whats to be searched for and
let the func retrieve and pass back data to return of sp ?.
Anyway it does sound like an sql newsgroup question.
Don

"Rob Meade" <ro********@NO-SPAM.kingswoodweb.net> wrote in message
news:1q*********************@news-text.cableinet.net...
"Don Grover" wrote ...
Rob
Why not put a start and stop character on all non text values, .
In database store something like this [<b>]Hellow[</b>]World or if you use
them try | & ~
Than write 1 asp function that strips them out of the imcomming string,

and
one function that puts them in.


Hi Don, thanks for the reply.

This is kinda what I have at the moment, from the user entering the data

on the form like this :

bold etc

which then gets changed using my function to

<b>bold</b>

which is then saved to the database.

I'm sure I could come up with something in ASP to remove the tags, ie, a
regular expression or something (with help from this group of course :D) -
but its on the SQL side I need to do this, when I execute the
searches...unless I have a clean, non-formatted version of the data in the
first place. I dont think I would want to retrieve every page and its
contents/data from the database into ASP then search through it in ASP.

Regards

Rob

Jul 19 '05 #4

P: n/a
"Don Grover" wrote ...
Anyway it does sound like an sql newsgroup question.


Hi Don,

Yes, if I do that in SQL then I should ask elsewhere :)

However....It was the theory, ie, has anyone else produced anything similar,
and done it a different way, ie the formatting not being in the same tables
as the data that I was interested in knowing as well, if they have then
maybe they'll have another stance on the way forward that I could adopt.

Regards

Rob
Jul 19 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.