473,385 Members | 1,764 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

how to removie html tags from a string

3
Hi,

I store the comments as a text in database, but for special characters as an HTML tag.
While fetching it in a text file, i just need the comments and no HTML tag.
Is there any way to remove this using C program.

Pls help.

-Ratish
Feb 18 '09 #1
8 3683
donbock
2,426 Expert 2GB
Are you confident the html file is well-formed? If so, you could delete all text enclosed in angle brackets. Notice that horrible things will happen if the input file isn't well-formed.

What are you supposed to do with character entity references (such as "&lt") and numeric entity references (such as "&#931) -- pass them through or expand them?

Is this an assignment that you have to do in a particular way; or are you happy with any approach that works? Try opening the file with a browser and then saving it as a text file.
Feb 18 '09 #2
Ra71sh
3
Actually the text is stored in database while formatting from the application screen.
But I need to provide a report with the comments input for which I need to remove the html tags. I am using a C program to fetch the data from database.
But it carries the html tags. I need to remove all the HTML tags that is in these comments.
Feb 18 '09 #3
JosAH
11,448 Expert 8TB
@Ra71sh
Yes, you already wrote that: copy all characters on a line until you scan a '<'; stop copying but keep on scanning until you see a '>'. Repeat until you've reached the end of the line.

As already mentioned character combinations such as '&lt;' and uglier pass unharmed.

kind regards,

Jos
Feb 18 '09 #4
Ra71sh
3
that is fine but my concern is for the cases where someone has entered some text e.g. points where he has used <a>, <b>, or a>, b> etc.
How would I handle this.
Also, in cases where I have special handling like &amp; or #3688 etc what would I do?
Feb 18 '09 #5
JosAH
11,448 Expert 8TB
@Ra71sh
You have to write a complete fault tolerant html parser then. As a corollary I understand that your database contains incorrect html data? If so the GIGO prinicples rears its ugly head (Garbage In Garbage Out).

kind regards,

Jos
Feb 18 '09 #6
donbock
2,426 Expert 2GB
@Ra71sh
In a well-formed html file, the input characters "<" and ">" would be replaced by "&lt" and "&gt". Is that happening for you? If not, then your file is malformed and you will have great difficulty parsing it.

Regarding "&#3688" and its kin, don't ask us ... what do you think should happen? Are you emitting a text file? If so, then you are limited to printable characters. What is a meaningful and useful way to handle nonprintable characters in your context?

Do you want to see the html page exactly as it would appear on a browser page -- with all the mark-ups? If so, then open the file with a browser and save or print to a pdf file.
Feb 18 '09 #7
donbock
2,426 Expert 2GB
A malformed html file can confuse the html parser:
html parse error
Feb 18 '09 #8
JosAH
11,448 Expert 8TB
@donbock
That is so funny. ;-)

kind regards,

Jos
Feb 18 '09 #9

Sign in to post your reply or Sign up for a free account.

Similar topics

15
by: Jeff North | last post by:
Hi, I'm using a control called HTMLArea which allows a person to enter text and converts the format instructions to html tags. Most of my users know nothing about html so this is perfect for my...
4
by: lkrubner | last post by:
I'd like to write a PHP script to be used from the command line on a Unix machine. I'd like for the script to put together a string, turn it into a web page, print it, then return control the...
2
by: John Olsen | last post by:
Hi. I`m building a small CMS, and want to add the possibility to include server side code inside static html-strings that is stored in a database. For e.g. in the string...
3
by: Alex | last post by:
Hello. First, with AJAX I will get a remote web page into a string. Thus, a string will contain HTML tags and such. I will need to extract text from one <span> for which I know the ID the inner...
9
by: anupamjain | last post by:
Hi, After 2 weeks of search/hit-and-trial I finally thought to revert to the group to find solution to my problem.(something I should have done much earlier) This is the deal : On a JSP...
3
by: Just D. | last post by:
All, What's the simplest way to show my own HTML string on the ASPX page assuming that this page is just created using the wizard and it has nothing on it? We're free to use any control adding...
7
by: Xah Lee | last post by:
Summary: when encountering ex as a unit in css, FireFox (and iCab) did not take into account the font-family. Detail: http://xahlee.org/js/ff_pre_ex.html Xah xah@xahlee.org ∑...
1
by: since | last post by:
I figured I would post my solution to the following. Resizable column tables. Search and replace values in a table. (IE only) Scrollable tables. Sortable tables. It is based on a lot...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.