473,800 Members | 3,089 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

c#.NET get text between body tags of an html file

rhitam30111985
112 New Member
Hi all ,

I am trying to read an html file and retrieve only the text between
the body tags of that file. Now , for reading a string between two
strings , i already have a function :


http://www.mycsharpcorner.com/Post.aspx?postID=15


But the problem is that the body tag might have some attribute. In
that case i dont know how to exclude that and get only the text
between the tags. Ie , something like this :


<body style="margin:0 ;padding:0">
..
.
.
.
</body>


Any ideas?


Regards,
Rhitam
May 5 '09 #1
2 18552
cloud255
427 Recognized Expert Contributor
This is just a slight modification on the search algorithm where your start index is not the entire opening tag in this case "<body>" but rather the first occurrence of ">" after the body tag start "<body".

So you need to find the end index of the "<body" string and use this as the start index for your next search in which you will look for the ">" character. This is then the starting index of your actual message.

From this point its fairly straight forward to get the starting index of the closing "</body>" tag.

This is string manipulation which can be done using Regex or the .NET string class.
May 5 '09 #2
r035198x
13,262 MVP
Just use regex.
If you have
Expand|Select|Wrap|Line Numbers
  1. string text = "<body style=\"margin:0;padding:0\">r035198x</body> ";
then
Expand|Select|Wrap|Line Numbers
  1. Regex.Replace(text, "\\</*body.*?>", "");
should do it.

P.S Not tested on C# compiler.
May 5 '09 #3

Sign in to post your reply or Sign up for a free account.

Similar topics

3
1624
by: Roger Withnell | last post by:
My customer needs to be able to change some of the text on a page from time to time. She can do this using a form's text area but I'd prefer to design so that she can prepare the text in her PC and then upload it to the page. How can I do this?
10
4199
by: J. Alan Rueckgauer | last post by:
Hello. I'm looking for a simple way to do the following: We have a database that serves-up content to a website. Some of those items are events, some are news articles. They're stored in the DB as formatted HTML so ASP just drops them right into a page shell. Now, we want to send out a newsletter email containing some of those items. No problem sending as HTML. However, some of the members want just plain text. Is there some magic...
2
2210
by: xyz | last post by:
I am trying to display some XML-tagged text in a web page. The following example works well in Netscape 7.1, but only the H2 heading is colored in MSIE. It seems that MSIE only applies style attributes to tags that it recognizes. Is there a way in which this can be done? Is it possible to format the XML part (from <azTagger> to </azTagger>) with XLS? If so, how? Thanks xyz
1
1397
by: Sketcher | last post by:
Hi all, I have a basic menu (compatible with IE and N6), the code of which is shown below. Is there any way that I can tidy this up - perhaps define the criteria once only for the entire menu. It seems like a lot of code for such a simple menu and repeatative in parts. Also, I take it that it should work with other popular browsers aswell e.g. Opera.
8
4690
by: Jakej | last post by:
I've been using a javascript in an html file for a banner slider, and it works as desired. But I'd like to use it on more than one page and it would be great if I could transfer the code to a .js file and call it with the <script src="filename.js"></script> tags as I do for many other .js files. But, when I try to do that way, it doesn't work right. It will display the banner text, but only on the 0,0 page coordinate and doesn't slide....
4
1824
by: Stu | last post by:
Hi, I am writing a content management system that has to have W3C compliant output. The pages are template driven and there are special strings within the template to be used as placeholders for the content. However, when I output the content to a literal (as shown below) the body and html tags are within the form tags. Is there any way to position the form tags within the dynamically generated text (ie to just after the opening body...
4
4175
by: Spondishy | last post by:
Hi, I'm looking for help with a regular expression and c#. I want to remove all tags from a piece of html except the following. <a> <b> <h1> <h2>
3
2008
by: Alex | last post by:
Hello. First, with AJAX I will get a remote web page into a string. Thus, a string will contain HTML tags and such. I will need to extract text from one <span> for which I know the ID the inner text. Is it possible to access in this way "string variable".getElementByID() somehow? Thank you.
23
3103
by: Big Bill | last post by:
http://www.promcars.co.uk/pages/bonnie.php I don't believe they should be there, can I take them out without stopping the includes from functioning? I'm the (hapless) optimiser on this one... I have to correct where they've spelled my name wrong too...sigh... BB --
0
9689
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10032
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9085
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7573
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6811
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5469
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5597
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4148
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2942
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.