473,383 Members | 1,880 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

Regular Expressions to match HTML

69
Hi,

I have a need to read an HTML file into an array and match certain sections of it. I want to be able to extract headers and text within certain DIV tags with a class of padding8. For example,

I want to be able to match this
Expand|Select|Wrap|Line Numbers
  1. <div class="bottomPostionDiv" ><h1 >PROFESSIONAL </h1></div>
But not this because there's no content.
Expand|Select|Wrap|Line Numbers
  1. <div class="bottomPostionDiv" ><h1 ></h1></div>
.

I want to extract the content of this DIV.
Expand|Select|Wrap|Line Numbers
  1. <div class="padding8">
  2. <h2>Who does it apply to?</h2>
  3. Professional Development is relevant to all members, although<br>
  4. Corporate Members are required by the Code of Professional<br>
  5. Development to:<br>
  6. <span class="colorBlueLight">&bull;</span> Be committed to continuous learning and improvement<br>
  7. <span class="colorBlueLight">&bull;</span> Take ownership of their development and maintain it in a systematic manner<br>
  8. <span class="colorBlueLight">&bull;</span> Support the learning and development of others
  9. </div>
  10.  
and this
Expand|Select|Wrap|Line Numbers
  1. <div class="colOne width210">
  2. <h4 >Professional Development
  3. is a process that enables
  4. you to maintain and
  5. develop relevant skills and
  6. knowledge throughout your career</h4>
  7. <!-- end of h4 -->
  8. </div><!-- end of colOne 2-->
  9.  
I have tried using preg_replace and eregi, not sure which is best.

Expand|Select|Wrap|Line Numbers
  1. $delimiters['/<h{1}>/'] = '/</h{1}/';
  2. $delimiters['/class = "colOne \w">\s<h/'] = '/<\/div>/';
  3. $delimiters['/padding8/'] = '/<\/div>/';
  4. $delimiters['/block22/'] = '/<\/div>/';
  5.  
Please help.

Sean
Sep 16 '07 #1
0 949

Sign in to post your reply or Sign up for a free account.

Similar topics

7
by: YoBro | last post by:
Hi I have used some of this code from the PHP manual, but I am bloody hopeless with regular expressions. Was hoping somebody could offer a hand. The output of this will put the name of a form...
3
by: dmbkiwi | last post by:
I have a problem. I have written a python based theme for a linux app called superkaramba, which is effectively an engine for desktop applets that utilises python as its theming language. The...
1
by: Kenneth McDonald | last post by:
I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make...
3
by: Joe | last post by:
Hi, I have been using a regular expression that I don’t uite understand to filter the valid email address. My regular expression is as follows: <asp:RegularExpressionValidator...
7
by: norton | last post by:
Hello, Does any one know how to extact the following text into 4 different groups(namely Date, Artist, Album and Quality)? - Artist - Album Artist - Album - Artist - Album - Artist -...
7
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
3
by: Zeba | last post by:
Hi guys, I need some help regarding regular expressions. Consider the following statement : System.Text.RegularExpressions.Match match =...
12
by: FAQEditor | last post by:
Anybody have any URL's to tutorials and/or references for Regular Expressions? The four I have so far are: http://docs.sun.com/source/816-6408-10/regexp.htm...
9
by: Rene | last post by:
I'm trying to basically remove chunks of html from a page but I must not be doing my regular expression correctly. What i'm trying with no avail. $site = preg_replace("/<!DOCTYPE(.|\s)*<div...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.