473,416 Members | 1,552 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,416 software developers and data experts.

Regular express for <p>, <ul> and <ol> tags

Hi,
I am parsing an .HTML file that contains following example code:
<div>
<p class="html_preformatted" awml:style="HTML Preformatted"
dir="ltr" style="text-align:left"><span style="font-size:12pt;font-
family:'Arial'" xml:lang="en-US" lang="en-US">Normal Text Arial 12
Black before bullets.</span></p>
<ul>
<li class="html_preformatted" dir="ltr" style="text-
align:left">&nbsp;<span style="font-size:12pt;font-family:'Arial'"
xml:lang="en-US" lang="en-US">Bullet1: If you want to convert bitmap
images Single Line.</span></li>

<li class="html_preformatted" dir="ltr" style="text-
align:left">&nbsp;<span style="font-size:12pt;font-family:'Arial'"
xml:lang="en-US" lang="en-US">Bullet2: D you want to convert </
span><span style="font-weight:bold;font-size:13pt;font-family:'Times
New Roman';color:#ff0000" xml:lang="en-US" lang="en-US">Times New
Roman Bold Red 13</span><span style="font-size:12pt;font-
family:'Arial'" xml:lang="en-US" lang="en-US"like BMP, JPG?</span></
li>
<li class="html_preformatted" dir="ltr" style="text-
align:left">&nbsp;<span style="font-weight:bold;font-size:12pt;font-
family:'Arial'" xml:lang="en-US" lang="en-US">Bullet3 bold:</
span><span style="font-size:12pt;font-family:'Arial'" xml:lang="en-US"
lang="en-US"If you want to convert bitmap images like BMP, JPG</
span></li>
<li class="html_preformatted" dir="ltr" style="text-
align:left">&nbsp;<span style="font-weight:bold;font-size:14pt;font-
family:'Arial'" xml:lang="en-US" lang="en-US">Bullet4 bold 14: </
span><span style="font-size:14pt;font-family:'Arial'" xml:lang="en-US"
lang="en-US">If you want to convert bitmap images like BMP, JPG 2
lines.</span></li>
<li class="html_preformatted" dir="ltr" style="text-
align:left">&nbsp;<span style="font-weight:bold;font-size:16pt;font-
family:'Arial';color:#ff0000" xml:lang="en-US" lang="en-US">Bullet4
bold 14 all Red: </span><span style="font-size:16pt;font-
family:'Arial';color:#ff0000" xml:lang="en-US" lang="en-US">If you
want to convert bitmap images like BMP, JPG.</span></li>

<li class="html_preformatted" dir="ltr" style="text-
align:left">&nbsp;<span style="font-weight:bold;font-size:14pt;font-
family:'Arial'" xml:lang="en-US" lang="en-US">Bullet4 bold 14 Black:
</
span><span style="font-size:14pt;font-family:'Arial';color:#0000ff"
xml:lang="en-US" lang="en-US">Blue If you want to convert bitmap. </
span><span style="font-size:16pt;font-family:'Arial';color:#008000"
xml:lang="en-US" lang="en-US">Green 16 images like BMP, JPG.</span>
</li>
</ul>
<p class="html_preformatted" awml:style="HTML Preformatted"
dir="ltr" style="text-align:left"><span style="font-size:14pt;font-
family:'Arial';color:#ff0000" xml:lang="en-US" lang="en-US">Normal
Text Red Arial 14 after bullets.</span></p>
<p class="html_preformatted" awml:style="HTML Preformatted"
dir="ltr" style="text-align:left;margin-left:0.2500in"><span
style="font-weight:bold;font-size:14pt;font-family:'Arial'"
xml:lang="en-US" lang="en-US">&nbsp;</span></p>
<p dir="ltr" style="text-align:left"></p>
<p></p>
</div>

I am trying to parse all the <p>, <oland <ultags but couldn't
succeed yet.
I am trying following Regular Expression(RE):
"(<[pP][^>]*>(.*)</[pP]>)|(<[oO][lL][^>]+>(.*)</[oO][lL]>)|(<[uU][lL]
[^>]+>(.*)</[uU][lL]>)"

I am using preg_match_all(). Remember I am working in PHP.
If any one can help me, I will be very grateful to him/her. I need its
solution urgent.
Aug 26 '08 #1
2 2340
..oO(Shahid)
>I am parsing an .HTML file that contains following example code:
<div>
<p class="html_preformatted" awml:style="HTML Preformatted"
dir="ltr" style="text-align:left"><span style="font-size:12pt;font-
family:'Arial'" xml:lang="en-US" lang="en-US">Normal Text Arial 12
Black before bullets.</span></p>
<ul>
[...]

I am trying to parse all the <p>, <oland <ultags but couldn't
succeed yet.
I am trying following Regular Expression(RE):
"(<[pP][^>]*>(.*)</[pP]>)|(<[oO][lL][^>]+>(.*)</[oO][lL]>)|(<[uU][lL]
[^>]+>(.*)</[uU][lL]>)"

I am using preg_match_all(). Remember I am working in PHP.
If any one can help me, I will be very grateful to him/her. I need its
solution urgent.
Why don't you use the DOM with an XPath expression?

Micha
Aug 26 '08 #2
Shahid wrote:
Hi,
I am parsing an .HTML file that contains following example code:
[snip]

I am trying to parse all the <p>, <oland <ultags but couldn't
succeed yet.
I am trying following Regular Expression(RE):
"(<[pP][^>]*>(.*)</[pP]>)|(<[oO][lL][^>]+>(.*)</[oO][lL]>)|(<[uU][lL]
[^>]+>(.*)</[uU][lL]>)"

I am using preg_match_all(). Remember I am working in PHP.
If any one can help me, I will be very grateful to him/her. I need its
solution urgent.
Have you bothered checking php.net's docs? Their page for
preg_match_all has an example regex doing what you want.

--
Curtis
Aug 26 '08 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
by: Chris Goldie | last post by:
From an accessibility point of view, is there any advange in using <P> over <br>? eg, whats the difference between these two examples, are they both accessible? Eg. 1 <p>My first...
4
by: Timo Nentwig | last post by:
Hi! Is the following possible?         1. one   3. three         2. two   4. four And if so, how? If not, an offset attribute should be added to the next version of XHTML:
8
by: Michael | last post by:
This is a two-part question to which I haven't been able to find an answer anywhere else. 1. Is it possible to format the bullet/number character of the <li>? In my styles sheet, I have the <li>...
4
by: Peter | last post by:
Hi at all To make a list using <UL><IL> showed horizontally AND with ITEMS SPACED 30pixels. waht CSS command have I to use please? Thank in advance Peter
4
by: Mark | last post by:
Hopefully I 'm missing something silly, but I can't find an easy way to loop all list items in a simple <ol>. I was hoping a for loop as shown below would be enough, however clicking "alert all" in...
1
by: jasonchan | last post by:
How do you align <ol> and <ul> elements when they are contained in a floated box? Here is my website: http://geocities.com/jasonchan483/ Here's my problem. The markers of the lists are...
7
by: patrick j | last post by:
Hi I'm wondering about lists with nested lists as one does on a Saturday afternoon. Anyway below is an example of a list with a nested list which the iCab browser's very useful HTML...
3
by: Man-wai Chang | last post by:
A 2 columns x 10 rows matrix input form <ul> <li> <ul> <li>item name 1 <li><input type="textbox" name="input_col_1_row_1"> <li><input type="textbox" name="input_col_1_row_2"> </ul> <li>
5
nathj
by: nathj | last post by:
Hi All, I'm working on a new site that has a two column layout underneath a title bar. If you check out: http://www.christianleadership.org.uk/scratch/mainpage.php using IE or Opera you will...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.