473,809 Members | 2,769 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

preg_match and html question


I need a regular expression for preg_match to find all of the strings between
'>' and '<' from html. Eg.

1. <TD><FONT SIZE='2'>XXX</FONT></TD>
2. <TD><FONT SIZE='2'><A HREF=http://www.whatever.co m/...7>ZZZ</A></FONT></TD>
3. <TD ALIGN=CENTER><F ONT SIZE='2'>Y/Y</FONT></TD>

#1 matches should be "", "XXX", and ""
#2 should be "", "", "ZZZ", "", ""
#3 should be "", "Y/Y", ""

Actually I need only the non-empty strings, but I can just ignore the empty ones.

I've tried preg_match("/>(.*)</i", $str, $matches) but it doesn't work like I want.
Does anyone know how to get this to work?

thanks,
Sam

Jun 10 '07 #1
3 3026
Sam Waller wrote:
I need a regular expression for preg_match to find all of the strings between
'>' and '<' from html. Eg.

1. <TD><FONT SIZE='2'>XXX</FONT></TD>
2. <TD><FONT SIZE='2'><A HREF=http://www.whatever.co m/...7>ZZZ</A></FONT></TD>
3. <TD ALIGN=CENTER><F ONT SIZE='2'>Y/Y</FONT></TD>

#1 matches should be "", "XXX", and ""
#2 should be "", "", "ZZZ", "", ""
#3 should be "", "Y/Y", ""

Actually I need only the non-empty strings, but I can just ignore the empty ones.

I've tried preg_match("/>(.*)</i", $str, $matches) but it doesn't work like I want.
Does anyone know how to get this to work?

thanks,
Sam

strip_tags() :-)

--
Arjen
http://www.arjenkarel.nl
Jun 10 '07 #2
On 10.06.2007 17:23 Sam Waller wrote:
I need a regular expression for preg_match to find all of the strings between
'>' and '<' from html. Eg.

1. <TD><FONT SIZE='2'>XXX</FONT></TD>
2. <TD><FONT SIZE='2'><A HREF=http://www.whatever.co m/...7>ZZZ</A></FONT></TD>
3. <TD ALIGN=CENTER><F ONT SIZE='2'>Y/Y</FONT></TD>

#1 matches should be "", "XXX", and ""
#2 should be "", "", "ZZZ", "", ""
#3 should be "", "Y/Y", ""

Actually I need only the non-empty strings, but I can just ignore the empty ones.

I've tried preg_match("/>(.*)</i", $str, $matches) but it doesn't work like I want.
Does anyone know how to get this to work?

thanks,
Sam
Try using a character class instead of dot: />([^<>]*)</

--
gosha bine

extended php parser ~ http://code.google.com/p/pihipi
blok ~ http://www.tagarga.com/blok
Jun 11 '07 #3
gosha bine wrote:
Try using a character class instead of dot: />([^<>]*)</
Also, using + instead of * might work better for your purposes.

--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.12-12mdksmp, up 107 days, 22:28.]

URLs in demiblog
http://tobyinkster.co.uk/blog/2007/05/31/demiblog-urls/
Jun 11 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
4698
by: Han | last post by:
I'm wondering if someone can explain why the following works with preg_match_all, but not preg_match: $html = "product=3456789&amp;" preg_match_all ("|product=(\d{5,10})&amp;|i", $html, $out); $out = 3456789 preg_match ("|product=(\d{5,10})&amp;|i", $html, $out);
14
7186
by: Westcoast Sheri | last post by:
What is the most efficient way of extracting the first two digits in a string? The following is wrong for me, because it only gives me the first instance of two digits together: $string = ujdk3ca94abc preg_match("/\d{2}/",$string,$result); echo "$result"; //prints 94. However, the result I am looking for is 39
10
8174
by: aaron | last post by:
I need some help with validating an email address. Right now, I am doing this: function sys_is_valid_email ($s) { if (preg_match ("/^.+@.+\..+$/", $s)) { return 1; } else { return 0; } }
2
8562
by: Muumac | last post by:
I have problem with large textfiles! When I load over 4MB xml and then try to preg_match something in this I get always FALSE! I have <File>....</File> tags in XML. Between tags is files contents BASE64 encoded! When xml contains only one big over 4MB file in it, preg_match and preg_match_all doesnt find it! When included file size is nelow 4MB, preg functions works perfectly! But PHP manual says that there is no size limitations to...
4
2229
by: DH | last post by:
I need to parse some HTML tags and display the style classes, and have it partly working, but need some regex / preg_match advise. If the tag is <td class="red" colspan="1"> I can display "td.red" If the tag is <td colspan="1" class="red"> my regex doesn't display the desired "td.red" ... I can't figure out how to skip over colspan="1" if the class doesn't immediately follow the HTML tag name.
5
7564
by: Mark Woodward | last post by:
Hi all, I'm trying to validate text in a HTML input field. How do I *allow* a single quote? // catch any nasty characters (eg !@#$%^&*()/\) $match = '/^+$/'; $valid_srch = preg_match($match, $res_description); if (!$valid_srch) { ...
4
1923
by: cainwebdesign | last post by:
I need to create a simple page to find the .gif file below from the page below. No matter what I try it doesn't work.... Any ideas? http://www.toysrus.com/product/index.jsp?productId=2327085 prod_AddtoCart.gif
7
19459
by: Chuck Anderson | last post by:
I am trying to implement email injection protection by looking for \r and/or \n in the name, subject, or email address fields from my contact form The first script, contact_us.php, contains a form with text fields for name, subject, and emailaddr (the sender's email address) The message (body of the email) is a textarea. I post the form to send_the_email_contact.php where I have the following test:
3
3373
by: fienen | last post by:
I am working on a script to handle a search query. In some instances, the query could come through as "isbn:%20#############" (where %20 is an encoded space and the colon is optional). Basically I want to strip off the ISBN portion and leave just the numbers if that is the case. Orignally I was trying $value = $_GET; if (preg_match("isbn:?%20", $value)) {
0
9721
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9600
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10633
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10376
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9198
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7651
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6880
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5548
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
3
3011
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.