473,386 Members | 1,668 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

regex question

Hi folks,
I have to do the following:

match everything between "start match after this text:" and "</td>".
My problem is that there are other html-tags between, so [^<] doesn't work.
How can do something like [^<\/td>] (yes, I know this means not < or /
or ...), but do it right?

Many thanks in advance,
yours Henri

--
| Henri Schomäcker - BYTECONCEPTS, VIRTUAL HOMES
| * * Datendesign für Internet und Intranet
| * * * * http://www.byteconcepts.de
| * * * * http://www.virtual-homes.de
Jul 17 '05 #1
6 2190
Henri Schomaecker wrote:
I have to do the following:

match everything between "start match after this text:" and "</td>".
My problem is that there are other html-tags between, so [^<] doesn't work. How can do something like [^<\/td>] (yes, I know this means not < or / or ...), but do it right?


What's wrong with preg_match('/STARTTEXT(.*)<\/td>/', $text, $array)?
Where STARTTEXT is the start match.

Maybe I'm mimsunderstanding your requirement, in which case you would
need to post some explicit examples of what you want.

--
Oli

Jul 17 '05 #2
'/STARTTEXT(.*)<\/td>/' will continue matching until the last
occurance of </TD>

use this regex:
'|STARTTEXT([^<>]*)</td>|'

you shouldn't have any < or >s (other than those that make up tags)
right. :)

Jul 17 '05 #3

BKDotCom wrote:
'/STARTTEXT(.*)<\/td>/' will continue matching until the last
occurance of </TD>

In that case, /STARTTEXT(.*?)<\/td>/ instead.
use this regex:
'|STARTTEXT([^<>]*)</td>|'

you shouldn't have any < or >s (other than those that make up tags)
right. :)


The OP stated that there *will* be other HTML tags in between, so this
regex won't work.

--
Oli

Jul 17 '05 #4
Thanks to all of you!

I solved it. It was a greedy problem.
I just don't understand why in PHP .* catches far over the (...) when I
don't set the N (non-greedy) Option. - In my Opinion it should at least
stop matching, when the match-making ) is reached. - But it doesn't!
In perl, this is no problem, I tried a few one-liners with the g option
(perl's greedy option) with my example now.
PHP seems to match, and match ..., and does not stop with matching until the
end of the subject string is found.

I recently wrote a (unfortunately at the moment closed source) c++ API for
libpcre. Because the PHP API seems to be kind of copied from pcre, I think
I'll have to make some tests, if this behaviour is also present in he pcre
API, this will really be a problem for me.

Question: Is it correct PHP pcre behaviour to match all over the
match-delimiter ) ?

Many thanks for every answer,
yours Henri

--
| Henri Schomäcker - BYTECONCEPTS, VIRTUAL HOMES
| * * Datendesign für Internet und Intranet
| * * * * http://www.byteconcepts.de
| * * * * http://www.virtual-homes.de
Jul 17 '05 #5
*** BKDotCom wrote/escribió (11 May 2005 14:57:11 -0700):
'/STARTTEXT(.*)<\/td>/' will continue matching until the last
occurance of </TD>


Unless you turn greediness off:

'/STARTTEXT(.*)<\/td>/U'

or just

'#STARTTEXT(.*)</td>#U'
--
-- Álvaro G. Vicario - Burgos, Spain
-- http://bits.demogracia.com - Mi sitio sobre programación web
-- Don't e-mail me your questions, post them to the group
--
Jul 17 '05 #6
Henri Schomaecker <hs@byteconcepts.de> wrote:

I solved it. It was a greedy problem.
I just don't understand why in PHP .* catches far over the (...) when I
don't set the N (non-greedy) Option. - In my Opinion it should at least
stop matching, when the match-making ) is reached. - But it doesn't!
That's your opinion, because it conveniently suits your current
requirement. Regular expressions have been greedy right from the start.
In perl, this is no problem, I tried a few one-liners with the g option
(perl's greedy option) with my example now.
Perl is greedy by default (as are all regular expression matchers).
Perhaps you should post your test so we can figure out what you really did.
PHP seems to match, and match ..., and does not stop with matching until the
end of the subject string is found.


Please post your exact tests. I want to make sure we can explain this to
everyone.
--
- Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Jul 17 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: engwar1 | last post by:
Not sure where to ask this. Please suggest another newsgroup if this isn't the best place for this question. I'm new to both vb.net and regex. I need a regular expression that will validate what...
4
by: aevans1108 | last post by:
expanding this message to microsoft.public.dotnet.xml Greetings Please direct me to the right group if this is an inappropriate place to post this question. Thanks. I want to format a...
2
by: Tim Conner | last post by:
Hi, Thanks to Peter, Chris and Steven who answered my previous answer about regex to split a string. Actually, it was as easy as create a regex with the pattern "/*-+()," and most of my string...
6
by: Du Dang | last post by:
Text: ===================== <script1> ***stuff A </script1> ***more stuff <script2> ***stuff B
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
5
by: Chris | last post by:
How Do I use the following auto-generated code from The Regulator? '------------------------------------------------------------------------------ ' <autogenerated> ' This code was generated...
6
by: Martin Evans | last post by:
Sorry, yet another REGEX question. I've been struggling with trying to get a regular expression to do the following example in Python: Search and replace all instances of "sleeping" with "dead"....
7
by: Extremest | last post by:
I am using this regex. static Regex paranthesis = new Regex("(\\d*/\\d*)", RegexOptions.IgnoreCase); it should find everything between parenthesis that have some numbers onyl then a forward...
6
by: Phil Barber | last post by:
I am using Regex to validate a file name. I have everything I need except I would like the dot(.) in the filename only to appear once. My question is it possible to allow one instance of character...
6
by: | last post by:
Hi all, Sorry for the lengthy post but as I learned I should post concise-and-complete code. So the code belows shows that the execution of ValidateAddress consumes a lot of time. In the test...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.