473,626 Members | 3,240 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regex Help Required

hello,

I've got the following regexpression:

"<span .*bbc_underline .*>(.*)</span>"

and the following input string:

"this <span class="bbc_unde rline">is underlined <span
class="bbc_stri kethrough">and striked through</span>text.</span>"

when I do a replace on the expression, I get the following result:

"this text."

I'm quite new to regular expressions. the goal in the expression above is to
yield the following result:

"this [u]is underlined [s]and striked through[/s]test.[u]"

but somehow, the first expression eats up more then it should. btw, I don't
know if or what is nested. I just have a couple of span's with given
classes, that will "turn into" square brackets. can somebody please help? A
ton of thanks in advance.

Matthias
Mar 5 '07 #1
4 1221
Regex is "greedy" by default. It will always find the largest chunk it can.
To make it "lazy" (find the smallest chunk it can), use "*?".
So, I think your Regex would be
"<span .*?bbc_underlin e.*?>(.*?)</span>"
Each of the ".*?" will find the minimum number of characters rather than the
maximum.
Give it a try.
Ethan
"Matthias S." <matthias&AtSig n&emvoid$dot$co mwrote in message
news:e2******** ******@TK2MSFTN GP05.phx.gbl...
hello,

I've got the following regexpression:

"<span .*bbc_underline .*>(.*)</span>"

and the following input string:

"this <span class="bbc_unde rline">is underlined <span
class="bbc_stri kethrough">and striked through</span>text.</span>"

when I do a replace on the expression, I get the following result:

"this text."

I'm quite new to regular expressions. the goal in the expression above is
to
yield the following result:

"this [u]is underlined [s]and striked through[/s]test.[u]"

but somehow, the first expression eats up more then it should. btw, I
don't
know if or what is nested. I just have a couple of span's with given
classes, that will "turn into" square brackets. can somebody please help?
A
ton of thanks in advance.

Matthias


Mar 5 '07 #2
hey ethan,

thanks for your help, but it does not work correctly either, since it
matches the first occurence of </spaninstead of the last.

In the case I did not explain it correctly: What I want to achive is to
match the first occurence of <span SOMETHING "bbc_underline" with the last
possible occurence of </spanand replace everything within with $1.
I don't know how many nestings there are. The span classes I use come from a
predefined list (bbc_underline, bbc_italic, bbc_strikethrou gh and the like)

I would greatly appreceate further help.

Matthias

"Ethan Strauss" <ethan dot strauss at Promega dot comschrieb im
Newsbeitrag news:OD******** ******@TK2MSFTN GP03.phx.gbl...
Regex is "greedy" by default. It will always find the largest chunk it
can.
To make it "lazy" (find the smallest chunk it can), use "*?".
So, I think your Regex would be
"<span .*?bbc_underlin e.*?>(.*?)</span>"
Each of the ".*?" will find the minimum number of characters rather than
the
maximum.
Give it a try.
Ethan
"Matthias S." <matthias&AtSig n&emvoid$dot$co mwrote in message
news:e2******** ******@TK2MSFTN GP05.phx.gbl...
hello,

I've got the following regexpression:

"<span .*bbc_underline .*>(.*)</span>"

and the following input string:

"this <span class="bbc_unde rline">is underlined <span
class="bbc_stri kethrough">and striked through</span>text.</span>"

when I do a replace on the expression, I get the following result:

"this text."

I'm quite new to regular expressions. the goal in the expression above
is
to
yield the following result:

"this [u]is underlined [s]and striked through[/s]test.[u]"

but somehow, the first expression eats up more then it should. btw, I
don't
know if or what is nested. I just have a couple of span's with given
classes, that will "turn into" square brackets. can somebody please
help?
A
ton of thanks in advance.

Matthias


Mar 5 '07 #3
The only way of replacing nested tags is to start from the innermost tag.

Make a pattern that matches your tag, but only if there is not another
such tag inside it. Then you can successfully match the innermost tag
and replace it, then repeat the process until there are no more matches.

Matthias S. wrote:
hey ethan,

thanks for your help, but it does not work correctly either, since it
matches the first occurence of </spaninstead of the last.

In the case I did not explain it correctly: What I want to achive is to
match the first occurence of <span SOMETHING "bbc_underline" with the last
possible occurence of </spanand replace everything within with $1.
I don't know how many nestings there are. The span classes I use come from a
predefined list (bbc_underline, bbc_italic, bbc_strikethrou gh and the like)

I would greatly appreceate further help.

Matthias

"Ethan Strauss" <ethan dot strauss at Promega dot comschrieb im
Newsbeitrag news:OD******** ******@TK2MSFTN GP03.phx.gbl...
>Regex is "greedy" by default. It will always find the largest chunk it
can.
>To make it "lazy" (find the smallest chunk it can), use "*?".
So, I think your Regex would be
"<span .*?bbc_underlin e.*?>(.*?)</span>"
Each of the ".*?" will find the minimum number of characters rather than
the
>maximum.
Give it a try.
Ethan
"Matthias S." <matthias&AtSig n&emvoid$dot$co mwrote in message
news:e2******* *******@TK2MSFT NGP05.phx.gbl.. .
>>hello,

I've got the following regexpression:

"<span .*bbc_underline .*>(.*)</span>"

and the following input string:

"this <span class="bbc_unde rline">is underlined <span
class="bbc_st rikethrough">an d striked through</span>text.</span>"

when I do a replace on the expression, I get the following result:

"this text."

I'm quite new to regular expressions. the goal in the expression above
is
>>to
yield the following result:

"this [u]is underlined [s]and striked through[/s]test.[u]"

but somehow, the first expression eats up more then it should. btw, I
don't
know if or what is nested. I just have a couple of span's with given
classes, that will "turn into" square brackets. can somebody please
help?
>>A
ton of thanks in advance.

Matthias



--
Göran Andersson
_____
http://www.guffa.com
Mar 5 '07 #4
Hello Matthias,
>thanks for your help, but it does not work correctly either, since it
matches the first occurence of </spaninstead of the last.

In the case I did not explain it correctly: What I want to achive is to
match the first occurence of <span SOMETHING "bbc_underline" with the last
possible occurence of </spanand replace everything within with $1.
If I understand you correctly, you want balanced matching. That is
actually possible in .NET - here's a post that explains how:
http://blogs.msdn.com/bclteam/archiv...15/396452.aspx

Let me know if that's what you need but you can't make it work - I don't
want to work on it unless it really helps you :-)
Oliver Sturm
--
http://www.sturmnet.org/blog
Mar 5 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
10212
by: William Stacey [MVP] | last post by:
Would like help with a (I think) a common regex split example. Thanks for your example in advance. Cheers! Source Data Example: one "two three" four Optional, but would also like to ignore pairs of brackets like: "one" <tab> "two three" ( four "five six" ) Want fields like:
4
3602
by: shonend | last post by:
I am trying to extract the pattern like this : "SUB: some text LOT: one-word" Described, "SUB" and "LOT" are key words; I want those words, everything in between and one word following the "LOT:". Source text may contain multiple "SUB: ... LOT:" blocks. For example this is my source text:
11
3094
by: Steve | last post by:
Hi All, I'm having a tough time converting the following regex.compile patterns into the new re.compile format. There is also a differences in the regsub.sub() vs. re.sub() Could anyone lend a hand? import regsub
0
1538
by: Tidane | last post by:
Visual Basic.NET Framework 2.0 I've created a program to parse out text as the program recieved it and use Regex matching to decide what should be done. My problem is that the text is matching when it shouldn't be, if that makes any sense. If Regex.IsMatch(Text, "You find (a|an)" & MoneyMatch) Then Other code here that doesn't matter. ElseIf Regex.IsMatch(Text, "(+\s)obtains (a|an)") Then More code that doesn't matter. EndIf
3
2278
by: Masa Ito | last post by:
I am trying to capture the contents of a function with Regex. I am using Expresso to test (nice - thanks for the great tool UltraPico!). I can handle my own with single line regex's (I think).. I want to have a named capture of the entire 'contents' of specific functions. EG: Sample code <Description("{0} is a required field.")_ Protected Overridable Function AccountIDRequired(ByVal target As Object, ByVal e As RuleArgs) As Boolean...
15
50214
by: morleyc | last post by:
Hi, i would like to remove a number of characters from my string (\t \r \n which are throughout the string), i know regex can do this but i have no idea how. Any pointers much appreciated. Chris
7
2054
by: Nightcrawler | last post by:
Hi all, I am trying to use regular expressions to parse out mp3 titles into three different groups (artist, title and remix). I currently have three ways to name a mp3 file: Artist - Title Artist - Title (Remix) Artist - Title
1
1305
by: billy.murray | last post by:
I have the following text inside a file which I am trying to parse. I am creating a new RegEx object as follows :- string strRegex = @"({S:\r?\n)(*)(})" {S: INTERLOCK FAIL : Tie bar hoist - rotate anticlockwise I1-T/bar hoist slew handwind eng, input LS/TBHS/HWEa is LOW I1-T/bar hoist slew handwind eng, input LS/TBHS/HWEb is
3
2505
by: bob | last post by:
Hi I need to replace the nth field in a CSV record with a new value, Having trouble figuring out a regex that will capture the nth field. I have a regex that captures every field in the record but can't see an efficient way of making use of the matches collection to build the required output string. i.e. Don't want to iterate through the m.Groups collection appending to a string builder when I figure I could use a replace function if I...
4
2667
by: CJ | last post by:
Is this the format to parse a string and return the value between the item? Regex pRE = new Regex("<File_Name>.*>(?<insideText>.*)</File_Name>"); I am trying to parse this string. <File_Name>Services</File_Name> Thanks
0
8269
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8711
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8368
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8512
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7203
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6125
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4094
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
2630
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1515
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.