473,804 Members | 3,259 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

trying to find repeated substrings with regular expression

Hello all,

I'm trying to find substrings that look like 'FOO blah blah blah'
in a string. For example give 'blah FOO blah1a blah1b FOO blah2
FOO blah3a blah3b blah3b' I want to get three substrings,
'FOO blah1a blah1b', 'FOO blah2', and 'FOO blah3a blah3b blah3b'.

I've tried numerous variations on '.*(FOO((?!FOO) .)*)+.*'
and everything I've tried either matches too much or too little.

I've decided it's easier for me just to search for FOO, and then
break up the string based on the locations of FOO.

But I'd like to better understand regular expressions.
Can someone suggest a regular expression which will return
groups corresponding to the FOO substrings above?

Thanks for any insights, I appreciate it a lot.

Robert Dodier

Mar 13 '06 #1
4 4183
Robert Dodier wrote:
Hello all,

I'm trying to find substrings that look like 'FOO blah blah blah'
in a string. For example give 'blah FOO blah1a blah1b FOO blah2
FOO blah3a blah3b blah3b' I want to get three substrings,
'FOO blah1a blah1b', 'FOO blah2', and 'FOO blah3a blah3b blah3b'. [...] Can someone suggest a regular expression which will return
groups corresponding to the FOO substrings above?


FOO.*?(?=(?:FOO |$))
--
Giovanni Bajo
Mar 13 '06 #2
Robert Dodier wrote:
Hello all,

I'm trying to find substrings that look like 'FOO blah blah blah'
in a string. For example give 'blah FOO blah1a blah1b FOO blah2
FOO blah3a blah3b blah3b' I want to get three substrings,
'FOO blah1a blah1b', 'FOO blah2', and 'FOO blah3a blah3b blah3b'.

I've tried numerous variations on '.*(FOO((?!FOO) .)*)+.*'
and everything I've tried either matches too much or too little.
FOO(.*?)(?=FOO| $)

I've decided it's easier for me just to search for FOO, and then
break up the string based on the locations of FOO.


Use re.split() for this.

Kent
Mar 13 '06 #3
Robert Dodier wrote:
I've decided it's easier for me just to search for FOO, and then
break up the string based on the locations of FOO.

But I'd like to better understand regular expressions.


Those who cannot learn regular expressions are doomed to repeat string
searches. Which is not such a bad thing.

txt = "blah FOO blah1a blah1b FOO blah2 FOO blah3a blah3b blah3b"

def fa(s, pat):
retlist = []
try:
while True:
i = s.rindex(pat)
retlist.insert( 0,s[i:])
s = s[:i]
except:
return retlist

print fa(txt, "FOO")

Mar 13 '06 #4
[Robert Dodier]
I'm trying to find substrings that look like 'FOO blah blah blah'
in a string. For example give 'blah FOO blah1a blah1b FOO blah2
FOO blah3a blah3b blah3b' I want to get three substrings,
'FOO blah1a blah1b', 'FOO blah2', and 'FOO blah3a blah3b blah3b'.
No need for regular expressions on this one:
s = 'blah FOO blah1a blah1b FOO blah2 FOO blah3a blah3b blah3b'
['FOO' + tail for tail in s.split('FOO')[1:]] ['FOO blah1a blah1b ', 'FOO blah2 ', 'FOO blah3a blah3b blah3b']


I've tried numerous variations on '.*(FOO((?!FOO) .)*)+.*'
and everything I've tried either matches too much or too little.


The regular expression way is to find the target phrase followed by any
text followed by the target phrase. The first two are in a group and
the last is not included in the result group. The any-text section is
non-greedy:
import re
re.findall('(FO O.*?)(?=FOO|$)' , s)

['FOO blah1a blah1b ', 'FOO blah2 ', 'FOO blah3a blah3b blah3b']
Raymond

Mar 14 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
1367
by: borges2003xx | last post by:
hi everyone. a problem: two binary strings, a="0101" b="000011110100"; i search a function f(a,b) that gives 1 if a is "contained" in b with any sub strings interposed. in this example a in contained cause 000<01>111<01>00 but also 0<0>001111<101>00" but also <0>0001111<101>00 but also 000<0>1111<01>0<0> etc.... any idea? Thanx in advance.
26
13331
by: rkleiner | last post by:
Is there a regular expression to find the first unmatched right bracket in a string, if there is one? For example, "(1*(2+3))))".search(/regexp/) == 9 Thanks in advance.
16
2168
by: Andrew Baker | last post by:
I am trying to write a function which provides my users with a file filter. The filter used to work just using the VB "Like" comparision, but I can't find the equivilant in C#. I looked at RegEx.IsMatch but it behaves quite differently. Is there a way I can mimic the DOS filtering of filenames (eg. "*.*" or "*" returns all files, "*.xls" returns all excel files, "workbook*" returns all files begining with "workbook" etc)? thanks in...
4
2068
by: JackRazz | last post by:
I'm trying to use Visual Studio's Find/Replace to match VB declarations. This RegEx works fine in Regulator: ^\s*(Public|Friend|Protected|Private)*\s*(Shared|Overrides)*\s*(Sub|Function|Property )+ But when I try it in Visual Studio's find (with regular expressions turned on) it doesn't work. Are there any known problems with VS find or is there something wrong with the above expression? I'm new to them, so the second question...
10
5413
by: CodeRazor | last post by:
Is there a string method that allows gives you the number of times a substring appears in your string. Looping through my string performing IndexOf("substring",startPos), seems like overkill. thank you. CR
7
1704
by: matteosartori | last post by:
Hi all, I've spent all morning trying to work this one out: I've got the following string: <td>04/01/2006</td><td>Wednesday</td><td>&nbsp;</td><td>09:14</td><td>12:44</td><td>12:50</td><td>17:58</td><td>&nbsp;</td><td>&nbsp;</td><td>&nbsp;</td><td>&nbsp;</td><td>08:14</td> from which I'm attempting to extract the date, and the five times from into a list. Only the very last time is guaranteed to be there so it
3
16928
by: TOXiC | last post by:
Hi everyone, First I say that I serched and tryed everything but I cannot figure out how I can do it. I want to open a a file (not necessary a txt) and find and replace a string. I can do it with: import fileinput, string, sys fileQuery = "Text.txt" sourceText = '''SOURCE'''
1
4388
by: Allan Ebdrup | last post by:
I have a dynamic list of regular expressions, the expressions don't change very often but they can change. And I have a single string that I want to match the regular expressions against and find the first regular expression that matches the string. I've gor the regular expressions ordered so that the highest priority is first (if two or more regular expressions match the string I want the first one returned) The code that does this has...
0
578
by: M.-A. Lemburg | last post by:
On 2008-11-21 15:31, scsoce wrote: ??? That's a strange requirement. If you want to match every character, then why are you using a regular expression for this ? -- Marc-Andre Lemburg
0
9704
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10562
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10319
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10303
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10070
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9132
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6845
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5508
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3803
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.