471,350 Members | 1,438 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,350 software developers and data experts.

How to write Regular Expression for recursive matching?

Hi All,

I have problem to split a string like this:

'abc.defg.hij.klmnop'

and I want to get all substrings with only one '.' in mid. so the
output I expect is :

'abc.defg', 'defg.hij', 'hij.klmnop'

a simple regular expression '\w+.\w' will only return:
'abc.defg', 'hij.klmnop'

is there a way to get 'defg.hij' using regular expression?

Thanks,

Nov 26 '07 #1
6 3694
On Nov 26, 10:40 am, lisong <lisong.1...@gmail.comwrote:
Hi All,

I have problem to split a string like this:

'abc.defg.hij.klmnop'

and I want to get all substrings with only one '.' in mid. so the
output I expect is :

'abc.defg', 'defg.hij', 'hij.klmnop'

a simple regular expression '\w+.\w' will only return:
'abc.defg', 'hij.klmnop'

is there a way to get 'defg.hij' using regular expression?

Thanks,
Why are you using regular expressions? Use the split method defined
for strings:
>>'abc.defg.hij.klmnop'.split('.')
['abc', 'defg', 'hij', 'klmnop']

-- Paul
Nov 26 '07 #2
lisong wrote:
Hi All,

I have problem to split a string like this:

'abc.defg.hij.klmnop'

and I want to get all substrings with only one '.' in mid. so the
output I expect is :

'abc.defg', 'defg.hij', 'hij.klmnop'

a simple regular expression '\w+.\w' will only return:
'abc.defg', 'hij.klmnop'

is there a way to get 'defg.hij' using regular expression?
Nope. Regular expressions can't get back in their input-stream, at least not
for such stuff.

The problem at hand is easily solved using

s = 'abc.defg.hij.klmnop'

pairs = [".".join(v) for v in zip(s.split(".")[:-1], s.split(".")[1:])]

Diez
Nov 26 '07 #3
On Nov 26, 10:51 am, Paul McGuire <pt...@austin.rr.comwrote:
On Nov 26, 10:40 am, lisong <lisong.1...@gmail.comwrote:


Hi All,
I have problem to split a string like this:
'abc.defg.hij.klmnop'
and I want to get all substrings with only one '.' in mid. so the
output I expect is :
'abc.defg', 'defg.hij', 'hij.klmnop'
a simple regular expression '\w+.\w' will only return:
'abc.defg', 'hij.klmnop'
is there a way to get 'defg.hij' using regular expression?
Thanks,

Why are you using regular expressions? Use the split method defined
for strings:
>'abc.defg.hij.klmnop'.split('.')

['abc', 'defg', 'hij', 'klmnop']

-- Paul- Hide quoted text -

- Show quoted text -
Sorry, misread your post - Diez Roggisch has the right answer.

-- Paul
Nov 26 '07 #4
lisong wrote:
Hi All,

I have problem to split a string like this:

'abc.defg.hij.klmnop'

and I want to get all substrings with only one '.' in mid. so the
output I expect is :

'abc.defg', 'defg.hij', 'hij.klmnop'

a simple regular expression '\w+.\w' will only return:
'abc.defg', 'hij.klmnop'

is there a way to get 'defg.hij' using regular expression?

Thanks,
Do you need it to be a regular expression ?
>>def f(s) :
ws = s.split('.')
return map('.'.join,zip(ws,ws[1:]))
>>f('abc.defg.hij.klmnop')
['abc.defg', 'defg.hij', 'hij.klmnop']

Nov 26 '07 #5
On Mon, Nov 26, 2007 at 06:04:54PM +0100, Diez B. Roggisch wrote regarding Re: How to write Regular Expression for recursive matching?:
>
lisong wrote:
Hi All,

I have problem to split a string like this:

'abc.defg.hij.klmnop'

and I want to get all substrings with only one '.' in mid. so the
output I expect is :

'abc.defg', 'defg.hij', 'hij.klmnop'

a simple regular expression '\w+.\w' will only return:
'abc.defg', 'hij.klmnop'

is there a way to get 'defg.hij' using regular expression?

Nope. Regular expressions can't get back in their input-stream, at least not
for such stuff.

The problem at hand is easily solved using

s = 'abc.defg.hij.klmnop'

pairs = [".".join(v) for v in zip(s.split(".")[:-1], s.split(".")[1:])]
which is veritably perlesque in its elegance and simplicity!

A slightly more verbose version.

l = s.split('.')
pairs = []
for x in xrange(len(l)-1):
pairs.append('.'.join(l[x:x+2]))

Cheers,
Cliff
Nov 26 '07 #6
On Nov 26, 12:34 pm, "J. Clifford Dyer" <j...@sdf.lonestar.orgwrote:
On Mon, Nov 26, 2007 at 06:04:54PM +0100, Diez B. Roggisch wrote regarding Re: How to write Regular Expression for recursive matching?:


lisong wrote:
Hi All,
I have problem to split a string like this:
'abc.defg.hij.klmnop'
and I want to get all substrings with only one '.' in mid. so the
output I expect is :
'abc.defg', 'defg.hij', 'hij.klmnop'
a simple regular expression '\w+.\w' will only return:
'abc.defg', 'hij.klmnop'
is there a way to get 'defg.hij' using regular expression?
Nope. Regular expressions can't get back in their input-stream, at least not
for such stuff.
The problem at hand is easily solved using
s = 'abc.defg.hij.klmnop'
pairs = [".".join(v) for v in zip(s.split(".")[:-1], s.split(".")[1:])]

which is veritably perlesque in its elegance and simplicity!

A slightly more verbose version.

l = s.split('.')
pairs = []
for x in xrange(len(l)-1):
pairs.append('.'.join(l[x:x+2]))

Cheers,
Cliff
Thank u all for your kindly reply, I agree, RE is not necessary here.

Song
Nov 26 '07 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

3 posts views Thread by Erik Lechak | last post: by
7 posts views Thread by Billa | last post: by
25 posts views Thread by Mike | last post: by
14 posts views Thread by Chris | last post: by
3 posts views Thread by Zeba | last post: by
9 posts views Thread by netimen | last post: by
reply views Thread by XIAOLAOHU | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.