468,291 Members | 1,569 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,291 developers. It's quick & easy.

Why does re.sub('.*?', '-', 'abc') return '-a-b-c-' instead of '-------'?

This is the results from python2.7.

>>> re.sub('.*?', '-', 'abc')
The results I thought should be as follows.

>>> re.sub('.*?', '-', 'abc')
But it's not. Why?
Aug 1 '18 #1
1 1998
5,400 Expert Mod 4TB
Reference: Regular expression operations

Breaking down the expression
. - dot ignore newline match all other characters
* - match zero or more repetitions of the proceeding RE
such that "ab*" will match "ab" "abb" "abbbb" etc...
? - match zero or 1 repetitions of the proceeding RE
such that "ab?" it will match "a" or "ab"

So we now have the construct *?
Adding the ? after the * says only match the first instance.
That is to say * by itself will match as many as possible - it's greedy.
*? is not greedy and matches as few as possible.
SO for string="abc"; rex=".*?"; will match only the "a"

Change the expression to - with global and multi set true:
this should render to ' - - - '

I know that this will render "---" also with global and multi set true:

You need a good primer on Regular Expressions

>I've found this one: https://www.rexegg.com/
It appears to cover the basics and a bit of extra too.
Aug 6 '18 #2

Post your reply

Sign in to post your reply or Sign up for a free account.

By using this site, you agree to our Privacy Policy and Terms of Use.