By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
439,931 Members | 2,015 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 439,931 IT Pros & Developers. It's quick & easy.

Split a string based on change of character

P: n/a
Python beginner here.

For a string 'ABBBCC', I want to produce a list ['A', 'BBB', 'CC'].
That is, break the string into pieces based on change of character.
What's the best way to do this in Python?

Using Python 2.5.1, I tried:

import re
s = re.split(r'(?<=(.))(?!\1)', 'ABBBCC')
for e in s: print e

but was surprised when it printed:

ABBBCC

I expected something like:

A
A
BBB
B
CC
C

(the extra fields because of the capturing parens).

Thanks,
/-\

__________________________________________________ __________________________________
Yahoo!7 Mail has just got even bigger and better with unlimited storage on all webmail accounts.
http://au.docs.yahoo.com/mail/unlimitedstorage.html
Jul 29 '07 #1
Share this Question
Share on Google+
1 Reply


P: n/a
On Jul 28, 9:46 pm, Andrew Savige <ajsav...@yahoo.com.auwrote:
Python beginner here.

For a string 'ABBBCC', I want to produce a list ['A', 'BBB', 'CC'].
That is, break the string into pieces based on change of character.
What's the best way to do this in Python?

Using Python 2.5.1, I tried:

import re
s = re.split(r'(?<=(.))(?!\1)', 'ABBBCC')
for e in s: print e

but was surprised when it printed:

ABBBCC

I expected something like:

A
A
BBB
B
CC
C

(the extra fields because of the capturing parens).

Using itertools:

import itertools

s = 'ABBBCC'
print [''.join(grp) for key, grp in itertools.groupby(s)]
Using re:

import re

pat = re.compile(r'((\w)\2*)')
print [t[0] for t in re.findall(pat, s)]
By the way, your pattern seems to work in perl:

$ perl -le '$, = " "; print split(/(?<=(.))(?!\1)/, "ABBBCC");'
A A BBB B CC C

Was that the type of regular expressions you were expecting?

--
Hope this helps,
Steven
Jul 29 '07 #2

This discussion thread is closed

Replies have been disabled for this discussion.