473,397 Members | 2,028 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,397 software developers and data experts.

finding indices in a sequence of parentheses

I have a list of strings that looks something like:

lst = ['0', '0', '(*)', 'O', '(*', '*', '(*', '*))', '((*', '*)', '*)']

The parentheses in the labels indicate where an "annotation" starts and
ends. So for example, the label '(*)' at index 2 of the list means that
I have an annotation at (2, 2), and the labels '(*', '*', '(*', '*))' at
indices 4 through 7 mean that I have an annotation at (4, 7) and an
annotation at (6, 7).

I'd like to determine all indices at which I have an annotation. So for
the data above, I want the indices:

(2, 2), (4, 7), (6, 7), (8, 9) and (8, 10)

Here's what I'm doing now:

py> def indices(lst):
.... stack = []
.... for i, s in enumerate(lst):
.... if s == 'O':
.... continue
.... stack.extend([i]*s.count('('))
.... if '*' in s and not stack:
.... raise Exception('No start for %r at %i' % (s, i))
.... for _ in range(s.count(')')):
.... try:
.... yield stack.pop(), i
.... except IndexError:
.... raise Exception('No start for %r at %i' % (s, i))
.... if stack:
.... raise Exception('No ends for starts at %r' % stack)
....
py> list(indices(['0', '0', '(*)', 'O', '(*', '*', '(*', '*))', '((*',
'*)', '*)', '0']))
[(2, 2), (6, 7), (4, 7), (8, 9), (8, 10)]

I think that works right, but I'm not certain. So two questions:

(1) Can anyone see anything wrong with the code above? and
(2) Does anyone see an easier/clearer/simpler[1] way of doing this?

Thanks,

STeVe

[1] Yes, I know easier/clearer/simpler are subjective terms. It's okay,
I'm only looking for opinions here anyway. =)
Jul 19 '05 #1
8 1582
Steven Bethard wrote:
(2) Does anyone see an easier/clearer/simpler[1] way of doing this?


I'd personnally extract the parenthesis then zip the lists of indices.
In short:
def indices(mylist): ... lopen=reduce(list.__add__, [[i]*s.count('(') for i,s in
enumerate(mylist)],[])
... lclose=reduce(list.__add__, [[i]*s.count(')') for i,s in
enumerate(mylist)],[])
... return zip(lopen,lclose)
... indices(lst) [(2, 2), (4, 7), (6, 7), (8, 9), (8, 10)]


Before returning, you can check if the lists have same size and if the
'(' has lower or equal index than ')' in each of these couples. If not
you can raise the appropriate exception.

Disclaimer: not tested further than example above (but confident).
Jul 19 '05 #2
tiissa wrote:
I'd personnally extract the parenthesis then zip the lists of indices.
In short:
>>> def indices(mylist): ... lopen=reduce(list.__add__, [[i]*s.count('(') for i,s in
enumerate(mylist)],[])
... lclose=reduce(list.__add__, [[i]*s.count(')') for i,s in
enumerate(mylist)],[])
... return zip(lopen,lclose)
... >>> indices(lst) [(2, 2), (4, 7), (6, 7), (8, 9), (8, 10)] >>>


Thanks, that's a good idea. In case anyone else is reading this thread,
and had to mentally unwrap the reduce expressions, I believe they could
be written as:

lopen = [x for i, s in enumerate(lst) for x in [i]*s.count('(')]
lclose = [x for i, s in enumerate(lst) for x in [i]*s.count(')')]

or maybe:

lopen = [i for i, s in enumerate(lst) for _ in xrange(s.count('('))]
lclose = [i for i, s in enumerate(lst) for _ in xrange(s.count(')'))]

Sorry, I have an irrational fear of reduce. ;)

STeVe
Jul 19 '05 #3
[Steven Bethard]
I have a list of strings that looks something like:
lst = ['0', '0', '(*)', 'O', '(*', '*', '(*', '*))', '((*', '*)', '*)'] . . . I want the indices:
(2, 2), (4, 7), (6, 7), (8, 9) and (8, 10)


opener_stack = []
for i, elem in enumerate(lst):
for c in elem:
if c == '(':
opener_stack.append(i)
elif c == ')':
print opener_stack.pop(), i
To see something like this in production code, look at
Tools/scripts/texcheck.py

Raymond Hettinger

Jul 19 '05 #4
tiissa wrote:
Steven Bethard wrote:
(2) Does anyone see an easier/clearer/simpler[1] way of doing this?


I'd personnally extract the parenthesis then zip the lists of indices.
In short:
>>> def indices(mylist): ... lopen=reduce(list.__add__, [[i]*s.count('(') for i,s in
enumerate(mylist)],[])
... lclose=reduce(list.__add__, [[i]*s.count(')') for i,s in
enumerate(mylist)],[])
... return zip(lopen,lclose)
... >>> indices(lst) [(2, 2), (4, 7), (6, 7), (8, 9), (8, 10)] >>>


Before returning, you can check if the lists have same size and if the
'(' has lower or equal index than ')' in each of these couples. If not
you can raise the appropriate exception.

Disclaimer: not tested further than example above (but confident).


Not tested but confident should be an oxymoron for a programmer. Some
examples:

lst: ['(', '(', ')', ')']
hettinger [(1, 2), (0, 3)]
bethard [(1, 2), (0, 3)]
tiissa [(0, 2), (1, 3)] oops (or am I just spoilt by the XML spec?)

lst: ['(', ')(', ')']
hettinger [(0, 1), (1, 2)]
bethard [(1, 1), (0, 2)] oops
tiissa [(0, 1), (1, 2)]

So far Raymond's solution is my favourite...

Peter
Jul 19 '05 #5
Peter Otten wrote:
tiissa wrote:
Disclaimer: not tested further than example above (but confident).


Not tested but confident should be an oxymoron for a programmer.


Not tested but confident is an oxymoron for mathemtaticians.
Programmers know better than that, they leave bugs in their code to have
more work to do. ;)

OTOH, you're right. Matching parentheses cannot be done without a stack,
that should have rung a bell.
Jul 19 '05 #6
tiissa wrote:
Not tested but confident is an oxymoron for mathemtaticians.


I think no amount of testing will give these strange people confidence.
"Proof" is the magic word here.

Peter

Jul 19 '05 #7
Peter Otten wrote:
I think no amount of testing will give these strange people confidence.
"Proof" is the magic word here.


Some would maybe be satisfied if your tests cover the whole set of input.

When that's possible, that may be useless. But that's not a matter to
bother them with. ;)

(And of course, you just don't tell them how you generated their output
with the same program.)
Jul 19 '05 #8
Raymond Hettinger wrote:
[Steven Bethard]
I have a list of strings that looks something like:
lst = ['0', '0', '(*)', 'O', '(*', '*', '(*', '*))', '((*', '*)', '*)']
. . .
I want the indices:
(2, 2), (4, 7), (6, 7), (8, 9) and (8, 10)


opener_stack = []
for i, elem in enumerate(lst):
for c in elem:
if c == '(':
opener_stack.append(i)
elif c == ')':
print opener_stack.pop(), i


Thanks Raymond, this is definitely an elegant solution. It was also easy
to add all the error checking I needed into this one. For the curious,
my final solution looks something like:

def indices(lst):
stack = []
for i, elem in enumerate(lst):
for c in elem:
if c == '(':
stack.append(i)
elif c == ')' and not stack:
raise Exception('")" at %i without "("' % i)
elif c == ')':
yield stack.pop(), i
elif c == 'O' and stack:
raise Exception('"O" at %i after "(" at %i' %
(i, stack[-1]))
elif c == '*' and not stack:
raise Exception('"*" at %i without "("' % i)
if stack:
raise Exception('"(" at %r without ")"' % stack)

Thanks again!

STeVe
Jul 19 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: dan | last post by:
I was recently surprised, and quite shocked in fact, to find that Python treats negative indices into sequence types as if they were mod(length-of-sequence), at least up to -len(seq). This fact...
5
by: andrea.gavana | last post by:
Hello NG, I was wondering if there is a faster/nicer method (than a for loop) that will allow me to find the elements (AND their indices) in a list that verify a certain condition. For example,...
5
by: Ross MacGregor | last post by:
I have a very simple yet complicated problem. I want to generate a random list of indices (int's) for a container. Let's say I have a container with 10 items and I want a list of 3 random...
5
by: brettr | last post by:
When I reference document.cookie, there is a long string of key=value; pairs listed. I may have 100 hundred cookies on my hard drive. However, most only have one key=value pair. Does the...
3
by: GavinCrooks | last post by:
The indices method of slice doesn't seem to work quite how I would expect when reversing a sequence. For example : '43210' '43210' So a slice with a negative step (and nothing else) reverses...
4
by: deLenn | last post by:
Hi, Does scipy have an equivalent to Matlab's 'find' function, to list the indices of all nonzero elements in a sparse matrix? Cheers.
3
by: mmu2643 | last post by:
Hi, I had a noobie question: i = 0; while (len i) && (SOME_VAL != arr ) i++; In the above snippet if len is the number of elements in arr, when i == len, since the first condition fails,...
1
by: watershed | last post by:
In PHP4, how would I go about finding *every* instance of text (max of 14 characters) in a string that is placed within parentheses? And then loop through each instance and update the string...
34
by: samjnaa | last post by:
This is like the previous one. Please check for sanity and approve for posting at python-dev. I would like to have something like "option base" in Visual Basic. IIRC it used to allow me to...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.