iterblocks cookbook example

Steve Howell

George Sakkis produced the following cookbook recipe,
which addresses a common problem that comes up on this
mailing list:

http://aspn.activestate.com/ASPN/Coo.../Recipe/521877
I would propose adding something like this to the
cookbook example above.

def iterblocks2(lst, start_delim):
# This variation on iterblocks shows a more
typical
# implementation that behaves like iterblocks for
# the Hello World example. The problem with this
naive
# implementation is that you cannot pass arbitrary
# iterators.
blocks = []
new_block = []
for item in lst:
if start_delim(item):
if new_block: blocks.append(new_block)
new_block = []
else:
new_block.append(item)
if new_block: blocks.append(new_block)
return blocks

Comments welcome. This has been tested on George's
slow-version-of-string-split example. It treates the
delimiter as not being part of the block, and it punts
on the issue of what to do when you have empty blocks
(i.e. consecutive delimiters).

__________________________________________________ __________________________________
Be a better Heartthrob. Get better relationship answers from someone who knows. Yahoo! Answers - Check it out.
http://answers.yahoo.com/dir/?link=list&sid=396545433

Jun 2 '07 #1

Subscribe Reply

1466

Raymond Hettinger

On Jun 2, 10:19 am, Steve Howell <showel...@yahoo.comwrote:

George Sakkis produced the following cookbook recipe,
which addresses a common problem that comes up on this
mailing list:

ISTM, this is a common mailing list problem because it is fun
to solve, not because people actually need it on a day-to-day basis.

In that spirit, it would be fun to compare several different
approaches to the same problem using re.finditer, itertools.groupby,
or the tokenize module. To get the ball rolling, here is one variant:

from itertools import groupby

def blocks(s, start, end):
def classify(c, ingroup=[0], delim={start:2, end:3}):
result = delim.get(c, ingroup[0])
ingroup[0] = result in (1, 2)
return result
return [tuple(g) for k, g in groupby(s, classify) if k == 1]

print blocks('the <quickbrown <foxjumped', start='<', end='>')

One observation is that groupby() is an enormously flexible tool.
Given a well crafted key= function, it makes short work of almost
any data partitioning problem.
Raymond

Jun 2 '07 #2

Gerard Flanagan

On Jun 2, 10:47 pm, Raymond Hettinger <pyt...@rcn.comwrote:

On Jun 2, 10:19 am, Steve Howell <showel...@yahoo.comwrote:

George Sakkis produced the following cookbook recipe,
which addresses a common problem that comes up on this
mailing list:

ISTM, this is a common mailing list problem because it is fun
to solve, not because people actually need it on a day-to-day basis.

In that spirit, it would be fun to compare several different
approaches to the same problem using re.finditer, itertools.groupby,
or the tokenize module. To get the ball rolling, here is one variant:

from itertools import groupby

def blocks(s, start, end):
def classify(c, ingroup=[0], delim={start:2, end:3}):
result = delim.get(c, ingroup[0])
ingroup[0] = result in (1, 2)
return result
return [tuple(g) for k, g in groupby(s, classify) if k == 1]

print blocks('the <quickbrown <foxjumped', start='<', end='>')

One observation is that groupby() is an enormously flexible tool.
Given a well crafted key= function, it makes short work of almost
any data partitioning problem.

Can anyone suggest a function that will split text by paragraphs, but
NOT if the paragraphs are contained within a

...

construct. In other words, the following text should yield 3 blocks
not 6:

TEXT = '''
Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Pellentesque dolor quam, dignissim ornare, porta et,
auctor eu, leo. Phasellus malesuada metus id magna.

Only when flight shall soar
not for its own sake only
up into heaven's lonely
silence, and be no more

merely the lightly profiling,
proudly successful tool,
playmate of winds, beguiling
time there, careless and cool:

only when some pure Whither
outweighs boyish insistence
on the achieved machine

will who has journeyed thither
be, in that fading distance,
all that his flight has been.

Integer urna nulla, tempus sit amet, ultrices interdum,
rhoncus eget, ipsum. Cum sociis natoque penatibus et
magnis dis parturient montes, nascetur ridiculus mus.
'''

Other info:

* don't worry about nesting
* the

and

musn't be stripped.

Gerard

Jun 4 '07 #3

Gerard Flanagan

On Jun 4, 1:52 pm, Gerard Flanagan <grflana...@yahoo.co.ukwrote:

On Jun 2, 10:47 pm, Raymond Hettinger <pyt...@rcn.comwrote:

On Jun 2, 10:19 am, Steve Howell <showel...@yahoo.comwrote:

George Sakkis produced the following cookbook recipe,
which addresses a common problem that comes up on this
mailing list:

ISTM, this is a common mailing list problem because it is fun
to solve, not because people actually need it on a day-to-day basis.

In that spirit, it would be fun to compare several different
approaches to the same problem using re.finditer, itertools.groupby,
or the tokenize module. To get the ball rolling, here is one variant:

from itertools import groupby

def blocks(s, start, end):
def classify(c, ingroup=[0], delim={start:2, end:3}):
result = delim.get(c, ingroup[0])
ingroup[0] = result in (1, 2)
return result
return [tuple(g) for k, g in groupby(s, classify) if k == 1]

print blocks('the <quickbrown <foxjumped', start='<', end='>')

One observation is that groupby() is an enormously flexible tool.
Given a well crafted key= function, it makes short work of almost
any data partitioning problem.

Can anyone suggest a function that will split text by paragraphs, but
NOT if the paragraphs are contained within a
...
construct. In other words, the following text should yield 3 blocks
not 6:

TEXT = '''
Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Pellentesque dolor quam, dignissim ornare, porta et,
auctor eu, leo. Phasellus malesuada metus id magna.

Only when flight shall soar
not for its own sake only
up into heaven's lonely
silence, and be no more

merely the lightly profiling,
proudly successful tool,
playmate of winds, beguiling
time there, careless and cool:

only when some pure Whither
outweighs boyish insistence
on the achieved machine

will who has journeyed thither
be, in that fading distance,
all that his flight has been.

Integer urna nulla, tempus sit amet, ultrices interdum,
rhoncus eget, ipsum. Cum sociis natoque penatibus et
magnis dis parturient montes, nascetur ridiculus mus.
'''

Other info:

* don't worry about nesting
* the
and
musn't be stripped.

Gerard

(Sorry if I ruined the parent thread.) FWIW, I didn't get a groupby
solution but with some help from the Python Cookbook (O'Reilly), I
came up with the following:

import re

RE_START_BLOCK = re.compile('^\[[\w|\s]*\]$')
RE_END_BLOCK = re.compile('^\[/[\w|\s]*\]$')

def iter_blocks(lines):
block = []
inblock = False
for line in lines:
if line.isspace():
if inblock:
block.append(line)
elif block:
yield block
block = []
else:
if RE_START_BLOCK.match(line):
inblock = True
elif RE_END_BLOCK.match(line):
inblock = False
block.append(line.lstrip())
if block:
yield block

Jun 4 '07 #4

Similar topics

online cookbook recipes

by: Ringwraith | last post by:

Hello! I want to ask You the question about the licence of ASPN online Python Cookbook recipes. Under what licence are those recipes. If I want to use in my application some parts of the code...

Python

Looking for cookbook/recipe generating multipart web data

by: Raaijmakers, Vincent $GE Infrastructure$ | last post by:

My web server (apache+mod_python) needs to reply on some session with a multipart connection. My idea is to use the content-type of multipart/mixed; boundary=--myboundary The data that I would...

Python

Python Cookbook Second Edition call for submissions

by: Alex Martelli | last post by:

Greetings, fellow Pythonistas! We (Alex Martelli, David Ascher and Anna Martelli Ravenscroft) are in the process of selecting recipes for the Second Edition of the Python Cookbook. Please...

Python

Python 2.4 / WinXP / distutils error (cookbook example)

by: magoldfish | last post by:

Hi, I've installed Python 2.4 on Windows XP and walked through the Alex Martelli ASPN cookbook example at: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/66509 This is a recipe for...

Python

Review of 'Python Cookbook'

by: TechBookReport | last post by:

TechBookReport (http://www.techbookreport.com) has just published a review of the Python Cookbook. This is an extract from the full review: We're big fans of cookbooks here at TechBookReport,...

Python

"Access Cookbook, Second Edition" Released by O'Reilly

by: Frederick Noronha $FN$ | last post by:

---------- Forwarded message ---------- Solutions to Everyday User Interface and Programming Problems O'Reilly Releases "Access Cookbook, Second Edition" Sebastopol, CA--Neither reference book...

Microsoft Access / VBA

pleac and Cookbook

by: jonas | last post by:

Hi, After a search on http://pleac.sourceforge.net/pleac_c++/index.html why C++/STL/Boost have a low %? I have also wondered about this, being a newbie. Is it mostly because: 1) All the...

C / C++

232

Requesting advice how to clean up C code for validating string represents integer

by: robert maas, see http://tinyurl.com/uh3t | last post by:

I'm working on examples of programming in several languages, all (except PHP) running under CGI so that I can show both the source files and the actually running of the examples online. The first...

C / C++

ISO Python example projects (like in Perl Cookbook)

by: kj | last post by:

I'm looking for "example implementations" of small projects in Python, similar to the ones given at the end of most chapters of The Perl Cookbook (2nd edition, isbn: 0596003137). (Unfortunately,...

Python

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

C# / C Sharp

php

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP