473,800 Members | 2,711 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

iterblocks cookbook example

George Sakkis produced the following cookbook recipe,
which addresses a common problem that comes up on this
mailing list:

http://aspn.activestate.com/ASPN/Coo.../Recipe/521877
I would propose adding something like this to the
cookbook example above.

def iterblocks2(lst , start_delim):
# This variation on iterblocks shows a more
typical
# implementation that behaves like iterblocks for
# the Hello World example. The problem with this
naive
# implementation is that you cannot pass arbitrary
# iterators.
blocks = []
new_block = []
for item in lst:
if start_delim(ite m):
if new_block: blocks.append(n ew_block)
new_block = []
else:
new_block.appen d(item)
if new_block: blocks.append(n ew_block)
return blocks

Comments welcome. This has been tested on George's
slow-version-of-string-split example. It treates the
delimiter as not being part of the block, and it punts
on the issue of what to do when you have empty blocks
(i.e. consecutive delimiters).



_______________ _______________ _______________ _______________ _______________ _________
Be a better Heartthrob. Get better relationship answers from someone who knows. Yahoo! Answers - Check it out.
http://answers.yahoo.com/dir/?link=list&sid=396545433
Jun 2 '07 #1
3 1483
On Jun 2, 10:19 am, Steve Howell <showel...@yaho o.comwrote:
George Sakkis produced the following cookbook recipe,
which addresses a common problem that comes up on this
mailing list:
ISTM, this is a common mailing list problem because it is fun
to solve, not because people actually need it on a day-to-day basis.

In that spirit, it would be fun to compare several different
approaches to the same problem using re.finditer, itertools.group by,
or the tokenize module. To get the ball rolling, here is one variant:

from itertools import groupby

def blocks(s, start, end):
def classify(c, ingroup=[0], delim={start:2, end:3}):
result = delim.get(c, ingroup[0])
ingroup[0] = result in (1, 2)
return result
return [tuple(g) for k, g in groupby(s, classify) if k == 1]

print blocks('the <quickbrown <foxjumped', start='<', end='>')

One observation is that groupby() is an enormously flexible tool.
Given a well crafted key= function, it makes short work of almost
any data partitioning problem.
Raymond

Jun 2 '07 #2
On Jun 2, 10:47 pm, Raymond Hettinger <pyt...@rcn.com wrote:
On Jun 2, 10:19 am, Steve Howell <showel...@yaho o.comwrote:
George Sakkis produced the following cookbook recipe,
which addresses a common problem that comes up on this
mailing list:

ISTM, this is a common mailing list problem because it is fun
to solve, not because people actually need it on a day-to-day basis.

In that spirit, it would be fun to compare several different
approaches to the same problem using re.finditer, itertools.group by,
or the tokenize module. To get the ball rolling, here is one variant:

from itertools import groupby

def blocks(s, start, end):
def classify(c, ingroup=[0], delim={start:2, end:3}):
result = delim.get(c, ingroup[0])
ingroup[0] = result in (1, 2)
return result
return [tuple(g) for k, g in groupby(s, classify) if k == 1]

print blocks('the <quickbrown <foxjumped', start='<', end='>')

One observation is that groupby() is an enormously flexible tool.
Given a well crafted key= function, it makes short work of almost
any data partitioning problem.
Can anyone suggest a function that will split text by paragraphs, but
NOT if the paragraphs are contained within a
...
construct. In other words, the following text should yield 3 blocks
not 6:

TEXT = '''
Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Pellentesque dolor quam, dignissim ornare, porta et,
auctor eu, leo. Phasellus malesuada metus id magna.

Only when flight shall soar
not for its own sake only
up into heaven's lonely
silence, and be no more

merely the lightly profiling,
proudly successful tool,
playmate of winds, beguiling
time there, careless and cool:

only when some pure Whither
outweighs boyish insistence
on the achieved machine

will who has journeyed thither
be, in that fading distance,
all that his flight has been.
Integer urna nulla, tempus sit amet, ultrices interdum,
rhoncus eget, ipsum. Cum sociis natoque penatibus et
magnis dis parturient montes, nascetur ridiculus mus.
'''

Other info:

* don't worry about nesting
* the
and
musn't be stripped.

Gerard

Jun 4 '07 #3
On Jun 4, 1:52 pm, Gerard Flanagan <grflana...@yah oo.co.ukwrote:
On Jun 2, 10:47 pm, Raymond Hettinger <pyt...@rcn.com wrote:
On Jun 2, 10:19 am, Steve Howell <showel...@yaho o.comwrote:
George Sakkis produced the following cookbook recipe,
which addresses a common problem that comes up on this
mailing list:
ISTM, this is a common mailing list problem because it is fun
to solve, not because people actually need it on a day-to-day basis.
In that spirit, it would be fun to compare several different
approaches to the same problem using re.finditer, itertools.group by,
or the tokenize module. To get the ball rolling, here is one variant:
from itertools import groupby
def blocks(s, start, end):
def classify(c, ingroup=[0], delim={start:2, end:3}):
result = delim.get(c, ingroup[0])
ingroup[0] = result in (1, 2)
return result
return [tuple(g) for k, g in groupby(s, classify) if k == 1]
print blocks('the <quickbrown <foxjumped', start='<', end='>')
One observation is that groupby() is an enormously flexible tool.
Given a well crafted key= function, it makes short work of almost
any data partitioning problem.

Can anyone suggest a function that will split text by paragraphs, but
NOT if the paragraphs are contained within a
...
construct. In other words, the following text should yield 3 blocks
not 6:

TEXT = '''
Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Pellentesque dolor quam, dignissim ornare, porta et,
auctor eu, leo. Phasellus malesuada metus id magna.

Only when flight shall soar
not for its own sake only
up into heaven's lonely
silence, and be no more

merely the lightly profiling,
proudly successful tool,
playmate of winds, beguiling
time there, careless and cool:

only when some pure Whither
outweighs boyish insistence
on the achieved machine

will who has journeyed thither
be, in that fading distance,
all that his flight has been.

Integer urna nulla, tempus sit amet, ultrices interdum,
rhoncus eget, ipsum. Cum sociis natoque penatibus et
magnis dis parturient montes, nascetur ridiculus mus.
'''

Other info:

* don't worry about nesting
* the
and
musn't be stripped.

Gerard
(Sorry if I ruined the parent thread.) FWIW, I didn't get a groupby
solution but with some help from the Python Cookbook (O'Reilly), I
came up with the following:

import re

RE_START_BLOCK = re.compile('^\[[\w|\s]*\]$')
RE_END_BLOCK = re.compile('^\[/[\w|\s]*\]$')

def iter_blocks(lin es):
block = []
inblock = False
for line in lines:
if line.isspace():
if inblock:
block.append(li ne)
elif block:
yield block
block = []
else:
if RE_START_BLOCK. match(line):
inblock = True
elif RE_END_BLOCK.ma tch(line):
inblock = False
block.append(li ne.lstrip())
if block:
yield block

Jun 4 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1322
by: Ringwraith | last post by:
Hello! I want to ask You the question about the licence of ASPN online Python Cookbook recipes. Under what licence are those recipes. If I want to use in my application some parts of the code from some recipe, should I place a statement saying who the author is, etc.? Best Regards, Niki
0
1176
by: Raaijmakers, Vincent \(GE Infrastructure\) | last post by:
My web server (apache+mod_python) needs to reply on some session with a multipart connection. My idea is to use the content-type of multipart/mixed; boundary=--myboundary The data that I would like to send is a continuous stream of data, divided by that boundary. Well, there are tons of examples on the web how to encode such a stream on the client side, but I can't find an example how to generate this data using mod_python. So for now...
0
1616
by: Alex Martelli | last post by:
Greetings, fellow Pythonistas! We (Alex Martelli, David Ascher and Anna Martelli Ravenscroft) are in the process of selecting recipes for the Second Edition of the Python Cookbook. Please contribute your recipes (code and discussion), along with comments on and ratings of existing recipes, to the cookbook site, http://aspn.activestate.com/ASPN/Cookbook/Python , and do it *now*! The Python Cookbook is a collaborative collection of your...
2
1605
by: magoldfish | last post by:
Hi, I've installed Python 2.4 on Windows XP and walked through the Alex Martelli ASPN cookbook example at: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/66509 This is a recipe for a simple extension type for Python. When I try to build and install it, however, I get an error:
0
1224
by: TechBookReport | last post by:
TechBookReport (http://www.techbookreport.com) has just published a review of the Python Cookbook. This is an extract from the full review: We're big fans of cookbooks here at TechBookReport, whether its Java, XSLT or Linux, they're a great way of pulling together lots of useful snippets of code and technique in one place. For the beginner they provide instant advice, usable code and a way into new areas. They're also a great way to find...
0
2984
by: Frederick Noronha \(FN\) | last post by:
---------- Forwarded message ---------- Solutions to Everyday User Interface and Programming Problems O'Reilly Releases "Access Cookbook, Second Edition" Sebastopol, CA--Neither reference book nor tutorial, "Access Cookbook, Second Edition" (O'Reilly, US $49.95), by Ken Getz, Paul Litwin, and Andy Baron, delivers hundreds of practical examples, up-to-date suggestions, and handy solutions to real-world problems that Access users and...
4
1743
by: jonas | last post by:
Hi, After a search on http://pleac.sourceforge.net/pleac_c++/index.html why C++/STL/Boost have a low %? I have also wondered about this, being a newbie. Is it mostly because: 1) All the easy tasks have been accomplished?
232
13363
by: robert maas, see http://tinyurl.com/uh3t | last post by:
I'm working on examples of programming in several languages, all (except PHP) running under CGI so that I can show both the source files and the actually running of the examples online. The first set of examples, after decoding the HTML FORM contents, merely verifies the text within a field to make sure it is a valid representation of an integer, without any junk thrown in, i.e. it must satisfy the regular expression: ^ *?+ *$ If the...
4
1484
by: kj | last post by:
I'm looking for "example implementations" of small projects in Python, similar to the ones given at the end of most chapters of The Perl Cookbook (2nd edition, isbn: 0596003137). (Unfortunately, the otherwise excellent Python Cookbook (2nd edition, isbn: 0596007973), by the same publisher (O'Reilly), does not have this great feature.) The subchapters devoted to these small projects (which are called "Program"s in the book), each...
0
9691
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10507
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10036
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9092
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6815
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5473
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5607
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3765
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2948
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.