advice : how do you iterate with an acc ?

vd12005

hello,

i'm wondering how people from here handle this, as i often encounter
something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?

regards

Dec 3 '05 #1

Subscribe Post Reply

2969

bonono

vd*****@yahoo.fr wrote:

hello,

i'm wondering how people from here handle this, as i often encounter
something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?

I think it is quite ok as the last "if acc:" is just an "end-of-stream"
implicit marker, whereas during the loop, you have explicit markers to
signal end/start of blocks. There is no unwanted variable introduced
and I don't see how it can be error prone.

This is one of the case I won't try to make it a one liner, because it
is already very natural :-)

Dec 3 '05 #2

Bengt Richter

On 2 Dec 2005 17:08:02 -0800, bo****@gmail.com wrote:

vd*****@yahoo.fr wrote:
hello,

i'm wondering how people from here handle this, as i often encounter
something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?

It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)

import itertools
for condresult, acciter in itertools.groupby(fileinput.imput(), condition):
if not condresult:
dosomething(list(acciter)) # or dosomething(acciter) if iterator is usable

IOW, groupy collects contiguous lines for which condition evaluates to a distinct
value. Assuming this is a funtion that returns only two distinct values (for true
and false, like True and False), then if I understand your program's logic, you
do nothing with the line(s) that actually satisfy the condition, you just trigger
on them as delimiters and want to process the nonempty groups of the other lines,
so the "if not condresult:" should select those. Groupby won't return an empty group AFAIK,
so you don't need to test for that. Also, you won't need the list call in list(acciter)
if your dosomething can accept an iterator instead of a list.

Regards,
Bengt Richter

Dec 3 '05 #3

Dan Sommers

On 2 Dec 2005 16:45:38 -0800,
vd*****@yahoo.fr wrote:

hello,
i'm wondering how people from here handle this, as i often encounter
something like: acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2 BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?

If doSomething handled an empty list gracefully, then you would have
less repetition:

acc = []
for line in fileinput.input():
if condition(line):
doSomething(acc) #1
acc = []
else:
acc.append(line)
doSomething(acc) #2

If condition were simple enough and the file(s) small enough, perhaps
you could read the whole file at once and use split to separate the
pieces:

contents = file.read()
for acc in contents.split( "this is the delimiter line\n" ):
doSomething(acc.split("\n"))

(There are probably some strange cases of repeated delimiter lines or
delimiter lines occurring at the beginning or end of the file for which
the above code will not work. Caveat emptor.)

If condition were a little more complicated, perhaps re.split would
work.

Or maybe you could look at split and see what it does (since your code
is conceptually very similar to it).

OTOH, FWIW, your version is very clean and very readable and fits my
brain perfectly.

HTH,
Dan

--
Dan Sommers
<http://www.tombstonezero.net/dan/>

Dec 3 '05 #4

bonono

Bengt Richter wrote:

It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)

Ah, groupby. The generic string.split() equivalent. But the doc said
the input needs to be sorted.

Dec 3 '05 #5

Jeffrey Schwab

vd*****@yahoo.fr wrote:

hello,

i'm wondering how people from here handle this, as i often encounter
something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?

Could you add a sentry to the end of your input? E.g.:

for line in fileinput.input() + line_that_matches_condition:

This way, you wouldn't need a separate check at the end.

Dec 3 '05 #6

Ben Finney

vd*****@yahoo.fr wrote:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

Looks like you'd be better off making an Accumulator that knows what
to do.

class Accumulator(list): ... def flush(self):
... if len(self):
... print "Flushing items: %s" % self
... del self[:]
... lines = [ ... "spam", "eggs", "FLUSH",
... "beans", "rat", "FLUSH",
... "strawberry",
... ]
acc = Accumulator()
for line in lines: ... if line == 'FLUSH':
... acc.flush()
... else:
... acc.append(line)
...
Flushing items: ['spam', 'eggs']
Flushing items: ['beans', 'rat'] acc.flush() Flushing items: ['strawberry']

--
\ "[W]e are still the first generation of users, and for all that |
`\ we may have invented the net, we still don't really get it." |
_o__) -- Douglas Adams |
Ben Finney

Dec 3 '05 #7

Bengt Richter

On 2 Dec 2005 18:34:12 -0800, bo****@gmail.com wrote:

Bengt Richter wrote:
It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)

Ah, groupby. The generic string.split() equivalent. But the doc said
the input needs to be sorted.

seq = [3,1,4,'t',0,3,4,2,'t',3,1,4]
import itertools
def condition(item): return item=='t' ... def dosomething(it): return 'doing something with %r'%list(it) ... for condresult, acciter in itertools.groupby(seq, condition): ... if not condresult:
... dosomething(acciter)
...
'doing something with [3, 1, 4]'
'doing something with [0, 3, 4, 2]'
'doing something with [3, 1, 4]'

I think the input only needs to be sorted if you a trying to group sorted subsequences of the input.
I.e., you can't get them extracted together unless the condition is satisfied for a contiguous group, which
only happens if the input is sorted. But AFAIK the grouping logic just scans and applies key condition
and returns iterators for the subsequences that yield the same key function result, along with that result.
So it's a general subsequence extractor. You just have to supply the key function to make the condition value
change when a group ends and a new one begins. And the value can be arbitrary, or just toggle beween two values, e.g.
for condresult, acciter in itertools.groupby(range(20), lambda x:x%3==0 or x==5): ... print '%6s: %r'%(condresult, list(acciter))
...
True: [0]
False: [1, 2]
True: [3]
False: [4]
True: [5, 6]
False: [7, 8]
True: [9]
False: [10, 11]
True: [12]
False: [13, 14]
True: [15]
False: [16, 17]
True: [18]
False: [19]

or a condresult that stays the same in groups, but every group result is different:
for condresult, acciter in itertools.groupby(range(20), lambda x:x//3):

... print '%6s: %r'%(condresult, list(acciter))
...
0: [0, 1, 2]
1: [3, 4, 5]
2: [6, 7, 8]
3: [9, 10, 11]
4: [12, 13, 14]
5: [15, 16, 17]
6: [18, 19]

Regards,
Bengt Richter

Dec 3 '05 #8

bonono

Bengt Richter wrote:

On 2 Dec 2005 18:34:12 -0800, bo****@gmail.com wrote:

Bengt Richter wrote:
It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)

Ah, groupby. The generic string.split() equivalent. But the doc said
the input needs to be sorted.

>>> seq = [3,1,4,'t',0,3,4,2,'t',3,1,4]
>>> import itertools
>>> def condition(item): return item=='t' ... >>> def dosomething(it): return 'doing something with %r'%list(it) ... >>> for condresult, acciter in itertools.groupby(seq, condition): ... if not condresult:
... dosomething(acciter)
...
'doing something with [3, 1, 4]'
'doing something with [0, 3, 4, 2]'
'doing something with [3, 1, 4]'

I think the input only needs to be sorted if you a trying to group sorted subsequences of the input.
I.e., you can't get them extracted together unless the condition is satisfied for a contiguous group, which
only happens if the input is sorted. But AFAIK the grouping logic just scans and applies key condition
and returns iterators for the subsequences that yield the same key function result, along with that result.
So it's a general subsequence extractor. You just have to supply the key function to make the condition value
change when a group ends and a new one begins. And the value can be arbitrary, or just toggle beween two values, e.g.
>>> for condresult, acciter in itertools.groupby(range(20), lambda x:x%3==0 or x==5): ... print '%6s: %r'%(condresult, list(acciter))
...
True: [0]
False: [1, 2]
True: [3]
False: [4]
True: [5, 6]
False: [7, 8]
True: [9]
False: [10, 11]
True: [12]
False: [13, 14]
True: [15]
False: [16, 17]
True: [18]
False: [19]

or a condresult that stays the same in groups, but every group result is different:
>>> for condresult, acciter in itertools.groupby(range(20), lambda x:x//3):

... print '%6s: %r'%(condresult, list(acciter))
...
0: [0, 1, 2]
1: [3, 4, 5]
2: [6, 7, 8]
3: [9, 10, 11]
4: [12, 13, 14]
5: [15, 16, 17]
6: [18, 19]

Thanks. So it basically has an internal state storing the last
"condition" result and if it flips(different), a new group starts.

Dec 3 '05 #9

Scott David Daniels

Jeffrey Schwab wrote:

vd*****@yahoo.fr wrote:
hello,

.... i often encounter something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

Could you add a sentry to the end of your input? E.g.:
for line in fileinput.input() + line_that_matches_condition:
This way, you wouldn't need a separate check at the end.

Check itertools for a good way to do this:

import itertools
SENTRY = 'something for which condition(SENTRY) is True'

f = open(filename)
try:
for line in itertools.chain(f, [SENTRY]):
if condition(line):
if acc:
doSomething(acc)
acc = []
else:
acc.append(line)
assert acc == []
finally:
f.close()
--Scott David Daniels
sc***********@acm.org

Dec 3 '05 #10

Bengt Richter

On 3 Dec 2005 03:28:19 -0800, bo****@gmail.com wrote:

Bengt Richter wrote:
On 2 Dec 2005 18:34:12 -0800, bo****@gmail.com wrote:
>
>Bengt Richter wrote:
>> It looks to me like itertools.groupby could get you close to what you want,
>> e.g., (untested)
>Ah, groupby. The generic string.split() equivalent. But the doc said
>the input needs to be sorted.
>

>>> seq = [3,1,4,'t',0,3,4,2,'t',3,1,4]
>>> import itertools
>>> def condition(item): return item=='t'

...
>>> def dosomething(it): return 'doing something with %r'%list(it)

...
>>> for condresult, acciter in itertools.groupby(seq, condition):

... if not condresult:
... dosomething(acciter)
...
'doing something with [3, 1, 4]'
'doing something with [0, 3, 4, 2]'
'doing something with [3, 1, 4]'

I think the input only needs to be sorted if you a trying to group sorted subsequences of the input.
I.e., you can't get them extracted together unless the condition is satisfied for a contiguous group, which
only happens if the input is sorted. But AFAIK the grouping logic just scans and applies key condition
and returns iterators for the subsequences that yield the same key function result, along with that result.
So it's a general subsequence extractor. You just have to supply the key function to make the condition value
change when a group ends and a new one begins. And the value can be arbitrary, or just toggle beween two values, e.g.
>>> for condresult, acciter in itertools.groupby(range(20), lambda x:x%3==0 or x==5):

... print '%6s: %r'%(condresult, list(acciter))
...
True: [0]
False: [1, 2]
True: [3]
False: [4]
True: [5, 6]
False: [7, 8]
True: [9]
False: [10, 11]
True: [12]
False: [13, 14]
True: [15]
False: [16, 17]
True: [18]
False: [19]

or a condresult that stays the same in groups, but every group result is different:
>>> for condresult, acciter in itertools.groupby(range(20), lambda x:x//3):

... print '%6s: %r'%(condresult, list(acciter))
...
0: [0, 1, 2]
1: [3, 4, 5]
2: [6, 7, 8]
3: [9, 10, 11]
4: [12, 13, 14]
5: [15, 16, 17]
6: [18, 19]

Thanks. So it basically has an internal state storing the last
"condition" result and if it flips(different), a new group starts.

So it appears. But note that "flips(different)" seems to be based on ==,
and default key function is just passthrough like lambda x:x, so e.g. integers
and floats will group together if their values are equal.
E.g., to elucidate further,

Default key function:

from itertools import groupby
for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j]): ... print k, list(g)
...
0 [0, 0.0, 0j]
[] [[]]
() [()]
None [None]
1 [1, 1.0]
1j [1j]

Group by bool value: for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j], key=bool): ... print k, list(g)
...
False [0, 0.0, 0j, [], (), None]
True [1, 1.0, 1j]

It's not trying to sort, so it doesn't trip on complex for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j]): ... print k, list(g)
...
0 [0, 0.0, 0j]
[] [[]]
() [()]
None [None]
1 [1, 1.0]
1j [1j]
2j [2j]

But you have to watch out if you try to pre-sort stuff that includes complex numbers for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j])): ... print k, list(g)
...
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot compare complex numbers using <, <=, >, >=

And if you do sort using a key function, it doesn't mean groupy inherits that keyfunction for grouping
unless you specify it
def keyfun(x): ... if isinstance(x, (int, long, float)): return x
... else: return type(x).__name__
... for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j], key=keyfun)): ... print k, list(g)
...
0 [0, 0.0]
1 [1, 1.0]
None [None]
0j [0j]
1j [1j]
2j [2j]
[] [[]]
() [()]

Vs giving groupby the same keyfun for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j], key=keyfun), keyfun): ... print k, list(g)
...
0 [0, 0.0]
1 [1, 1.0]
NoneType [None]
complex [0j, 1j, 2j]
list [[]]
tuple [()]
Exmple of unsorted vs sorted subgroup extraction:
for k,g in groupby('this that other thing note order'.split(), key=lambda s:s[0]): ... print k, list(g)
...
t ['this', 'that']
o ['other']
t ['thing']
n ['note']
o ['order']

vs.
for k,g in groupby(sorted('this that other thing note order'.split()), key=lambda s:s[0]):

... print k, list(g)
...
n ['note']
o ['order', 'other']
t ['that', 'thing', 'this']

Oops, that key would be less brittle as (untested) key=lambda s:s[:1], e.g., in case a split with args was used.

Regards,
Bengt Richter

Dec 3 '05 #11

by: Sam Watson | last post by:

Hi, I could use a little help with a project i'm going to build. I know I want to use python and wxWindows but thats about all I know. The client will be linux or windows. The server will be...

Python

complex query / advice needed

by: Christoph Bisping | last post by:

Hello! I'm seeking advice on a rather complex type of query I need to build in an Access ADP (SQL-Server 7). There are four tables: tblPeople ID(PK) PRENAME --------------- 1 Thomas 2 Frank

Microsoft SQL Server

need some advice- finding rows in DataGrid

by: MrNobody | last post by:

what I'd like to have is a simple feature to search within a column in a DataGrid and have that the first matching row selected, then the user can choose to search again where it will select the...

C# / C Sharp

Asynchronous Socket Server Advice

by: Colin | last post by:

I'm writing a little console socket server but I'm having some difficulty. Can I ask your advice - where is the best place to get some help on that topic? It would be nice if some people who knew...

ASP.NET

Starting University COSC and learning JAVA, advice please :D

by: David Van D | last post by:

Hi there, A few weeks until I begin my journey towards a degree in Computer Science at Canterbury University in New Zealand, Anyway the course tutors are going to be teaching us JAVA wth bluej...

Java

client/server design and advice

by: TonyM | last post by:

I recently completed the general guidelines for a future project that I would like to start developing...but I've sort of hit a wall with respect to how to design it. In short, I want to run...

Python

advice on how to procede with this project

by: Peted | last post by:

Hello Im writing a app that is designed to test and interrogate serial devices. I can do the serial comunication ok My problem is i have a mdi parent form and i want the interface for the...

C# / C Sharp

STL container advice

by: brekehan | last post by:

I am implementing a event messaging system. Basically I do: ---update cycle--- processing/check for new events allocate a new event put it in a std::queue Dispatch events make an event...

C / C++

DB2 SQL PL advice

by: PJackson | last post by:

DB2 UDB 8.X on Windows server A couple of questions: 1.) I am trying to develop some DB2 SQL stored procedures. Dev environment at UDB v 8.2, production environment still at UDB v 8.1. Any...

DB2 Database

how to show/hide rows when using iterate tag...

by: prathna | last post by:

Hi .. I have a logic:iterate tag which will display 5 rows each row with a drop downlist and 2 textfields.now by default all the rows will be shown.how do i hide all the rows except the first...

Javascript

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

advice : how do you iterate with an acc ?

Similar topics