473,406 Members | 2,698 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

advice : how do you iterate with an acc ?

hello,

i'm wondering how people from here handle this, as i often encounter
something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?

regards

Dec 3 '05 #1
10 2969

vd*****@yahoo.fr wrote:
hello,

i'm wondering how people from here handle this, as i often encounter
something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?

I think it is quite ok as the last "if acc:" is just an "end-of-stream"
implicit marker, whereas during the loop, you have explicit markers to
signal end/start of blocks. There is no unwanted variable introduced
and I don't see how it can be error prone.

This is one of the case I won't try to make it a one liner, because it
is already very natural :-)

Dec 3 '05 #2
On 2 Dec 2005 17:08:02 -0800, bo****@gmail.com wrote:

vd*****@yahoo.fr wrote:
hello,

i'm wondering how people from here handle this, as i often encounter
something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?

It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)

import itertools
for condresult, acciter in itertools.groupby(fileinput.imput(), condition):
if not condresult:
dosomething(list(acciter)) # or dosomething(acciter) if iterator is usable

IOW, groupy collects contiguous lines for which condition evaluates to a distinct
value. Assuming this is a funtion that returns only two distinct values (for true
and false, like True and False), then if I understand your program's logic, you
do nothing with the line(s) that actually satisfy the condition, you just trigger
on them as delimiters and want to process the nonempty groups of the other lines,
so the "if not condresult:" should select those. Groupby won't return an empty group AFAIK,
so you don't need to test for that. Also, you won't need the list call in list(acciter)
if your dosomething can accept an iterator instead of a list.

Regards,
Bengt Richter
Dec 3 '05 #3
On 2 Dec 2005 16:45:38 -0800,
vd*****@yahoo.fr wrote:
hello,
i'm wondering how people from here handle this, as i often encounter
something like: acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2 BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?


If doSomething handled an empty list gracefully, then you would have
less repetition:

acc = []
for line in fileinput.input():
if condition(line):
doSomething(acc) #1
acc = []
else:
acc.append(line)
doSomething(acc) #2

If condition were simple enough and the file(s) small enough, perhaps
you could read the whole file at once and use split to separate the
pieces:

contents = file.read()
for acc in contents.split( "this is the delimiter line\n" ):
doSomething(acc.split("\n"))

(There are probably some strange cases of repeated delimiter lines or
delimiter lines occurring at the beginning or end of the file for which
the above code will not work. Caveat emptor.)

If condition were a little more complicated, perhaps re.split would
work.

Or maybe you could look at split and see what it does (since your code
is conceptually very similar to it).

OTOH, FWIW, your version is very clean and very readable and fits my
brain perfectly.

HTH,
Dan

--
Dan Sommers
<http://www.tombstonezero.net/dan/>
Dec 3 '05 #4

Bengt Richter wrote:
It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)

Ah, groupby. The generic string.split() equivalent. But the doc said
the input needs to be sorted.

Dec 3 '05 #5
vd*****@yahoo.fr wrote:
hello,

i'm wondering how people from here handle this, as i often encounter
something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?


Could you add a sentry to the end of your input? E.g.:

for line in fileinput.input() + line_that_matches_condition:

This way, you wouldn't need a separate check at the end.
Dec 3 '05 #6
vd*****@yahoo.fr wrote:
acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2


Looks like you'd be better off making an Accumulator that knows what
to do.
class Accumulator(list): ... def flush(self):
... if len(self):
... print "Flushing items: %s" % self
... del self[:]
... lines = [ ... "spam", "eggs", "FLUSH",
... "beans", "rat", "FLUSH",
... "strawberry",
... ]
acc = Accumulator()
for line in lines: ... if line == 'FLUSH':
... acc.flush()
... else:
... acc.append(line)
...
Flushing items: ['spam', 'eggs']
Flushing items: ['beans', 'rat'] acc.flush() Flushing items: ['strawberry']


--
\ "[W]e are still the first generation of users, and for all that |
`\ we may have invented the net, we still don't really get it." |
_o__) -- Douglas Adams |
Ben Finney
Dec 3 '05 #7
On 2 Dec 2005 18:34:12 -0800, bo****@gmail.com wrote:

Bengt Richter wrote:
It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)

Ah, groupby. The generic string.split() equivalent. But the doc said
the input needs to be sorted.

seq = [3,1,4,'t',0,3,4,2,'t',3,1,4]
import itertools
def condition(item): return item=='t' ... def dosomething(it): return 'doing something with %r'%list(it) ... for condresult, acciter in itertools.groupby(seq, condition): ... if not condresult:
... dosomething(acciter)
...
'doing something with [3, 1, 4]'
'doing something with [0, 3, 4, 2]'
'doing something with [3, 1, 4]'

I think the input only needs to be sorted if you a trying to group sorted subsequences of the input.
I.e., you can't get them extracted together unless the condition is satisfied for a contiguous group, which
only happens if the input is sorted. But AFAIK the grouping logic just scans and applies key condition
and returns iterators for the subsequences that yield the same key function result, along with that result.
So it's a general subsequence extractor. You just have to supply the key function to make the condition value
change when a group ends and a new one begins. And the value can be arbitrary, or just toggle beween two values, e.g.
for condresult, acciter in itertools.groupby(range(20), lambda x:x%3==0 or x==5): ... print '%6s: %r'%(condresult, list(acciter))
...
True: [0]
False: [1, 2]
True: [3]
False: [4]
True: [5, 6]
False: [7, 8]
True: [9]
False: [10, 11]
True: [12]
False: [13, 14]
True: [15]
False: [16, 17]
True: [18]
False: [19]

or a condresult that stays the same in groups, but every group result is different:
for condresult, acciter in itertools.groupby(range(20), lambda x:x//3):

... print '%6s: %r'%(condresult, list(acciter))
...
0: [0, 1, 2]
1: [3, 4, 5]
2: [6, 7, 8]
3: [9, 10, 11]
4: [12, 13, 14]
5: [15, 16, 17]
6: [18, 19]

Regards,
Bengt Richter
Dec 3 '05 #8

Bengt Richter wrote:
On 2 Dec 2005 18:34:12 -0800, bo****@gmail.com wrote:

Bengt Richter wrote:
It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)

Ah, groupby. The generic string.split() equivalent. But the doc said
the input needs to be sorted.

>>> seq = [3,1,4,'t',0,3,4,2,'t',3,1,4]
>>> import itertools
>>> def condition(item): return item=='t' ... >>> def dosomething(it): return 'doing something with %r'%list(it) ... >>> for condresult, acciter in itertools.groupby(seq, condition): ... if not condresult:
... dosomething(acciter)
...
'doing something with [3, 1, 4]'
'doing something with [0, 3, 4, 2]'
'doing something with [3, 1, 4]'

I think the input only needs to be sorted if you a trying to group sorted subsequences of the input.
I.e., you can't get them extracted together unless the condition is satisfied for a contiguous group, which
only happens if the input is sorted. But AFAIK the grouping logic just scans and applies key condition
and returns iterators for the subsequences that yield the same key function result, along with that result.
So it's a general subsequence extractor. You just have to supply the key function to make the condition value
change when a group ends and a new one begins. And the value can be arbitrary, or just toggle beween two values, e.g.
>>> for condresult, acciter in itertools.groupby(range(20), lambda x:x%3==0 or x==5): ... print '%6s: %r'%(condresult, list(acciter))
...
True: [0]
False: [1, 2]
True: [3]
False: [4]
True: [5, 6]
False: [7, 8]
True: [9]
False: [10, 11]
True: [12]
False: [13, 14]
True: [15]
False: [16, 17]
True: [18]
False: [19]

or a condresult that stays the same in groups, but every group result is different:
>>> for condresult, acciter in itertools.groupby(range(20), lambda x:x//3):

... print '%6s: %r'%(condresult, list(acciter))
...
0: [0, 1, 2]
1: [3, 4, 5]
2: [6, 7, 8]
3: [9, 10, 11]
4: [12, 13, 14]
5: [15, 16, 17]
6: [18, 19]

Thanks. So it basically has an internal state storing the last
"condition" result and if it flips(different), a new group starts.

Dec 3 '05 #9
Jeffrey Schwab wrote:
vd*****@yahoo.fr wrote:
hello,

.... i often encounter something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2


Could you add a sentry to the end of your input? E.g.:
for line in fileinput.input() + line_that_matches_condition:
This way, you wouldn't need a separate check at the end.


Check itertools for a good way to do this:

import itertools
SENTRY = 'something for which condition(SENTRY) is True'

f = open(filename)
try:
for line in itertools.chain(f, [SENTRY]):
if condition(line):
if acc:
doSomething(acc)
acc = []
else:
acc.append(line)
assert acc == []
finally:
f.close()
--Scott David Daniels
sc***********@acm.org
Dec 3 '05 #10
On 3 Dec 2005 03:28:19 -0800, bo****@gmail.com wrote:

Bengt Richter wrote:
On 2 Dec 2005 18:34:12 -0800, bo****@gmail.com wrote:
>
>Bengt Richter wrote:
>> It looks to me like itertools.groupby could get you close to what you want,
>> e.g., (untested)
>Ah, groupby. The generic string.split() equivalent. But the doc said
>the input needs to be sorted.
>

>>> seq = [3,1,4,'t',0,3,4,2,'t',3,1,4]
>>> import itertools
>>> def condition(item): return item=='t'

...
>>> def dosomething(it): return 'doing something with %r'%list(it)

...
>>> for condresult, acciter in itertools.groupby(seq, condition):

... if not condresult:
... dosomething(acciter)
...
'doing something with [3, 1, 4]'
'doing something with [0, 3, 4, 2]'
'doing something with [3, 1, 4]'

I think the input only needs to be sorted if you a trying to group sorted subsequences of the input.
I.e., you can't get them extracted together unless the condition is satisfied for a contiguous group, which
only happens if the input is sorted. But AFAIK the grouping logic just scans and applies key condition
and returns iterators for the subsequences that yield the same key function result, along with that result.
So it's a general subsequence extractor. You just have to supply the key function to make the condition value
change when a group ends and a new one begins. And the value can be arbitrary, or just toggle beween two values, e.g.
>>> for condresult, acciter in itertools.groupby(range(20), lambda x:x%3==0 or x==5):

... print '%6s: %r'%(condresult, list(acciter))
...
True: [0]
False: [1, 2]
True: [3]
False: [4]
True: [5, 6]
False: [7, 8]
True: [9]
False: [10, 11]
True: [12]
False: [13, 14]
True: [15]
False: [16, 17]
True: [18]
False: [19]

or a condresult that stays the same in groups, but every group result is different:
>>> for condresult, acciter in itertools.groupby(range(20), lambda x:x//3):

... print '%6s: %r'%(condresult, list(acciter))
...
0: [0, 1, 2]
1: [3, 4, 5]
2: [6, 7, 8]
3: [9, 10, 11]
4: [12, 13, 14]
5: [15, 16, 17]
6: [18, 19]

Thanks. So it basically has an internal state storing the last
"condition" result and if it flips(different), a new group starts.

So it appears. But note that "flips(different)" seems to be based on ==,
and default key function is just passthrough like lambda x:x, so e.g. integers
and floats will group together if their values are equal.
E.g., to elucidate further,

Default key function:
from itertools import groupby
for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j]): ... print k, list(g)
...
0 [0, 0.0, 0j]
[] [[]]
() [()]
None [None]
1 [1, 1.0]
1j [1j]

Group by bool value: for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j], key=bool): ... print k, list(g)
...
False [0, 0.0, 0j, [], (), None]
True [1, 1.0, 1j]

It's not trying to sort, so it doesn't trip on complex for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j]): ... print k, list(g)
...
0 [0, 0.0, 0j]
[] [[]]
() [()]
None [None]
1 [1, 1.0]
1j [1j]
2j [2j]

But you have to watch out if you try to pre-sort stuff that includes complex numbers for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j])): ... print k, list(g)
...
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot compare complex numbers using <, <=, >, >=

And if you do sort using a key function, it doesn't mean groupy inherits that keyfunction for grouping
unless you specify it
def keyfun(x): ... if isinstance(x, (int, long, float)): return x
... else: return type(x).__name__
... for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j], key=keyfun)): ... print k, list(g)
...
0 [0, 0.0]
1 [1, 1.0]
None [None]
0j [0j]
1j [1j]
2j [2j]
[] [[]]
() [()]

Vs giving groupby the same keyfun for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j], key=keyfun), keyfun): ... print k, list(g)
...
0 [0, 0.0]
1 [1, 1.0]
NoneType [None]
complex [0j, 1j, 2j]
list [[]]
tuple [()]
Exmple of unsorted vs sorted subgroup extraction:
for k,g in groupby('this that other thing note order'.split(), key=lambda s:s[0]): ... print k, list(g)
...
t ['this', 'that']
o ['other']
t ['thing']
n ['note']
o ['order']

vs.
for k,g in groupby(sorted('this that other thing note order'.split()), key=lambda s:s[0]):

... print k, list(g)
...
n ['note']
o ['order', 'other']
t ['that', 'thing', 'this']

Oops, that key would be less brittle as (untested) key=lambda s:s[:1], e.g., in case a split with args was used.

Regards,
Bengt Richter
Dec 3 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Sam Watson | last post by:
Hi, I could use a little help with a project i'm going to build. I know I want to use python and wxWindows but thats about all I know. The client will be linux or windows. The server will be...
4
by: Christoph Bisping | last post by:
Hello! I'm seeking advice on a rather complex type of query I need to build in an Access ADP (SQL-Server 7). There are four tables: tblPeople ID(PK) PRENAME --------------- 1 Thomas 2 Frank
1
by: MrNobody | last post by:
what I'd like to have is a simple feature to search within a column in a DataGrid and have that the first matching row selected, then the user can choose to search again where it will select the...
7
by: Colin | last post by:
I'm writing a little console socket server but I'm having some difficulty. Can I ask your advice - where is the best place to get some help on that topic? It would be nice if some people who knew...
1
by: David Van D | last post by:
Hi there, A few weeks until I begin my journey towards a degree in Computer Science at Canterbury University in New Zealand, Anyway the course tutors are going to be teaching us JAVA wth bluej...
5
by: TonyM | last post by:
I recently completed the general guidelines for a future project that I would like to start developing...but I've sort of hit a wall with respect to how to design it. In short, I want to run...
4
by: Peted | last post by:
Hello Im writing a app that is designed to test and interrogate serial devices. I can do the serial comunication ok My problem is i have a mdi parent form and i want the interface for the...
4
by: brekehan | last post by:
I am implementing a event messaging system. Basically I do: ---update cycle--- processing/check for new events allocate a new event put it in a std::queue Dispatch events make an event...
5
by: PJackson | last post by:
DB2 UDB 8.X on Windows server A couple of questions: 1.) I am trying to develop some DB2 SQL stored procedures. Dev environment at UDB v 8.2, production environment still at UDB v 8.1. Any...
1
by: prathna | last post by:
Hi .. I have a logic:iterate tag which will display 5 rows each row with a drop downlist and 2 textfields.now by default all the rows will be shown.how do i hide all the rows except the first...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.