467,175 Members | 1,311 Online

# need help on generator...

 hello, i'm trying to understand how i could build following consecutive sets from a root one using generator : l = [1,2,3,4] would like to produce : [1,2], [2,3], [3,4], [1,2,3], [2,3,4] but unfortunately can not, i guess i can do it by using sub generator and maybe enumerate, please if you help could you explain a bit the trick ? looks like this sub generator thing mess me up. (i think it should may be also have [1,], [2,], [3,], [1,2,3,4] , and then be filtered) bests Jul 18 '05 #1
• viewed: 2585
Share:
45 Replies
 On 21 Jan 2005 05:58:03 -0800 jo******@yahoo.fr (Joh) wrote: i'm trying to understand how i could build following consecutive sets from a root one using generator : l = [1,2,3,4] would like to produce : [1,2], [2,3], [3,4], [1,2,3], [2,3,4] def consecutive_sets(l): .... for i in xrange(2, len(l)): .... for j in xrange(0, len(l)-i+1): .... yield l[j:j+i] .... list(consecutive_sets([1,2,3,4])) [[1, 2], [2, 3], [3, 4], [1, 2, 3], [2, 3, 4]] -- Denis S. Otkidach http://www.python.ru/ [ru] Jul 18 '05 #2
 Joh wrote: hello,i'm trying to understand how i could build following consecutive setsfrom a root one using generator :l = [1,2,3,4]would like to produce :[1,2], [2,3], [3,4], [1,2,3], [2,3,4] Do you mean: [1,2], [2,3], [3,4], [1,2,3], [2,3,4], [1,3,4] (E.g. all elements in the power set except the empty set, the sets with one element and the sets with all elements.) Otherwise, please describe your desired sets in verbal - I cannot see the point. Jul 18 '05 #3
 On Fri, 2005-01-21 at 17:14 +0300, Denis S. Otkidach wrote: On 21 Jan 2005 05:58:03 -0800 jo******@yahoo.fr (Joh) wrote: i'm trying to understand how i could build following consecutive sets from a root one using generator : l = [1,2,3,4] would like to produce : [1,2], [2,3], [3,4], [1,2,3], [2,3,4] def consecutive_sets(l): ... for i in xrange(2, len(l)): ... for j in xrange(0, len(l)-i+1): ... yield l[j:j+i] Since you have a much faster brain than I (though I ended up with exactly the same thing barring variable names) and beat me to posting the answer, I'll post the inevitable awful generator expression version instead: consecutive_sets = ( x[offset:offset+subset_size] for subset_size in xrange(2, len(x)) for offset in xrange(0, len(x) + 1 - subset_size) ) -- Craig Ringer Jul 18 '05 #4
 Hi, I recently read David Mertz (IBM DeveloperWorks) about generators and got excited about using lazy constructs in my Python programming. But besides the fact that generators are either produced with the new "yield" reserved word or by defining the __new__ method in a class definition, I don't know much about them. In particular, I don't know what Python constructs does generate a generator. I know this is now the case for reading lines in a file or with the new "iterator" package. But what else ? Does Craig Ringer answer mean that list comprehensions are lazy ? Where can I find a comprehensive list of all the lazy constructions built in Python ? (I think that to easily distinguish lazy from strict constructs is an absolute programmer need -- otherwise you always end up wondering when is it that code is actually executed like in Haskell). Thank you Francis Girard FRANCE Le vendredi 21 Janvier 2005 15:38, Craig Ringer a Ã©critÂ*: On Fri, 2005-01-21 at 17:14 +0300, Denis S. Otkidach wrote: On 21 Jan 2005 05:58:03 -0800 jo******@yahoo.fr (Joh) wrote: i'm trying to understand how i could build following consecutive sets from a root one using generator : l = [1,2,3,4] would like to produce : [1,2], [2,3], [3,4], [1,2,3], [2,3,4]>> def consecutive_sets(l): ... for i in xrange(2, len(l)): ... for j in xrange(0, len(l)-i+1): ... yield l[j:j+i] Since you have a much faster brain than I (though I ended up with exactly the same thing barring variable names) and beat me to posting the answer, I'll post the inevitable awful generator expression version instead: consecutive_sets = ( x[offset:offset+subset_size] for subset_size in xrange(2, len(x)) for offset in xrange(0, len(x) + 1 - subset_size) ) -- Craig Ringer Jul 18 '05 #5
 On Fri, 2005-01-21 at 22:38 +0800, Craig Ringer wrote: consecutive_sets = ( x[offset:offset+subset_size] for subset_size in xrange(2, len(x)) for offset in xrange(0, len(x) + 1 - subset_size) ) Where 'x' is list to operate on, as I should've initially noted. Sorry for the reply-to-self. I did say "awful" for a reason ;-) -- Craig Ringer Jul 18 '05 #6
 Joh wrote: i'm trying to understand how i could build following consecutive sets from a root one using generator : l = [1,2,3,4] would like to produce : [1,2], [2,3], [3,4], [1,2,3], [2,3,4] but unfortunately can not, i guess i can do it by using sub generator and maybe enumerate, please if you help could you explain a bit the trick ? looks like this sub generator thing mess me up. Here is an (untested) variant that accepts any iterable while trying to remain memory-efficient. This makes it necessary to shuffle the order of the output a bit. from itertools import tee, islice def gen(iterable, start, end): it = iter(iterable) while True: it, a = tee(it) a = tuple(islice(a, end-1)) for sz in xrange(start, len(a)+1): yield a[:sz] it.next() if __name__ == "__main__": print list(gen(range(1, 5), 2, 4)) # prints: # [(1, 2), (1, 2, 3), (2, 3), (2, 3, 4), (3, 4)] Peter Jul 18 '05 #7
 Le vendredi 21 Janvier 2005 16:06, Craig Ringer a Ã©critÂ*: On Fri, 2005-01-21 at 22:38 +0800, Craig Ringer wrote: consecutive_sets = ( x[offset:offset+subset_size] for subset_size in xrange(2, len(x)) for offset in xrange(0, len(x) + 1 - subset_size) ) Where 'x' is list to operate on, as I should've initially noted. Sorry for the reply-to-self. I did say "awful" for a reason ;-) -- Craig Ringer First, I think that you mean : consecutive_sets = [ x[offset:offset+subset_size] for subset_size in xrange(2, len(x)) for offset in xrange(0, len(x) + 1 - subset_size)] (with square brackets). Second, this is not lazy anymore (like Denis S. Otkidach previous answer was) because the __whole__ list get constructed __before__ any other piece of code have a chance to execute. The type of consecutive_sets is simply a list, not a generator. I'm just trying to understand and obviously I'm missing the point. Thank you Francis Girard FRANCE Jul 18 '05 #9
 On Fri, 2005-01-21 at 16:54 +0100, Francis Girard wrote: First, I think that you mean : consecutive_sets = [ x[offset:offset+subset_size] for subset_size in xrange(2, len(x)) for offset in xrange(0, len(x) + 1 - subset_size)] (with square brackets). I'm just trying to understand and obviously I'm missing the point. Yep. This: ( x for x in xrange(10) ) will return a generator that calculates things as it goes, while this: [ x for x in xrange(10) ] will return a list. Check out: http://www.python.org/peps/pep-0289.html http://docs.python.org/whatsnew/node4.html http://www.python.org/dev/doc/newstyle/ref/genexpr.html for details. -- Craig Ringer Jul 18 '05 #11
 Thank you, I immediately download version 2.4, switching from version 2.3. Francis Girard FRANCE Le vendredi 21 Janvier 2005 17:34, Craig Ringer a Ã©critÂ*: On Fri, 2005-01-21 at 16:54 +0100, Francis Girard wrote: First, I think that you mean : consecutive_sets = [ x[offset:offset+subset_size] for subset_size in xrange(2, len(x)) for offset in xrange(0, len(x) + 1 - subset_size)] (with square brackets). I'm just trying to understand and obviously I'm missing the point. Yep. This: ( x for x in xrange(10) ) will return a generator that calculates things as it goes, while this: [ x for x in xrange(10) ] will return a list. Check out: http://www.python.org/peps/pep-0289.html http://docs.python.org/whatsnew/node4.html http://www.python.org/dev/doc/newstyle/ref/genexpr.html for details. -- Craig Ringer Jul 18 '05 #12
 Francis Girard wrote: In particular, I don't know what Python constructs does generate a generator. I know this is now the case for reading lines in a file or with the new "iterator" package. But what else ? Does Craig Ringer answer mean that list comprehensions are lazy ? Where can I find a comprehensive list of all the lazy constructions built in Python ? (I think that to easily distinguish lazy from strict constructs is an absolute programmer need -- otherwise you always end up wondering when is it that code is actually executed like in Haskell). I don't think there is a *list* as such, but there are some rules of thumb for when lazy evaluation will take place (hopefully others will note any cases that I missed): 1. Iterators (classes with a next() method, and an __iter__ method that returns 'self') are lazily evaluated, as itr.next() is called to retrieve each value. I think you will find it is this method, rather than __new__ which is relevant to creating class-based generators. Note that "for x in itr" automatically calls itr.next() in order to obtain each new value of the loop variable. This iterator protocol is the basis of lazy evaluation in Python, and is described here: http://www.python.org/dev/doc/devel/lib/typeiter.html 2. Iterables (classes with an __iter__ method) will return a lazy iterator via iter(obj). Actual iterators return themselves from __iter__, so iter(obj) is a good way to make sure you have an iterator. 3. Generators (functions that use 'yield' instead of 'return') and generator expressions (like list comprehensions, but without the square brackets) are simply concise ways of creating iterators. 4. The utility functions in the itertools module generally return iterators rather than lists (this shouldn't suprise anyone!) 5. Several builtin functions return iterators rather than lists, specifically xrange(), enumerate() and reversed(). Other builtins that yield sequences (range(), sorted(), zip()) return lists. However, be aware that some things which accept any iterable don't take advantage of the lazy evaluation, and still cause the whole thing to be created in memory at once - "".join(itr) is currently one such operation. The sequence vs iterator distinction is somewhat unfortunate (since it breaks with TOOWTDI), but completing the transition to an iterator based approach isn't going to be possible until Python 3.0, when things that currently return lists can be converted to return iterators (i.e. it has been suggested that the fundamental construct in Python 3.x should be an iterator just as a list is the fundamental construct in Python 2.x) Regards, Nick. -- Nick Coghlan | nc******@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net Jul 18 '05 #13
 Nick Coghlan wrote: 5. Several builtin functions return iterators rather than lists, specifically xrange(), enumerate() and reversed(). Other builtins that yield sequences (range(), sorted(), zip()) return lists. Yes for enumerate and reversed, no for xrange: xx=xrange(7) xx.next() Traceback (most recent call last): File "", line 1, in ? AttributeError: 'xrange' object has no attribute 'next' it SHOULD return an iterator, no doubt, but it doesn't (can't, for backwards compatibility reasons). Neither does it return a list: it returns "an `xrange' object", a specialized type that's not an iterator, though it's iterable. It's a type, btw: xrange so it's not surprising that calling it returns instances of it (enumerate and reversed are also types, but *WITH* 'next'...). Alex Jul 18 '05 #15
 Francis Girard wrote: ... A 'def' of a function whose body uses 'yield', and in 2.4 the new genexp construct. Ok. I guess I'll have to update to version 2.4 (from 2.3) to follow the discussion. It's worth upgrading even just for the extra speed;-). Since you appear to conflate generators and iterators, I guess the iter built-in function is the main one you missed. iter(x), for any x, either raises an exception (if x's type is not iterable) or else returns an iterator. You're absolutly right, I take the one for the other and vice-versa. If I understand correctly, a "generator" produce something over which you can iterate with the help of an "iterator". Can you iterate (in the strict sense of an "iterator") over something not generated by a "generator" ? A generator function (commonly known as a generator), each time you call it, produces a generator object AKA a generator-iterator. To wit: def f(): yield 23 .... f x = f() x type(x) A generator expression (genexp) also has a result which is a generator object: x = (23 for __ in [0]) type(x) Iterators need not be generator-iterators, by any means. Generally, the way to make sure you have an iterator is to call iter(...) on something; if the something was already an iterator, NP, then iter's idempotent: iter(x) is x True That's what "an iterator" means: some object x such that x.next is callable without arguments and iter(x) is x. Since iter(x) tries calling type(x).__iter__(x) [[slight simplification here by ignoring custom metaclasses, see recent discussion on python-dev as to why this is only 99% accurate, not 100% accurate]], one way to code an iterator is as a class. For example: class Repeater(object): def __iter__(self): return self def next(self): return 23 Any instance of Repeater is an iterator which, as it happens, has just the same behavior as itertools.repeat(23), which is also the same behavior you get from iterators obtained by calling: def generepeat(): while True: yield 23 In other words, after: a = Repeater() b = itertools.repeat(23) c = generepeat() the behavior of a, b and c is indistinguishable, though you can easily tell them apart by introspection -- type(a) != type(b) != type(c). Python's penchant for duck typing -- behavior matters more, WAY more than implementation details such as type() -- means we tend to consider a, b and c fully equivalent. Focusing on ``generator'' is at this level an implementation detail. Most often, iterators (including generator-iterators) are used in a for statement (or equivalently a for clause of a listcomp or genexp), which is why one normally doesn't think about built-in ``iter'': it's called automatically by these ``for'' syntax-forms. In other words, for x in <<>>: ...body... is just like: __tempiter = iter(<<>>) while True: try: x = __tempiter.next() except StopIteration: break ...body... ((Simplification alert: the ``for'' statement has an optional ``else'' which this allegedly "just like" form doesn't mimic exactly...)) You're right. I was much more talking (mistakenly) about lazy evaluation of the arguments to a function (i.e. the function begins execution before its arguments get evaluated) -- in such a case I think it should be specified which arguments are "strict" and which are "lazy" -- but I don't think there's such a thing in Python (... well not yet as Python get more and more akin to FP). Python's strict that way. To explicitly make some one argument "lazy", sorta, you can put a "lambda:" in front of it at call time, but then you have to "call the argument" to get it evaluated; a bit of a kludge. There's a PEP out to allow a ``prefix :'' to mean just the same as this "lambda:", but even though I co-authored it I don't think it lowers the kludge quotient by all that much. Guido, our beloved BDFL, is currently musing about optional typing of arguments, which might perhaps open a tiny little crack towards letting some arguments be lazy. I don't think Guido wants to go there, though. My prediction is that even Python 3000 will be strict. At least this makes some things obvious at each call-site without having to study the way a function is defined, e.g., upon seeing f(a+b, c*d) you don't have to wonder, or study the ``def f'', to find out when the addition and the multiplication happen -- they happen before f's body gets a chance to run, and thus, in particular, if either operation raises an exception, there's nothing f can do about it. And that's a misunderstanding I _have_ seen repeatedly even in people with a pretty good overall grasp of Python, evidenced in code such as: self.assertRaises(SomeError, f(23)) with astonishment that -- if f(23) does indeed raise SomeError -- this exception propagates, NOT caught by assertRaises; and if mistakenly f(23) does NOT raise, you typically get a TypeError about None not being callable. The way to do the above call, _since Python is strict_, is: self.assertRaises(SomeError, f, 23) i.e. give assertRaises the function (or other callable) and arguments to pass to it -- THIS way, assertRaises performs the call under its own control (inside the try clause of a try/except statement) and can and does catch and report things appropriately. The frequency of this misunderstanding is high enough to prove to me that strictness is not FULLY understood by some "intermediate level" Pythonistas. However, I doubt the right solution is to complicate Python with the ability to have some arguments be strict, and other lazy, much as sometimes one might yearn for it. _Maybe_ one could have some form that makes ALL arguments lazy, presenting them as an iterator to the function itself. But even then the form _should_, I suspect, be obvious at the call site, rather than visible only in the "def"... Alex Jul 18 '05 #19
 Craig Ringer wrote: .>>> data = ''.join(x for x in infile) Maybe ''.join(infile) is a better way to express this functionality? Avoids 2.4 dependency and should be faster as well as more concise. Might it be worth providing a way to have file objects seek back to the current position of the iterator when read() etc are called? If not, I It's certainly worth doing a patch and see what the python-dev crowd thinks of it, I think; it might make it into 2.5. favour the suggestion in the referenced post - file should probably fail noisily, or at least emit a warning. Under what conditions, exactly, would you want such an exception? Alex Jul 18 '05 #20
 On Sat, 2005-01-22 at 12:20 +0100, Alex Martelli wrote: Craig Ringer wrote: .>>> data = ''.join(x for x in infile) Maybe ''.join(infile) is a better way to express this functionality? Avoids 2.4 dependency and should be faster as well as more concise. Thanks - for some reason I hadn't clicked to that. Obvious in hindsight, but I just completely missed it. Might it be worth providing a way to have file objects seek back to the current position of the iterator when read() etc are called? If not, I It's certainly worth doing a patch and see what the python-dev crowd thinks of it, I think; it might make it into 2.5. I'll certainly look into doing so. I'm not dumb enough to say "Sure, I'll do that" before looking into the code involved and thinking more about what issues could pop up. Still, it's been added to my increasingly frightening TODO list. favour the suggestion in the referenced post - file should probably fail noisily, or at least emit a warning. Under what conditions, exactly, would you want such an exception? When read() or other methods suffering from the same issue are called after next() without an intervening seek(). It'd mean another state flag for the file to keep track of - but I doubt it'd make any detectable difference in performance given that there's disk I/O involved. I'd be happier to change the behaviour so that a warning isn't necessary, though, and I suspect it can be done without introducing backward compatibility issues. Well, so long as nobody is relying on the undefined file position after using an iterator - and I'm not dumb enough to say nobody would ever do that. I've really got myself into hot water now though - I'm going to have to read over the `file' source code before impulsively saying anything REALLY stupid. -- Craig Ringer Jul 18 '05 #21
 Hi, First, My deepest thanks to Craig Ringer, Alex Martelli, Nick Coghlan and Terry Reedy for having generously answered on the "Need help on need help on generator" thread. I'm compiling the answers to sketch myself a global pictures about iterators, generators, iterator-generators and laziness in python. In the meantime, I couldn't resist to test the new Python features about laziness on a classical FP problem, i.e. the "Hamming" problem. The name of the game is to produce the sequence of integers satisfying the following rules : (i) The list is in ascending order, without duplicates (ii) The list begins with the number 1 (iii) If the list contains the number x, then it also contains the numbers 2*x, 3*x, and 5*x (iv) The list contains no other numbers. The algorithm in FP is quite elegant. Simply suppose that the infinite sequence is produced, then simply merge the three sequences (2*x,3*x,5*x) for each x in the infinite sequence we supposed as already produced ; this is O(n) complexity for n numbers. I simply love those algorithms that run after their tails. In haskell, the algorithm is translated as such : -- BEGIN SNAP -- hamming.hs -- Merges two infinite lists merge :: (Ord a) => [a] -> [a] -> [a] merge (x:xs)(y:ys) | x == y = x : merge xs ys | x < y = x : merge xs (y:ys) | otherwise = y : merge (x:xs) ys -- Lazily produce the hamming sequence hamming :: [Integer] hamming = 1 : merge (map (2*) hamming) (merge (map (3*) hamming) (map (5*) hamming)) -- END SNAP In Python, I figured out this implementation : -- BEGIN SNAP import sys from itertools import imap ## Merges two infinite lists def imerge(xs, ys): x = xs.next() y = ys.next() while True: if x == y: yield x x = xs.next() y = ys.next() elif x < y: yield x x = xs.next() else: # if y < x: yield y y = ys.next() ## Lazily produce the hamming sequence def hamming(): yield 1 ## Initialize the machine for n in imerge(imap(lambda h: 2*h, hamming()), imerge(imap(lambda h: 3*h, hamming()), imap(lambda h: 5*h, hamming()))): yield n print "Falling out -- We should never get here !!" for n in hamming(): sys.stderr.write("%s " % str(n)) ## stderr for unbuffered output -- END SNAP My goal is not to compare Haskell with Python on a classical FP problem, which would be genuine stupidity. Nevertheless, while the Haskell version prints Hamming sequence for as longas I can stand it, and with very little memory consumation, the Python version only prints : >> hamming.py 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 40 45 48 50 54 60 64 7275 80 81 90 96 100 108 120 125 128 135 144 150 160 162 180 192 200 216 225 240 243 250 256 270 288 300 320 324 360 375 384 400 405 432 450 480 486 500 512 540 576 600 625 640 648 675 720 729 750 768 800 810 864 900 960 972 1000 1024 1080 1125 1152 1200 1215 1250 1280 1296 1350 1440 1458 1500 1536 1600 1620 1728 1800 1875 1920 1944 2000 2025 2048 2160 2187 2250 2304 2400 2430 2500 2560 2592 2700 2880 2916 3000 3072 3125 3200 3240 3375 3456 3600 3645 3750 3840 3888 4000 4050 4096 4320 4374 4500 4608 4800 4860 5000 5120 5184 5400 5625 5760 5832 6000 6075 6144 6250 6400 6480 6561 6750 6912 7200 7290 7500 7680 7776 8000 8100 8192 8640 8748 9000 9216 9375 9600 9720 10000 10125 10240 10368 10800 10935 11250 11520 11664 12000 12150 12288 12500 12800 12960 13122 13500 13824 14400 14580 15000 15360 15552 15625 16000 16200 16384 16875 17280 17496 18000 18225 18432 18750 19200 19440 19683 20000 20250 20480 20736 21600 21870 22500 23040 23328 24000 24300 24576 25000 25600 25920 26244 27000 27648 28125 28800 29160 30000 30375 30720 31104 31250 32000 32400 32768 32805 33750 34560 34992 36000 36450 36864 37500 38400 38880 39366 40000 40500 40960 41472 43200 43740 45000 46080 46656 46875 48000 48600 49152 50000 50625 51200 51840 52488 54000 54675 55296 56250 57600 Processus arrÃªtÃ© After 57600, my machine begins swapping like crazy and I do have to kill the python processus. I think I should not have this kind of behaviour, even using recursion, since I'm only using lazy constructs all the time. At least, I would expect the program to produce much more results before surrending. What's going on ? Thank you Francis Girard FRANCE Jul 18 '05 #22
 Your formulation in Python is recursive (hamming calls hamming()) and I think that's why your program gives up fairly early. Instead, use itertools.tee() [new in Python 2.4, or search the internet for an implementation that works in 2.3] to give a copy of the output sequence to each "multiply by N" function as well as one to be the return value. Here's my implementation, which matched your list early on but effortlessly reached larger values. One large value it printed was 6412351813189632 (a Python long) which indeed has only the distinct prime factors 2 and 3. (2**43 * 3**6) Jeff from itertools import tee import sys def imerge(xs, ys): x = xs.next() y = ys.next() while True: if x == y: yield x x = xs.next() y = ys.next() elif x < y: yield x x = xs.next() else: yield y y = ys.next() def hamming(): def _hamming(j, k): yield 1 hamming = generators[j] for i in hamming: yield i * k generators = [] generator = imerge(imerge(_hamming(0, 2), _hamming(1, 3)), _hamming(2, 5)) generators[:] = tee(generator, 4) return generators[3] for i in hamming(): print i, sys.stdout.flush() -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQFB9CTeJd01MZaTXX0RAlHPAJ4gJwuPxCzBdnWiM1/Rs3eUPk1HMwCeJUXh selUGrwJc9T8wE6aCQOjq+Q= =AW5F -----END PGP SIGNATURE----- Jul 18 '05 #23
 [Francis Girard] ... In the meantime, I couldn't resist to test the new Python features about laziness on a classical FP problem, i.e. the "Hamming" problem. .... Nevertheless, while the Haskell version prints Hamming sequence for as long as I can stand it, and with very little memory consumation, the Python version only prints : .... I think I should not have this kind of behaviour, even using recursion, since I'm only using lazy constructs all the time. At least, I would expect the program to produce much more results before surrending. What's going on ? You can find an explanation in Lib/test/test_generators.py, which uses this problem as an example; you can also find one efficient way to write it there. Jul 18 '05 #24
 Francis Girard wrote: The following implementation is even more speaking as it makes self-evident and almost mechanical how to translate algorithms that run after their tail from recursion to "tee" usage : Thanks, Francis and Jeff for raising a fascinating topic. I've enjoyed trying to get my head around both the algorithm and your non-recursive implementation. Here's a version of your implementation that uses a helper class to make the algorithm itself prettier. from itertools import tee, imap def hamming(): def _hamming(): yield 1 for n in imerge(2 * hamming, imerge(3 * hamming, 5 * hamming)): yield n hamming = Tee(_hamming()) return iter(hamming) class Tee(object): """Provides an indepent iterator (using tee) on every iteration request Also implements lazy iterator arithmetic""" def __init__(self, iterator): self.iter = tee(iterator,1)[0] def __iter__(self): return self.iter.__copy__() def __mul__(self, number): return imap(lambda x: x * number,self.__iter__()) def imerge(xs, ys): x = xs.next() y = ys.next() while True: if x == y: yield x x = xs.next() y = ys.next() elif x < y: yield x x = xs.next() else: # if y < x: yield y y = ys.next() hg = hamming() for i in range(10000): .... n = hg.next() .... if i % 1000 == 0: print i, n .... 0 1 1000 51840000 2000 8100000000 3000 279936000000 4000 4707158941350 5000 50960793600000 6000 409600000000000 7000 2638827906662400 8000 14332723200000000 9000 68024448000000000 Regards Michael Jul 18 '05 #29
 Francis Girard wrote: def hamming(): def _hamming(): yield 1 hamming2 = hammingGenerators[0] hamming3 = hammingGenerators[1] hamming5 = hammingGenerators[2] for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5)))): yield n hammingGenerators = tee(_hamming(), 4) return hammingGenerators[3] If you are after readability, you might prefer this... def hamming(): def _hamming(): yield 1 for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5)))): yield n hamming2, hamming3, hamming5, result = tee(_hamming(), 4) return result PS interesting thread - never heard of Hamming sequences before! -- Nick Craig-Wood -- http://www.craig-wood.com/nick Jul 18 '05 #30
 On 25 Jan 2005 08:30:03 GMT, Nick Craig-Wood wrote: Francis Girard wrote: def hamming(): def _hamming(): yield 1 hamming2 = hammingGenerators[0] hamming3 = hammingGenerators[1] hamming5 = hammingGenerators[2] for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5)))): yield n hammingGenerators = tee(_hamming(), 4) return hammingGenerators[3]If you are after readability, you might prefer this...def hamming(): def _hamming(): yield 1 for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5)))): yield n hamming2, hamming3, hamming5, result = tee(_hamming(), 4) return resultPS interesting thread - never heard of Hamming sequences before! Are the long words really that helpful? def hamming(): def _hamming(): yield 1 for n in imerge(imap(lambda h: 2*h, iter(hg2)), imerge(imap(lambda h: 3*h, iter(hg3)), imap(lambda h: 5*h, iter(hg5)))): yield n hg2, hg3, hg5, result = tee(_hamming(), 4) # four hamming generators return result Regards, Bengt Richter Jul 18 '05 #31
 Francis Girard wrote: The following implementation is even more speaking as it makes self-evident and almost mechanical how to translate algorithms that run after their tail from recursion to "tee" usage : *** BEGIN SNAP from itertools import tee, imap import sys def imerge(xs, ys): x = xs.next() y = ys.next() while True: if x == y: yield x x = xs.next() y = ys.next() elif x < y: yield x x = xs.next() else: yield y y = ys.next() Thinking about this some more leads me to believe a general purpose imerge taking any number of arguments will look neater, eg def imerge(*generators): values = [ g.next() for g in generators ] while True: x = min(values) yield x for i in range(len(values)): if values[i] == x: values[i] = generators[i].next() def hamming(): def _hamming(): yield 1 hamming2 = hammingGenerators[0] hamming3 = hammingGenerators[1] hamming5 = hammingGenerators[2] for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5)))): yield n hammingGenerators = tee(_hamming(), 4) return hammingGenerators[3] This means that this can be further simplified thus, def hamming(): def _hamming(): yield 1 for n in imerge( imap(lambda h: 2*h, hamming2), imap(lambda h: 3*h, hamming3), imap(lambda h: 5*h, hamming5) ): yield n hamming2, hamming3, hamming5, result = tee(_hamming(), 4) return result (Note the iter(...) seemed not to be doing anything useful so I removed them) -- Nick Craig-Wood -- http://www.craig-wood.com/nick Jul 18 '05 #32
 Nick Craig-Wood wrote: Thinking about this some more leads me to believe a general purpose imerge taking any number of arguments will look neater, eg def imerge(*generators): values = [ g.next() for g in generators ] while True: x = min(values) yield x for i in range(len(values)): if values[i] == x: values[i] = generators[i].next() This kinda looks like it dies after the first generator is exhausted, but I'm not certain. An alternate version that doesn't search for 'i': py> def imerge(*iterables): .... iters = [iter(i) for i in iterables] .... values = [i.next() for i in iters] .... while iters: .... x, i = min((val, i) for i, val in enumerate(values)) .... yield x .... try: .... values[i] = iters[i].next() .... except StopIteration: .... del iters[i] .... del values[i] .... py> list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9])) [1, 2, 3, 4, 5, 6, 7, 8, 9] py> list(imerge([3, 6, 9], [1, 4, 7], [2, 5, 8])) [1, 2, 3, 4, 5, 6, 7, 8, 9] py> list(imerge([1, 4, 7], [3, 6, 9], [2, 5, 8])) [1, 2, 3, 4, 5, 6, 7, 8, 9] This means that this can be further simplified thus, def hamming(): def _hamming(): yield 1 for n in imerge( imap(lambda h: 2*h, hamming2), imap(lambda h: 3*h, hamming3), imap(lambda h: 5*h, hamming5) ): yield n hamming2, hamming3, hamming5, result = tee(_hamming(), 4) return result Nice. Steve Jul 18 '05 #33
 Steven Bethard wrote: Nick Craig-Wood wrote: Thinking about this some more leads me to believe a general purpose imerge taking any number of arguments will look neater, eg def imerge(*generators): values = [ g.next() for g in generators ] while True: x = min(values) yield x for i in range(len(values)): if values[i] == x: values[i] = generators[i].next() This kinda looks like it dies after the first generator is exhausted, but I'm not certain. Yes it will stop iterating then (rather like zip() on lists of unequal size). Not sure what the specification should be! It works for the hamming problem though. list(imerge(iter([1, 2]), iter([1, 2, 3]), iter([1, 2, 3, 4]))) [1, 2] An alternate version that doesn't search for 'i': py> def imerge(*iterables): ... iters = [iter(i) for i in iterables] ... values = [i.next() for i in iters] ... while iters: ... x, i = min((val, i) for i, val in enumerate(values)) ... yield x ... try: ... values[i] = iters[i].next() ... except StopIteration: ... del iters[i] ... del values[i] ... py> list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9])) [1, 2, 3, 4, 5, 6, 7, 8, 9] py> list(imerge([3, 6, 9], [1, 4, 7], [2, 5, 8])) [1, 2, 3, 4, 5, 6, 7, 8, 9] py> list(imerge([1, 4, 7], [3, 6, 9], [2, 5, 8])) [1, 2, 3, 4, 5, 6, 7, 8, 9] This isn't quite right... list(imerge([1, 2, 3], [1, 2, 3], [1, 2, 3])) [1, 1, 1, 2, 2, 2, 3, 3, 3] This should produce [1, 2, 3] So I'm afraid the searching *is* necessary - you've got to find all the generators with the min value and move them on. -- Nick Craig-Wood -- http://www.craig-wood.com/nick Jul 18 '05 #34
 Nick Craig-Wood wrote: Steven Bethard wrote: Nick Craig-Wood wrote:Thinking about this some more leads me to believe a general purposeimerge taking any number of arguments will look neater, egdef imerge(*generators): values = [ g.next() for g in generators ] while True: x = min(values) yield x for i in range(len(values)): if values[i] == x: values[i] = generators[i].next() This kinda looks like it dies after the first generator is exhausted, but I'm not certain. Yes it will stop iterating then (rather like zip() on lists of unequal size). Not sure what the specification should be! It works for the hamming problem though. Actually, it stops iterating on lists of equal size too: py> def imerge(*iterators): .... iterators = [iter(i) for i in iterators] .... values = [i.next() for i in iterators] .... while True: .... x = min(values) .... yield x .... for i, val in enumerate(values): .... if val == x: .... values[i] = iterators[i].next() .... py> list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9])) [1, 2, 3, 4, 5, 6, 7] Note that 8 and 9 are not in the list. Steve Jul 18 '05 #35
 Nick Craig-Wood wrote: Steven Bethard wrote: Nick Craig-Wood wrote:Thinking about this some more leads me to believe a general purposeimerge taking any number of arguments will look neater, egdef imerge(*generators): values = [ g.next() for g in generators ] while True: x = min(values) yield x for i in range(len(values)): if values[i] == x: values[i] = generators[i].next() This kinda looks like it dies after the first generator is exhausted, but I'm not certain. Yes it will stop iterating then (rather like zip() on lists of unequal size). Not sure what the specification should be! It works for the hamming problem though.list(imerge(iter([1, 2]), iter([1, 2, 3]), iter([1, 2, 3, 4]))) [1, 2]An alternate version that doesn't search for 'i': py> def imerge(*iterables): ... iters = [iter(i) for i in iterables] ... values = [i.next() for i in iters] ... while iters: ... x, i = min((val, i) for i, val in enumerate(values)) ... yield x ... try: ... values[i] = iters[i].next() ... except StopIteration: ... del iters[i] ... del values[i] ... py> list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9])) [1, 2, 3, 4, 5, 6, 7, 8, 9] py> list(imerge([3, 6, 9], [1, 4, 7], [2, 5, 8])) [1, 2, 3, 4, 5, 6, 7, 8, 9] py> list(imerge([1, 4, 7], [3, 6, 9], [2, 5, 8])) [1, 2, 3, 4, 5, 6, 7, 8, 9] This isn't quite right...list(imerge([1, 2, 3], [1, 2, 3], [1, 2, 3])) [1, 1, 1, 2, 2, 2, 3, 3, 3] This should produce [1, 2, 3] So I'm afraid the searching *is* necessary - you've got to find all the generators with the min value and move them on. Here's a dict-based implementation: cute, but slow, at least for a small number of iterators def imerge(*iterables): ... cache = {} ... iterators = map(iter,iterables) ... number = len(iterables) ... exhausted = 0 ... while 1: ... for it in iterators: ... try: ... cache.setdefault(it.next(),[]).append(it) ... except StopIteration: ... exhausted += 1 ... if exhausted == number: ... raise StopIteration ... val = min(cache) ... iterators = cache.pop(val) ... yield val list(imerge([1, 2, 3, 6], [1, 2, 3, 7], [1, 2, 3, 4, 5])) [1, 2, 3, 4, 5, 6, 7] Michael Jul 18 '05 #36
 Bengt Richter wrote: On 25 Jan 2005 08:30:03 GMT, Nick Craig-Wood wrote:If you are after readability, you might prefer this...def hamming(): def _hamming(): yield 1 for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5)))): yield n hamming2, hamming3, hamming5, result = tee(_hamming(), 4) return result Are the long words really that helpful? def hamming(): def _hamming(): yield 1 for n in imerge(imap(lambda h: 2*h, iter(hg2)), imerge(imap(lambda h: 3*h, iter(hg3)), imap(lambda h: 5*h, iter(hg5)))): yield n hg2, hg3, hg5, result = tee(_hamming(), 4) # four hamming generators return result Well, judging by the fact that shortening the identifiers made it so that you felt the need to add a comment indicating what they were identifying, I'd say that yes, the long words *are* helpful. ;) Comments are good, but self-documenting code is even better. Jeff Shannon Technician/Programmer Credit International Jul 18 '05 #37
 On Tue, 25 Jan 2005 11:46:04 -0800, Jeff Shannon wrote: Bengt Richter wrote: On 25 Jan 2005 08:30:03 GMT, Nick Craig-Wood wrote:If you are after readability, you might prefer this...def hamming(): def _hamming(): yield 1 for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5)))): yield n hamming2, hamming3, hamming5, result = tee(_hamming(), 4) return result Are the long words really that helpful? def hamming(): def _hamming(): yield 1 for n in imerge(imap(lambda h: 2*h, iter(hg2)), imerge(imap(lambda h: 3*h, iter(hg3)), imap(lambda h: 5*h, iter(hg5)))): yield n hg2, hg3, hg5, result = tee(_hamming(), 4) # four hamming generators return resultWell, judging by the fact that shortening the identifiers made it sothat you felt the need to add a comment indicating what they wereidentifying, I'd say that yes, the long words *are* helpful. ;)Comments are good, but self-documenting code is even better. The comment was a conscious factoring decision, in terms of documentation ;-) Regards, Bengt Richter Jul 18 '05 #38
 Le mardi 25 Janvier 2005 09:01, Michael Spencer a Ã©critÂ*: Francis Girard wrote: The following implementation is even more speaking as it makes self-evident and almost mechanical how to translate algorithms that run after their tail from recursion to "tee" usage : Thanks, Francis and Jeff for raising a fascinating topic. I've enjoyed trying to get my head around both the algorithm and your non-recursive implementation. Yes, it's been fun. Here's a version of your implementation that uses a helper class to make the algorithm itself prettier. from itertools import tee, imap def hamming(): def _hamming(): yield 1 for n in imerge(2 * hamming, imerge(3 * hamming, 5 * hamming)): yield n hamming = Tee(_hamming()) return iter(hamming) class Tee(object): """Provides an indepent iterator (using tee) on every iteration request Also implements lazy iterator arithmetic""" def __init__(self, iterator): self.iter = tee(iterator,1)[0] def __iter__(self): return self.iter.__copy__() def __mul__(self, number): return imap(lambda x: x * number,self.__iter__()) def imerge(xs, ys): x = xs.next() y = ys.next() while True: if x == y: yield x x = xs.next() y = ys.next() elif x < y: yield x x = xs.next() else: # if y < x: yield y y = ys.next() >>> hg = hamming() >>> for i in range(10000): ... n = hg.next() ... if i % 1000 == 0: print i, n ... 0 1 1000 51840000 2000 8100000000 3000 279936000000 4000 4707158941350 5000 50960793600000 6000 409600000000000 7000 2638827906662400 8000 14332723200000000 9000 68024448000000000 Interesting idea. Regards Michael Jul 18 '05 #39
 Le mardi 25 Janvier 2005 19:52, Steven Bethard a écrit*: Thank you Nick and Steven for the idea of a more generic imerge. To work with the Hamming problem, the imerge function _must_not_ allow duplicates and we can assume all of the specified iteratables are of the same size, i.e. infinite ! Therefore, Nick's solution fulfills the need. But it is admittedly confusing to call the function "imerge" when it doesn't merge everything (including duplicates). Anyway both solution sheds new light and brings us a bit farther. That's the beauty of many brains from all over the world collabarating. Really, it makes me emotive thinking about it. For the imerge function, what we really need to make the formulation clear is a way to look at the next element of an iteratable without consuming it. Or else, a way to put back "consumed" elements in the front an iteration flow, much like the list constructors in FP languages like Haskell. It is simple to encapsulate an iterator inside another iterator class that would do just that. Here's one. The additional "fst" method returns the next element to consume without consuming it and the "isBottom" checks if there is something left to consume from the iteration flow, without actually consuming anything. I named the class "IteratorDeiterator" for lack of imagination : -- BEGIN SNAP class IteratorDeiterator: def __init__(self, iterator): self._iterator = iterator.__iter__() self._firstVal = None ## Avoid consuming if not requested from outside ## Works only if iterator itself can't return None def __iter__(self): return self def next(self): valReturn = self._firstVal if valReturn is None: valReturn = self._iterator.next() self._firstVal = None return valReturn def fst(self): if self._firstVal is None: self._firstVal = self._iterator.next() return self._firstVal def isBottom(self): if self._firstVal is None: try: self._firstVal = self._iterator.next() except StopIteration: return True return False -- END SNAP Now the imerge_infinite which merges infinite lists, while allowing duplicates, is quite straightforward : -- BEGIN SNAP def imerge_infinite(*iterators): vItTopable = [IteratorDeiterator(it) for it in iterators] while vItTopable: yield reduce(lambda itSmallest, it: itSmallest.fst() > it.fst() and it or itSmallest, vItTopable).next() -- END SNAP To merge finite lists of possibly different lengths is a bit more work as you have to eliminate iterators that reached the bottom before the others. This is quite easily done by simply filtering them out. The imerge_finite function below merges "lists" of possibly different sizes. -- BEGIN SNAP def imerge_finite(*iterators): vItDeit = [IteratorDeiterator(it) for it in iterators] vItDeit = filter(lambda it: not it.isBottom(), vItDeit) while vItDeit: yield reduce(lambda itSmallest, it: itSmallest.fst() > it.fst() and it or itSmallest, vItDeit).next() vItDeit = filter(lambda it: not it.isBottom(), vItDeit) -- END SNAP Now, we want versions of these two merge functions that do not allow duplicates. Building upon what we've already done in a semi FP way, we just filter out the duplicates from the iteration streams returned by the above functions. The job is the same for the infinite and finite versions, hence the imerge_nodup generic function. -- BEGIN SNAP def imerge_nodup(fImerge, *iterators): it = fImerge(*iterators) el = it.next() yield el while True: el = dropwhile(lambda _next: _next == el, it).next() yield el imerge_inf_nodup = \ lambda *iterators: imerge_nodup(imerge_infinite, *iterators) imerge_finite_nodup = \ lambda *iterators: imerge_nodup(imerge_finite, *iterators) -- END SNAP I used the lambda notation for imerge_inf_nodup and imerge_finite_nodup to avoid another useless for-yield loop. Here the notation really just express function equivalence with a bounded argument (it would be great to have a notation for this in Python, i.e. binding a function argument to yield a new function). The merge function to use with hamming() is imerge_inf_nodup. Regards, Francis Girard FRANCE Nick Craig-Wood wrote: Steven Bethard wrote: Nick Craig-Wood wrote:Thinking about this some more leads me to believe a general purposeimerge taking any number of arguments will look neater, egdef imerge(*generators): values = [ g.next() for g in generators ] while True: x = min(values) yield x for i in range(len(values)): if values[i] == x: values[i] = generators[i].next() This kinda looks like it dies after the first generator is exhausted, but I'm not certain. Yes it will stop iterating then (rather like zip() on lists of unequal size). Not sure what the specification should be! It works for the hamming problem though. Actually, it stops iterating on lists of equal size too: py> def imerge(*iterators): ... iterators = [iter(i) for i in iterators] ... values = [i.next() for i in iterators] ... while True: ... x = min(values) ... yield x ... for i, val in enumerate(values): ... if val == x: ... values[i] = iterators[i].next() ... py> list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9])) [1, 2, 3, 4, 5, 6, 7] Note that 8 and 9 are not in the list. Steve Jul 18 '05 #40
 Francis Girard wrote: For the imerge function, what we really need to make the formulation clear is a way to look at the next element of an iteratable without consuming it. Or else, a way to put back "consumed" elements in the front an iteration flow, much like the list constructors in FP languages like Haskell. It is simple to encapsulate an iterator inside another iterator class that would do just that. Here's one. The additional "fst" method returns the next element to consume without consuming it and the "isBottom" checks if there is something left to consume from the iteration flow, without actually consuming anything. I named the class "IteratorDeiterator" for lack of imagination : [snip] Another implementation is my peekable class: http://aspn.activestate.com/ASPN/Coo.../Recipe/304373 It allows you to peek several elements ahead if that's necessary. Steve Jul 18 '05 #41
 Francis Girard wrote: Thank you Nick and Steven for the idea of a more generic imerge. You are welcome :-) [It came to me while walking the children to school!] [snip] class IteratorDeiterator: def __init__(self, iterator): self._iterator = iterator.__iter__() self._firstVal = None ## Avoid consuming if not requested from outside ## Works only if iterator itself can't return None You can use a sentinel here if you want to avoid the "can't return None" limitation. For a sentinel you need an object your iterator couldn't possibly return. You can make one up, eg self._sentinel = object() self._firstVal = self._sentinel Or you could use self (but I'm not 100% sure that your recursive functions wouldn't return it!) def __iter__(self): return self def next(self): valReturn = self._firstVal if valReturn is None: and if valReturn is self._sentinel: valReturn = self._iterator.next() self._firstVal = None self._firstVal = self._sentinel etc.. [snip more code] Thanks for some more examples of fp-style code. I find it hard to get my head round so its been good exercise! -- Nick Craig-Wood -- http://www.craig-wood.com/nick Jul 18 '05 #42
 Le jeudi 27 Janvier 2005 10:30, Nick Craig-Wood a Ã©critÂ*: Francis Girard wrote: Thank you Nick and Steven for the idea of a more generic imerge. You are welcome :-) [It came to me while walking the children to school!] Sometimes fresh air and children purity is all what it takes. Much better than coffee, cigarrette and flat screen. [snip] class IteratorDeiterator: def __init__(self, iterator): self._iterator = iterator.__iter__() self._firstVal = None ## Avoid consuming if not requested from outside ## Works only if iterator itself can't return None You can use a sentinel here if you want to avoid the "can't return None" limitation. For a sentinel you need an object your iterator couldn't possibly return. You can make one up, eg Great idea. I'll use it. self._sentinel = object() self._firstVal = self._sentinel Or you could use self (but I'm not 100% sure that your recursive functions wouldn't return it!) def __iter__(self): return self def next(self): valReturn = self._firstVal if valReturn is None: and if valReturn is self._sentinel: valReturn = self._iterator.next() self._firstVal = None self._firstVal = self._sentinel etc.. [snip more code] Thanks for some more examples of fp-style code. I find it hard to get my head round so its been good exercise! Introduction to functional programming Richard Bird and Philip Wadler Prentice Hall 1988 This is the very best intro I ever read. The book is without hype, doesn't show its age and is language neutral. Authors are world leaders in the field today. Only really strong guys have the kindness to do understandable introductions without trying to hide the difficulties (because they are strong enough to face them with simplicity). It's been a real pleasure. Regards, Francis Girard FRANCE -- Nick Craig-Wood -- http://www.craig-wood.com/nick Jul 18 '05 #43
 Francis Girard writes: Thank you Nick and Steven for the idea of a more generic imerge. If you want to get fancy, the merge should use a priority queue (like in the heapsort algorithm) instead of a linear scan through the incoming iters, to find the next item to output. That lowers the running time to O(n log k) instead of O(n*k), where k is the number of iterators and n is the length. Jul 18 '05 #44
 Paul Rubin wrote: Francis Girard writes:Thank you Nick and Steven for the idea of a more generic imerge. If you want to get fancy, the merge should use a priority queue (like in the heapsort algorithm) instead of a linear scan through the incoming iters, to find the next item to output. That lowers the running time to O(n log k) instead of O(n*k), where k is the number of iterators and n is the length. I looked at a heapq solution but didn't see any clean way of dealing with multiple iterators having equal values. The dict solution below deals cleanly with that, since one key can be shared by any number of iterators. Extracting the minimum, and the associated iterables is fast, but the overall solution is still slower than the brute force approach for the 3 hamming iterators. def imerge(*iterables): ... cache = {} ... iterators = map(iter,iterables) ... number = len(iterables) ... exhausted = 0 ... while 1: # First time through, looked at all of them # Subsequently, update only the minimum iterators ... for it in iterators: ... try: # Key each iterator by its next() value # Multiple iterators may share the same key ... cache.setdefault(it.next(),[]).append(it) ... except StopIteration: ... exhausted += 1 ... if exhausted == number: ... raise StopIteration # Get the lowest value ... val = min(cache) # and all the iterators that have that value ... iterators = cache.pop(val) ... yield val list(imerge([1, 2, 3, 6], [1, 2, 3, 7], [1, 2, 3, 4, 5])) [1, 2, 3, 4, 5, 6, 7] Michael Jul 18 '05 #45
 Le vendredi 28 Janvier 2005 08:27, Paul Rubin a écrit*: Francis Girard writes: Thank you Nick and Steven for the idea of a more generic imerge. If you want to get fancy, the merge should use a priority queue (like in the heapsort algorithm) instead of a linear scan through the incoming iters, to find the next item to output. That lowers the running time to O(n log k) instead of O(n*k), where k is the number of iterators and n is the length. The goal was only to show some FP construct on small problems. Here the number of iterators are intended to be fairly small. Otherwise, yes, you could exploit the fact that, at one loop execution, you already seen most of the elements at previous loop execution, storing them in some heap structure and therefore not having to recan the whole list of "already seen" iterator values at each loop execution. Thank you Francis Girard Jul 18 '05 #46

### This discussion thread is closed

Replies have been disabled for this discussion.