473,543 Members | 2,130 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Do you have real-world use cases for map's None fill-in feature?

I am evaluating a request for an alternate version of itertools.izip( )
that has a None fill-in feature like the built-in map function:
map(None, 'abc', '12345') # demonstrate map's None fill-in feature

[('a', '1'), ('b', '2'), ('c', '3'), (None, '4'), (None, '5')]

The movitation is to provide a means for looping over all data elements
when the input lengths are unequal. The question of the day is whether
that is both a common need and a good approach to real-world problems.
The answer to the question can likely be found in results from other
programming languages or from real-world Python code that has used
map's None fill-in feature.

I scanned the docs for Haskell, SML, and Perl and found that the norm
for map() and zip() is to truncate to the shortest input or raise an
exception for unequal input lengths. I scanned the standard library
and found no instances where map's fill-in feature was used. Likewise,
I found no examples in all of the code I've ever written.

The history of Python's current zip() function serves as another
indicator that the proposal is weak. PEP 201 contemplated and rejected
the idea as one that likely had unintended consequences. In the years
since zip() was introduced in Py2.0, SourceForge has shown no requests
for a fill-in version of zip().

My request for readers of comp.lang.pytho n is to search your own code
to see if map's None fill-in feature was ever used in real-world code
(not toy examples). I'm curious about the context, how it was used,
and what alternatives were rejected (i.e. did the fill-in feature
improve the code).

Also, I'm curious as to whether someone has seen a zip fill-in feature
employed to good effect in some other programming language, perhaps
LISP or somesuch?

Maybe a few real-word code examples and experiences from other
languages will shed light on the question of whether lock-step
iteration has meaning beyond the length of the shortest matching
elements. If ordinal position were considered as a record key, then
the proposal equates to a database-style outer join operation (where
data elements with unmatched keys are included) and order is
significant. Does an outer-join have anything to do with lock-step
iteration? Is this a fundamental looping construct or just a
theoretical wish-list item? IOW, does Python really need
itertools.izip_ longest() or would that just become a distracting piece
of cruft?
Raymond Hettinger
P.S. FWIW, the OP's use case involved printing files in multiple
columns:

for f, g in itertools.izip_ longest(file1, file2, fillin_value='' ):
print '%-20s\t|\t%-20s' % (f.rstrip(), g.rstrip())

The alternative was straight-forward but not as terse:

while 1:
f = file1.readline( )
g = file2.readline( )
if not f and not g:
break
print '%-20s\t|\t%-20s' % (f.rstrip(), g.rstrip())

Jan 9 '06 #1
12 2067
"Raymond Hettinger" <py****@rcn.com > writes:
I am evaluating a request for an alternate version of itertools.izip( )
that has a None fill-in feature like the built-in map function:
map(None, 'abc', '12345') # demonstrate map's None fill-in feature


I think finding different ways to write it was an entertaining
exercise but it's too limited in usefulness to become a standard
feature.

I do think some idiom ought to develop to allow checking whether an
iterator is empty, without consuming an item. Here's an idea:
introduce something like

iterator = check_empty(ite rator)

where check_empty would work roughly like (untested):

def check_empty(ite rator):
iclass = iterator.__clas s__
class buffered(iclass ):
def __init__(self):
n = iter((self.next (),)) # might raise StopIteration
self.__save = chain(n, self)
def next(self):
return self.__save.nex t()
# all other operations are inherited from iclass

return buffered(iterat or)

The idea is you get back a new iterator which yields the same stream
and supports the same operations as the old one, if the old one is
non-empty. Otherwise it raises StopIteration.

There are some obvious problems with the above:

1) the new iterator should support all of the old one's attributes,
not just inherit its operations
2) In the case where the old iterator is already buffered, the
constructor should just peek at the lookahead instead of making
a new object. That means that checking an iterator multiple times
won't burn more and more memory.

Maybe there is some way of doing the above with metaclasses but I've
never been able to wrap my head around those.
Jan 9 '06 #2
On 7 Jan 2006 23:19:41 -0800, "Raymond Hettinger" <py****@rcn.com > wrote:
I am evaluating a request for an alternate version of itertools.izip( )
that has a None fill-in feature like the built-in map function:
map(None, 'abc', '12345') # demonstrate map's None fill-in feature[('a', '1'), ('b', '2'), ('c', '3'), (None, '4'), (None, '5')]
I don't like not being able to supply my own sentinel. None is too common
a value. Hm, ... <bf warning> unless maybe it can also be a type that we can instantiate with
really-mean-it context level like None(5) ;-)
map(None(5), 'abc', '12345') # demonstrate map's None fill-in feature

[('a', '1'), ('b', '2'), ('c', '3'), (None(5), '4'), (None(5), '5')]

But seriously, standard sentinels for "missing data" and "end of data" might be nice to have,
and to have produced in appropriate standard contexts. Singleton string subclass
instances "NOD" and "EOD"? Doesn't fit with EOF=='' though.
</bf warning>
The movitation is to provide a means for looping over all data elements
when the input lengths are unequal. The question of the day is whether
that is both a common need and a good approach to real-world problems.
The answer to the question can likely be found in results from other
programming languages or from real-world Python code that has used
map's None fill-in feature.
What about some semantics like my izip2 in
http://groups.google.com/group/comp....1ddb1f46?hl=en

(which doesn't even need a separate name, since it would be backwards compatible)

Also, what about factoring sequence-related stuff into being methods or attributes
of iter instances? And letting iter take multiple sequences or callable/sentinel pairs,
which could be a substitute for izip and then some? Methods could be called via a returned
iterator before or after the first .next() call, to control various features, such as
sentinel testing by 'is' instead of '==' for callable/sentinel pairs, or buffering n
steps of lookahead supported by a .peek(n) method defaulting to .peek(1), etc. etc.
The point being to have a place to implement universal sequence stuff.
I scanned the docs for Haskell, SML, and Perl and found that the norm
for map() and zip() is to truncate to the shortest input or raise an
exception for unequal input lengths. I scanned the standard library
and found no instances where map's fill-in feature was used. Likewise,
I found no examples in all of the code I've ever written.

The history of Python's current zip() function serves as another
indicator that the proposal is weak. PEP 201 contemplated and rejected
the idea as one that likely had unintended consequences. In the years
since zip() was introduced in Py2.0, SourceForge has shown no requests
for a fill-in version of zip().

My request for readers of comp.lang.pytho n is to search your own code
to see if map's None fill-in feature was ever used in real-world code
(not toy examples). I'm curious about the context, how it was used,
and what alternatives were rejected (i.e. did the fill-in feature
improve the code).

Also, I'm curious as to whether someone has seen a zip fill-in feature
employed to good effect in some other programming language, perhaps
LISP or somesuch? ISTM in general there is a chicken-egg problem where workarounds are easy.
I.e., the real question is how many workaround situations there are
that would have been handled conveniently with a builtin feature,
and _then_ to see whether the convenience would be worth enough.
Maybe a few real-word code examples and experiences from other
languages will shed light on the question of whether lock-step
iteration has meaning beyond the length of the shortest matching
elements. If ordinal position were considered as a record key, then
the proposal equates to a database-style outer join operation (where
data elements with unmatched keys are included) and order is
significant. Does an outer-join have anything to do with lock-step
iteration? Is this a fundamental looping construct or just a
theoretical wish-list item? IOW, does Python really need
itertools.izip _longest() or would that just become a distracting piece
of cruft?

Even if there is little use for continuing in correct code, IWT getting
at the state of the iterator in an erroroneous situation would be a benefit.
Being able to see the result of the last attempt at gathering tuple elements
could help. (I can see reasons for wanting variations of trying all streams
vs shortcutting on the first to exhaust though).

Regards,
Bengt Richter
Jan 10 '06 #3
[Bengt Richter]
What about some semantics like my izip2 in
http://groups.google.com/group/comp....1ddb1f46?hl=en

(which doesn't even need a separate name, since it would be backwards compatible)

Also, what about factoring sequence-related stuff into being methods or attributes
of iter instances? And letting iter take multiple sequences or callable/sentinel pairs,
which could be a substitute for izip and then some? Methods could be called via a returned
iterator before or after the first .next() call, to control various features, such as
sentinel testing by 'is' instead of '==' for callable/sentinel pairs, or buffering n
steps of lookahead supported by a .peek(n) method defaulting to .peek(1), etc. etc.
The point being to have a place to implement universal sequence stuff.
ISTM, these cures are worse than the disease ;-)

Even if there is little use for continuing in correct code, IWT getting
at the state of the iterator in an erroroneous situation would be a benefit.
Being able to see the result of the last attempt at gathering tuple elements
could help. (I can see reasons for wanting variations of trying all streams
vs shortcutting on the first to exhaust though).


On the one hand, that seems reasonable. On the other hand, I can't see
how to use it without snarling the surrounding code in which case it is
probably better to explicitly manage individual iterators within a
while loop.
Raymond

Jan 10 '06 #4
[Raymond Hettinger]
I am evaluating a request for an alternate version of itertools.izip( )
that has a None fill-in feature like the built-in map function:
>> map(None, 'abc', '12345') # demonstrate map's None fill-in feature

[Paul Rubin] I think finding different ways to write it was an entertaining
exercise but it's too limited in usefulness to become a standard
feature.
Makes sense.
I do think some idiom ought to develop to allow checking whether an
iterator is empty, without consuming an item. Here's an idea:
introduce something like

iterator = check_empty(ite rator)
There are so many varieties of iterator that it's probably not workable
to alter the iterator API for all of the them. In any case, a broad
API change like this would need its own PEP.

There are some obvious problems with the above:

1) the new iterator should support all of the old one's attributes,
not just inherit its operations
2) In the case where the old iterator is already buffered, the
constructor should just peek at the lookahead instead of making
a new object. That means that checking an iterator multiple times
won't burn more and more memory.

Maybe there is some way of doing the above with metaclasses but I've
never been able to wrap my head around those.


Metaclasses are unlikely to be of help because there are so many,
unrelated kinds of iterator -- most do not inherit from a common
parent.
Raymond

Jan 10 '06 #5
"Raymond Hettinger" <py****@rcn.com > writes:
iterator = check_empty(ite rator)


There are so many varieties of iterator that it's probably not workable
to alter the iterator API for all of the them. In any case, a broad
API change like this would need its own PEP.


The hope was that it wouldn't be an API change, but rather just a new
function dropped into the existing library, that could wrap any
existing iterator without having to change or break anything that's
already been written. Maybe the resulting iterator couldn't support
every operation, or maybe it could have a __getattr__ that delegates
everything except "next" to the wrapped iterator, or something. The
obvious implementation methods that I can see are very kludgy but
maybe something better is feasible. I defer to your knowledge about
this.
Jan 10 '06 #6
[Raymond Hettinger]
...
I scanned the docs for Haskell, SML, and Perl and found that the norm
for map() and zip() is to truncate to the shortest input or raise an
exception for unequal input lengths.
...
Also, I'm curious as to whether someone has seen a zip fill-in feature
employed to good effect in some other programming language, perhaps
LISP or somesuch?


FYI, Common Lisp's `pairlis` function requires that its first two
arguments be lists of the same length. It's a strain to compare to
Python's zip() though, as the _intended_ use of `pairlis` is to add
new pairs to a Lisp association list. For that reason, `pairlis`
accepts an optional third argument; if present, this should be an
association list, and pairs from zipping the first two arguments are
prepended to it. Also for this reason, the _order_ in which pairs are
taken from the first two arguments isn't defined(!).

http://www.lispworks.com/documentati...li.htm#pairlis

For its intended special-purpose use, it wouldn't make sense to allow
arguments of different lengths.
Jan 10 '06 #7
On 10 Jan 2006 00:47:36 -0800, "Raymond Hettinger" <py****@rcn.com > wrote:
[Bengt Richter]
What about some semantics like my izip2 in
http://groups.google.com/group/comp....1ddb1f46?hl=en

(which doesn't even need a separate name, since it would be backwards compatible)

Also, what about factoring sequence-related stuff into being methods or attributes
of iter instances? And letting iter take multiple sequences or callable/sentinel pairs,
which could be a substitute for izip and then some? Methods could be called via a returned
iterator before or after the first .next() call, to control various features, such as
sentinel testing by 'is' instead of '==' for callable/sentinel pairs, or buffering n
steps of lookahead supported by a .peek(n) method defaulting to .peek(1), etc. etc.
The point being to have a place to implement universal sequence stuff.
ISTM, these cures are worse than the disease ;-)

Are you reacting to my turgidly rambling post, or to
from ut.izip2 import izip2 as izip
it = izip('abc','12' ,'ABCD')
for t in it: print t ...
('a', '1', 'A')
('b', '2', 'B')

Then after a backwards-compatible izip, if the iterator has
been bound, it can be used to continue, with sentinel sustitution:
for t in it.rest('<senti nel>'): print t ...
('c', '<sentinel>', 'C')
('<sentinel>', '<sentinel>', 'D')

or optionally in sentinel substitution mode from the start:
for t in izip('abc','12' ,'ABCD').rest(' <sentinel>'): print t

...
('a', '1', 'A')
('b', '2', 'B')
('c', '<sentinel>', 'C')
('<sentinel>', '<sentinel>', 'D')

Usage-wise, this seems not too diseased to me, so I guess I want to make sure
this is what you were reacting to ;-)

(Implementation was just to hack together a working demo. I'm sure it can be improved upon ;-)

Even if there is little use for continuing in correct code, IWT getting
at the state of the iterator in an erroroneous situation would be a benefit.
Being able to see the result of the last attempt at gathering tuple elements
could help. (I can see reasons for wanting variations of trying all streams
vs shortcutting on the first to exhaust though).


On the one hand, that seems reasonable. On the other hand, I can't see
how to use it without snarling the surrounding code in which case it is
probably better to explicitly manage individual iterators within a
while loop.

The above would seem to allow separation of concerns. I.e., if you care why
a normal iteration terminated, you can test after the fact. I.e., if all sequences
were the same length, the .rest() iterator will be empty. And if you don't care at
all about possible data, you can just try: it.rest().next( ) and catch StopIteration
to check.

BTW, is there any rule against passing information with StopIteration?

Regards,
Bengt Richter
Jan 10 '06 #8
> There are so many varieties of iterator that it's probably not workable
to alter the iterator API for all of the them.


i always wondered if it can be implemented:

there are iterators which has length:
i = iter([1,2,3])
len(i) 3

now isn't there a way to make this length inheritible?
eg. generators could have length in this case: g = (x for x in [1,2,3])
# len(g) == len([1,2,3]) == 3
of course in special cases length would remain undefined: f = (x for x in [1,2,3] if x>2)
# len(f) == ?


IMHO there are special cases when this is useful:
L=list(it)
here if it has length, then list creation can be more effective
(required memory is known in advance)

nsz

Jan 10 '06 #9
> There are so many varieties of iterator that it's probably not workable
to alter the iterator API for all of the them.


i always wondered if it can be implemented:

there are iterators which has length:
i = iter([1,2,3])
len(i) 3

now isn't there a way to make this length inheritible?
eg. generators could have length in this case: g = (x for x in [1,2,3])
# len(g) == len([1,2,3]) == 3
of course in special cases length would remain undefined: f = (x for x in [1,2,3] if x>2)
# len(f) == ?


IMHO there are special cases when this is useful:
L=list(it)
here if it has length, then list creation can be more effective
(required memory is known in advance)

nsz

Jan 10 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
9446
by: | last post by:
I've trolled the lists, FAQs, and Net as a whole, but can't find anything definitive to answer this. We're looking for real-time graph capability (bar, line, etc), so that we can display telemetry from a robot system. There are a bunch of packages out there, but many seem to provide only static graphs (e.g. for scientific, financial data,...
7
10104
by: ±ÇÌéÍõ×Ó | last post by:
Hi,guys! I find an example from the book "Advanced C++ Programming Styles and idoms" by James O.Coplien, but it fails to compile. Code: class String{ public: friend String operator+ (const char*,const String&) const; friend String operator+ (const String&,const char*) const;
4
4197
by: WittyGuy | last post by:
Hi all, Though I know the concepts of both abstract class & virtual function (like derived class pointer pointing to base class...then calling the function with the pointer...), what is the real implementation usage of these concepts? Where these concepts will be used. Please provide some illustration (real-time), so that it can be easily...
3
1771
by: Michele Petrazzo | last post by:
I want to redistribute the library that I create. I create a project, its setup.py that when launched copy all files into the "site-packages/library" directory. And here it's all ok. When I call my library with: import library library.class() I want that my library know where are its real path (site-packages/library/), because it has to...
4
3127
by: Silas | last post by:
Hi, I use view to join difference table together for some function. However, when the "real" table fields changed (e.g. add/delete/change field). The view table still use the "old fields". Therefore everytimes when I change the real table, I also needed open the view table and save it by SQL enterprise manager manually for update the...
4
3830
by: Frank Meng | last post by:
Hi. I am trying a csharp sample from http://www.codeproject.com/csharp/socketsincs.asp . (Sorry I didn't post all the source codes here, please get the codes from above link if you want to try). I had some troubles when I started 6 threads (each thread made a separate connection) and sent messages to same server simultaneously. Sometimes,...
1
1408
by: RealTimeC++ | last post by:
I am thinking about tring to use Visual C++ .NET to develop some real-time applications. I was wandering if there are any issues with timing with the GC, and what is the interaction between the use of Unmanaged C++ and Managed C++? I am looking to get real-time resolution down to under half a second, and the application is going to be...
5
3149
by: ma740988 | last post by:
Trying to determine how to get the max element from a complex sequence. Given: int main() { //typedef std::complex < double > C typedef std::vector < std::complex < double > > complex_vec; typedef complex_vec::const_iterator cvect_it; complex_vec x_vec;
0
3291
by: reema | last post by:
dotnet http://www.interviewdoor.com/interviewforum/index.php?c=5 VB.NET Interview Questions And Real Time Discussions http://www.interviewdoor.com/interviewforum/viewforum.php?f=25 ASP.NET Interview Questions And Real Time Discussions http://www.interviewdoor.com/interviewforum/viewforum.php?f=26
0
1815
by: reema | last post by:
dotnet http://www.interviewdoor.com/interviewforum/index.php?c=5 VB.NET Interview Questions And Real Time Discussions http://www.interviewdoor.com/interviewforum/viewforum.php?f=25 ASP.NET Interview Questions And Real Time Discussions http://www.interviewdoor.com/interviewforum/viewforum.php?f=26
0
7397
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7336
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7582
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
7726
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
5877
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
4884
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3385
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1809
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
0
626
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.