472,336 Members | 1,213 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,336 software developers and data experts.

Proposed new PEP: print to expand generators


I would like to champion a proposed enhancement to Python. I describe the
basic idea below, in order to gage community interest. Right now, it's only
an idea, and I'm sure there's room for improvement. And of course it's
possible there's some serious "gotcha" I've overlooked. Thus I welcome any
and all comments.

If there's some agreement that this proposal is worth further consideration
then I'll re-submit a formal document in official PEP format.

Regards

--jb

PEP -- EXTEND PRINT TO EXPAND GENERATORS

NUTSHELL

I propose that we extend the semantics of "print" such that if the object to
be printed is a generator then print would iterate over the resulting
sequence of sub-objects and recursively print each of the items in order.

E.g.,

print obj

under the proposal would behave something like

import types

if type( obj ) == types.GeneratorType:
for item in obj:
print item, # recursive call
print # trailing newline
else:
print obj # existing print behavior

I know this isn't precisely how print would work, but I intentionally
simplified the illustration to emphasize the intended change. Nevertheless,
several points above expressly are part of this proposal (subject to
discussion and possible revision):

Print behavior does not change EXCEPT in the case
that the object being printed is a generator.

Enumerated items are printed with intervening spaces
[alternatively: "" or "\n"].

An enumerated sequence ends with a newline
[alternatively: "" or " "].

Iterators themselves could return iterators as elements, and the proposed
change to print would recursively serialize any arbitrary "tree" of iterators.

__str__() for complex user-defined objects then could return iterators, and
arbitrarily complex structures could be printed out without glomming
everything into a huge string -- only to throw it away in the end.

I expect we likely also would want to modify str() itself to embody this
serialization behavior. This additional change would support those cases
where one actually does want the single large string in the end, say, to
store into a UI widget. Still, the string would be constructed once at the
end, much more efficiently than by building a bunch of smaller, intermediate
strings.

Then, in an abstract sense, we would not be changing print at all -- the new
semantics would be embodied in the change to str(). However, in practice,
we'd also want to modify print, as an important optimization for a more
common use case.

The present behavior (displaying, e.g., "<generator object at 0x016BA288>")
would still be available via

print repr( generator )

Note that this behavior presently results from all three of:

print generator
print str( generator )
print repr( generator )

So, this proposal merely ascribes useful new semantics to the first two of
three redundant language constructs.

MOTIVATION

With increasingly complex objects, the print representation naturally becomes
more complex. In particular, when an object consists of a collection of
sub-objects, it's natural for it's string representation to be defined
recursively in terms of the sub-components' string representations, with some
further indication of how they're held together.

This is possible to do with the __str__ overload and the existing print
semantics. However, existing semantics require constructing many otherwise
unnecessary intermediate strings, and, as such, is grossly inefficient.
Worse, each intermediate string is generally the catenation of several
previous intermediaries, so the volume of intermediate results steadily
increases throughout the conversion. Finally, the cost of string operations
is proportional to the length of the strings in question, so I expect the
overall cost increases significantly faster than in direct proportion to the
size of the output (i.e. it's non-linear).

E.g., instances of the following classes can become arbitrarily expensive to
print out:

def HtmlTable( object ):
# ...
def __str__( self ):
return ( "<table"
+ str( self.attr )
+ ">\n"
+ "".join([ str( row ) for row in self.head ])
+ "".join([ str( row ) for row in self.rows ])
+ "</table>\n" )

def HtmlRow( object ):
# ...
def __str__( self ):
return ( "<tr"
+ str( self.attr )
+ ">\n"
+ "".join([ str( cell ) for cell in self.cells ])
+ "</tr>\n" )

def HtmlCell( object ):
# ...
def __str__( self ):
return ( "<td"
+ str( self.attr )
+ ">\n"
+ "".join([ str( datum ) for datum in self.data ])
+ "</td>\n" )

Clearly, printing an arbitrary HtmlTable might require a LOT of unnecessary
string manipulation.

Using the proposed extension, the above example could be implemented instead
as something like:

def HtmlTable( object ):
# ...
def __str__( self ):
yield "<table"
yield str( self.attr )
yield ">\n"
for row in self.head:
yield str( row )
for row in self.rows:
yield str( row )
yield "</table>\n"

def HtmlRow( object ):
# ...
def __str__( self ):
yield "<tr"
yield str( self.attr )
yield ">\n"
for cell in self.cells:
yield str( cell )
yield "</tr>\n"
def HtmlCell( object ):
# ...
def __str__( self ):
yield "<td"
yield str( self.attr )
yield ">\n"
for datum in self.data:
yield str( datum )
yield "</td>\n"

With the new extension, the individual bits of data are simply output in the
proper order, virtually eliminating unnecessary string operations, resulting
in a huge performance improvement. In fact, in the common case where all of
the leaf nodes are literal strings, then the entire HTML table (or page!)
could be written out without any string manipulation -- the existing strings
are simply written out from their present locations in memory!

Furthermore, there's greater clarity and economy of expression in the
proposed new method.

The primary motivation behind this proposal is to eliminate unnecessary
overhead, while retaining all the convenience of the existing semantics of
string representations of custom objects.

While it's not 100% backwards compatible, it assigns a new meaning to one of
several redundant and little-used, existing language constructs.
ALTERNATIVES

In lieu of the proposed change, users can define their own auxiliary function
to generate the output. E.g.:

def HtmlTable( object ):
# ...
def pr( self, stream=sys.stdout ):
"<table"
print >>stream, str( self.attr )
print >>stream, ">\n"
for row in self.head:
print >>stream, row
row in self.rows:
print >>stream, row
print >>stream, "</table>"

I myself have successfully used this technique in a variety of applications.

Pro:
Requires no changes to Python

Con:
The solution has to be "hand crafted" in each case,
subject to user errors.

The solution only works if user expressly maintains the
convention throughout his class hierarchy.

The solution is not interchangeable with objects
from other authors.

///

Jun 4 '06 #1
3 2000
In article <ma***************************************@python. org>,
"James J. Besemer" <jb@cascade-sys.com> wrote:
I propose that we extend the semantics of "print" such that if the object to
be printed is a generator then print would iterate over the resulting
sequence of sub-objects and recursively print each of the items in order.


I believe the functionality you desire already exists, or something very
close to it, in the pprint (pretty printer) module.
Jun 4 '06 #2
James J. Besemer wrote:
I propose that we extend the semantics of "print" such that if the
object to be printed is a generator then print would iterate over the
resulting sequence of sub-objects and recursively print each of the
items in order.


I don't feel like searching for the specific python-dev threads right
now, but something like this has been suggested before (I think with a
"%i" formatting code), and Guido felt strongly that the addition or
removal of a simple print statement shouldn't change the behavior of the
surrounding code.

Consider code like::

items = get_generator_or_None()
for item in items:
do_something(item)

Now let's say I insert a debugging line like::

items = get_generator_or_None()
print "make sure this isn't None:", items
for item in items:
do_something(item)

My debugging line now just broke the rest of my code. That's not good.
The other reason I don't think this PEP should go forward (at least as
it is) is that Python 3000 is already going to turn the print statement
into a function (though the exact details of that function have not been
hashed out yet). So adding extra cruft to the print statement is kind
of wasted effort.

STeVe
Jun 4 '06 #3
James J. Besemer a écrit :
(snip)
PEP -- EXTEND PRINT TO EXPAND GENERATORS

NUTSHELL

I propose that we extend the semantics of "print" such that if the
object to be printed is a generator then print would iterate over the
resulting sequence of sub-objects and recursively print each of the
items in order.


Please, don't:

from itertools import cycle
def mygen():
return cycle('this is a very bad idea'.split())
Jun 4 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

72
by: Raymond Hettinger | last post by:
Peter Norvig's creative thinking triggered renewed interest in PEP 289. That led to a number of contributors helping to re-work the pep details...
9
by: Francis Avila | last post by:
A little annoyed one day that I couldn't use the statefulness of generators as "resumable functions", I came across Hettinger's PEP 288...
8
by: Raymond Hettinger | last post by:
Comments are invited on the following proposed PEP. Raymond Hettinger ------------------------------------------------------- PEP: 329
27
by: Brian Sabbey | last post by:
Here is a first draft of a PEP for thunks. Please let me know what you think. If there is a positive response, I will create a real PEP. I made...
108
by: Bryan Olson | last post by:
The Python slice type has one method 'indices', and reportedly: This method takes a single integer argument /length/ and computes information...
4
by: Michael | last post by:
Hi, I'm having difficulty finding any previous discussion on this -- I keep finding people either having problems calling os.exec(lepev),...
4
by: Tony Lownds | last post by:
(Note: PEPs in the 3xxx number range are intended for Python 3000) PEP: 3107 Title: Function Annotations Version: $Revision: 53169 $...
8
by: Paul Rubin | last post by:
The boilerplate def some_gen(): ... for x in some_other_gen(): yield x ... is so common (including the case where some_other_gen is the...
4
by: dustin | last post by:
I've been hacking away on this PEP for a while, and there has been some related discussion on python-dev that went into the PEP: ...
0
by: concettolabs | last post by:
In today's business world, businesses are increasingly turning to PowerApps to develop custom business applications. PowerApps is a powerful tool...
0
better678
by: better678 | last post by:
Question: Discuss your understanding of the Java platform. Is the statement "Java is interpreted" correct? Answer: Java is an object-oriented...
0
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and...
0
by: CD Tom | last post by:
This happens in runtime 2013 and 2016. When a report is run and then closed a toolbar shows up and the only way to get it to go away is to right...
0
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was...
2
by: Matthew3360 | last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it...
0
by: AndyPSV | last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable...
0
by: Arjunsri | last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.