473,888 Members | 1,530 Online

# Conflicting needs for __init__ method

Here's an example of a problem that I've recently come up against for
the umpteenth time. It's not difficult to solve, but my previous
solutions have never seemed quite right, so I'm writing to ask whether
others have encountered this problem, and if so what solutions they've
come up with.

Suppose you're writing a class "Rational" for rational numbers. The
__init__ function of such a class has two quite different roles to
play. First, it's supposed to allow users of the class to create
Rational instances; in this role, __init__ is quite a complex beast.
It needs to allow arguments of various types---a pair of integers, a
single integer, another Rational instance, and perhaps floats, Decimal
instances, and suitably formatted strings. It has to validate the
input and/or make sure that suitable exceptions are raised on invalid
input. And when initializing from a pair of integers---a numerator
and denominator---it makes sense to normalize: divide both the
numerator and denominator by their greatest common divisor and make
sure that the denominator is positive.

But __init__ also plays another role: it's going to be used by the
other Rational arithmetic methods, like __add__ and __mul__, to return
new Rational instances. For this use, there's essentially no need for
any of the above complications: it's easy and natural to arrange that
the input to __init__ is always a valid, normalized pair of integers.
(You could include the normalization in __init__, but that's wasteful
when gcd computations are relatively expensive and some operations,
like negation or raising to a positive integer power, aren't going to
require it.) So for this use __init__ can be as simple as:

def __init__(self, numerator, denominator):
self.numerator = numerator
self.denominato r = denominator

So the question is: (how) do people reconcile these two quite
different needs in one function? I have two possible solutions, but
neither seems particularly satisfactory, and I wonder whether I'm
missing an obvious third way. The first solution is to add an
optional keyword argument "internal = False" to the __init__ routine,
and have all internal uses specify "internal = True"; then the
__init__ function can do the all the complicated stuff when internal
is False, and just the quick initialization otherwise. But this seems
rather messy.

The other solution is to ask the users of the class not to use
Rational() to instantiate, but to use some other function
(createRational (), say) instead. Then __init__ is just the simple
method above, and createRational does all the complicated stuff to
figure out what the numerator and denominator should be and eventually
calls Rational(numera tor, denomiator) to create the instance. But
asking users not to call Rational() seems unnatural. Perhaps with
some metaclass magic one can ensure that "external" calls to
Rational() actually go through createRational( ) instead?

Of course, none of this really has anything to do with rational
numbers. There must be many examples of classes for which internal
calls to __init__, from other methods of the same class, require
minimal argument processing, while external calls require heavier and
possibly computationally expensive processing. What's the usual way
to solve this sort of problem?

Mark

Jan 14 '07 #1
19 1644
Mark wrote:

[a lot of valid, but long concerns about types that return
an object of their own type from some of their methods]

I think that the best solution is to use an alternative constructor
in your arithmetic methods. That way users don't have to learn about
two different factories for the same type of objects. It also helps
with subclassing, because users have to override only a single method
if they want the results of arithmetic operations to be of their own
type.

For example, if your current implementation looks something like
this:

class Rational(object ):

# a long __init__ or __new__ method

# compute new numerator and denominator
return Rational(numera tor, denominator)

# other simmilar arithmetic methods
then you could use something like this instead:

class Rational(object ):

# a long __init__ or __new__ method

# compute new numerator and denominator
return self.result(num erator, denominator)

# other simmilar arithmetic methods

@staticmethod
def result(numerato r, denominator):
"""
we don't use a classmethod, because users should
explicitly override this method if they want to
change the return type of arithmetic operations.
"""
result = object.__new__( Rational)
result.numerato r = numerator
result.denomina tor = denominator
return result
Hope this helps,
Ziga

Jan 15 '07 #2
At Sunday 14/1/2007 20:32, di******@gmail. com wrote:
>Of course, none of this really has anything to do with rational
numbers. There must be many examples of classes for which internal
calls to __init__, from other methods of the same class, require
minimal argument processing, while external calls require heavier and
possibly computationally expensive processing. What's the usual way
to solve this sort of problem?
In some cases you can differentiate by the type or number of
arguments, so __init__ is the only constructor used.
In other cases this can't be done, then you can provide different
constructors (usually class methods or static methods) with different
names, of course. See the datetime class, by example. It has many
constructors (today(), fromtimestamp() , fromordinal().. .) all of them
class methods; it is a C module.

For a slightly different approach, see the TarFile class (this is a
Python module). It has many constructors (classmethods) like taropen,
gzopen, etc. but there is a single public constructor, the open()
classmethod. open() is a factory, dispatching to other constructors
depending on the combination of arguments used.
--
Gabriel Genellina
Softlab SRL

_______________ _______________ _______________ _____
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas

Jan 15 '07 #3
On Sun, 14 Jan 2007 15:32:35 -0800, dickinsm wrote:
Suppose you're writing a class "Rational" for rational numbers. The
__init__ function of such a class has two quite different roles to
play. First, it's supposed to allow users of the class to create
Rational instances; in this role, __init__ is quite a complex beast.
It needs to allow arguments of various types---a pair of integers, a
single integer, another Rational instance, and perhaps floats, Decimal
instances, and suitably formatted strings. It has to validate the
input and/or make sure that suitable exceptions are raised on invalid
input. And when initializing from a pair of integers---a numerator
and denominator---it makes sense to normalize: divide both the
numerator and denominator by their greatest common divisor and make
sure that the denominator is positive.

But __init__ also plays another role: it's going to be used by the
other Rational arithmetic methods, like __add__ and __mul__, to return
new Rational instances. For this use, there's essentially no need for
any of the above complications: it's easy and natural to arrange that
the input to __init__ is always a valid, normalized pair of integers.
(You could include the normalization in __init__, but that's wasteful
Is it really? Have you measured it or are you guessing? Is it more or less
wasteful than any other solution?
when gcd computations are relatively expensive and some operations,
like negation or raising to a positive integer power, aren't going to
require it.) So for this use __init__ can be as simple as:

def __init__(self, numerator, denominator):
self.numerator = numerator
self.denominato r = denominator

So the question is: (how) do people reconcile these two quite
different needs in one function? I have two possible solutions, but
neither seems particularly satisfactory, and I wonder whether I'm
missing an obvious third way. The first solution is to add an
optional keyword argument "internal = False" to the __init__ routine,
and have all internal uses specify "internal = True"; then the
__init__ function can do the all the complicated stuff when internal
is False, and just the quick initialization otherwise. But this seems
rather messy.
Worse than messy. I guarantee you that your class' users will,
deliberately or accidentally, end up calling Rational(10,30, internal=True)
and you'll spent time debugging mysterious cases of instances not being
normalised when they should be.

The other solution is to ask the users of the class not to use
Rational() to instantiate, but to use some other function
That's ugly! And they won't listen.
Of course, none of this really has anything to do with rational
numbers. There must be many examples of classes for which internal
calls to __init__, from other methods of the same class, require
minimal argument processing, while external calls require heavier and
possibly computationally expensive processing. What's the usual way
to solve this sort of problem?
class Rational(object ):
def __init__(self, numerator, denominator):
print "lots of heavy processing here..."
# processing ints, floats, strings, special case arguments,
# blah blah blah...
self.numerator = numerator
self.denominato r = denominator
def __copy__(self):
cls = self.__class__
obj = cls.__new__(cls )
obj.numerator = self.numerator
obj.denominator = self.denominato r
return obj
def __neg__(self):
obj = self.__copy__()
obj.numerator *= -1
return obj

I use __copy__ rather than copy for the method name, so that the copy
module will do the right thing.

--
Steven D'Aprano

Jan 15 '07 #4
On Mon, 15 Jan 2007 14:43:55 +1100, Steven D'Aprano wrote:
>Of course, none of this really has anything to do with rational
numbers. There must be many examples of classes for which internal
calls to __init__, from other methods of the same class, require
minimal argument processing, while external calls require heavier and
possibly computationally expensive processing. What's the usual way
to solve this sort of problem?

class Rational(object ):
def __init__(self, numerator, denominator):
print "lots of heavy processing here..."
# processing ints, floats, strings, special case arguments,
# blah blah blah...
self.numerator = numerator
self.denominato r = denominator
def __copy__(self):
cls = self.__class__
obj = cls.__new__(cls )
obj.numerator = self.numerator
obj.denominator = self.denominato r
return obj
def __neg__(self):
obj = self.__copy__()
obj.numerator *= -1
return obj

Here's a variation on that which is perhaps better suited for objects with
lots of attributes:

def __copy__(self):
cls = self.__class__
obj = cls.__new__(cls )
obj.__dict__.up date(self.__dic t__) # copy everything quickly
return obj

--
Steven D'Aprano

Jan 15 '07 #5
On Jan 14, 7:49 pm, "Ziga Seilnacht" <ziga.seilna... @gmail.comwrote :
Mark wrote:[a lot of valid, but long concerns about types that return
an object of their own type from some of their methods]

I think that the best solution is to use an alternative constructor
in your arithmetic methods. That way users don't have to learn about
two different factories for the same type of objects. It also helps
with subclassing, because users have to override only a single method
if they want the results of arithmetic operations to be of their own
type.
Aha. I was wondering whether __new__ might appear in the solution
somewhere, but couldn't figure out how that would work; I'd previously
only ever used it for its advertised purpose of subclassing immutable
types.
Hope this helps,
It helps a lot. Thank you.

Mark

Jan 15 '07 #6

On Jan 14, 10:43 pm, Steven D'Aprano
<s...@REMOVEME. cybersource.com .auwrote:
On Sun, 14 Jan 2007 15:32:35 -0800, dickinsm wrote:
(You could include the normalization in __init__, but that's wasteful
Is it really? Have you measured it or are you guessing? Is it more or less
wasteful than any other solution?
Just guessing :). But when summing the reciprocals of the first 2000
positive integers, for example, with:

sum((Rational(1 , n) for n in range(1, 2001)), Rational(0))

the profile module tells me that the whole calculation takes 8.537
seconds, 8.142 of which are spent in my gcd() function. So it seemed
sensible to eliminate unnecessary calls to gcd() when there's an easy
way to do so.
def __copy__(self):
cls = self.__class__
obj = cls.__new__(cls )
obj.numerator = self.numerator
obj.denominator = self.denominato r
return obj
Thank you for this.

Mark

Jan 15 '07 #7
di******@gmail. com writes:
Suppose you're writing a class "Rational" for rational numbers. The
__init__ function of such a class has two quite different roles to
play.
That should be your first clue to question whether you're actually
needing separate functions, rather than trying to force one function
to do many different things.
First, it's supposed to allow users of the class to create Rational
instances; in this role, __init__ is quite a complex beast.
The __init__ function isn't the "constructo r" you find in other
languages. Its only purpose is to initialise an already-created
instance, not make a new one.
It needs to allow arguments of various types---a pair of integers, a
single integer, another Rational instance, and perhaps floats, Decimal
instances, and suitably formatted strings. It has to validate the
input and/or make sure that suitable exceptions are raised on invalid
input. And when initializing from a pair of integers---a numerator
and denominator---it makes sense to normalize: divide both the
numerator and denominator by their greatest common divisor and make
sure that the denominator is positive.
All of this points to having a separate constructor function for each
of the inputs you want to handle.
But __init__ also plays another role: it's going to be used by the
other Rational arithmetic methods, like __add__ and __mul__, to
return new Rational instances.
No, it won't; those methods won't "use" the __init__ method. They will
use a constructor, and __init__ is not a constructor (though it does
get *called by* the construction process).
For this use, there's essentially no need for any of the above
complications: it's easy and natural to arrange that the input to
__init__ is always a valid, normalized pair of integers.
Therefore, make your __init__ handle just the default, natural case
you identify.

class Rational(object ):
def __init__(self, numerator, denominator):
self.numerator = numerator
self.denominato r = denominator
So the question is: (how) do people reconcile these two quite
different needs in one function?
By avoiding the tendency to crowd a single function with disparate
functionality. Every function should do one narrowly-defined task and
no more.

@classmethod
def from_string(inp ut):
(n, d) = parse_elements_ of_string_input (input)
return Rational(n, d)

@classmethod
def from_int(input) :
return Rational(input, 1)

@classmethod
def from_rational(i nput):
(n, d) = (input.numerato r, input.denominat or)
return Rational(n, d)

return result

def __sub__(self, other):
result = perform_subtrac tion(self, other)
return result

Put whatever you need to for 'parse_elements _of_string_inpu t',
'perform_additi on', 'perform_subtra ction', etc; either the calculation
itself, if simple, or a call to a function that can contain the
complexity.

Use Python's exception system to avoid error-checking all over the
place; if there's a problem with the subtraction, for instance, let
the exception propagate up to the code that gave bad input.

The alternate constructors are decorated as '@classmethod' since they
won't be called as instance methods, but rather:

foo = Rational.from_s tring("355/113")
bar = Rational.from_i nt(17)
baz = Rational.from_r ational(foo)

--
\ "If you can't beat them, arrange to have them beaten." -- |
`\ George Carlin |
_o__) |
Ben Finney

Jan 15 '07 #8
Steven D'Aprano wrote:
class Rational(object ):
def __init__(self, numerator, denominator):
print "lots of heavy processing here..."
# processing ints, floats, strings, special case arguments,
# blah blah blah...
self.numerator = numerator
self.denominato r = denominator
def __copy__(self):
cls = self.__class__
obj = cls.__new__(cls )
obj.numerator = self.numerator
obj.denominator = self.denominato r
return obj
def __neg__(self):
obj = self.__copy__()
obj.numerator *= -1
return obj

Here's a variation on that which is perhaps better suited for objects with
lots of attributes:

def __copy__(self):
cls = self.__class__
obj = cls.__new__(cls )
obj.__dict__.up date(self.__dic t__) # copy everything quickly
return obj
I recently had to do something similar for my ORM, where a
user-instantiated object gets expensive default values, but the back
end just overwrites those defaults when "resurrecti ng" objects, so it
shouldn't pay the price. However (and this is the tricky part), I also
wanted to allow subclasses to extend the __init__ method, so just using
cls.__new__(cls ) didn't quite go far enough. Here's what I ended up
with [1]:

def __init__(self, **kwargs):
self.sandbox = None

cls = self.__class__
if self._zombie:
# This is pretty tricky, and deserves some detailed
explanation.
# When normal code creates an instance of this class, then
the
# expensive setting of defaults below is performed
automatically.
# However, when a DB recalls a Unit, we have its entire
properties
# dict already and should skip defaults in the interest of
speed.
# Therefore, a DB which recalls a Unit can write:
# unit = UnitSubClass.__ new__(UnitSubCl ass)
# unit._zombie = True
# unit.__init__()
# unit._propertie s = {...}
# unit = UnitSubClass()
# unit._propertie s = {...}
# If done this way, the caller must make CERTAIN that all
of
# the values in _properties are set, and must call
cleanse().
self._propertie s = dict.fromkeys(c ls.properties, None)
else:
# Copy the class properties into self._propertie s,
# setting each value to the UnitProperty.de fault.
self._propertie s = dict([(k, getattr(cls, k).default)
for k in cls.properties])

# Make sure we cleanse before assigning properties from
kwargs,
# or the new unit won't get saved if there are no further
changes.
self.cleanse()

for k, v in kwargs.iteritem s():
setattr(self, k, v)

The _zombie argument is therefore a flag which allows you to keep the
initialization code inside __init__ (rather than repeating it inside
every method).
Robert Brewer
System Architect
Amor Ministries
fu******@amor.o rg

[1] http://projects.amor.org/dejavu/brow.../units.py#l552

Jan 16 '07 #9
On Tue, 16 Jan 2007 08:54:09 +1100, Ben Finney wrote:
di******@gmail. com writes:
>Suppose you're writing a class "Rational" for rational numbers. The
__init__ function of such a class has two quite different roles to
play.

That should be your first clue to question whether you're actually
needing separate functions, rather than trying to force one function
to do many different things.
[snip]
All of this points to having a separate constructor function for each
of the inputs you want to handle.
[snip]
The alternate constructors are decorated as '@classmethod' since they
won't be called as instance methods, but rather:

foo = Rational.from_s tring("355/113")
bar = Rational.from_i nt(17)
baz = Rational.from_r ational(foo)
That's one way of looking at it. Another way is to consider that __init__
has one function: it turns something else into a Rational. Why should the
public interface of "make a Rational" depend on what you are making it
from?

Think of built-ins like str() and int(). I suggest that people would be
*really* unhappy if we needed to do this:

str.from_int(45 )
str.from_float( 45.0)
str.from_list([45, 45.5])
etc.

Why do you consider that Rationals are different from built-ins in this
regard?

return result
But that could just as easily be written as:

which then raises the question, why delegate the addition out of __add__
to perform_additio n? There is at least three distinct costs: a larger
namespace, an extra function to write tests for; and an extra method
call for every addition. What benefit do you gain? Why not put the

Just creating an extra layer to contain the complexity of rational
addition doesn't gain you anything -- you haven't done anything to reduce
the complexity of the problem, but you have an extra layer to deal with.

And you still haven't dealt with another problem: coercions from other
types. If you want to be able to add Rationals to (say) floats, ints and
Rationals without having to explicitly convert them then you need some
method of dispatching to different initialiser methods. (You should be
asking whether you really do need this, but let's assume you do.)

Presumably you create a method Rational.dispat ch_to_initialis ers that
takes any object and tries each initialiser in turn until one succeeds,
then returns the resultant Rational. Or you could just call it
Rational.__init __.

This doesn't mean that __init__ must or even should contain all the
initialisation logic -- it could dispatch to from_string, from_float and
other methods. But the caller doesn't need to call the individual
initialisers -- although of course they are public methods and can be
called if you want -- since __init__ will do the right thing.
--
Steven D'Aprano

Jan 16 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.