By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,908 Members | 1,860 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,908 IT Pros & Developers. It's quick & easy.

Why less emphasis on private data?

P: n/a
Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

Thanks any insight.

Jan 7 '07 #1
Share this Question
Share on Google+
63 Replies


P: n/a
ti********@gmail.com schrieb:
Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..
It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

Thanks any insight.
Python doesn't prefer public data in classes. It leaves the choice to
the programmer. You can define your own private instance variables (or
functions) by using a '__' prefix:

example:
class Foo:
def __init__(self, data):
self.__data = data

def get_data(self):
return self.__data

>>f = Foo('bar')
f.__data
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: Foo instance has no attribute '__data'
>>f.get_data()
'bar'

Jan 7 '07 #2

P: n/a
ti********@gmail.com schrieb:
Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?
Private data is a convention, not a strict enforcement, for both Java
and C++.

Depending on your C++ compiler, a simple

#define private public

will give you access to all data you want. Besides the fact that casting
to a void* pointer and just accessing the private parts isn't rocket
science.

The same applies to java, for whatever reasons (I presume
serialization), you can access private fields via reflection.

In python, private members are usually declared using a single or double
underscore. And the basic idea is: "if you tamper with this, you've been
warned". Which is the way coding between consenting adults should be.

To be honest: I've stumbled over more cases of unescessary hoops to jump
through due to private declarations than bugs caused of me exploiting
things I've been told by the compiler not to tamper with it.

Summary: not important, forget about it, enjoy python!

Diez
Jan 7 '07 #3

P: n/a

timeComing from a C++ / C# background, the lack of emphasis on private
timedata seems weird to me.

Python doesn't try to protect you from the authors of the code you use. You
should be intelligent enough to use it wisely. On the flip side, the lack
of truly private data and methods means the original author of a piece of
code doesn't need to anticipate all future uses to which the code will be
put. Here are a couple items along the lines of "we're all adults here".

http://spyced.blogspot.com/2005/06/a...ok-python.html
http://www.mail-archive.com/tu***@py.../msg17806.html

Skip
Jan 7 '07 #4

P: n/a
On 6 Jan 2007 16:07:05 -0800, ti********@gmail.com <ti********@gmail.comwrote:
Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?
Google for "python for consenting adults"

Or ask yourself the opposite question. Why does C++ and C# prefer more
private data? It is given that emphasizing private data
(encapsulation) leads to more internal complexity and more lines of
code because you have to write getters and setters and stuff. With
that in mind, why do you think that data encapsulation makes code less
error prone? Can you prove it? Or do you have anecdotal evidence of
where data encapsulation saved your ass?

IMHO, that data hiding is good, is one of those ideas that have been
repeated so much that virtually everyone thinks it is true. But
Python proves that it isn't necessarily so.

--
mvh Björn
Jan 7 '07 #5

P: n/a
On 2007-01-07 01:54, BJörn Lindqvist wrote:
Google for "python for consenting adults"

Or ask yourself the opposite question. Why does C++ and C# prefer more
private data? It is given that emphasizing private data
(encapsulation) leads to more internal complexity and more lines of
code because you have to write getters and setters and stuff. With
that in mind, why do you think that data encapsulation makes code less
error prone? Can you prove it? Or do you have anecdotal evidence of
where data encapsulation saved your ass?

IMHO, that data hiding is good, is one of those ideas that have been
repeated so much that virtually everyone thinks it is true. But
Python proves that it isn't necessarily so.
I think attributes (callable or not) which relate to the
abstraction of the class should be "public" (special methods
or without leading underscore). Attributes that are there for a
specific implementation of the abstraction should be "private".

The internal implementation of a class is more-often changed
in incompatible ways than the abstraction, so distiguishing
between a public and a private interface will probably save
you from reworking the clients of a class if you prefer the
public interface. It will also make the client code easier to
understand.

Admittedly, there are special cases where you want to access
private attributes, e. g. debugging; that's ok.

In summary, the distinction between public and non-public
attributes IMHO makes sense, but I don't think that the
distinction should be enforced by the language as in C++
or Java.

Stefan
Jan 7 '07 #6

P: n/a
"BJörn Lindqvist" <bj*****@gmail.comwrites:
It is given that emphasizing private data (encapsulation) leads to
more internal complexity and more lines of code because you have to
write getters and setters and stuff.
You can have public variables in Java if you choose to. Writing
private variables with public setters and getters is just a style choice.
Or do you have anecdotal evidence of where data encapsulation saved
your ass?
There are certainly applications that can't live without it, like
browser applets.

As for it saving my ass, there's no way to know, it's like asking
whether garbage collection has saved my ass. Yes I've had plenty of
pointer related bugs in C programs that don't happen in GC'd
languages, so GC in that sense saves my ass all the time. I've also
had bugs in Python programs that would have been prevented by better
use of encapsulation (including in the stdlib). Python certainly
makes you spend more of your attention worrying about possible
attribute name collisions between classes and their superclasses. And
Python's name mangling scheme is leaky and bug-prone if you ever
re-use class names. Overall, I think Python would gain from having
better support for encapsulation and C++-like casting between class
instances.
Jan 7 '07 #7

P: n/a
Paul Rubin wrote:
Yes I've had plenty of
pointer related bugs in C programs that don't happen in GC'd
languages, so GC in that sense saves my ass all the time.
My experience is different, I never suffered a lot for
leaking or dangling pointers in C++ programs; and on
the opposite I didn't expect that fighting with object
leaking in complex python applications was that difficult
(I've heard of zope applications that just gave up and
resorted to the "reboot every now and then" solution).

With a GC if you just don't plan ownership and disposal
carefully and everything works as expected then you're
saving some thinking and code, but if something goes
wrong then you're totally busted.
The GC "leaky abstraction" requires you to be lucky to
work well, but unfortunately IMO as code complexity
increases one is never lucky enough.

Andrea
Jan 7 '07 #8

P: n/a
Dennis Lee Bieber <wl*****@ix.netcom.comwrites:
__ (two leading underscores) results in name-mangling. This /may/ be
used to specify "private" data, but is really more useful when one is
designing with multiple super classes:
Trouble with this is you can have two classes with the same name,
perhaps because they were defined in different modules, and then the
name mangling fails to tell them apart.
Jan 7 '07 #9

P: n/a
In article <11**********************@v33g2000cwv.googlegroups .com>,
"ti********@gmail.com" <ti********@gmail.comwrote:
Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

Thanks any insight.
One thing that the other posters didn't mention is that if you access data
members of a class in C++ you end up with a very tight coupling with that class.
If the class later changes so that the data is no longer part of the public
interface, then every user of the class has to change the code and recompile.

In Python, on the other hand, if I have a piece of public data that I later
decide to replace with an accessor method, I can do that without changing any of
the code that uses the class.

So, insistence on private data in C++ is a good thing because it reduces the
level of coupling between a class and its clients. In Python, this is not an
issue, because the same loose coupling can be obtained with data as well as
accessor methods, and therefore public data is used when possible and private
data when necessary.

hth

Ben

--
If this message helped you, consider buying an item
from my wish list: <http://artins.org/ben/wishlist>

I changed my name: <http://periodic-kingdom.org/People/NameChange.php>
Jan 7 '07 #10

P: n/a
On 07 Jan 2007 02:01:44 -0800, Paul Rubin
<"http://phr.cx"@nospam.invalidwrote:
Dennis Lee Bieber <wl*****@ix.netcom.comwrites:
__ (two leading underscores) results in name-mangling. This /may/ be
used to specify "private" data, but is really more useful when one is
designing with multiple super classes:

Trouble with this is you can have two classes with the same name,
perhaps because they were defined in different modules, and then the
name mangling fails to tell them apart.
What is the chance of having to inherit from two classes from
different modules but with exactly the same name *and* the same
instance variable name?

Of course you're being very pessimistic or extremely unlucky.

--
Felipe.
Jan 7 '07 #11

P: n/a
"Felipe Almeida Lessa" <fe**********@gmail.comwrites:
What is the chance of having to inherit from two classes from
different modules but with exactly the same name *and* the same
instance variable name?

Of course you're being very pessimistic or extremely unlucky.
If you want to write bug-free code, pessimism is the name of the game.
Jan 7 '07 #12

P: n/a
ti********@gmail.com wrote:
Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..
What is the use of private declarations, if the names themselves are not verbose about it?

=You'll always search the class definition/doc to check if the member is below "private" or you wait for compiler errors. If you still want to override, you have to declare 'friends' and all that school boy stuff.

=Its not useful and efficient for programmers but probably more fulfilled teachers lust itching disciples, when those languages where invented.

Moreover, in those languages there is more or less a clash of namespaces: All globals, module globals, members, local variables and possibly 'with'-variables. This confusion mixed with private declarations will soon provide a situation where one looses overview, what variable exactly was meant.

The syntax in Python with _'s and 'self.' and true modularization and minimal magic namespace behavior, but with explicit self-similiar access to objects, modules, functions and everything is overall most clear und effective. After all I don't know another language which behaves so well in this regard.

Even Ruby (little positive: it has not even the 'global' variable declaration) is much more ill below the line in that modules,classes, methods/functions.. are not objects but namespaces, messages etc. - thus self-similarity is so broken, that this which will actually limit the power and scalability of this language.
Robert

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

Thanks any insight.
Jan 7 '07 #13

P: n/a
Paul Rubin:
Python certainly makes you spend more of your attention worrying
about possible attribute name collisions between classes and their
superclasses. And Python's name mangling scheme is leaky and
bug-prone if you ever re-use class names.
Trouble with this is you can have two classes with the same name,
perhaps because they were defined in different modules, and then the
name mangling fails to tell them apart.
Without changing Python syntax at all I think this situation may be
improved. Instead of Python applying name mangling to names with __
before them, it can manage them as private, a higher level kind of
management. And then if it's useful a new built-in function may be
invented to access such private attributes anyway. I think this may
solve your problem. (This is for Py3.0). Maybe a metaclass can be
invented to simulate such behavior to test and try it before modifying
the language itself.

Bye,
bearophile

Jan 7 '07 #14

P: n/a
On 06 Jan 2007 17:38:06 -0800, Paul Rubin <httpwrote:
"BJörn Lindqvist" <bj*****@gmail.comwrites:
>It is given that emphasizing private data (encapsulation) leads to
more internal complexity and more lines of code because you have to
write getters and setters and stuff.

You can have public variables in Java if you choose to. Writing
private variables with public setters and getters is just a style choice.
Privates with getters/setters are (as I think someone else hinted) pretty
pointless. The interesting stuff is the private data that *is* private, i.e.
not meant for users at all.

But yes, I don't mind not having 'private:' in Python. I don't have
compile-time type checking anyway. In fact, I don't always know what the
attributes of my objects /are/ until runtime.

And besides, this is pretty close to a compile-time check:

find -name \*.py | \
xargs egrep '\._[_a-z]' | \
fgrep -v self._

/Jorgen

--
// Jorgen Grahn <grahn@ Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.dyndns.org R'lyeh wgah'nagl fhtagn!
Jan 7 '07 #15

P: n/a
Jorgen Grahn schrieb:
On 06 Jan 2007 17:38:06 -0800, Paul Rubin <httpwrote:
>"BJörn Lindqvist" <bj*****@gmail.comwrites:
>>It is given that emphasizing private data (encapsulation) leads to
more internal complexity and more lines of code because you have to
write getters and setters and stuff.
You can have public variables in Java if you choose to. Writing
private variables with public setters and getters is just a style choice.

Privates with getters/setters are (as I think someone else hinted) pretty
pointless. The interesting stuff is the private data that *is* private, i.e.
not meant for users at all.
Not really pointless, since you can hide your data structures that you
don't want to be fiddled around with (which for me is almost the only
point to use it).
But yes, I don't mind not having 'private:' in Python. I don't have
compile-time type checking anyway. In fact, I don't always know what the
attributes of my objects /are/ until runtime.
Me neither, although I have to say that the '__' prefix comes pretty
close to being 'private' already. It depends on the definition of
private. For me, private means 'not accessible from outside the
module/class'.

Thomas

Jan 7 '07 #16

P: n/a
Thomas Ploch <Th**********@gmx.netwrites:
Me neither, although I have to say that the '__' prefix comes pretty
close to being 'private' already. It depends on the definition of
private. For me, private means 'not accessible from outside the
module/class'.
class A:
__x = 3

class B(A):
__x = 4 # ok

class C(B):
__x = 5 # oops!

Consider that the above three class definitions might be in separate
files and you see how clumsy this gets.
Jan 7 '07 #17

P: n/a
Paul Rubin schrieb:
Thomas Ploch <Th**********@gmx.netwrites:
>Me neither, although I have to say that the '__' prefix comes pretty
close to being 'private' already. It depends on the definition of
private. For me, private means 'not accessible from outside the
module/class'.
>
class A:
__x = 3

class B(A):
__x = 4 # ok

class C(B):
__x = 5 # oops!

Consider that the above three class definitions might be in separate
files and you see how clumsy this gets.

I don't understand why this should be oops, even if they are in
different files.
>>a = A()
print a._A__x
3
>>b = B()
print b._B__x
4
>>c = C()
print c._C__x
5
>>dir(c)
['_A__x', '_B__x', '_C__x', '__doc__', '__module__']
>>print c._A__x
3
>>print c._B__x
4
Jan 7 '07 #18

P: n/a

ti********@gmail.com wrote:
Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?
The designers of Java, C++, C#, Ada95, Delphi, etc. seem to think that
if an object's 'internal' variables or states cannot be kept private,
programmers get an irresistible temptation to mess with them in
malicious ways. But if you are that stupid, should you be programming
in any language? The most widely used language is still C, and there is
no concept of private data in C either, nor is it needed.

As mentioned in other replies, it is not rocket science to access a
class private data. In C++ you can cast to void*, in Java and C# you
can use reflection. C++ is said to be an "unsafe" language because
programmers can, using a few tricks, mess with the vtables. But how
many really do that?

In Python variables are kept in strict namespaces. You can ask the
compiler to name mangle a variable by prepending underscores. The
variable then becomes just as 'private' as a C++ private variable,
because as previously mentioned, 'private' variables in C++ can be
accessed through a cast to void*.

Jan 7 '07 #19

P: n/a
Paul Rubin wrote:
>
class A:
__x = 3

class B(A):
__x = 4 # ok

class C(B):
__x = 5 # oops!

Consider that the above three class definitions might be in separate
files and you see how clumsy this gets.
What are you trying to show with the above? The principal benefit of
using private attributes set on either the class or the instance is to
preserve access, via self, to those attributes defined in association
with (or within) a particular class in the inheritance hierarchy, as
opposed to providing access to the "most overriding" definition of an
attribute. This is demonstrated more effectively with a method on class
A:

class A:
__x = 3
def f(self):
print self.__x # should always refer to A.__x

class B(A):
__x = 4

class C(B):
__x = 5

Here, instances of A, B and C will always print the value of A.__x when
the f method is invoked on them. Were a non-private attribute to be
used instead, instances of A, B and C would print the overridden value
of the attribute when the f method is invoked on them.

Paul

Jan 7 '07 #20

P: n/a
sturlamolden schrieb:
ti********@gmail.com wrote:
>Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

The designers of Java, C++, C#, Ada95, Delphi, etc. seem to think that
if an object's 'internal' variables or states cannot be kept private,
programmers get an irresistible temptation to mess with them in
malicious ways. But if you are that stupid, should you be programming
in any language? The most widely used language is still C, and there is
no concept of private data in C either, nor is it needed.
There is a kind of this concept in C with 'static' declarations.
As mentioned in other replies, it is not rocket science to access a
class private data. In C++ you can cast to void*, in Java and C# you
can use reflection. C++ is said to be an "unsafe" language because
programmers can, using a few tricks, mess with the vtables. But how
many really do that?
Exactly, if they were available, a lot more would do that. I think this
is the point. Programmers who can do that normally are sensible towards
that people who have designed this or that knew what they were doing.
But there are enough people that don't have a clue and _will_ fiddle
around and then flame all kind of mailing lists with requests for help
cause they did it wrong.


Jan 7 '07 #21

P: n/a
Andrea Griffini a écrit :
Paul Rubin wrote:
Yes I've had plenty of
>pointer related bugs in C programs that don't happen in GC'd
languages, so GC in that sense saves my ass all the time.


My experience is different, I never suffered a lot for
leaking or dangling pointers in C++ programs; and on
the opposite I didn't expect that fighting with object
leaking in complex python applications was that difficult
(I've heard of zope applications that just gave up and
resorted to the "reboot every now and then" solution).
Zope is a special case here, since it relies on an object database...
Jan 7 '07 #22

P: n/a
Paul Rubin a écrit :
"Felipe Almeida Lessa" <fe**********@gmail.comwrites:
>>What is the chance of having to inherit from two classes from
different modules but with exactly the same name *and* the same
instance variable name?

Of course you're being very pessimistic or extremely unlucky.


If you want to write bug-free code, pessimism is the name of the game.
Not to pretend my own code is always totally bug-free, but I found that,
with languages like Python, I usually got better results starting with
the simplest possible implementation, and only then adding some
'defensive' boilerplate where it makes sens (that is mostly resources
acquisition/release) - an approach that I would certainly not advocate
when it comes to coding in C...
Jan 7 '07 #23

P: n/a
[ Thomas Ploch <Th**********@gmx.net]
sturlamolden schrieb:
>ti********@gmail.com wrote:
>>Coming from a C++ / C# background, the lack of emphasis on private
data seems weird to me. I've often found wrapping private data
useful to prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave
class
data public. What is the logic behind that choice?

The designers of Java, C++, C#, Ada95, Delphi, etc. seem to think
that if an object's 'internal' variables or states cannot be kept
private, programmers get an irresistible temptation to mess with them
in malicious ways. But if you are that stupid, should you be
programming in any language? The most widely used language is still
C, and there is no concept of private data in C either, nor is it
needed.

There is a kind of this concept in C with 'static' declarations.
>As mentioned in other replies, it is not rocket science to access a
class private data. In C++ you can cast to void*, in Java and C# you
can use reflection. C++ is said to be an "unsafe" language because
programmers can, using a few tricks, mess with the vtables. But how
many really do that?

Exactly, if they were available, a lot more would do that. I think
this is the point. Programmers who can do that normally are sensible
towards that people who have designed this or that knew what they were
doing. But there are enough people that don't have a clue and _will_
fiddle around and then flame all kind of mailing lists with requests
for help cause they did it wrong.
Those people deserve to fail for being just extraordinary stupid...

--
Freedom is always the freedom of dissenters.
(Rosa Luxemburg)
Jan 7 '07 #24

P: n/a
Sebastian 'lunar' Wiesner schrieb:
>
Those people deserve to fail for being just extraordinary stupid...
Yes, but there are a lot of them around...

Thomas
P.S.: I don't mean they are around here. :-)
Jan 7 '07 #25

P: n/a
sturlamolden wrote:
ti********@gmail.com wrote:
>>Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?


The designers of Java, C++, C#, Ada95, Delphi, etc. seem to think that
if an object's 'internal' variables or states cannot be kept private,
programmers get an irresistible temptation to mess with them in
malicious ways.
If you're not clear on encapsulation issues, you probably haven't
done extensive maintenance programming on code written by others.
Finding out who can mess with a variable when debugging the code of
others is not fun.

Because Python doesn't have explicit declarations, scope of variables is
a touchy issue. If you write "x = 1" within a function, that will
create a local "x" if "x" doesn't exist, or alter a global "x" if "x" was
previously created in the global context. But at least global variables
are local to the namespace; we don't have clashes across files. So
it's not too bad. JavaScript has the convention that newly created
variables are global by default. Big mistake.

The underscore thing makes sense. Single underscore
variables are "protected" in the C++ sense, and double underscore
variables are "private", not visible from inherited classes.
It's hard to misuse such variables by accident. I'd be tempted
to prohibit access to underscore variables other than via "self._x"
forms, so they'd be inaccessable outside the object. It's undesirable
from a maintenance standpoint to have an unenforced convention like
a lead underscore. The maintenance programmer can't trust its meaning.

As Python grows up, and larger systems are written in it, these
issues become more important.

John Nagle
Animats
Jan 7 '07 #26

P: n/a
Bruno Desthuilliers wrote:
>... and on
the opposite I didn't expect that fighting with object
leaking in complex python applications was that difficult
(I've heard of zope applications that just gave up and
resorted to the "reboot every now and then" solution).
Zope is a special case here, since it relies on an object database...
Just to clarify my post... I found by being punched myself
in the nose what does it mean to have a complex python
application that suffers from object leaking; it's not
something I only read about zope programs.

But why zope applications would be a special case ?

Andrea
Jan 7 '07 #27

P: n/a
Thomas Ploch a écrit :
sturlamolden schrieb:
(snip)
>
>>As mentioned in other replies, it is not rocket science to access a
class private data. In C++ you can cast to void*, in Java and C# you
can use reflection. C++ is said to be an "unsafe" language because
programmers can, using a few tricks, mess with the vtables. But how
many really do that?

Exactly, if they were available, a lot more would do that.
Do you have any concrete evidence ? FWIW, I've seen a *lot* of Python
code, and very very few uses of _implementation stuff - most of them
being legitimate.
I think this
is the point. Programmers who can do that normally are sensible towards
that people who have designed this or that knew what they were doing.
But there are enough people that don't have a clue and _will_ fiddle
around and then flame all kind of mailing lists with requests for help
cause they did it wrong.
The fact is that there's no cure for stupidity. If you want a language
explicitly designed to "protect" dummies from themselves, you know where
to find it. Why should normally intelligent peoples have to suffer from
this ? Are you going to forbid hammers because dummies could smash their
fingers then complain ?
Jan 7 '07 #28

P: n/a
On 7 ene, 16:13, John Nagle <n...@animats.comwrote:
Because Python doesn't have explicit declarations, scope of variables is
a touchy issue. If you write "x = 1" within a function, that will
create a local "x" if "x" doesn't exist, or alter a global "x" if "x" was
previously created in the global context. But at least global variables
are local to the namespace; we don't have clashes across files.
No, `x=1` always uses a local variable x, unless an (explicit!) global
statement was in effect in the same block. This, and the explicit self,
make very clear which x you are referring to.

--
Gabriel Genellina

Jan 7 '07 #29

P: n/a
Andrea Griffini a écrit :
Bruno Desthuilliers wrote:
>... and on
>>the opposite I didn't expect that fighting with object
leaking in complex python applications was that difficult
(I've heard of zope applications that just gave up and
resorted to the "reboot every now and then" solution).
Zope is a special case here, since it relies on an object database...


Just to clarify my post... I found by being punched myself
in the nose what does it mean to have a complex python
application that suffers from object leaking; it's not
something I only read about zope programs.

But why zope applications would be a special case ?
1/ because of how Zope and the ZODB work
2/ because Zope is an unusually complex Python application.

FWIW, I've never had any memory problem with other Python applications
and/or frameworks I've used so far (ie: in the past seven years) - most
of them being somewhat 'simpler' than if they had been implemented in C
or C++...
Jan 7 '07 #30

P: n/a
John Nagle a écrit :
sturlamolden wrote:
>ti********@gmail.com wrote:
>>Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

The designers of Java, C++, C#, Ada95, Delphi, etc. seem to think that
if an object's 'internal' variables or states cannot be kept private,
programmers get an irresistible temptation to mess with them in
malicious ways.


If you're not clear on encapsulation issues,
encapsulation != data hiding
you probably haven't
done extensive maintenance programming on code written by others.
I did.
Finding out who can mess with a variable when debugging the code of
others is not fun.
# before
class Toto(object):
def __init__(self, x):
self._x = x

# after
class Toto(object):
def __init__(self, x):
self._x = x

@apply
def _x():
def fget(self):
return self._real_x
def fset(self, value):
import pdb; pdb.set_trace()
self._real_x = value
return property(**locals)

This is of course a braindead implementation - a better one would use
either the inspect module of the sys._getframe() hack to retrieve useful
debug infos (left as an excercice to the reader...)

Because Python doesn't have explicit declarations, scope of
variables is
a touchy issue.
???
If you write "x = 1" within a function, that will
create a local "x" if "x" doesn't exist, or alter a global "x" if "x" was
previously created in the global context.
Err... May I suggest you to read these two pages:
http://docs.python.org/ref/assignment.html
http://docs.python.org/ref/global.html#l2h-563
But at least global variables
are local to the namespace; we don't have clashes across files. So
it's not too bad. JavaScript has the convention that newly created
variables are global by default.
Unless preceded by the 'var' keyword...
Big mistake.
Mmm... which one ?
The underscore thing makes sense. Single underscore
variables are "protected" in the C++ sense, and double underscore
variables are "private", not visible from inherited classes.
It's hard to misuse such variables by accident. I'd be tempted
to prohibit access to underscore variables other than via "self._x"
forms, so they'd be inaccessable outside the object.
# foo.py
class Foo(object):
def __init__(self, x):
self._x = x
def __repr__(self):
return "<Foo %s>" % self._x

# bar.py
def bar(self):
self.y = self._x

# baaz.py
from foo import Foo
from bar import bar
Foo.bar = bar

f = Foo([42])
f.bar()
f.y.append('gotcha')
print f

It's undesirable
from a maintenance standpoint to have an unenforced convention
If it's a convention, it doesn't have to be inforced. If it's inforced,
it's not a convention anymore.

While we're at it, I've found it very valuable to be able to mess with
implementation when doing maintenance on somewhat large (and somewhat
messy) Python systems...
like
a lead underscore. The maintenance programmer can't trust its meaning.

As Python grows up, and larger systems are written in it, these
issues become more important.
If you go that way, then you'll also want to introduce declarative
static typing and remove all possibility to dynamically modify classes
or add/replace attributes and/or methods on a per-instance basis. If you
want Java, you know where to find it.
Jan 7 '07 #31

P: n/a
On Sun, 07 Jan 2007 04:09:13 -0800, Paul Rubin wrote:
"Felipe Almeida Lessa" <fe**********@gmail.comwrites:
>What is the chance of having to inherit from two classes from
different modules but with exactly the same name *and* the same
instance variable name?

Of course you're being very pessimistic or extremely unlucky.

If you want to write bug-free code, pessimism is the name of the game.
I wonder whether Paul uses snow chains all year round, even in the blazing
summer? After all, "if you want to drive safely, pessimism is the name of
the game".

In the last couple of weeks comp.lang.python has had (at least) two
practical examples of the pros and cons of private attributes.

The pro: there was discussion about replacing the optparse module's
implementation with argparse, leaving the interface the same. This was
complicated by the fact that optparse exposes its internal variables,
making the job of duplicating the interface significantly harder. However
this was surely a design choice, not an accident. Having private
attributes won't save you if you choose not to make your attributes
private.

The con: there was a fellow who (for some reason) actually needed to
access a class' private attributes. To the best of my knowledge, he was
over 18 and, while new to Python, an experienced programmer, so I believe
him when he said he had eliminated all other alternatives. (And if he
were wrong, if he was incompetent -- Not My Problem. It isn't for me to
take a hammer off him so he doesn't hit his thumb with it.) In his case,
Python's name mangling of private attributes was an inconvenience, not a
help.

Compared to all the "what-ifs" and "maybes" and hypotheticals in this
thread, there were two practical cases that revolved around private
variables. In one, we see that they aren't a panacea: data hiding doesn't
help when the data isn't hidden. In the other, we see that one developer's
private attribute is just what another developer needs to solve a problem.
--
Steven.

Jan 7 '07 #32

P: n/a
"Paul Boddie" <pa**@boddie.org.ukwrites:
Consider that the above three class definitions might be in separate
files and you see how clumsy this gets.

What are you trying to show with the above? The principal benefit of
using private attributes set on either the class or the instance is to
preserve access, via self, to those attributes defined in association
with (or within) a particular class in the inheritance hierarchy, as
opposed to providing access to the "most overriding" definition of an
attribute. This is demonstrated more effectively with a method on class A:
Right, the problem is if those methods start changing the "private"
variable. I should have been more explicit about that.

class A:
def __init__(self):
self.__x = 3
def foo(self):
return self.__x

class B(A): pass

class A(B):
def bar(self):
self.__x = 5 # clobbers private variable of earlier class named A
Jan 8 '07 #33

P: n/a
Steven D'Aprano <st***@REMOVE.THIS.cybersource.com.auwrites:
If you want to write bug-free code, pessimism is the name of the game.

I wonder whether Paul uses snow chains all year round, even in the blazing
summer? After all, "if you want to drive safely, pessimism is the name of
the game".
No. I'm willing to accept a 10**-5 chance of hitting a freak
snowstorm in summer, since I drive in summer at most a few hundred
times a year, so it will take me 100's of years before I'm likely to
encounter such a storm. There are millions of drivers, so if they all
take a similar chance, then a few times a year we'll see in the paper
that someone got caught in a storm, which is ok. Usually there's no
real consequence beyond some inconvenience of waiting for a tow truck.

Tow truck or ambulance operators, on the other hand, should keep
chains available all year around, since they have to service the needs
of millions of users, have to be ready for freak summer storms.

As a software developer wanting to deploy code on a wide scale, I'm
more like a tow truck operator than an individual car driver.
Alternatively, as a coder I "drive" a lot more often. If some Python
misfeature introduces a bug with probability 10**-5 per line of code,
then a 100 KLoc program is likely to have such a bug somewhere. It
doesn't take 100's of years.
Jan 8 '07 #34

P: n/a
Dennis Lee Bieber <wl*****@ix.netcom.comwrites:
I'd be quite concerned about the design environment rather than the
immediate code... Probably need something ugly like...

from mod1 import B as B1
from mod2 import B as B2
class A(B1, B2):
....
Interesting. I just tried that. mod1.py contains:

class B:
def foo(self): self.__x = 'mod1'

mod2.py contains:

class B:
def bar(self): self.__x = 'mod2'

And the test is:

from mod1 import B as B1
from mod2 import B as B2

class A(B1, B2): pass

a = A()
a.foo()
print a._B__x
a.bar()
print a._B__x

Sure enough, mod2 messes up mod1's private variable.
Jan 8 '07 #35

P: n/a
On Sun, 07 Jan 2007 19:30:05 -0800, Paul Rubin wrote:
Steven D'Aprano <st***@REMOVE.THIS.cybersource.com.auwrites:
If you want to write bug-free code, pessimism is the name of the game.

I wonder whether Paul uses snow chains all year round, even in the blazing
summer? After all, "if you want to drive safely, pessimism is the name of
the game".

No. I'm willing to accept a 10**-5 chance of hitting a freak
snowstorm in summer, since I drive in summer at most a few hundred
times a year, so it will take me 100's of years before I'm likely to
encounter such a storm. There are millions of drivers, so if they all
take a similar chance, then a few times a year we'll see in the paper
that someone got caught in a storm, which is ok. Usually there's no
real consequence beyond some inconvenience of waiting for a tow truck.

Tow truck or ambulance operators, on the other hand, should keep
chains available all year around, since they have to service the needs
of millions of users, have to be ready for freak summer storms.

As a software developer wanting to deploy code on a wide scale, I'm
more like a tow truck operator than an individual car driver.
Alternatively, as a coder I "drive" a lot more often. If some Python
misfeature introduces a bug with probability 10**-5 per line of code,
then a 100 KLoc program is likely to have such a bug somewhere. It
doesn't take 100's of years.
That's an irrelevant argument. We're not talking about random bugs in
random places of code, we're talking about one specific type of bug which
can only occur in a handful of very restricted set of circumstances, e.g.
you have to inherit from two classes which not only have exactly the same
name but they also have the same private attribute.

Your argument is that Python's strategy for dealing with private
attributes is insufficiently pessimistic, because it doesn't deal with
those circumstances. Fine. I agree. Python isn't pessimistic. Does it need
to be? Just how often do you inherit from two identically-named classes
both of which use identically-named private attributes?

You suggested that coders (and by extension, Python) should behave with
equal pessimism whether they are subclassing two identically-named
classes or not. That's equivalent to the argument that one should use snow
chains whether it is snowing or not -- it only considers the benefit of
the extra protection, without considering the costs.

Python's private attribute handling balances convenience and protection,
giving more weight to convenience, trading off some protection. And
convenience gives increased productivity, easier debugging, few bugs
overall, and other Good Things. It would probably change the character of
Python unacceptably much to push that balance the other way.

Don't get me wrong, it is a good thing for you to alert people to the
circumstances that Python's strategy breaks down, so that they can "use
snow chains" in those circumstances. And, hey, if somebody reads this
thread and is motivated to find a better strategy that doesn't change the
nature of the language by too much, great. (This happened once before:
multiple inheritance was broken in classic classes, and new classes where
added partly to fix that.)

But chances are, the majority of Pythonistas will think that having to
use snow chains once in a very great while is an acceptable trade-off to
the smooth ride Python gives the rest of the time.
--
Steven D'Aprano

Jan 8 '07 #36

P: n/a
Steven D'Aprano <st***@REMOVEME.cybersource.com.auwrites:
Just how often do you inherit from two identically-named classes
both of which use identically-named private attributes?
I have no idea how often if ever. I inherit from library classes all
the time, without trying to examine what superclasses they use. If my
subclass happens to have the same name as a superclass of some library
class (say Tkinter) this could happen. Whether it ever DOES happen, I
don't know, I could only find out by examining the implementation
details of every library class I ever use, and I could only prevent it
by remembering those details. That is an abstraction leak and is
dangerous and unnecessary. The name mangling scheme is a crock. How
often does anyone ever have a good reason for using it, except maybe
in something like a debugger that can just as easily reach inside the
actual class descriptors and get all the variables out?
Jan 8 '07 #37

P: n/a
Paul Rubin wrote:
>
Right, the problem is if those methods start changing the "private"
variable. I should have been more explicit about that.

class A:
def __init__(self):
self.__x = 3
def foo(self):
return self.__x

class B(A): pass

class A(B):
def bar(self):
self.__x = 5 # clobbers private variable of earlier class named A
Has this ever been reported as a bug in Python? I could imagine more
sophisticated "name mangling": something to do with the identity of the
class might be sufficient, although that would make the tolerated
"subversive" access to private attributes rather difficult.

Paul

Jan 8 '07 #38

P: n/a
"Paul Boddie" <pa**@boddie.org.ukwrites:
Has this ever been reported as a bug in Python? I could imagine more
sophisticated "name mangling": something to do with the identity of the
class might be sufficient, although that would make the tolerated
"subversive" access to private attributes rather difficult.
If you mean the object id, I don't think you can use it for name
mangling, since the mangled names have to survive code marshalling
and you may end up with different object id's.

I've just never encountered any legitimate use for the "subversive"
access and if it's really necessary, it's better to do it through some
kind of well-designed reflection interface in the class, rather than
with a crock like name mangling.
Jan 8 '07 #39

P: n/a
"Paul Boddie" <pa**@boddie.org.ukwrote:
Paul Rubin wrote:
>>
Right, the problem is if those methods start changing the "private"
variable. I should have been more explicit about that.

class A:
def __init__(self):
self.__x = 3
def foo(self):
return self.__x

class B(A): pass

class A(B):
def bar(self):
self.__x = 5 # clobbers private variable of earlier class named A

Has this ever been reported as a bug in Python? I could imagine more
sophisticated "name mangling": something to do with the identity of the
class might be sufficient, although that would make the tolerated
"subversive" access to private attributes rather difficult.

Paul
If it worries you then you can always check for it and disallow any
hierarchies where it could be a problem. For that matter PyChecker ought to
be able to catch this situation (maybe it already does, I haven't looked).
>>class SafetyNet(type):
def __new__(cls, name, bases, dct):
print "new",name
c = type.__new__(cls, name, bases, dct)
assert not name in [b.__name__ for b in c.__mro__[1:]]
return c

>>__metaclass__ = SafetyNet
class A: pass
new A
>>class B(A): pass
new B
>>class A(B): pass
new A

Traceback (most recent call last):
File "<pyshell#24>", line 1, in <module>
class A(B): pass
File "<pyshell#17>", line 5, in __new__
assert not name in [b.__name__ for b in c.__mro__[1:]]
AssertionError

Jan 8 '07 #40

P: n/a
On Sun, 07 Jan 2007 23:49:21 -0800, Paul Rubin wrote:
Steven D'Aprano <st***@REMOVEME.cybersource.com.auwrites:
>Just how often do you inherit from two identically-named classes
both of which use identically-named private attributes?

I have no idea how often if ever.
You've established that there's a name conflict when you do so, which
leads to bugs. So how often do you get bitten by that particular type of
bug?

I inherit from library classes all
the time, without trying to examine what superclasses they use. If my
subclass happens to have the same name as a superclass of some library
class (say Tkinter) this could happen. Whether it ever DOES happen, I
don't know, I could only find out by examining the implementation
details of every library class I ever use, and I could only prevent it
by remembering those details.
class MySubClass(SomeSuperclass):
try:
__my_private_attribute
except AttributeError:
__my_private_attribute = some_value
else:
raise ValueError("Name conflict with private attribute!")

Problem solved.

*wink*

That is an abstraction leak and is
dangerous and unnecessary. The name mangling scheme is a crock. How
often does anyone ever have a good reason for using it,
Exactly. I never use it.

The truth of the matter is, MyClass.__private is not private at all. It is
still a public attribute with a slightly unexpected name. In other words,
if you want to code defensively, you should simply assume that Python has
no private attributes, and code accordingly.

Problem solved.

--
Steven.

Jan 8 '07 #41

P: n/a
Steven D'Aprano wrote:
>
The truth of the matter is, MyClass.__private is not private at all. It is
still a public attribute with a slightly unexpected name. In other words,
if you want to code defensively, you should simply assume that Python has
no private attributes, and code accordingly.

Problem solved.
Well, it isn't really solved - it's more avoided than anything else.
;-)

Still, if one deconstructs the use of private data in various
programming languages, one can identify the following roles (amongst
others):

1. The prevention of access to data from program sections
not belonging to a particular component.
(The classic "keep out" mechanism.)
2. The enforcement of distinct namespaces within components.
(Making sure that subclass attributes and superclass attributes
can co-exist.)
3. To support stable storage layouts and binary compatibility.

Most Python adherents don't care too much about #1, and Python isn't
driven by the need for #3, mostly due to the way structures (modules,
classes, objects) are accessed by the virtual machine. However, one
thing which does worry some people is #2, and in a way it's the
forgotten but more significant benefit of private data.

Before I became completely aware of the significance of #2, I remember
using various standard library classes which are meant to be subclassed
and built upon, thinking that if I accidentally re-used an attribute
name then the operation of such classes would be likely to fail in
fairly bizarre ways. Of course, a quick browse of the source code for
sgmllib.SGMLParser informed me of the pitfalls, and I'm sure that
various tools could also be informative without the need to load
sgmllib.py into a text editor, but if I had been fully aware of the
benefits of private attributes and could have been sure that such
attributes had been used (again, a tool might have given such
assurances) then I wouldn't have needed to worry.

So I suppose that to "code accordingly" in the context of your advice
involves a manual inspection of the source code of superclasses or the
usage of additional tools. Yet I suppose that this isn't necessarily
unusual behaviour when working with large systems.

Paul

Jan 8 '07 #42

P: n/a
Steven D'Aprano <st***@REMOVE.THIS.cybersource.com.auwrites:
I have no idea how often if ever.

You've established that there's a name conflict when you do so, which
leads to bugs. So how often do you get bitten by that particular type of bug?
I don't know. Likely zero, possibly not. I'm sure I've written many
bugs that have never been detected by me or anyone else. I've
probably written bugs that crashed an application for some user but
they just cursed me out and never bothered to tell me about the crash.
Maybe I've even written bugs that leaked a user's private data without
the user noticing, but discovered by some attacker intercepting the
data who is cackling about the bug while keeping it secret. There's
no way for me to think I'll ever find out.

I'd much prefer to be able to say of any type of bug, "the number is
exactly zero as a known fact, because it's inherent in Python's design
that it's impossible to have that type of bug". Language designs
should aim to let programmers say things like that as often as possible.
class MySubClass(SomeSuperclass): ...
raise ValueError("Name conflict with private attribute!")
Problem solved.
No good, Python allows creating classes and attributes on the fly.
The superclass could create its private variable after the subclass is created.
The truth of the matter is, MyClass.__private is not private at all. It is
still a public attribute with a slightly unexpected name. In other words,
if you want to code defensively, you should simply assume that Python has
no private attributes, and code accordingly.

Problem solved.
Well, "problem moved", not "problem solved". Now you have the problem
of having to know the names of every attribute any related class might
use when you write your own class. That is why other languages have
private variables and Python has name mangling--to solve a real problem.
Except Python's solution is a leaky kludge.
Jan 8 '07 #43

P: n/a
"Paul Rubin" <http://ph****@NOSPAM.invalidwrote:
If you want to write bug-free code, pessimism is the name of the game.
A healthy touch of paranoia does not come amiss either...

And even then things foul up in strange ways because your head
is never quite literal enough.

When you hear a programmer use the word "probability" -
then its time to fire him, as in programming even the lowest
probability is a certainty when you are doing millions of
things a second.

But this is off topic, really - I don't think that hiding things make
much difference, especially as the python hiding is not absolute.

- Hendrik
Jan 8 '07 #44

P: n/a
hg
sturlamolden wrote:
The designers of Java, C++, C#, Ada95, Delphi, etc. seem to think that
if an object's 'internal' variables or states cannot be kept private,
programmers get an irresistible temptation to mess with them in
malicious ways. But if you are that stupid, should you be programming
in any language? The most widely used language is still C, and there is
no concept of private data in C either, nor is it needed.

void test(void)
{
static int i;
}
Do you agree that i is "private" to test ?

hg

Jan 8 '07 #45

P: n/a
On 2007-01-08, Paul Rubin <httpwrote:
Dennis Lee Bieber <wl*****@ix.netcom.comwrites:
>I'd be quite concerned about the design environment rather than the
immediate code... Probably need something ugly like...

from mod1 import B as B1
from mod2 import B as B2
class A(B1, B2):
....

Interesting. I just tried that. mod1.py contains:

class B:
def foo(self): self.__x = 'mod1'

mod2.py contains:

class B:
def bar(self): self.__x = 'mod2'

And the test is:

from mod1 import B as B1
from mod2 import B as B2

class A(B1, B2): pass

a = A()
a.foo()
print a._B__x
a.bar()
print a._B__x

Sure enough, mod2 messes up mod1's private variable.
When faced with this situation, is there any way to proceed
besides using composition instead?

--
Neil Cerutti
We've got to pause and ask ourselves: How much clean air do we really need?
--Lee Iacocca
Jan 8 '07 #46

P: n/a
On 2007-01-08, hg <hg@nospam.orgwrote:
sturlamolden wrote:
>The designers of Java, C++, C#, Ada95, Delphi, etc. seem to think that
if an object's 'internal' variables or states cannot be kept private,
programmers get an irresistible temptation to mess with them in
malicious ways. But if you are that stupid, should you be programming
in any language? The most widely used language is still C, and there is
no concept of private data in C either, nor is it needed.


void test(void)
{
static int i;
}
Do you agree that i is "private" to test ?
In C one uses the pointer to opaque struct idiom to hide data.
For example, the standard FILE pointer.

--
Neil Cerutti
Jan 8 '07 #47

P: n/a
Neil Cerutti kirjoitti:
On 2007-01-08, hg <hg@nospam.orgwrote:
>sturlamolden wrote:
>>The designers of Java, C++, C#, Ada95, Delphi, etc. seem to think that
if an object's 'internal' variables or states cannot be kept private,
programmers get an irresistible temptation to mess with them in
malicious ways. But if you are that stupid, should you be programming
in any language? The most widely used language is still C, and there is
no concept of private data in C either, nor is it needed.

void test(void)
{
static int i;
}
Do you agree that i is "private" to test ?

In C one uses the pointer to opaque struct idiom to hide data.
For example, the standard FILE pointer.
To surlamolden: I don't know how you define private, but if one defines
in C an external static variable i.e. a variable outside any functions,
on the file level, the scope of the variable is that file only.

To hg: One does not need in C the static keyword to make a variable
defined inside a function i.e. a so called 'automatic variable' private
to that test. Automatic variables are private to their function by
definition. The static keyword makes the variable permanent i.e. it
keeps its value between calls but it is of course private also.

To Neil Cerutti: If a programmer in C has got a pointer to some piece of
memory, that piece is at the mercy of the programmer. There's no data
hiding at all in this case.

To whom it may concern: please stop comparing C and Python with regard
to privacy and safety. They are two different worlds altogether. Believe
me: I've been in this world for 2.5 years now after spending 19 years in
the C world.

Cheers,
Jussi
Jan 8 '07 #48

P: n/a
Private data in the C++ and Java OO worlds is so taught so much and
emphasized so often that people have started thinking of it as being
desirable for its own sake. But the primary motivation for it grew out
of the need to maintain compatible interfaces. These languages rely on
a great deal of shared information between provides and clients of
interfaces in order to work correctly - public/private interfaces are
simply a reflection of that requirement (and the fact that your
clients still need to see the stuff you declare as private is an
example of a leak in that abstraction).

Python doesn't have these problems, so the only use for private
information is to warn your clients away from access to certain names.
There's no need for compiler enforcement of that, as a convention is
just as effective.

The remaining arguments are generally outgrowths of "but my code is
SECRET", which just isn't true in general, even less true of Python,
and not really a healthy attitude anyway.
Jan 8 '07 #49

P: n/a
Chris Mellon wrote:
Private data in the C++ and Java OO worlds is so taught so much and
emphasized so often that people have started thinking of it as being
desirable for its own sake. But the primary motivation for it grew out
of the need to maintain compatible interfaces.
This is generally true, yes.

[...]
Python doesn't have these problems, so the only use for private
information is to warn your clients away from access to certain names.
There's no need for compiler enforcement of that, as a convention is
just as effective.
You'll have to be more clear, here. If you're writing a subclass of
some other class then any usage of private attributes in the superclass
potentially provides the benefit of a free choice in attribute names in
the subclass. If you wanted to warn people away from certain names, it
would be the public attributes that would require the warning, noting
that "your clients" in this context includes people extending classes
as well as those merely instantiating and using them.
The remaining arguments are generally outgrowths of "but my code is
SECRET", which just isn't true in general, even less true of Python,
and not really a healthy attitude anyway.
I don't care about secret attributes, and the namespace privacy aspect
doesn't bother me enough to use private attributes anyway, especially
since I'm the author of most of the superclasses I'm extending. But
either providing namespace privacy or convenient tools to mitigate
namespace sharing seems fairly important to me, at least.

Paul

Jan 8 '07 #50

63 Replies

This discussion thread is closed

Replies have been disabled for this discussion.