473,386 Members | 1,842 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Why does python not have a mechanism for data hiding?

Hi,

first, python is one of my fav languages, and i'll definitely keep
developing with it. But, there's 1 one thing what I -really- miss:
data hiding. I know member vars are private when you prefix them with
2 underscores, but I hate prefixing my vars, I'd rather add a keyword
before it.

Python advertises himself as a full OOP language, but why does it miss
one of the basic principles of OOP? Will it ever be added to python?

Thanks in advance,
Lucas
Jun 27 '08
162 10036
On Jun 3, 9:41 am, "Russ P." <Russ.Paie...@gmail.comwrote:
Here's what I think Python should have. I think it should have a
keyword, something like "priv," to identify data or functions as
"private."
As I stated earlier in this thread, if you want a public interface and
a private implementation, rather than adding another language feature
why not just separate them into two classes? This is exactly what the
Bridge pattern provides and would clearly denote your intention in the
code.
Jun 27 '08 #51
On Jun 2, 6:21 pm, alex23 <wuwe...@gmail.comwrote:
On Jun 3, 9:41 am, "Russ P." <Russ.Paie...@gmail.comwrote:
Here's what I think Python should have. I think it should have a
keyword, something like "priv," to identify data or functions as
"private."

As I stated earlier in this thread, if you want a public interface and
a private implementation, rather than adding another language feature
why not just separate them into two classes? This is exactly what the
Bridge pattern provides and would clearly denote your intention in the
code.
Yes, that looks interesting, but I think it has a couple of drawbacks.
First, it requires another completely separate class for the
"implementation" (although perhaps that could be a nested class).
Secondly, I think it essentially just adds a sort of inner namespace
through which the "private" data is accessed. That might be a good
idea, but I don't think it's quite the same as encapsulation.
Jun 27 '08 #52
On Jun 2, 5:11 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
"Russ P." <Russ.Paie...@gmail.comwrites:
I also realize, by the way, that Python allows a client of a class to
define a new class member from completely outside the class
definition. Obviously, that cannot be declared private.

This is bogus about 95% of the time though. For the cases where it is
really desired, I think it's best to require the target class to be
enable it specifically somehow, maybe by inheriting from a special
superclass. That could let the compiler statically resolve member
lookups the rest of the time.
It did seem a bit odd to me when I realized that you can add data
members (or even a "methods") to a class from completely outside the
class definition. That can be risky, of course, and as you suggest,
perhaps it shouldn't even be allowed by default.

I usually find that it's safer to initialize in the constructor all
(or nearly all) of the data members that will be needed in a class. If
I need a list that will be populated later, for example, I reserve the
name with an empty list in the constructor. Then, if for some reason
the list gets accessed before it is populated, I don't get an
exception.
Jun 27 '08 #53
On Jun 3, 2:11 pm, "Russ P." <Russ.Paie...@gmail.comwrote:
Yes, that looks interesting, but I think it has a couple of drawbacks.
First, it requires another completely separate class for the
"implementation" (although perhaps that could be a nested class).
That's hardly an enormous overhead, and it does clearly separate the
interface you want your "users" to have from the implementation. Even
better, since you seem so concerned with others meddling with your
implementation directly, they could provide their own quite easily if
they so choose.
Secondly, I think it essentially just adds a sort of inner namespace
through which the "private" data is accessed. That might be a good
idea, but I don't think it's quite the same as encapsulation.
It's a clear separation of concerns, check. It removes the underscored
methods you find so aesthetically offensive, check. I have absolutely
no idea what _you_ mean by "encapsulation".

Then again, I have no issue with the current convention and personally
find the idea of adding a "private" keyword makes as much sense as
being able to syntactically define "model", "view" and "controller"
methods.
Jun 27 '08 #54
On Jun 2, 10:23 pm, alex23 <wuwe...@gmail.comwrote:
On Jun 3, 2:11 pm, "Russ P." <Russ.Paie...@gmail.comwrote:
Yes, that looks interesting, but I think it has a couple of drawbacks.
First, it requires another completely separate class for the
"implementation" (although perhaps that could be a nested class).

That's hardly an enormous overhead, and it does clearly separate the
interface you want your "users" to have from the implementation. Even
better, since you seem so concerned with others meddling with your
implementation directly, they could provide their own quite easily if
they so choose.
Secondly, I think it essentially just adds a sort of inner namespace
through which the "private" data is accessed. That might be a good
idea, but I don't think it's quite the same as encapsulation.

It's a clear separation of concerns, check. It removes the underscored
methods you find so aesthetically offensive, check. I have absolutely
no idea what _you_ mean by "encapsulation".

Then again, I have no issue with the current convention and personally
find the idea of adding a "private" keyword makes as much sense as
being able to syntactically define "model", "view" and "controller"
methods.
I'm not saying your bridge pattern is no good. It is a form of
encapsulation, I suppose, but it is cumbersome. With my suggested
"priv" keyword, you just use it once (per data member or function) and
you never need to use it again. With your approach, not only do you
need twice the number of classes, you need to use the longer names
every time you access the data.
Jun 27 '08 #55
On Jun 2, 10:23 pm, alex23 <wuwe...@gmail.comwrote:
Then again, I have no issue with the current convention and personally
find the idea of adding a "private" keyword makes as much sense as
being able to syntactically define "model", "view" and "controller"
methods.
Well, the designers of C++, Java, and Ada, to name just three very
popular languages (well, two) seem to think it makes sense. But maybe
you know more than they know.

Jun 27 '08 #56
On Jun 3, 3:42 pm, "Russ P." <Russ.Paie...@gmail.comwrote:
Well, the designers of C++, Java, and Ada, to name just three very
popular languages (well, two) seem to think it makes sense. But maybe
you know more than they know.
You do realise the same argument could be made about you and the
designers of Python, right?

But regardless, an argument from authority is a sure sign that this
isn't worth continued debate.

Good luck with the PEP.
Jun 27 '08 #57
On 2008-06-03, Carl Banks <pa************@gmail.comwrote:
On Jun 2, 10:14 am, Antoon Pardon <apar...@forel.vub.ac.bewrote:
>On 2008-06-02, Carl Banks <pavlovevide...@gmail.comwrote:
>Can't you look beyond the specific example? The GetX is just an example.
Any local function of __init__ has access to hidden and its attributes
and could manipulate them, even if the class wouldn't define getters
and setters.

Ok. You could have made the proof-of-concept-ness of your example
more clear by not, you know, binding those local functions to globally
accessable names, thus thwarting your own concept.

But still, we all knew functions could do that.
Well maybe you all knew that, yet while I read the thread, I saw
noone mentioning this possibility to get what the OP seemed to
want. I also did't see other posibilities. Your idea as an alternative
only came as a reaction to my idea.

I know my idea as presented was cumbersome. Maybe it can be worked out
more, maybe with the help of decorators to be less cumbersome.

I also know it can be circumvented in CPython.
Yet I think is was usefull to explore what is already possible in Python
instead of just argueing about whether people need it or not
I always find it odd when the "you don't need it" argument is used.
Of course that is correct, put python has acquired a lot of things
that were originnally labeled "you don't need it". So why people
still use it against introducing something, is something I don't
entirely understand.

--
Antoon Pardon
Jun 27 '08 #58
On Mon, Jun 2, 2008 at 10:50 PM, Russ P. <Ru**********@gmail.comwrote:
On Jun 2, 6:41 am, Carl Banks <pavlovevide...@gmail.comwrote:
>You are not realizing that only useful(**) thing about data hiding is
that some code has access to the data, other code does not. If you
"hide" data equally from everyone it's just a useless spelling change.

I think you're missing the point.

As I see it, the primary value of data hiding is that it provides
useful information on which data and methods are intended for the
client and which are intended for internal use. It's like putting a
front panel on a TV set with the main controls intended for the
viewer.
Here's my two cents. First of all, a TV is a bad analogy compared to
reusable software libraries. Really bad analogy. A TV is a horribly
complicated device which has to be dumbed down because otherwise it
would be to hard to use for ordinary people.

A software developers relation to a third party library is more
similar to a TV repair man trying to repair a TV than to a random
person watching TV. For a repair man, the front panel is just useless
and in the way.

Oh, and to continue on the TV analogy, one of the reason why a TV is
complicated is because its interface is totally different from its
implementation. Channels are just a bad abstraction for tuning the
receiver to different frequencies and for switching inputs. Merely
using a TV doesn't teach you anything about how it actually works.

KISS: Keep It Simple Stupid. And it is always simpler to not implement
the gunk needed for data hiding than to do it. By keeping things
simple you keep your code easy to implement, easy to understand and
easy to reuse.

Data hiding sacrifices implementation simplicity supposedly to make
the interface simpler and to keep backwards compatibility. It allows
you to change implementation details without affecting the
interface. But do you really want to do that? Consider this silly Java
example:

class Foo {
private int bar;
public int getBar() {
return bar;
}
};

Then for some reason you decide that hm, "bar" is not a good attribute
name so you change it to "babar". And you can do that without changing
the public interface! Woho! So now you have a public getter named
"getBar" that returns an attribute named "babar". That's in reality
just bad and whoever is maintaining the implementation is going to be
annoyed that the getters name doesn't match the attribute name.

What would have happened without data hiding? Renaming the public
attribute "bar" to "babar" probably cause some grief for someone
reusing your library, but you would keep your implementation pure.

What about semantic changes? Data hiding doesn't protect you against
that, so you'll have to change your interface anyway. The interface
for a car hasn't changed much in the last 100 years, but the
implementation has. How easy is it to repair a car nowadays compared
to 30 years ago?

And data hiding as a documentation aid is just a sham. "These methods
are public so you can call them, these aren't so hands off!" A reuser
of your library *will* want to know what happens on the inside, by
trying to make stuff impossible to reach you are just making that kind
of information much harder to come by.

The better method is to just write proper docstrings that tell the
user what the methods do and when they can be called.

Another good way to see how useless data hiding is, is to try and unit
test a very encapsulated library. You'll see that it is almost
impossible to write good unit tests unless you publicly export
almost everything in the code. At which point you come to realize that
all the data hiding was for naught.
--
mvh Björn
Jun 27 '08 #59
Lie
On May 24, 9:14*pm, Fuzzyman <fuzzy...@gmail.comwrote:
On May 24, 2:58 pm, Ben Finney <bignose+hates-s...@benfinney.id.au>
wrote:
Sh4wn <luckyluk...@gmail.comwrites:
first, python is one of my fav languages, and i'll definitely keep
developing with it. But, there's 1 one thing what I -really- miss:
data hiding. I know member vars are private when you prefix them with
2 underscores, but I hate prefixing my vars, I'd rather add a keyword
before it.
From whom are you trying to hide your attributes?

Actually, 'data hiding', although vastly overused by the static crowd
can be a reasonable thing to want.

For example, at Resolver Systems we expose the spreadsheet object
model to our users. It hasa public, documented, API - plus a host of
undocumented internally used methods.

We would really *much* rather hide these, because anything our
customers start using (whether documented or not) we will probably
have to continue supporting and maintaining.

The 'we told you not to use that' approach, when applied to paying
customers doesn't really work... all they see is that you broke their
spreadsheet code by changing your API.

You can make members truly private by proxying, but it is a bit
ungainly.
Then don't document it, or separate internal documentation (which is
never to pass through the wall) and public documentation (which your
users use). Nobody would (apart from your dev team and anyone told by
your dev team, which means you may fire the person for "lack of
discipline") know that there is such a thing and in consequence
wouldn't use it.

Don't tell your user not to use something, just don't tell them that
it exists and they won't use it.
Jun 27 '08 #60
On Jun 3, 1:42*am, "Russ P." <Russ.Paie...@gmail.comwrote:
On Jun 2, 10:23 pm, alex23 <wuwe...@gmail.comwrote:
Then again, I have no issue with the current convention and personally
find the idea of adding a "private" keyword makes as much sense as
being able to syntactically define "model", "view" and "controller"
methods.

Well, the designers of C++, Java, and Ada, to name just three very
popular languages (well, two) seem to think it makes sense. But maybe
you know more than they know.
And even more (well, almost all) languages use explicit delimiters for
defining blocks instead of indentation, so what's your point ?
Jun 27 '08 #61
Lie
On Jun 3, 5:07*pm, "BJörn Lindqvist" <bjou...@gmail.comwrote:
On Mon, Jun 2, 2008 at 10:50 PM, Russ P. <Russ.Paie...@gmail.comwrote:
On Jun 2, 6:41 am, Carl Banks <pavlovevide...@gmail.comwrote:
You are not realizing that only useful(**) thing about data hiding is
that some code has access to the data, other code does not. *If you
"hide" data equally from everyone it's just a useless spelling change.
I think you're missing the point.
As I see it, the primary value of data hiding is that it provides
useful information on which data and methods are intended for the
client and which are intended for internal use. It's like putting a
front panel on a TV set with the main controls intended for the
viewer.

Here's my two cents. First of all, a TV is a bad analogy compared to
reusable software libraries. Really bad analogy. A TV is a horribly
complicated device which has to be dumbed down because otherwise it
would be to hard to use for ordinary people.
I think it's actually quite a good analogy, a class may get quite
complicated and it may do completely different things that the public
interface seems to imply. Anyway, an analogy is an analogy, don't
expect it to be exactly the same as the case itself, expect it to
illustrate the point well enough and ignore the differences not being
illustrated.

TV is a good analogy since it illustrated the point quite well, that
there are some things user may freely interact, some that users should
not mess with, and things that is strictly not for you. Nevertheless,
with the proper knowledge and proper tools, any users could open the
case and get the special screwdriver to open the TV, if all else
fails, he could always get a hammer to break the casing and gone his
way through.

Python does not enforce data-hiding because it expect people that gone
his way to ignore the warning and get the special screwdriver to be
knowledgeable enough to mess with it. C/C++ expects people to use
hammer to break through their casings, and in the end, since the
casings has already been broken, the device may never look as
beautiful as before. In python, the device may appear to look just as
beautiful.
A software developers relation to a third party library is more
similar to a TV repair man trying to repair a TV than to a random
person watching TV. For a repair man, the front panel is just useless
and in the way.
No, for a knowledgable man (a TV repairman), he'd try first to fix the
TV without opening the case (such as seeing whether the power cable is
actually plugged), and if those attempts fails (or if he already know
where the damage is from the beginning), he'd then open the screws.
The public interface isn't "useless and in the way".
Oh, and to continue on the TV analogy, one of the reason why a TV is
complicated is because its interface is totally different from its
implementation. Channels are just a bad abstraction for tuning the
receiver to different frequencies and for switching inputs. Merely
using a TV doesn't teach you anything about how it actually works.
Why couldn't a class have interface that's completely different thing
than the implementation.
KISS: Keep It Simple Stupid. And it is always simpler to not implement
the gunk needed for data hiding than to do it. By keeping things
simple you keep your code easy to implement, easy to understand and
easy to reuse.

Data hiding sacrifices implementation simplicity supposedly to make
the interface simpler and to keep backwards compatibility. It allows
you to change implementation details without affecting the
interface. But do you really want to do that? Consider this silly Java
example:

* * class Foo {
* * * * private int bar;
* * * * public int getBar() {
* * * * * * return bar;
* * * * }
* * };

Then for some reason you decide that hm, "bar" is not a good attribute
name so you change it to "babar". And you can do that without changing
the public interface! Woho! So now you have a public getter named
"getBar" that returns an attribute named "babar". That's in reality
just bad and whoever is maintaining the implementation is going to be
annoyed that the getters name doesn't match the attribute name.

What would have happened without data hiding? Renaming the public
attribute "bar" to "babar" probably cause some grief for someone
reusing your library, but you would keep your implementation pure.

What about semantic changes? Data hiding doesn't protect you against
that, so you'll have to change your interface anyway. The interface
for a car hasn't changed much in the last 100 years, but the
implementation has. How easy is it to repair a car nowadays compared
to 30 years ago?

And data hiding as a documentation aid is just a sham. "These methods
are public so you can call them, these aren't so hands off!" A reuser
of your library *will* want to know what happens on the inside, by
trying to make stuff impossible to reach you are just making that kind
of information much harder to come by.
And we expect those people is ready that the car may blow off right in
their face since they have violated the lines. If they broke the lines
and still think that we're guilty for his burnt face, that's their
problem.
The better method is to just write proper docstrings that tell the
user what the methods do and when they can be called.

Another good way to see how useless data hiding is, is to try and unit
test a very encapsulated library. You'll see that it is almost
impossible to write good unit tests unless you publicly export
almost everything in the code. At which point you come to realize that
all the data hiding was for naught.

--
mvh Björn
Jun 27 '08 #62
On 2008-06-03, George Sakkis <ge***********@gmail.comwrote:
On Jun 3, 1:42*am, "Russ P." <Russ.Paie...@gmail.comwrote:
>On Jun 2, 10:23 pm, alex23 <wuwe...@gmail.comwrote:
Then again, I have no issue with the current convention and personally
find the idea of adding a "private" keyword makes as much sense as
being able to syntactically define "model", "view" and "controller"
methods.

Well, the designers of C++, Java, and Ada, to name just three very
popular languages (well, two) seem to think it makes sense. But maybe
you know more than they know.

And even more (well, almost all) languages use explicit delimiters for
defining blocks instead of indentation, so what's your point ?
Hmm, difficult to react to this. On the one hand I have had people
argue that block delimiting in python is explicit too. So in that
case python doesn't differ from those other languages.

On the other hand if we accept that blocks are delimited implicitely
in python then it seems python doesn't follow its own zen:

Explicit is better than implicit

--
Antoon Pardon
Jun 27 '08 #63
>
Hmm, difficult to react to this. On the one hand I have had people
argue that block delimiting in python is explicit too. So in that
case python doesn't differ from those other languages.

On the other hand if we accept that blocks are delimited implicitely
in python then it seems python doesn't follow its own zen:

Explicit is better than implicit
So also duck typing is against python's philosophy? :)
Jun 27 '08 #64
Lie
On May 24, 9:14*pm, Fuzzyman <fuzzy...@gmail.comwrote:
On May 24, 2:58 pm, Ben Finney <bignose+hates-s...@benfinney.id.au>
wrote:
Sh4wn <luckyluk...@gmail.comwrites:
first, python is one of my fav languages, and i'll definitely keep
developing with it. But, there's 1 one thing what I -really- miss:
data hiding. I know member vars are private when you prefix them with
2 underscores, but I hate prefixing my vars, I'd rather add a keyword
before it.
From whom are you trying to hide your attributes?

Actually, 'data hiding', although vastly overused by the static crowd
can be a reasonable thing to want.

For example, at Resolver Systems we expose the spreadsheet object
model to our users. It hasa public, documented, API - plus a host of
undocumented internally used methods.

We would really *much* rather hide these, because anything our
customers start using (whether documented or not) we will probably
have to continue supporting and maintaining.

The 'we told you not to use that' approach, when applied to paying
customers doesn't really work... all they see is that you broke their
spreadsheet code by changing your API.
The problem is you're not hard enough, you let yourself to be enslaved
by your customer. If they have a problem because they used a private
interface, that's their problem, they have to fix it at their side or
go away and use a competing product[1]. Even if they're paying
customers they're not your master or your God, even if they're a
larger company than yours.

Python has an extremely good design because the BDFL doesn't just
listen to everyone and create a product that tries to please
everybody, no, he listens to those that have good ideas and tells the
stupid ideas to go away and he applies a subjective decision which
more often than not leads to a better python.

[1] In most cases, they would become silent at this point and fix
their code, because they know there is nothing they can do to change
your decision. It's often more expensive to move to competing products
so they'd either use old versions or fix those places where they've
used private interface, and avoid using private interface in the
future.
You can make members truly private by proxying, but it is a bit
ungainly.

Michael Foordhttp://www.ironpythoninaction.com/

(snip)
Jun 27 '08 #65
On May 25, 9:50 pm, miller.pau...@gmail.com wrote:
On May 24, 9:41 am, Sh4wn <luckyluk...@gmail.comwrote:
Python advertises himself as a full OOP language, but why does it miss
one of the basic principles of OOP? Will it ever be added to python?

Others have already answered this directly, but I'd like to mention
that languages I know of which have this feature also have a feature
for getting around it. (e.g. C++ and friend classes) I don't know
about you, but I don't want features in the language that make me want
to circumvent them. Do you?
I curious as how 'private' fits with the Open Source philosophy?
Sure, I can (and do) hide stuff with the double underscore technique,
but anyone using my code can open it up and add an "accessor" method
anytime they want, so nothing is really hidden. I think the
"consenting adults" approach is the best one could hope for with FOSS.
Jun 27 '08 #66
Russ P. a écrit :
On Jun 2, 6:41 am, Carl Banks <pavlovevide...@gmail.comwrote:
>You are not realizing that only useful(**) thing about data hiding is
that some code has access to the data, other code does not. If you
"hide" data equally from everyone it's just a useless spelling change.

I think you're missing the point.

As I see it, the primary value of data hiding is that it provides
useful information on which data and methods are intended for the
client and which are intended for internal use. It's like putting a
front panel on a TV set with the main controls intended for the
viewer.

People seem to be preoccupied with whether or not the back panel of
the TV is locked, but that is not the main issue. Sure, you probably
want to make the back panel removable, but you don't want the viewer
opening it up to change the channel, and you certainly don't want to
put all the internal adjustments for factory technicians together with
the controls for the end user.

As far as I am concerned, the current Python method of using
underscores to distinguish between internal and external methods and
data is an ugly hack that goes completely against the elegance of the
language in other areas.
As far as I'm concerned, it's JustFine(tm). I don't have to ask myself
if an attribute is part of the API or not, I know it immediatly.
It is like a TV set with no back cover and
the volume and channel controls intermingled with the factory
controls. The underscores are just an afterthought like a red dot or
something used to tell the TV viewer what to fiddle with.
Your opinion. But beware of leaky TV-Set-metaphor abstractions
Python is a very nice language overall, but as far as I am concerned
the underscore convention is a blemish. I just wish people wouldn't
get so infatuated with the language that they cannot see the obvious
staring them in the face.
I definitively don't have problem with this naming convention, which I'd
find useful ever with a language having enforced access restrictions. If
that's the only - or worse - wart you find in Python, then it must
surely be a pretty good language !-)
Jun 27 '08 #67
On May 24, 3:41 pm, Sh4wn <luckyluk...@gmail.comwrote:
first, python is one of my fav languages, and i'll definitely keep
developing with it. But, there's 1 one thing what I -really- miss:
data hiding. I know member vars are private when you prefix them with
2 underscores, but I hate prefixing my vars, I'd rather add a keyword
before it.
Python has no data hiding because C++ has (void *).

Python underscores does some name mangling, but does not attempt any
data hiding.

Python and C has about the same approach to data hiding. It is well
tried, and works equally well in both languages:

# this is mine, keep your filthy paws off!!!

Irresponsible programmers should not be allowed near a computer
anyway. If you use data hiding to protect your code from yourself,
what you really need is some time off to reconsider your career.

Jun 27 '08 #68
On Jun 3, 3:07 am, "BJörn Lindqvist" <bjou...@gmail.comwrote:
On Mon, Jun 2, 2008 at 10:50 PM, Russ P. <Russ.Paie...@gmail.comwrote:
On Jun 2, 6:41 am, Carl Banks <pavlovevide...@gmail.comwrote:
You are not realizing that only useful(**) thing about data hiding is
that some code has access to the data, other code does not. If you
"hide" data equally from everyone it's just a useless spelling change.
I think you're missing the point.
As I see it, the primary value of data hiding is that it provides
useful information on which data and methods are intended for the
client and which are intended for internal use. It's like putting a
front panel on a TV set with the main controls intended for the
viewer.

Here's my two cents. First of all, a TV is a bad analogy compared to
reusable software libraries. Really bad analogy. A TV is a horribly
complicated device which has to be dumbed down because otherwise it
would be to hard to use for ordinary people.

A software developers relation to a third party library is more
similar to a TV repair man trying to repair a TV than to a random
person watching TV. For a repair man, the front panel is just useless
and in the way.

Oh, and to continue on the TV analogy, one of the reason why a TV is
complicated is because its interface is totally different from its
implementation. Channels are just a bad abstraction for tuning the
receiver to different frequencies and for switching inputs. Merely
using a TV doesn't teach you anything about how it actually works.

KISS: Keep It Simple Stupid. And it is always simpler to not implement
the gunk needed for data hiding than to do it. By keeping things
simple you keep your code easy to implement, easy to understand and
easy to reuse.

Data hiding sacrifices implementation simplicity supposedly to make
the interface simpler and to keep backwards compatibility. It allows
you to change implementation details without affecting the
interface. But do you really want to do that? Consider this silly Java
example:

class Foo {
private int bar;
public int getBar() {
return bar;
}
};

Then for some reason you decide that hm, "bar" is not a good attribute
name so you change it to "babar". And you can do that without changing
the public interface! Woho! So now you have a public getter named
"getBar" that returns an attribute named "babar". That's in reality
just bad and whoever is maintaining the implementation is going to be
annoyed that the getters name doesn't match the attribute name.

What would have happened without data hiding? Renaming the public
attribute "bar" to "babar" probably cause some grief for someone
reusing your library, but you would keep your implementation pure.

What about semantic changes? Data hiding doesn't protect you against
that, so you'll have to change your interface anyway. The interface
for a car hasn't changed much in the last 100 years, but the
implementation has. How easy is it to repair a car nowadays compared
to 30 years ago?

And data hiding as a documentation aid is just a sham. "These methods
are public so you can call them, these aren't so hands off!" A reuser
of your library *will* want to know what happens on the inside, by
trying to make stuff impossible to reach you are just making that kind
of information much harder to come by.

The better method is to just write proper docstrings that tell the
user what the methods do and when they can be called.

Another good way to see how useless data hiding is, is to try and unit
test a very encapsulated library. You'll see that it is almost
impossible to write good unit tests unless you publicly export
almost everything in the code. At which point you come to realize that
all the data hiding was for naught.

--
mvh Björn
I really like this message and find it very true. Writing unit tests
for private data is nigh impossible. You end up either creating
accessors, or passing in parameters via the constructor (resulting in
a huge constructor). Personally, I'd rather have better test coverage
than data hiding.

Second, private vars with third party libs suck, and are nothing but
an infuriating frustration. I'm currently dealing with about 3 or 4
different libs, one of them uses private variables and its a huge
headache. I have to access some of those private vars occasionally to
make my thing work. The other libs i'm using don't have any private
vars (__) (only a couple protected ones, _), and its a breeze. The
docs say "this does x" or there's a comment that says "don't use this
unless you really know what you're doing," and I respect their
warnings.

When I was fooling around with sqlalchemy, it made heavy use of
protected vars but had a straight forward public api. Unfortunately,
writing plugins for it required access to some of those protected
vars. It wouldn't be possible if they were strictly controlled and
restricted by the language itself. Whenever I'd use those protected
vars, I expected an odd behavior or two. When using private vars, I
don't expect it to work at all, and really, refrain from using them
unless i've grokked the source.

My point is that I currently like the private/protected/public scheme
python has going on. It lets me fix or alter things if I have to, but
also provides a warning that I shouldn't be doing this.

As for customers using the internals and worrying about an upgrade
breaking them, it seems likes a silly issue, at least in python. If
there are internals that the customer would be playing with, then it
should be exposed publically, since they want it that way to begin
with. If they're using defunct variables or methods, you use
properties and __getattr__ to maintain backwards compatibility for a
version or two.
Jun 27 '08 #69
On Jun 2, 12:40 pm, Antoon Pardon <apar...@forel.vub.ac.bewrote:
I think you completed missed the point.

This is just a proof of concept thing. In a real example there would
of course no Set en Get methods but just methods that in the course
of their execution would access or update the hidden attributes
I have to agree with Banks here, you have not provided an example of
data hiding. It does not discriminate between attribute access from
within and from outside the class. You just assume that the attribute
named 'hidden' will be left alone. Also naming it hidden is stupid as
it is visible.

What you need is a mechanism that will thrown an exception whenever an
attribue is accessed from outside the class, but not from inside.

The mechanism must also be impossible to override with additional
code.

If Ada is what you want, Ada is what you should use.



Jun 27 '08 #70
On Jun 3, 4:21 am, George Sakkis <george.sak...@gmail.comwrote:
On Jun 3, 1:42 am, "Russ P." <Russ.Paie...@gmail.comwrote:
On Jun 2, 10:23 pm, alex23 <wuwe...@gmail.comwrote:
Then again, I have no issue with the current convention and personally
find the idea of adding a "private" keyword makes as much sense as
being able to syntactically define "model", "view" and "controller"
methods.
Well, the designers of C++, Java, and Ada, to name just three very
popular languages (well, two) seem to think it makes sense. But maybe
you know more than they know.

And even more (well, almost all) languages use explicit delimiters for
defining blocks instead of indentation, so what's your point ?
You are comparing a syntactic convention with a more fundmaental
aspect of the language. But beyond that, I dislike braces as
delimiters for the same reason I dislike leading underscores: both are
unnecessary syntactic noise. And the whole idea of encoding properties
of an object in its name just seems tacky to me.

What is it about leading underscores that bothers me? To me, they are
like a small pebble in your shoe while you are on a hike. Yes, you can
live with it, and it does no harm, but you still want to get rid of it.
Jun 27 '08 #71
On Jun 3, 11:02 am, Richard Levasseur <richard...@gmail.comwrote:
On Jun 3, 3:07 am, "BJörn Lindqvist" <bjou...@gmail.comwrote:
On Mon, Jun 2, 2008 at 10:50 PM, Russ P. <Russ.Paie...@gmail.comwrote:
On Jun 2, 6:41 am, Carl Banks <pavlovevide...@gmail.comwrote:
>You are not realizing that only useful(**) thing about data hiding is
>that some code has access to the data, other code does not. If you
>"hide" data equally from everyone it's just a useless spelling change..
I think you're missing the point.
As I see it, the primary value of data hiding is that it provides
useful information on which data and methods are intended for the
client and which are intended for internal use. It's like putting a
front panel on a TV set with the main controls intended for the
viewer.
Here's my two cents. First of all, a TV is a bad analogy compared to
reusable software libraries. Really bad analogy. A TV is a horribly
complicated device which has to be dumbed down because otherwise it
would be to hard to use for ordinary people.
A software developers relation to a third party library is more
similar to a TV repair man trying to repair a TV than to a random
person watching TV. For a repair man, the front panel is just useless
and in the way.
Oh, and to continue on the TV analogy, one of the reason why a TV is
complicated is because its interface is totally different from its
implementation. Channels are just a bad abstraction for tuning the
receiver to different frequencies and for switching inputs. Merely
using a TV doesn't teach you anything about how it actually works.
KISS: Keep It Simple Stupid. And it is always simpler to not implement
the gunk needed for data hiding than to do it. By keeping things
simple you keep your code easy to implement, easy to understand and
easy to reuse.
Data hiding sacrifices implementation simplicity supposedly to make
the interface simpler and to keep backwards compatibility. It allows
you to change implementation details without affecting the
interface. But do you really want to do that? Consider this silly Java
example:
class Foo {
private int bar;
public int getBar() {
return bar;
}
};
Then for some reason you decide that hm, "bar" is not a good attribute
name so you change it to "babar". And you can do that without changing
the public interface! Woho! So now you have a public getter named
"getBar" that returns an attribute named "babar". That's in reality
just bad and whoever is maintaining the implementation is going to be
annoyed that the getters name doesn't match the attribute name.
What would have happened without data hiding? Renaming the public
attribute "bar" to "babar" probably cause some grief for someone
reusing your library, but you would keep your implementation pure.
What about semantic changes? Data hiding doesn't protect you against
that, so you'll have to change your interface anyway. The interface
for a car hasn't changed much in the last 100 years, but the
implementation has. How easy is it to repair a car nowadays compared
to 30 years ago?
And data hiding as a documentation aid is just a sham. "These methods
are public so you can call them, these aren't so hands off!" A reuser
of your library *will* want to know what happens on the inside, by
trying to make stuff impossible to reach you are just making that kind
of information much harder to come by.
The better method is to just write proper docstrings that tell the
user what the methods do and when they can be called.
Another good way to see how useless data hiding is, is to try and unit
test a very encapsulated library. You'll see that it is almost
impossible to write good unit tests unless you publicly export
almost everything in the code. At which point you come to realize that
all the data hiding was for naught.
--
mvh Björn

I really like this message and find it very true. Writing unit tests
for private data is nigh impossible. You end up either creating
accessors, or passing in parameters via the constructor (resulting in
a huge constructor). Personally, I'd rather have better test coverage
than data hiding.

Second, private vars with third party libs suck, and are nothing but
an infuriating frustration. I'm currently dealing with about 3 or 4
different libs, one of them uses private variables and its a huge
headache. I have to access some of those private vars occasionally to
make my thing work. The other libs i'm using don't have any private
vars (__) (only a couple protected ones, _), and its a breeze. The
docs say "this does x" or there's a comment that says "don't use this
unless you really know what you're doing," and I respect their
warnings.

When I was fooling around with sqlalchemy, it made heavy use of
protected vars but had a straight forward public api. Unfortunately,
writing plugins for it required access to some of those protected
vars. It wouldn't be possible if they were strictly controlled and
restricted by the language itself. Whenever I'd use those protected
vars, I expected an odd behavior or two. When using private vars, I
don't expect it to work at all, and really, refrain from using them
unless i've grokked the source.

My point is that I currently like the private/protected/public scheme
python has going on. It lets me fix or alter things if I have to, but
also provides a warning that I shouldn't be doing this.

As for customers using the internals and worrying about an upgrade
breaking them, it seems likes a silly issue, at least in python. If
there are internals that the customer would be playing with, then it
should be exposed publically, since they want it that way to begin
with. If they're using defunct variables or methods, you use
properties and __getattr__ to maintain backwards compatibility for a
version or two.
If you think that private data and methods should not be allowed
because they complicate unit testing, then I suggest you take a look
at how unit testing is done is C++, Java, and Ada. They seem to do
just fine. Also, I have stated several times now that "back door"
access should be allowed. That should satisfy any need for access to
"private" data in unit testing.

But I think there is a more fundamental issue here. You complain about
problems with software that uses data encapsulation. So two
possibilities exist here: either the designers of the code were not
smart enough to understand what data or methods the client would need,
or the client is not smart enough to understand what they need. Maybe
the solution is smarter programmers and clients rather than a dumber
language.
Jun 27 '08 #72
On Jun 4, 4:29 am, "Russ P." <Russ.Paie...@gmail.comwrote:
If you think that private data and methods should not be allowed
because they complicate unit testing, then I suggest you take a look
at how unit testing is done is C++, Java, and Ada. They seem to do
just fine.
Nice to put the burden of evidence back onto everyone else, but doing
a bit of searching I found the following "answers" to the question of
unit-testing private functions & methods:
I suggest that tests should be written only for the public methods.
You can use a debugger, probably Carbide. That way you can see
all the variables. Otherwise, write the values to a log or EMCT.
You can make the logging only happen for debug builds if you don't
want the logging in the production code. If you really need to
see the private variables from your code, declare them public in
debug builds.
Problem is testing private functions. Some can be fixed by
promoting private to protected, inheriting the class adding
testing in the class. Others get refactored out the classes
they reside in and get put into their own functor classes[...]
So the basic answers I'm seeing that "do just fine" are:

1. Don't test private functions.
2. Add functionality _to_ the private functions for testing.
3. Change the interface for the purpose of testing.

All of which seem exceptionally inefficient and run counter to the
whole purpose of unit testing.
But I think there is a more fundamental issue here. You complain about
problems with software that uses data encapsulation. So two
possibilities exist here: either the designers of the code were not
smart enough to understand what data or methods the client would need,
or the client is not smart enough to understand what they need. Maybe
the solution is smarter programmers and clients rather than a dumber
language.
This is the most ludicrous argument I've ever heard. Of _course_ we
can't predict every possible usage of our code that others might want
it for. If someone can easily extend code that I've written to improve
or increase its functionality, why would I want to prevent them from
doing so?

Then again, I tend to think of other programmers as "peers" rather
than clients. YMMV.
Jun 27 '08 #73
alex23 <wu*****@gmail.comwrites:
So the basic answers I'm seeing that "do just fine" are:

1. Don't test private functions.
2. Add functionality _to_ the private functions for testing.
3. Change the interface for the purpose of testing.

All of which seem exceptionally inefficient and run counter to the
whole purpose of unit testing.
It seems you have a different idea of what unit testing is for from
me.

Isn't the entire point of encapsulation to separate internal
components from the external interface?

Why would a unit test, the whole purpose of which is to assert some
aspect of the external behaviour of the unit of code, care about how
that code unit is implemented internally?

If changing the internal, encapsulated components of a unit causes its
external behaviour to change, that's a bug; either in the change made
(it shouldn't have altered the external behaviour), or in the unit
test asserting the wrong thing (it shouldn't be asserting anything
about internal state of the code).

--
\ “Try to become not a man of success, but try rather to become |
`\ a man of value.†—Albert Einstein |
_o__) |
Ben Finney
Jun 27 '08 #74
On Jun 3, 8:50 pm, Ben Finney <bignose+hates-s...@benfinney.id.au>
wrote:
alex23 <wuwe...@gmail.comwrites:
So the basic answers I'm seeing that "do just fine" are:
1. Don't test private functions.
2. Add functionality _to_ the private functions for testing.
3. Change the interface for the purpose of testing.
All of which seem exceptionally inefficient and run counter to the
whole purpose of unit testing.

It seems you have a different idea of what unit testing is for from
me.

Isn't the entire point of encapsulation to separate internal
components from the external interface?

Why would a unit test, the whole purpose of which is to assert some
aspect of the external behaviour of the unit of code, care about how
that code unit is implemented internally?

If changing the internal, encapsulated components of a unit causes its
external behaviour to change, that's a bug; either in the change made
(it shouldn't have altered the external behaviour), or in the unit
test asserting the wrong thing (it shouldn't be asserting anything
about internal state of the code).

--
\ “Try to become not a man of success, but try rather to become |
`\ a man of value.” —Albert Einstein|
_o__) |
Ben Finney
Thank you. Let me just add that, as I said before, I think "private"
data (if it were added to Python) should be accessible through some
sort of "indirect" mechanism akin to the double-leading-underscore
rule. Then, even if it *is* needed for unit testing, it can be
accessed.

As for unit testing in C++, Java, and Ada, I confess I know nothing
about it, but I assume it gets done. Considering that Ada is used to
manage and control fighter jets, cruise missiles, and nuclear
arsenals, let's hope it gets done right.
Jun 27 '08 #75
On Wed, 04 Jun 2008 13:50:42 +1000, Ben Finney wrote:
alex23 <wu*****@gmail.comwrites:
>So the basic answers I'm seeing that "do just fine" are:

1. Don't test private functions.
2. Add functionality _to_ the private functions for testing.
3. Change the interface for the purpose of testing.

All of which seem exceptionally inefficient and run counter to the
whole purpose of unit testing.

It seems you have a different idea of what unit testing is for from
me.
For me it's about finding bugs where documentation and implementation
disagree. And if you document private functions it makes sense to me to
also test if they work as documented. Because the official API relies on
the correct implementation of the private parts it uses under the hood.
Isn't the entire point of encapsulation to separate internal
components from the external interface?

Why would a unit test, the whole purpose of which is to assert some
aspect of the external behaviour of the unit of code, care about how
that code unit is implemented internally?
One part of writing unit tests is invoking functions with arguments that
you think are "corner cases". For example test if a function that takes a
list doesn't bomb out when you feed the empty list into it. Or if it
handles all errors correctly.

If a function `f()` calls internally `_g()` and that function might even
call other private functions, then you have to know how `f()` works
internally to create input that checks if error handling in `_g()` works
correctly. So it goes against your understanding of unit tests.

What do you do in such a situation? Build something from untested private
parts and just test the assembled piece? I prefer to test the private
functions too. After all the private functions are not private to the
everybody, there *are* functions that rely on them working correctly.

Ciao,
Marc 'BlackJack' Rintsch
Jun 27 '08 #76
On 2008-06-03, Lie <Li******@gmail.comwrote:
>
Python has an extremely good design because the BDFL doesn't just
listen to everyone and create a product that tries to please
everybody, no, he listens to those that have good ideas and tells the
stupid ideas to go away and he applies a subjective decision which
more often than not leads to a better python.
I agree that Guido van Rossum has done an excellent job. That doesn't
mean he has to be painted as unfailable in which the ideais he accepts
are good ideas and those he rejects are bad ideas almost by definition.

Guido has been known to change his mind, which is an admirabele quality,
but it does show that at some point he rejected a good idea or accepted
a bad idea.

--
Antoon Pardon
Jun 27 '08 #77
On 2008-06-03, sturlamolden <st**********@yahoo.nowrote:
On Jun 2, 12:40 pm, Antoon Pardon <apar...@forel.vub.ac.bewrote:
>I think you completed missed the point.

This is just a proof of concept thing. In a real example there would
of course no Set en Get methods but just methods that in the course
of their execution would access or update the hidden attributes

I have to agree with Banks here, you have not provided an example of
data hiding. It does not discriminate between attribute access from
within and from outside the class. You just assume that the attribute
named 'hidden' will be left alone. Also naming it hidden is stupid as
it is visible.
No I don't assume that hidden wil be left alone. hidden is a free
variable in a closure and thus simply can't be accessed except by
local functions that were made accessible (and some mechanism
dependant on the CPython implementation).
What you need is a mechanism that will thrown an exception whenever an
attribue is accessed from outside the class, but not from inside.
And my example does this. It threw an AttributeError
The mechanism must also be impossible to override with additional
code.
Which as far as I know it is.

--
Antoon Pardon
Jun 27 '08 #78
On 2008-06-04, Marc 'BlackJack' Rintsch <bj****@gmx.netwrote:
>>it makes sense to me to also test if they work as documented.

If they affect the behaviour of some public component, that's where
the documentation should be.

As I said they are public themselves for someone.
Isn't that contradictory: "Public for someone" I always
thought "public" meant accessible to virtually anyone.
Not to only someone.

--
Antoon Pardon
Jun 27 '08 #79
On May 26, 7:32 am, "Joe P. Cool" <joe.p.c...@googlemail.comwrote:
I saw this "don't need it" pattern in discussions about the ternary
"if..else" expression and about "except/finally on the same block
level".
Now Python has both.
if/else was added solely because people kept coming up with ways of
embedding a pseudo conditional inside expressions and writing buggy
code in the process. All it really saves you in practice is a bit of
vertical whitespace, so, no, you still don't need it - but if you
insist on doing it, at least there's now an easy way to do it
correctly.

except/finally on the same block level was trivial to implement once
the reference interpreter switched to an AST based compiler for 2.5.
If you look at the AST, you'll find that it still only has TryExcept
and TryFinally, so again, you still don't need except/finally on the
same block level - all the syntax allows you to do is omit the second
try: line and its associated indentation.
Actually it is very useful to be able to
distinguish
between inside and outside. This is obvious for real world things e.g.
your
TV. Nobody likes to open the rear cover to switch the channel. Similar
arguments apply to software objects. "data hiding" is a harsh name, I
would
call it "telling what matters". The need for this becomes
indispensable in
really big software packages like the Eclipse framework with approx.
100000
classes. If you cannot tell the difference between inside and outside
you
are lost.
In Python, the philosophy "we're all consenting adults here" applies.

Please don't sell a missing feature as a philosophy. Say you don't
need/want
it. But don't call it philosophy.
Gosh, and here I thought treating programmers as non-idiots was
actually one of the guiding philosophies in the discussion on python-
dev. Good thing we have you here to tell us we're only imagining that.
You shouldn't pretend to know, at the time you write it, all the uses
to which your code will be put.

It's *your* *decision* which uses will be available. Your explanation
appears
to me as a fear to decide.
Are you writing application code or library code? For application
code, you have a much greater idea of the uses for your code, so you
can be confident in your decision as to what should and should not be
visible. For library code, however, it's fairly common for a library
to provide something which is almost, but not quite, what the user
needs. Letting users poke around at their own risk is a nice courtesy
that can save them a lot of work in the long run.

So the decision to hide something is still made (by using an
underscore prefix), but an easy mechanism is provided for the library
user to override that decision.
If you want the users of your code to know that an attribute should
not be used as a public API for the code, use the convention of naming
the attribute with a single leading underscore.

Littering your class definition with dozens of underscores is exactly
the
line noise we love to criticize in Perl.
Using underscores in names (leading or otherwise) separated by
plaintext keywords is a far cry from multiple different symbols that
mean different things in different contexts and can be chained
together fairly arbitrarily.
Python advertises himself as a full OOP language, but why does it
miss one of the basic principles of OOP?
Who taught you that enforced restrictions on attribute access was a
"basic principle" of OO?

Nearly every introduction to OOP? Please don't tell me that
encapsulation
does not mean "enforced restriction". If the language has no syntactic
support for encapsulation then it does not have encapsulation.
Module globals aren't visible outside the module without importing it.
Class attributes aren't visible outside the class without derefencing
it.
Instance attributes aren't visible outside an instance without
deferencing one.

*That* is the encapsulation/data hiding which OOP requires, and is the
kind which Python enforces. What you're asking for is encapsulation of
class and instance attributes based on the context in which the
dereferencing occurs (inside the class, inside a subclass of that
class, inside an instance of that class, inside an instance of a
subclass of that class, somewhere else entirely), and that has nothing
to do with the basics of OOP.

On the other hand, if you're so keen on this feature, perhaps you'd
like to make a concrete proposal regarding how you would like the
semantics to work in light of Python dynamic typing model. What will
it do when a method is invoked via the class dict rather than via
attribute retrieval? Can unbound methods access protected or private
attribute? How about descriptor get, set and delete methods? What
happens when a function is added to a class definition after creation
as a new method?

Cheers,
Nick.
Jun 27 '08 #80
On May 26, 2:49 pm, "Russ P." <Russ.Paie...@gmail.comwrote:
I am also bothered a bit by the seeming inconsistency of the rules for
the single underscore. When used at file scope, they make the variable
or function invisible outside the module, but when used at class
scope, the "underscored" variables or functions are still fully
visible. For those who claim that the client should be left to decide
what to use, why is the client prohibited from using underscored
variables at file scope?
They aren't - the only thing that won't see the underscore prefixed
names is "from x import *". If you do "import x" instead, all the
underscored names will be accessible as attributes of the module.
Jun 27 '08 #81
NickC <nc******@gmail.comwrites:
if/else was added solely because people kept coming up with ways of
embedding a pseudo conditional inside expressions and writing buggy
code in the process. All it really saves you in practice is a bit of
vertical whitespace, so, no, you still don't need it - but if you
insist on doing it, at least there's now an easy way to do it
correctly.
Come on, it's more than vertical whitespace, it's extraneous variables
and sometimes even extraneous functions and function call overhead.
And Python is supposed to be unbureaucratic. People kept looking for
ways to write conditional expressions instead of spewing the logic
across multiple statements for a reason: the code is often cleaner
that way.
Jun 27 '08 #82
On Jun 4, 4:09 am, "Russ P." <Russ.Paie...@gmail.comwrote:
What is it about leading underscores that bothers me? To me, they are
like a small pebble in your shoe while you are on a hike. Yes, you can
live with it, and it does no harm, but you still want to get rid of it.
With leading underscores, you can see *at the point of dereference*
that the code is accessing private data. With a "this is private"
keyword you have no idea whether you're accessing private or public
data, because the two namespaces get conflated together.

I'll keep my pebble, thanks.

Cheers,
Nick.
Jun 27 '08 #83
On May 25, 8:01 pm, Fuzzyman <fuzzy...@gmail.comwrote:
Python was not really written with 'difficult' customers in mind ;-)

True. It's extremely suited to what we do though.Minor difficulties
like this are vastly outweighed by advantages. The difficulties are
real though.
It's interesting to take a look at some of the work Brett Cannon has
done trying to come up with a sandbox for executing Python code that
actually manages to block access to dangerous functions like file() or
urllib.urlopen(). Powerful introspection capabilities and restricted
access to methods and attributes don't really play well together.

http://svn.python.org/view/python/br...85&view=markup

(I believe that work is on hiatus while he's been busy with other
projects, such as a more flexible Python-based reimplementation of the
import mechanism that would be make it possible to implement the
security restrictions needed to permit limited imports in a sandboxed
interpreter)
One could largely hide private vars with a program that substituted random
names for single _ names, and removed the doc strings for functions,
classes, and methods with such names.

We need to *use* those names to display the spreadsheet once the
calculation has finished (and their code has run).
Such a program could even put such names in a separate module imported as
'_private_do_not_use_'.

Splitting more of the functionality out is probably part of the best
solution.
Yeah, at this point your only hope is going to be making them go
through such wild contortions to get at the internal data they think
better of it. Actually blocking all access to something written in
Python is fairly tough (you generally need an extension class written
in non-Python code that hides access to certain attributes).

Cheers,
Nick.
Jun 27 '08 #84
On Jun 4, 4:41 pm, Antoon Pardon <apar...@forel.vub.ac.bewrote:
Guido has been known to change his mind, which is an admirabele quality,
but it does show that at some point he rejected a good idea or accepted
a bad idea.
And sometimes the person that talked him into accepting the bad idea
in the first place ends up agreeing with him when he eventually
rejects it ;)

Cheers,
Nick.

P.S. Read the list of references in PEP 343 if you want to know what
I'm talking about *cough*

Jun 27 '08 #85
Marc 'BlackJack' Rintsch <bj****@gmx.netwrites:
On Wed, 04 Jun 2008 15:55:38 +1000, Ben Finney wrote:
By definition, "private" functions are not part of the publicly
documented behaviour of the unit. Any behaviour exhibited by some
private component is seen externally as a behaviour of some public
component.

But only indirectly
No, that's the point: externally, such behaviour is exhibited by the
public API. Whatever internal moving parts actually lead to the
behaviour doesn't make any difference: it's the external behaviour
that's being discussed in the above.
and it's often harder to predict the corner cases that might trigger
bugs or to test error testing in dependent private functions.
Indeed. The person writing the unit tests should do so with full
knowledge of what the implementation looks like. This allows coverage
of those corner cases you rightly point out exist in many
implementations.

Those unit tests should *not*, though, exercise anything but the
public API, otherwise they're breaking encapsulation. Their assertion
should continue to be just as true after a refactoring of the internal
components as before.
Private functions offer an API that's public to someone, so they
ought to be documented and tested.
No, that's pretty much the point: private functions are intended for
use by nothing except other functions at the same scope. If they're to
be used in other contexts, they're public API, not private.

It's a feature of Python that such bad API design doesn't lead to
hideous workarounds: one can still easily get at the parts of an API
that the programmer mistakenly marked "private". That doesn't make it
any less a mistake to break encapsulation, but it does make it much
more easily fixed.

--
\ "During the Middle Ages, probably one of the biggest mistakes |
`\ was not putting on your armor because you were 'just going down |
_o__) to the corner.'" -- Jack Handey |
Ben Finney
Jun 27 '08 #86
On Jun 4, 9:24 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
NickC <ncogh...@gmail.comwrites:
if/else was added solely because people kept coming up with ways of
embedding a pseudo conditional inside expressions and writing buggy
code in the process. All it really saves you in practice is a bit of
vertical whitespace, so, no, you still don't need it - but if you
insist on doing it, at least there's now an easy way to do it
correctly.

Come on, it's more than vertical whitespace, it's extraneous variables
and sometimes even extraneous functions and function call overhead.
And Python is supposed to be unbureaucratic. People kept looking for
ways to write conditional expressions instead of spewing the logic
across multiple statements for a reason: the code is often cleaner
that way.
True, but it really was the multitude of buggy workarounds for the
lack of a ternary expression that sealed the deal, rather than the
benefits of ternary expressions in their own right :)

Given that I personally use ternary expressions solely as the right
hand side of an assignment statement, the reduction in vertical
whitespace usage really is the only thing they gain me. I guess if you
embedded them as an argument to a function call or other more
complicated expression then there may be additional savings. I prefer
not to do that though, since such things can get quite difficult to
parse mentally when reading them later.

Cheers,
Nick.
Jun 27 '08 #87
On Jun 4, 9:56 pm, Ben Finney <bignose+hates-s...@benfinney.id.au>
wrote:
Those unit tests should *not*, though, exercise anything but the
public API, otherwise they're breaking encapsulation. Their assertion
should continue to be just as true after a refactoring of the internal
components as before.
Python must have bad unit tests then - the CPython test suite
explicitly tests private methods all the time.

There's actually an extremely good reason for doing it that way: when
the implementation of an internal method gets broken, the unit tests
flag it explicitly, rather than having to derive the breakage from the
breakage of 'higher level' unit tests (after all, you wouldn't factor
something out into its own method or function if you weren't using it
in at least a couple of different places).

Black box testing (testing only the public API) is certainly
important, but grey box and white box testing that either exploits
knowledge of the implementation when crafting interesting test cases,
or explicitly tests internal APIs can be highly beneficial in
localising faults quickly when something does break (and as any
experienced maintenance programmer will tell you, figuring out what
you actually broke is usually harder than fixing it after you find it).
Jun 27 '08 #88
On 2008-06-04, NickC <nc******@gmail.comwrote:
On May 26, 7:32 am, "Joe P. Cool" <joe.p.c...@googlemail.comwrote:
>I saw this "don't need it" pattern in discussions about the ternary
"if..else" expression and about "except/finally on the same block
level".
Now Python has both.

if/else was added solely because people kept coming up with ways of
embedding a pseudo conditional inside expressions and writing buggy
code in the process. All it really saves you in practice is a bit of
vertical whitespace, so, no, you still don't need it - but if you
insist on doing it, at least there's now an easy way to do it
correctly.
If I remember correctly it was added because one of the python
developers was bitten by a bug in the standard library code
that was caused by the use of the and-or emulation, mentioned
in the FAQ.

And although one indeed doesn't need this. There are a lot
of things in Python one doesn't need. Python could be limited
to single operator expressions. You don't need:

x = a * b + c

You can write it just like this:

x = a * b
x = x + c
And if you want a list comprehension like the following:

ls = [ x * x + 4 for x in xrange(10)]

You can of course write it as follows:

def sqrplus4(a):
rs = a * a
return rs + 4

ls = [sqrplus4(x) for x in xrange(10)]
Now of course noone would defend such a limitation on the grounds
that one doesn't need the general case and that the general case
will only save you some vertical space.

But when it came to the ternary operator that was exactly the
argument used, to defend the lack of it.
In Python, the philosophy "we're all consenting adults here" applies.

Please don't sell a missing feature as a philosophy. Say you don't
need/want
it. But don't call it philosophy.

Gosh, and here I thought treating programmers as non-idiots was
actually one of the guiding philosophies in the discussion on python-
dev.
I have heard the argument: "Such a feature will be abused too easily"
and similar too many times to find this credible.

--
Antoon Pardon
Jun 27 '08 #89
On 2008-06-04, NickC <nc******@gmail.comwrote:
On Jun 4, 4:09 am, "Russ P." <Russ.Paie...@gmail.comwrote:
>What is it about leading underscores that bothers me? To me, they are
like a small pebble in your shoe while you are on a hike. Yes, you can
live with it, and it does no harm, but you still want to get rid of it.

With leading underscores, you can see *at the point of dereference*
that the code is accessing private data.
But the leading underscore doesn't tell you whether it is your own
private date, which you can use a you see fit, or those of someone
else, which you have to be very carefull with.

--
Antoon Pardon
Jun 27 '08 #90
>
But the leading underscore doesn't tell you whether it is your own
private date, which you can use a you see fit, or those of someone
else, which you have to be very carefull with.

--
Antoon Pardon
Well how is that different from public accessor and mutators of
private variables?
Jun 27 '08 #91
On 2008-06-04, co*********@gmail.com <co*********@gmail.comwrote:
>>
But the leading underscore doesn't tell you whether it is your own
private date, which you can use a you see fit, or those of someone
else, which you have to be very carefull with.

--
Antoon Pardon

Well how is that different from public accessor and mutators of
private variables?
Public accessor and mutators for private variables is a bad idea.
So I don't understand what point you are trying to make by suggesting
that the use of an underscore is just like it in this regard.

--
Antoon Pardon
Jun 27 '08 #92
In article <87************@benfinney.id.au>,
Ben Finney <bi****************@benfinney.id.auwrote:
By definition, "private" functions are not part of the publicly
documented behaviour of the unit. Any behaviour exhibited by some
private component is seen externally as a behaviour of some public
component.
You know the difference between theory and reality? In theory, there is
none... Sometimes it's useful to test internal components. Imagine this
class:

class ArmegeddonMachine:
def pushTheButton(self):
"Destroy a random city"
city = self._pickCity()
self._destroy(city)

def _pickCity():
cities = ['New York', 'Moscow', 'Tokyo', 'Beijing', 'Mumbai']
thePoorSchmucks = random.choice(cities)
return 'New York'

def _destroy(self, city):
missle = ICBM()
missle.aim(city)
missle.launch()

The only externally visible interface is pushTheButton(), yet you don't
really want to call that during testing. What you do want to do is test
that a random city really does get picked.

You can do one of two things at this point. You can say, "But, that's not
part of the externally visible interface" and refuse to test it, or you can
figure out a way to test it. Up to you.
Jun 27 '08 #93
On Jun 4, 4:29 am, NickC <ncogh...@gmail.comwrote:
On Jun 4, 4:09 am, "Russ P." <Russ.Paie...@gmail.comwrote:
What is it about leading underscores that bothers me? To me, they are
like a small pebble in your shoe while you are on a hike. Yes, you can
live with it, and it does no harm, but you still want to get rid of it.

With leading underscores, you can see *at the point of dereference*
that the code is accessing private data. With a "this is private"
keyword you have no idea whether you're accessing private or public
data, because the two namespaces get conflated together.
That is true. But with the "priv" keyword you'll discover quickly
enough that you are trying to access private data (as soon as you run
the program). And even if a "priv" keyword is added, you are still
free to use the leading underscore convention if you wish.

The idea of being able to discern properties of an object by its name
alone is something that is not normally done in programming in
general. Yes, of course you should choose identifiers to be
descriptive of what they represent in the real world, but you don't
use names like "intCount," "floatWeight," or "MyClassMyObject" would
you? Why not? That would tell you the type of the object at the "point
of dereferencing," wouldn't it?
Jun 27 '08 #94
On Jun 4, 2:58 pm, "Russ P." <Russ.Paie...@gmail.comwrote:
On Jun 4, 4:29 am, NickC <ncogh...@gmail.comwrote:
On Jun 4, 4:09 am, "Russ P." <Russ.Paie...@gmail.comwrote:
What is it about leading underscores that bothers me? To me, they are
like a small pebble in your shoe while you are on a hike. Yes, you can
live with it, and it does no harm, but you still want to get rid of it.
With leading underscores, you can see *at the point of dereference*
that the code is accessing private data. With a "this is private"
keyword you have no idea whether you're accessing private or public
data, because the two namespaces get conflated together.

That is true. But with the "priv" keyword you'll discover quickly
enough that you are trying to access private data (as soon as you run
the program). And even if a "priv" keyword is added, you are still
free to use the leading underscore convention if you wish.

The idea of being able to discern properties of an object by its name
alone is something that is not normally done in programming in
general. Yes, of course you should choose identifiers to be
descriptive of what they represent in the real world, but you don't
use names like "intCount," "floatWeight," or "MyClassMyObject" would
you? Why not? That would tell you the type of the object at the "point
of dereferencing," wouldn't it?
Sounds familiar.
http://en.wikipedia.org/wiki/Hungarian_notation
Jun 27 '08 #95
Roy Smith <ro*@panix.comwrites:
The only externally visible interface is pushTheButton(), yet you
don't really want to call that during testing. What you do want to
do is test that a random city really does get picked.
Then what you're really testing is the interactions of the "push the
button" function with its external interface: you're asserting that
the "push the red button" function actually uses the result from "pick
a random city" as its target.

Thus, the "pick a random city" function is being defined by you as
*interface* for the "push the button" function. Interfaces do need to
be unit tested.

This is done by having the unit test substitute a test double for the
"pick a random city" function, rigging that double so that its
behaviour is deterministic, and asserting that the "push the button"
function uses that deterministically-generated result.

It's at this point, of course, that the "pick a random city" function
has come rather close to being public API. The designer needs to have
a fairly good reason not to simply expose the "pick a random city"
function in the API.
You can do one of two things at this point. You can say, "But,
that's not part of the externally visible interface" and refuse to
test it, or you can figure out a way to test it. Up to you.
Note that the only thing I'm saying one shouldn't do is unit test the
private function *directly*, since the design decision has been made
that it's not part of the API. The *behaviour* of the function, as
exposed via the "push the button" piblic API, should certainly be unit
tested.

Any behaviour of that function that's *not* exhibited through the
behaviour of some public API should *not* be unit tested, and should
in fact be removed during refactoring -- which will not break the unit
test suite since no unit tests depend on it.

Alternatively, as above, the design decision can be made that, in
fact, this function *is* part of the public API since external things
are depending on it directly. Then it needs full direct unit test
coverage.

--
\ "I got contacts, but I only need them when I read, so I got |
`\ flip-ups." -- Steven Wright |
_o__) |
Ben Finney
Jun 27 '08 #96
Lie wrote:
On May 24, 9:14 pm, Fuzzyman <fuzzy...@gmail.comwrote:
>For example, at Resolver Systems we expose the spreadsheet object
model to our users. It hasa public, documented, API - plus a host of
undocumented internally used methods.

We would really *much* rather hide these, because anything our
customers start using (whether documented or not) we will probably
have to continue supporting and maintaining.
Then don't document it, or separate internal documentation (which is
never to pass through the wall) and public documentation (which your
users use). Nobody would (apart from your dev team and anyone told by
your dev team, which means you may fire the person for "lack of
discipline") know that there is such a thing and in consequence
wouldn't use it.

Don't tell your user not to use something, just don't tell them that
it exists and they won't use it.
I am not familiar with the actual software, but judging from "we expose the
spreadsheet object model to our users", I assume that users can discover the
undocumented attributes, using Python's introspection features, like dir(obj),
obj.__dict__, the inspect module, etc. So in this case, not telling them that
the attributes exist, will not stop them from finding out.

--
Hans Nowak (zephyrfalcon at gmail dot com)
http://4.flowsnake.org/
Jun 27 '08 #97
On Wed, 04 Jun 2008 09:34:58 +0000, Antoon Pardon wrote:
On 2008-06-04, Marc 'BlackJack' Rintsch <bj****@gmx.netwrote:
>>>it makes sense to me to also test if they work as documented.

If they affect the behaviour of some public component, that's where
the documentation should be.

As I said they are public themselves for someone.

Isn't that contradictory: "Public for someone" I always
thought "public" meant accessible to virtually anyone.
Not to only someone.
For the programmer who writes or uses the private API it isn't really
"private", he must document it or know how it works. And he should IMHO
write tests for it and expect "private" functions written by others to be
tested.

Ciao,
Marc 'BlackJack' Rintsch
Jun 27 '08 #98
Ben Finney wrote:
Roy Smith <ro*@panix.comwrites:

>>The only externally visible interface is pushTheButton(), yet you
don't really want to call that during testing. What you do want to
do is test that a random city really does get picked.


Then what you're really testing is the interactions of the "push the
button" function with its external interface: you're asserting that
the "push the red button" function actually uses the result from "pick
a random city" as its target.

Thus, the "pick a random city" function is being defined by you as
*interface* for the "push the button" function. Interfaces do need to
be unit tested.

This is done by having the unit test substitute a test double for the
"pick a random city" function, rigging that double so that its
behaviour is deterministic, and asserting that the "push the button"
function uses that deterministically-generated result.

It's at this point, of course, that the "pick a random city" function
has come rather close to being public API. The designer needs to have
a fairly good reason not to simply expose the "pick a random city"
function in the API.

>>You can do one of two things at this point. You can say, "But,
that's not part of the externally visible interface" and refuse to
test it, or you can figure out a way to test it. Up to you.


Note that the only thing I'm saying one shouldn't do is unit test the
private function *directly*, since the design decision has been made
that it's not part of the API. The *behaviour* of the function, as
exposed via the "push the button" piblic API, should certainly be unit
tested.

Any behaviour of that function that's *not* exhibited through the
behaviour of some public API should *not* be unit tested, and should
in fact be removed during refactoring -- which will not break the unit
test suite since no unit tests depend on it.

Alternatively, as above, the design decision can be made that, in
fact, this function *is* part of the public API since external things
are depending on it directly. Then it needs full direct unit test
coverage.
I must be missing something in this discussion. Perhaps it's the
appropriate point of view. At any rate, it seems to me that any and
every function should be tested to ensure proper results. It's my
understanding that unit testing (a.k.a. PyUnit) is designed for just
such a purpose.

So is this argument simply over *who* should be (unit) testing the
internals? I.e. The fellow that wrote the code library vs. the other
fellow that wants to use the library? Or is it actually, as it seems,
over the internals being tested at all?
--
Ethan
Jun 27 '08 #99
Ethan Furman <et***@stoneleaf.uswrites:
I must be missing something in this discussion. Perhaps it's the
appropriate point of view. At any rate, it seems to me that any and
every function should be tested to ensure proper results.
I restrict that to "every proper behaviour the system is expected to
provide should be tested".

The corollary is that every behaviour is either:

* part of an expected external behaviour, and thus unit tests need
to assert that behaviour through the unit's public interface

* not part of an expected external behaviour, and thus needs to be
removed from the system

This also forces a decision about "private" functionality: Either it's
part of some public functionality, and thus needs to be tested via
that public functionality; or it's not part of any public
functionality, and needs to be removed.
It's my understanding that unit testing (a.k.a. PyUnit) is designed
for just such a purpose.
Yes.

--
\ "I was in the first submarine. Instead of a periscope, they had |
`\ a kaleidoscope. 'We're surrounded.'" -- Steven Wright |
_o__) |
Ben Finney
Jun 27 '08 #100

This thread has been closed and replies have been disabled. Please start a new discussion.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.