Why less emphasis on private data?

time.swift

Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

Thanks any insight.

Jan 7 '07

Subscribe Post Reply

2597

Neil Cerutti

On 2007-01-08, Jussi Salmela <ti*********@hotmail.comwrote:

Neil Cerutti kirjoitti:
>In C one uses the pointer to opaque struct idiom to hide data.
For example, the standard FILE pointer.

To Neil Cerutti: If a programmer in C has got a pointer to some
piece of memory, that piece is at the mercy of the programmer.
There's no data hiding at all in this case.

That's somewhat disingenuous. You get just as much data hiding
with an opaque data type in C as you get in C++ or Java.

--
Neil Cerutti
Potluck supper: prayer and medication to follow. --Church Bulletin Blooper

Jan 8 '07 #51

time.swift

Wow, I got a lot more feedback than I expected!

I can see both sides of the argument, both on technical merits, and
more philosophical merits. When I first learned C++ I felt
setters/getters were a waste of my time and extra code. When I moved
to C# I still felt that, and with their 'Property" syntax I perhaps
felt it more. What changed my mind is when I started placing logic in
them to check for values and throw expections or (hopefully) correct
the data. That's probably reason one why I find it weird in Python

Reason two is, as the user of a class or API I *don't care* what is
going on inside. All I want visible is the data that I can change. The
'_' convention is nice.. I do see that. I guess my old OOP classes are
hard to forget about. I feel that the state of an object should be
"stable" and "valid" at all times, and if its going into an unstable
state - error then, not later. That's why I like being able to protect
parts of an instances state. If, as a provider of a calculation engine,
I let the user change the internal state of the engine, I have no
assurances that my product (the engine) is doing its job...

<shrugs>

I appreciate all the feed back and enjoyed reading the discussion. It
helps me understand why Python community has chosen the road they have.
- Thanks.

Jan 8 '07 #52

Bruno Desthuilliers

ti********@gmail.com a écrit :

Wow, I got a lot more feedback than I expected!

I can see both sides of the argument, both on technical merits, and
more philosophical merits. When I first learned C++ I felt
setters/getters were a waste of my time and extra code. When I moved
to C# I still felt that, and with their 'Property" syntax I perhaps
felt it more. What changed my mind is when I started placing logic in
them to check for values and throw expections or (hopefully) correct
the data. That's probably reason one why I find it weird in Python

Python does have properties too. The point is that you can as well start
with a plain attribute, then turn it into a computed one when (and if)
needed.

Reason two is, as the user of a class or API I *don't care* what is
going on inside.

Very true... until you have a legitimate reason to mess with
implementation because of a use case the author did not expect.

All I want visible is the data that I can change. The
'_' convention is nice.. I do see that. I guess my old OOP classes are
hard to forget about.

Access restriction is not a mandatory part of OO. Of course, objects are
supposed to be treated as "black boxes", but that's also true of a
binary executable, and nothing (technically) prevents you to open it
with an hex editor and hack it as you see fit... But then, you would not
complain about strange bugs, would you ?-)

I feel that the state of an object should be
"stable" and "valid" at all times,

That's fine. Just remember that, in Python, methods are attributes too,
and can be dynamically modified too. So when thinking about "object
state", don't assume it only implies "data" attributes. Heck, you can
even dynamically change the *class* of a Python object...

and if its going into an unstable
state - error then, not later. That's why I like being able to protect
parts of an instances state. If, as a provider of a calculation engine,
I let the user change the internal state of the engine, I have no
assurances that my product (the engine) is doing its job...

If you follow the convention, you are not responsible for what happens
to peoples messing with implementation. period. Just like you're not
responsible for what happens if someone hack your binary executable with
an hex editor.

Welcome to Python, anyway.

Jan 8 '07 #53

Steven D'Aprano

On Mon, 08 Jan 2007 13:11:14 +0200, Hendrik van Rooyen wrote:

When you hear a programmer use the word "probability" -
then its time to fire him, as in programming even the lowest
probability is a certainty when you are doing millions of
things a second.

That is total and utter nonsense and displays the most appalling
misunderstanding of probability, not to mention a shocking lack of common
sense.
--
Steven.

Jan 8 '07 #54

sturlamolden

Jussi Salmela wrote:

To surlamolden: I don't know how you define private, but if one defines
in C an external static variable i.e. a variable outside any functions,
on the file level, the scope of the variable is that file only.

Sure, in C you can hide instances inside an object image by declaring
them static. But the real virtue of static declarations is to assist
the compiler.

My definition of 'private' for this thread is the private attribute
provided by C++, Java and C#. When I program C I use another idiom,

/* THIS IS MINE, KEEP YOUR PAWS OFF */

and it works just as well. The same idiom works for Python as well.

To hg: One does not need in C the static keyword to make a variable
defined inside a function i.e. a so called 'automatic variable' private
to that test. Automatic variables are private to their function by
definition. The static keyword makes the variable permanent i.e. it
keeps its value between calls but it is of course private also.

To Neil Cerutti: If a programmer in C has got a pointer to some piece of
memory, that piece is at the mercy of the programmer. There's no data
hiding at all in this case.

To whom it may concern: please stop comparing C and Python with regard
to privacy and safety. They are two different worlds altogether. Believe
me: I've been in this world for 2.5 years now after spending 19 years in
the C world.

Cheers,
Jussi

Jan 8 '07 #55

Andrea Griffini

Steven D'Aprano wrote:

That is total and utter nonsense and displays the most appalling
misunderstanding of probability, not to mention a shocking lack of common
sense.

While I agree that the programming job itself is not
a program and hence the "consider any possibility"
simply doesn't make any sense I can find a bit of
truth in the general idea that *in programs* it is
dangerous to be deceived by probability.

When talking about correctness (that should be the
main concern) for a programmer "almost never" means
"yes" and "almost always" means "not" (probability
of course for example kicks in about efficency).

Like I said however this reasoning doesn't work
well applied to the programming process itself
(that is not a program... as programmers are not
CPUs; no matter what bigots of software engineering
approaches are hoping for).
Private variables are about the programming process,
not the program itself; and in my experience the
added value of C++ private machinery is very low
(and the added cost not invisible).
When working in C++ I like much more using
all-public abstract interfaces and module-level
all-public concrete class definitions (the so
called "compiler firewall" idiom).

Another thing on the same "line of though" of
private members (that should "help programmers")
but for which I never ever saw *anything but costs*
is the broken idea of "const correctness" of C++.
Unfortunately that is not something that can be
avoided completely in C++, as it roots in the core
of the language.

Andrea

Jan 9 '07 #56

Hendrik van Rooyen

"Steven D'Aprano" <st***@REMOVE.THIS.cybersource.com.auwrote:

On Mon, 08 Jan 2007 13:11:14 +0200, Hendrik van Rooyen wrote:

When you hear a programmer use the word "probability" -
then its time to fire him, as in programming even the lowest
probability is a certainty when you are doing millions of
things a second.

That is total and utter nonsense and displays the most appalling
misunderstanding of probability, not to mention a shocking lack of common
sense.

Really?

Strong words.

If you don't understand you need merely ask, so let me elucidate:

If there is some small chance of something occurring at run time that can
cause code to fail - a "low probability" in all the accepted senses of the
word - and a programmer declaims - "There is such a low probability of
that occurring and its so difficult to cater for that I won't bother"
- then am I supposed to congratulate him on his wisdom and outstanding
common sense?

Hardly. - If anything can go wrong, it will. - to paraphrase Murphy's law.

To illustrate:
If there is one place in any piece of code that is critical and not protected,
even if its in a relatively rarely called routine, then because of the high
speed of operations, and the fact that time is essentially infinite, it WILL
fail, sooner or later, no matter how miniscule the apparent probability
of it occurring on any one iteration is.

How is this a misunderstanding of probability? - probability applies to any one
trial, so in a series of trials, when the number of trials is large enough - in
the
order of the inverse of the probability, then ones expectation must be that the
rare occurrence should occur...

There is a very low probability that any one gas molecule will collide with any
other one in a container - and "Surprise! Surprise! " there is nevertheless
something like the mean free path...

That kind of covers the math, albeit in a non algebraic way, so as not to
confuse what Newton used to call "Little Smatterers"...

Now how does all this show a shocking lack of common sense?

- Hendrik

Jan 9 '07 #57

sturlamolden

ti********@gmail.com wrote:

I let the user change the internal state of the engine, I have no
assurances that my product (the engine) is doing its job...

How would you proceed to protect this inner states? In C++ private
members they can be accessed through a cast to void pointer. In Java it
can be done through introspection. In C# it can be done through
introspection or casting to void pointer in an 'unsafe' block. There is
no way you can protect inner states of an object, it is just an
illusion you may have.

Python have properties as well. Properties has nothing to do with
hiding attributes.

Jan 9 '07 #58

Steven D'Aprano

On Tue, 09 Jan 2007 10:27:56 +0200, Hendrik van Rooyen wrote:

"Steven D'Aprano" <st***@REMOVE.THIS.cybersource.com.auwrote:

>On Mon, 08 Jan 2007 13:11:14 +0200, Hendrik van Rooyen wrote:

When you hear a programmer use the word "probability" -
then its time to fire him, as in programming even the lowest
probability is a certainty when you are doing millions of
things a second.

That is total and utter nonsense and displays the most appalling
misunderstanding of probability, not to mention a shocking lack of common
sense.

Really?

Strong words.

If you don't understand you need merely ask, so let me elucidate:

If there is some small chance of something occurring at run time that can
cause code to fail - a "low probability" in all the accepted senses of the
word - and a programmer declaims - "There is such a low probability of
that occurring and its so difficult to cater for that I won't bother"
- then am I supposed to congratulate him on his wisdom and outstanding
common sense?

Hardly. - If anything can go wrong, it will. - to paraphrase Murphy's law.

To illustrate:
If there is one place in any piece of code that is critical and not protected,
even if its in a relatively rarely called routine, then because of the high
speed of operations, and the fact that time is essentially infinite,

Time is essentially infinite? Do you really expect your code will still be
in use fifty years from now, let alone a billion years?

I know flowcharts have fallen out of favour in IT, and rightly so -- they
don't model modern programming techniques very well, simply because modern
programming techniques would lead to a chart far too big to be practical.
But for the sake of the exercise, imagine a simplified flowchart of some
program, one with a mere five components, such that one could take any of
the following paths through the program:

START -A -B -C -D -E
START -A -C -B -D -E
START -A -C -D -B -E
....
START -E -D -C -B -A

There are 5! (five factorial) = 120 possible paths through the program.

Now imagine one where there are just fifty components, still quite a
small program, giving 50! = 3e64 possible paths. Now suppose that there is
a bug that results from following just one of those paths. That would
match your description of "lowest probability" -- any lower and it would
be zero.

If all of the paths are equally likely to be taken, and the program takes
a billion different paths each millisecond, on average it would take about
1.5e55 milliseconds to hit the bug -- or about 5e44 YEARS of continual
usage. If every person on Earth did nothing but run this program 24/7, it
would still take on average almost sixty million billion billion billion
years to discover the bug.

But of course in reality some paths are more likely than others. If the
bug happens to exist in a path that is executed often, or if it exists
in many paths, then the bug will be found quickly. On the other hand, if
the bug is in a path that is rarely executed, your buggy program may be
more reliable than the hardware you run it on. (Cynics may say that isn't
hard.)

You're project manager for the development team. Your lead developer tells
you that he knows this bug exists (never mind how, he's very clever) and
that the probability of reaching that bug in use is about 3e-64.

If it were easy to fix, the developer wouldn't even have mentioned it.
This is a really hard bug to fix, it's going to require some major
changes to the program, maybe even a complete re-think of the program.
Removing this bug could even introduce dozens, hundreds of new bugs.

So okay Mister Project Manager. What do you do? Do you sack the developer,
like you said? How many dozens or hundreds of man-hours are you prepared
to put into this? If the money is coming out of your pocket, how much are
you willing to spend to fix this bug?
[snip]

How is this a misunderstanding of probability? - probability applies to
any one trial, so in a series of trials, when the number of trials is
large enough - in the
order of the inverse of the probability, then ones expectation must be
that the rare occurrence should occur...

"Even the lowest probability is a certainty" is mathematically nonsense:
it just isn't true -- no matter how many iterations, the probability is
always a little less than one. And you paper over a hole in your argument
with "when the number of trials is large enough" -- if the probability is
small enough, "large enough" could be unimaginably huge indeed.

Or, to put it another way, while anything with a non-zero probability
_might_ happen (you might drop a can of soft drink on your computer,
shorting it out and _just by chance_ causing it to fire off a perfectly
formatted email containing a poem about penguins) we are justified in
writing off small enough probabilities as negligible. It's not that they
can't happen, but the chances of doing so are so small that we can rightly
expect to never see them happen.

You might like to read up on Borel's "Law" (not really a law at all,
really just a heuristic for judging when probabilities are negligible).
Avoid the nonsense written about Borel and his guideline by Young Earth
Creationists, they have given him an undeserved bad name.

http://www.talkorigins.org/faqs/abioprob/borelfaq.html

There is a very low probability that any one gas molecule will collide
with any other one in a container

Not so. There is a very low probability that one gas molecule will collide
with a _specific_ other molecule -- but the probability of colliding with
_any_ other molecule is very high.

- and "Surprise! Surprise! " there
is nevertheless something like the mean free path...

Yes. And that mean free path increases without limit as the volume of the
gas increases. Take your molecule into the space between stars, and the
mean free path might be dozens of lightyears -- even though there is
actually more gas in total than in the entire Earth.

Now how does all this show a shocking lack of common sense?

You pay no attention to the economics of programming. Programming doesn't
come for free. It is always a trade-off for the best result with the least
effort. Any time people start making absolute claims about fixing every
possible bug, no matter how obscure or unlikely or how much work it will
take, I know that they aren't paying for the work to be done.

--
Steven.

Jan 9 '07 #59

Hendrik van Rooyen

"Steven D'Aprano" <st***@REMOVE.THIS.cybersource.com.auwrote:

On Tue, 09 Jan 2007 10:27:56 +0200, Hendrik van Rooyen wrote:

"Steven D'Aprano" <st***@REMOVE.THIS.cybersource.com.auwrote:

On Mon, 08 Jan 2007 13:11:14 +0200, Hendrik van Rooyen wrote:

When you hear a programmer use the word "probability" -
then its time to fire him, as in programming even the lowest
probability is a certainty when you are doing millions of
things a second.

That is total and utter nonsense and displays the most appalling
misunderstanding of probability, not to mention a shocking lack of common
sense.
Really?

Strong words.

If you don't understand you need merely ask, so let me elucidate:

If there is some small chance of something occurring at run time that can
cause code to fail - a "low probability" in all the accepted senses of the
word - and a programmer declaims - "There is such a low probability of
that occurring and its so difficult to cater for that I won't bother"
- then am I supposed to congratulate him on his wisdom and outstanding
common sense?

Hardly. - If anything can go wrong, it will. - to paraphrase Murphy's law.

To illustrate:
If there is one place in any piece of code that is critical and not

protected,

even if its in a relatively rarely called routine, then because of the high
speed of operations, and the fact that time is essentially infinite,

Time is essentially infinite? Do you really expect your code will still be
in use fifty years from now, let alone a billion years?

My code does not suffer from bit rot, so it should outlast the hardware...

But seriously - for the sort of mistakes we make as programmers - it does
not actually need infinite time for the lightning to strike - most things that
will actually run overnight are probably stable - and if it takes say a week
of running for the bug to raise its head - it is normally a very difficult
problem to find and fix. A case in point - One of my first postings to
this newsgroup concerned an intermittent failure on a serial port - It was
never resolved in a satisfactory manner - eventually I followed my gut
feel, made some changes, and it seems to have gone away - but I expect
it to bite me anytime - I don't actually *know* that its fixed, and there is
not, as a corollary to your sum below here, any real way to know for
certain.

>
I know flowcharts have fallen out of favour in IT, and rightly so -- they
don't model modern programming techniques very well, simply because modern
programming techniques would lead to a chart far too big to be practical.

I actually like drawing data flow diagrams, even if they are sketchy, primitive
ones, to try to model the inter process communications (where a "process"
may be just a python thread) - I find it useful to keep an overall perspective.

But for the sake of the exercise, imagine a simplified flowchart of some
program, one with a mere five components, such that one could take any of
the following paths through the program:

START -A -B -C -D -E
START -A -C -B -D -E
START -A -C -D -B -E
...
START -E -D -C -B -A

There are 5! (five factorial) = 120 possible paths through the program.

Now imagine one where there are just fifty components, still quite a
small program, giving 50! = 3e64 possible paths. Now suppose that there is
a bug that results from following just one of those paths. That would
match your description of "lowest probability" -- any lower and it would
be zero.

If all of the paths are equally likely to be taken, and the program takes
a billion different paths each millisecond, on average it would take about
1.5e55 milliseconds to hit the bug -- or about 5e44 YEARS of continual
usage. If every person on Earth did nothing but run this program 24/7, it
would still take on average almost sixty million billion billion billion
years to discover the bug.

In something with just 50 components it is, I believe, better to try to
inspect the quality in, than to hope that random testing will show up
errors - But I suppose this is all about design, and about avoiding
doing known no - nos.

>
But of course in reality some paths are more likely than others. If the
bug happens to exist in a path that is executed often, or if it exists
in many paths, then the bug will be found quickly. On the other hand, if
the bug is in a path that is rarely executed, your buggy program may be
more reliable than the hardware you run it on. (Cynics may say that isn't
hard.)

Oh I am of the opposite conviction - Like the fellow of the Circuit Cellar
I forget his name ( Steve Circia (?) ) who said: "My favourite Programming
Language is Solder"... I find that when I start blaming the hardware
for something that is going wrong, I am seldom right...

And this is true also for hardware that we make ourselves, that one would
expect to be buggy, because it is new and untested. It is almost as if the
tools used in hardware design are somehow less buggy than a programmer's
fumbling attempts at producing something logical.

>
You're project manager for the development team. Your lead developer tells
you that he knows this bug exists (never mind how, he's very clever) and
that the probability of reaching that bug in use is about 3e-64.

This is too convenient - This lead developer is about as likely as
my infinite time...

>
If it were easy to fix, the developer wouldn't even have mentioned it.
This is a really hard bug to fix, it's going to require some major
changes to the program, maybe even a complete re-think of the program.
Removing this bug could even introduce dozens, hundreds of new bugs.

So okay Mister Project Manager. What do you do? Do you sack the developer,
like you said? How many dozens or hundreds of man-hours are you prepared
to put into this? If the money is coming out of your pocket, how much are
you willing to spend to fix this bug?

Do a design review, Put in a man with some experience,
and hope for the best - in reality what else can you do, short
of trying to do it all yourself?

>
[snip]

How is this a misunderstanding of probability? - probability applies to
any one trial, so in a series of trials, when the number of trials is
large enough - in the
order of the inverse of the probability, then ones expectation must be
that the rare occurrence should occur...

"Even the lowest probability is a certainty" is mathematically nonsense:
it just isn't true -- no matter how many iterations, the probability is
always a little less than one. And you paper over a hole in your argument
with "when the number of trials is large enough" -- if the probability is
small enough, "large enough" could be unimaginably huge indeed.

*grin* sure - this is not the maths tripos...

But I am willing to lay a bet, that over an evening's play at roulette, the
red will come up at least once. I would expect to win too.

>
Or, to put it another way, while anything with a non-zero probability
_might_ happen (you might drop a can of soft drink on your computer,
shorting it out and _just by chance_ causing it to fire off a perfectly
formatted email containing a poem about penguins) we are justified in
writing off small enough probabilities as negligible. It's not that they
can't happen, but the chances of doing so are so small that we can rightly
expect to never see them happen.

I promise I won't hold my breath...

<joke>
Man inspecting the work of a bunch of monkeys with Keyboards:

"Hey Harry - I think we might have something here - check this:

To be, or not to be, that is the iuuiihiuweriopuqewt"

<end joke>

>
You might like to read up on Borel's "Law" (not really a law at all,
really just a heuristic for judging when probabilities are negligible).
Avoid the nonsense written about Borel and his guideline by Young Earth
Creationists, they have given him an undeserved bad name.

http://www.talkorigins.org/faqs/abioprob/borelfaq.html

ok will have a look later

8<--------------

Now how does all this show a shocking lack of common sense?

You pay no attention to the economics of programming. Programming doesn't
come for free. It is always a trade-off for the best result with the least
effort. Any time people start making absolute claims about fixing every
possible bug, no matter how obscure or unlikely or how much work it will
take, I know that they aren't paying for the work to be done.

Too much assumption from too little data. Have actually been the part owner
of a small company for the last two decades or so - I am paying, all right,
I am paying, and paying...

Which maybe is why I want perfection...

- Hendrik

Jan 10 '07 #60

Gabriel Genellina

At Wednesday 10/1/2007 04:33, Hendrik van Rooyen wrote:

>Oh I am of the opposite conviction - Like the fellow of the Circuit Cellar
I forget his name ( Steve Circia (?) ) who said: "My favourite Programming
Language is Solder"..

Almost right: Steve Ciarcia.
--
Gabriel Genellina
Softlab SRL

__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas

Jan 10 '07 #61

Steve Holden

Paul Boddie wrote:

Paul Rubin wrote:
>Right, the problem is if those methods start changing the "private"
variable. I should have been more explicit about that.

class A:
def __init__(self):
self.__x = 3
def foo(self):
return self.__x

class B(A): pass

class A(B):
def bar(self):
self.__x = 5 # clobbers private variable of earlier class named A

Has this ever been reported as a bug in Python? I could imagine more
sophisticated "name mangling": something to do with the identity of the
class might be sufficient, although that would make the tolerated
"subversive" access to private attributes rather difficult.

Paul

It would also force the mangling to take place at run-time, which would
probably affect efficiently pretty adversely (thinks: should really
check that mangling is a static mechanism before posting this).

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com
See you at PyCon? http://us.pycon.org/TX2007

Feb 5 '07 #62

Bart Ogryczak

On Jan 7, 1:07 am, "time.sw...@gmail.com" <time.sw...@gmail.com>
wrote:

Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

Often it´s a question of efficiency. Function calls in Python are
bloody slow. There is no "inline" directive, since it´s intepreted,
not compiled. Eg. consider code like that:

class MyWhatever:
...
def getSomeAttr(self):
return self._someAttr
def getSomeOtherAttr(self):
return self._someOtherAttr

[x.getSomeAttr() for x in listOfMyWhatevers if x.getSomeOtherAttr() ==
'whatever']

You´d get it running hundreds times faster doing it the "wrong" way:

[x._someAttr for x in listOfMyWhatevers if x._someOtherAttr ==
'whatever']

Feb 5 '07 #63

Paul Rubin

Steve Holden <st***@holdenweb.comwrites:

class A(B):
def bar(self):
self.__x = 5 # clobbers private variable of earlier class named A
Has this ever been reported as a bug in Python? I could imagine more
sophisticated "name mangling": something to do with the identity of the
class might be sufficient, although that would make the tolerated
"subversive" access to private attributes rather difficult.
It would also force the mangling to take place at run-time, which
would probably affect efficiently pretty adversely (thinks: should
really check that mangling is a static mechanism before posting this).

I think it could still be done statically. For example, the mangling
could include a random number created at compile time when the class
definition is compiled, that would also get stored in the class object.

I guess there are other ways to create classes than class statements
and those would have to be addressed too.

Feb 5 '07 #64

Similar topics