Encapsulation and Operator[]

Roger Lakner

I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept it?
Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

Thank you,

Roger

Mar 18 '06

Subscribe Reply

3363

« First
<
3
4
5

Daniel T.

In article <js************ *************** *****@4ax.com>,
Bob Hairgrove <in*****@bigfoo t.com> wrote:

On Tue, 21 Mar 2006 14:00:29 GMT, "Daniel T."
<po********@ear thlink.net> wrote:
As Bjarne Stroustrup says, C++ wasn't designed to be non-hackable.
Things like encapsulation and public/private access were meant to
avoid errors, not to prevent malicious coding. So if what you mean by
adherence to UAP is to avoid the latter, C++ isn't going to do it. And
if it is the former, then we all have to play by the rules -- which
includes reading the documentation.
Not at all. Nothing can prevent a programmer from hacking into the
memory footprint of an object and access anything he wants. I am talking
about the legions of programmers who provide reference returns and
*think* they are properly encapsulating their data.

So we agree that encapsulation cannot be achieved 100% purely by using
the language features in C++, but only through proper design?

I don't agree with the above. No design can achieve encapsulation 100%
because the language features in C++ allow intentional break-ins.

A prime example:

class Foo {
int bar;
public:
Foo(): bar( 0 ) { }
int getBar() const { return bar; }
};

int main()
{
Foo f;
std::cout << f.getBar() << '\n';
int* b = reinterpret_cas t<int*>( &f );
*b = 5;
std::cout << f.getBar() << '\n';
}

The above is what Stroustrup &al. was talking about.

If you
agree with that, I still do not understand why you categorically
stated earlier that returning a reference breaks encapsulation?
Because a reference return breaks the UAP. The client knows that the
value returned is stored in RAM somewhere, and not computed on the fly.

As Meyers himself wrote: "Unfortunat ely, the presence of [a member
function returning a non-const reference] defeats the purpose of making
[a private member-variable] private." (Meyers, "Effective C++" item 30)

No. This is more like what he says:

"Unfortunat ely, the presence of [a member function returning a
non-const reference TO A PRIVATE MEMBER OF THE CLASS WHICH CAN CHANGE
THE CLASS STATE] defeats the purpose of making [THAT private
member-variable] private." (Meyers, "Effective C++" item 30)

The fact that the return value is a reference is irrelevant by itself.
It all depends on what the reference refers to. As I pointed out
before, it could be a reference to some dummy variable which is kept
solely for the purpose of satisfying clients who need some kind of
non-const lvalue to write to. What actually is written (or not, as the
case may be) is solely under control of the class containing the
member and enforced through the implementation of the function
returning the reference.

class Foo {
public:
Foo();
int& getBar();
};

int main()
{
Foo f;

assert( f.getBar() == 0 );
int& b = f.getBar();
b = 5;
assert( f.getBar() == 5 );
}

If the above doesn't work (ie if the last assert fires,) then getBar was
implemented incorrectly, if the assert doesn't fire, then encapsulation
was broken. I don't see why that is so hard for you to grasp...

And even if it is a reference to some meaningful member of the class,
the function can be implemented in a discretionary manner. For
example, operator= often checks for "this==&argumen t". If the check
proves true, the behavior is different than if it isn't.

Why are you so afraid of references? It's all about design and not
about the language itself. Implementing operator[] as in the OP's
original example DOES break encapsulation. But it can be implemented
differently in the background using the same interface. So it is the
implementation, and not the interface, which makes the difference.

Please Bob, I'm not afraid of references, I use them whenever
appropriate. It is sometimes appropriate to break encapsulation, and
therefore it is sometimes appropriate to return references.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 22 '06 #41

Noah Roberts

Daniel T. wrote:

In article <js************ *************** *****@4ax.com>,
Bob Hairgrove <in*****@bigfoo t.com> wrote:

The fact that the return value is a reference is irrelevant by itself.
It all depends on what the reference refers to. As I pointed out
before, it could be a reference to some dummy variable which is kept
solely for the purpose of satisfying clients who need some kind of
non-const lvalue to write to. What actually is written (or not, as the
case may be) is solely under control of the class containing the
member and enforced through the implementation of the function
returning the reference.

class Foo {
public:
Foo();
int& getBar();
};

int main()
{
Foo f;

assert( f.getBar() == 0 );
int& b = f.getBar();
b = 5;
assert( f.getBar() == 5 );
}

If the above doesn't work (ie if the last assert fires,) then getBar was
implemented incorrectly, if the assert doesn't fire, then encapsulation
was broken. I don't see why that is so hard for you to grasp...

It's hard to grasp because it isn't true. We don't know what "bar" is,
where it is stored, how it is accessed, or anything about it except
that changes we do to it are guaranteed to remain...in other words we
are getting a reference to something changeable, which is what the
return value says (however, the assert can fail and not be implemented
incorrectly as there is nothing in the definition that says bar will
not be changed between calls or as a consequence of them or even that
the returned reference will always be the same). We have no idea about
the internals of class Foo and in fact bar may not even be in class Foo
but be in some other class either contained within Foo or accessed by
getBar(). Foo could easily be implemented in terms like this:

struct BarHolder
{
int x;
};

class Foo
{
BarHolder holder;
public:
int& getBar() { return holder.x; }
};

Bar could also be a global value (which is of course a total waste of
time), or less obviously a static global in the same compile object as
Foo::getBar(). Maybe as a member of an array? Or it could be a
reference return from some other class and Foo also has no idea where
bar is located.

Fact is that the client does not know, nor care, how the internals of
Foo are represented or where the return of bar comes from. All it
cares about are the guarantees spelled out by the signature of the
member function. This is the definition of encapsulation.

Now, I'll grant you that in many cases there is a smell to returning a
non-const reference, especially how you have spelled it above. This
doesn't mean that doing so always breaks encapsulation.

Mar 22 '06 #42

Bob Hairgrove

On 22 Mar 2006 08:47:06 -0800, "Noah Roberts" <ro**********@g mail.com>
wrote:

Fact is that the client does not know, nor care, how the internals of
Foo are represented or where the return of bar comes from. All it
cares about are the guarantees spelled out by the signature of the
member function. This is the definition of encapsulation.

Thank you.

--
Bob Hairgrove
No**********@Ho me.com

Mar 22 '06 #43

Daniel T.

In article <ic************ *************** *****@4ax.com>,
Bob Hairgrove <in*****@bigfoo t.com> wrote:

On 22 Mar 2006 08:47:06 -0800, "Noah Roberts" <ro**********@g mail.com>
wrote:
Fact is that the client does not know, nor care, how the internals of
Foo are represented or where the return of bar comes from. All it
cares about are the guarantees spelled out by the signature of the
member function. This is the definition of encapsulation.

Thank you.

Not quite, because the guarantee spelled out by the signature of the
member function is that the thing returned exists in RAM. And that is
the definition of breaking the UAP, an important aspect of encapsulation.

Let's go back to the example:

class Foo {
public:
int& bar();
};

No matter how Foo is implemented, it must guarantee that the the int it
returns is stored in a particular place in RAM that it knows about. It
must guarantee that if a client does:

myFoo.bar() = 5;

Then that place in ram will now hold the value 5.

The function signature itself *requires* that we break the UAP. The bar
function cannot change its implementation to calculate a value on the
fly and return it. The best it can do is calculate the value and store
it in a place in RAM, then return a reference to that place in RAM. No
matter how creative you get, this fundamental fact remains.

Granted, we can do something like calculate the value each time bar is
called and ignore any value that the client may put in the RAM location
returned, however I remind you of the title of this thread... A
non-const operator[] that ignores any assignment we make to the
reference returned would break every assumption that any reasonable
programmer can expect from that particular operator. So, even if you
completely ignore everything I wrote above, you are still left with the
fact that op[]'s semantics allow for very little lee-way in how it is
implemented. There is some there, but not the freedom that you would
otherwise have.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 23 '06 #44

Kai-Uwe Bux

Daniel T. wrote:

In article <ic************ *************** *****@4ax.com>,
Bob Hairgrove <in*****@bigfoo t.com> wrote:
On 22 Mar 2006 08:47:06 -0800, "Noah Roberts" <ro**********@g mail.com>
wrote:
>Fact is that the client does not know, nor care, how the internals of
>Foo are represented or where the return of bar comes from. All it
>cares about are the guarantees spelled out by the signature of the
>member function. This is the definition of encapsulation.
Thank you.

Not quite, because the guarantee spelled out by the signature of the
member function is that the thing returned exists in RAM. And that is
the definition of breaking the UAP, an important aspect of encapsulation.

Sorry for butting in. Just a few remarks: if UAP was broken any time a
non-const reference is returned, then this principle is violated so
frequently as to cast doubt whether it is really a such important aspect of
encapsulation.

Let's go back to the example:

class Foo {
public:
int& bar();
};

No matter how Foo is implemented, it must guarantee that the the int it
returns is stored in a particular place in RAM that it knows about. It
must guarantee that if a client does:

myFoo.bar() = 5;

Then that place in ram will now hold the value 5.

The function signature itself *requires* that we break the UAP. The bar
function cannot change its implementation to calculate a value on the
fly and return it. The best it can do is calculate the value and store
it in a place in RAM, then return a reference to that place in RAM. No
matter how creative you get, this fundamental fact remains.

Granted, we can do something like calculate the value each time bar is
called and ignore any value that the client may put in the RAM location
returned, however I remind you of the title of this thread... A
non-const operator[] that ignores any assignment we make to the
reference returned would break every assumption that any reasonable
programmer can expect from that particular operator.
I cannot reconcile these two paragraphs: Above you claim that the
*signature* of the function requires breaking UAP and down here you concede
that an implementation could, indeed, recalculate the value each time the
method is called. In other words: it is *not* the signature but the
contract or the reasonable expectations of a client (i.e., that changes
from assignments like myFoo.bar() = 5 are preserved) that brings about the
violation of UAP. If there is no promise on the part of Foo that values
written into the returned memory locations by the client are not ignored,
then there appears to be no violation of UAP.
So, even if you
completely ignore everything I wrote above, you are still left with the
fact that op[]'s semantics allow for very little lee-way in how it is
implemented. There is some there, but not the freedom that you would
otherwise have.

True for operator[] with the "canonical" contract.
Finally, a question: would the following break UAP:

struct Foo {

typedef int & reference;

reference bar();

};

in a case where the *documentation* states that "reference" is an
implementation defined type.

Best

Kai-Uwe Bux

Mar 23 '06 #45

Noah Roberts

Daniel T. wrote:

In article <ic************ *************** *****@4ax.com>,
Bob Hairgrove <in*****@bigfoo t.com> wrote:
On 22 Mar 2006 08:47:06 -0800, "Noah Roberts" <ro**********@g mail.com>
wrote:
Fact is that the client does not know, nor care, how the internals of
Foo are represented or where the return of bar comes from. All it
cares about are the guarantees spelled out by the signature of the
member function. This is the definition of encapsulation.
Thank you.

Not quite, because the guarantee spelled out by the signature of the
member function is that the thing returned exists in RAM. And that is
the definition of breaking the UAP, an important aspect of encapsulation.

You need to do more to show how this is so. Lots of things exist in
RAM, that doesn't mean you break encapsulation by using them.
Let's go back to the example:

class Foo {
public:
int& bar();
};

No matter how Foo is implemented, it must guarantee that the the int it
returns is stored in a particular place in RAM that it knows about.
No it doesn't. Foo doesn't have to know anything about where it is.
Even if that statement where true it STILL doesn't break encapsulation
because the CLIENT doesn't know.

It must guarantee that if a client does:

myFoo.bar() = 5;

Then that place in ram will now hold the value 5.
So?
The function signature itself *requires* that we break the UAP.
I think you must be confused about what the UAP is.

"All services offered by a module should be available through a uniform
notation, which does not betray whether they are implemented through
storage or through computation."
http://en.wikipedia.org/wiki/Uniform_access_principle

It doesn't speak of encapsulation at all. It doesn't say what you are
saying it says, speaking to syntax and language design more than
anything else, and I also don't see that it is necissarily valid.
There are people who want properties added to the C++ language...I
don't like the idea. VB has them, C++.NET has them, I don't think they
are that hot...but I guess you do.

I'll offer you more ammo though I don't see the relevance to the
argument: C++ violates the UAP!

The bar function cannot change its implementation to calculate a value on the
fly and return it. The best it can do is calculate the value and store
it in a place in RAM, then return a reference to that place in RAM. No
matter how creative you get, this fundamental fact remains.
And? Returning a value will never be able to return a location in RAM
either. In fact, without providing documentation saying,
"reinterpret_ca st this to a pointer" you will _never_, no matter how
much hackery, be able to return a location in ram if your signature
specifies a reference. Whereas you just came up with a way to
calculate a value and return it (another is the thread unsafe static
variable).

The point of fact simply is that the two signatures are completely
different and are not interchangeable . If you want them to be you are
using the wrong language and I can't think of ANY that provide that
ability (not saying there isn't). int& specifies that the returned
'value' is a mutable object in memory where int says that it isn't.
Granted, we can do something like calculate the value each time bar is
called and ignore any value that the client may put in the RAM location
returned, however I remind you of the title of this thread... A
non-const operator[] that ignores any assignment we make to the
reference returned would break every assumption that any reasonable
programmer can expect from that particular operator. So, even if you
completely ignore everything I wrote above, you are still left with the
fact that op[]'s semantics allow for very little lee-way in how it is
implemented. There is some there, but not the freedom that you would
otherwise have.

Yes, the common meaning to operator[] is expected to be a certain
thing. I don't see the relevance.

Mar 23 '06 #46

Bob Hairgrove

On 23 Mar 2006 00:41:25 -0800, "Noah Roberts" <ro**********@g mail.com>
wrote:

Daniel T. wrote:
In article <ic************ *************** *****@4ax.com>,
Bob Hairgrove <in*****@bigfoo t.com> wrote:
> On 22 Mar 2006 08:47:06 -0800, "Noah Roberts" <ro**********@g mail.com>
> wrote:
>
> >Fact is that the client does not know, nor care, how the internals of
> >Foo are represented or where the return of bar comes from. All it
> >cares about are the guarantees spelled out by the signature of the
> >member function. This is the definition of encapsulation.
>
> Thank you.

Not quite, because the guarantee spelled out by the signature of the
member function is that the thing returned exists in RAM. And that is
the definition of breaking the UAP, an important aspect of encapsulation.

You need to do more to show how this is so. Lots of things exist in
RAM, that doesn't mean you break encapsulation by using them.

Don't waste your time, I'm beginning to think that this guy is a
troll.

--
Bob Hairgrove
No**********@Ho me.com

Mar 23 '06 #47

Greg

Noah Roberts wrote:

Greg wrote:
Roger Lakner wrote:
"Greg" <gr****@pacbell .net> wrote in message
news:11******** **************@ i40g2000cwc.goo glegroups.com.. .
> There is a huge difference between using operator[] and making the
> array public. With a public array, clients can bypass FooList's
> interface (and FooList's methods) and obtain the data FooList
> stores -
> directly.

I guess I don't see, in practical terms, the difference. If an address
is returned, without any bounds checking as in my example, then
supposedly one could access any of the data FooList stores and,
consequently, any of the data Foo stores, bypassing FooList's
interface. Or perhaps I'm not understanding. Perhaps an example would
help.

Sure, let's have FooList provide the array interface and have no
persistent storage at all:

struct Foo { };

struct FooList
{
Foo* operator[](int index);
};
Foo *FooList::opera tor[](int index)
{
return new Foo;
}

int main()
{
FooList fooList;

Foo *f1 = fooList[3];
Foo *f2 = fooList[12];
Foo *f3 = fooList[15];
}

Now clients can still "retrieve" Foo objects from fooList - even though
fooList has no array at all - it simply returns a new object at any
index.

A more realistic example would have fooList obtain the objects from
disk or over a network - but the point is that the code in main() can
treat fooList as if it were an array. But fooList does not need to use
an array in its implementation - but can store the objects however it
likes, or, as in this example, not store them at all.

Whereas the original example, which returned references, did not break
encapsulation, this one does.

Foo * f1 = fooList[3];
Foo * f2 = f1 + 2;

This would be ok for a client of the class with an internal array but
not otherwise. The class's interface does nothing to suggest that
would not be valid.

Sure it does. Pointer arithmatic on f1 would always be invalid for any
kind of array that stores Foo pointers - whether that array is real or
simulated. And the array in this example is an array of Foo pointers (I
should have pointed out that change from the original). So the Foo
pointer returned is actually a copy of the value (and not a pointer to
the value) stored at that "index" in the array. It is not the address
of the item at that index - which in fact the client has no way of
obtaining.

To return a reference in this example, operator[] would have to be
declared like so:

Foo*& operator[](int index)const;

By returning a reference to the pointer the client is then able to
change that pointer's value as it is stored within the array. Returning
a Foo pointer by value, on the other hand, does not offer the client
the opportunity to change the Foo pointer value in the array. Whether
the client should be offered that ability is of course a design
decision.

Greg

Mar 24 '06 #48

Similar topics