Encapsulation and Operator[]

Roger Lakner

I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept it?
Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

Thank you,

Roger

Mar 18 '06

Subscribe Reply

3366

« First
<
2
3
4
5
>

Daniel T.

In article <11************ **********@u72g 2000cwu.googleg roups.com>,
"Greg" <gr****@pacbell .net> wrote:

Daniel T. wrote:
In article <r3************ *************** *****@4ax.com>,
Bob Hairgrove <in*****@bigfoo t.com> wrote:
On Sun, 19 Mar 2006 08:36:33 +0100, Bob Hairgrove
<in*****@bigfoo t.com> wrote:

>On Sat, 18 Mar 2006 19:31:25 GMT, "Daniel T."
><po********@ea rthlink.net> wrote:
>
>>> For example, if index is out of range, an exception can be thrown. The
>>> implementation can also be very complex. Consider that there might not
>>> even be a member "array", but operator[] does a database lookup
>>> instead (somehow). Or that the real array is held in another class,
>>> and FooList holds a pointer or reference to that class to which it
>>> forwards the call. There are many possibilities here.
>>
>>The above is not quite true. The Foos in FooList *must* be objects in
>>RAM because clients of FooList may keep a pointer/reference to the value
>>returned, or modify the state of a FooList object by modifying the Foo
>>returned. That breaks the UAP (clients know that the return value was
>>not computed, but rather stored,) thus encapsulation is broken.
>
>I'm sorry, but you are wrong. The reference returned may not
>necessarily even be a reference (see section 23.2.5, paragraph 2 of
>the C++ standard for what it says about
>std::vector<bo ol>::operator[]). This is probably an example of
>encapsulatio n at its best!

I would like to amend this a little ... of course, for the example
given by the OP, you are correct. I was only trying to illustrate that
an implementation of operator[] can be done in other, non-trivial
ways, and that having an operator[] which returns a non-const lvalue
doesn't necessarily break encapsulation.

But let's also consider that it would be perfectly legal for
operator[] to return a reference to a static object or a dummy member
variable which acts as a proxy for the real array element. One could
then document this fact somewhere so that clients would know not to
attempt to store a pointer or reference to the object.

// assume the methods below update some file or database such that
// assigning a value to foo.bar() calls update and retrieving a value
// from foo.bar() calls get_value()
void update( unsigned id, int i );
int get( unsigned id );

class Foo {
public:
Foo( unsigned id );
~Foo();

int& bar();
};

int main() {
Foo foo1( 1 );
foo1.bar() = 1963;
Foo foo2( 1 ); // note, same ID
assert( foo2.bar() == 1963 );
assert( get( 1 ) == 1963 );
}

Implement Foo however you see fit such that the asserts in main won't
fire...

OK, I did:

#include <map>

class Foo
{
public:
Foo( unsigned id ) : mIndex(id)
{
}

~Foo() {};

int& bar()
{
return implementations[mIndex].value;
}
int mIndex;

struct FooImpl { int value; };
static std::map<int, FooImpl> implementations ;
};

std::map<int, Foo::FooImpl> Foo::implementa tions;

void update( unsigned id, int i )
{
Foo::implementa tions[id].value = i;
}

int get( unsigned id )
{
return Foo::implementa tions[id].value;
}

int main()
{
Foo foo1( 1 );
foo1.bar() = 1963;
Foo foo2( 1 ); // note, same ID
assert( foo2.bar() == 1963 );
assert( get( 1 ) == 1963 );
}

Greg

By storing the value in RAM... This is the only way it can be done and
is my point exactly.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 20 '06 #31

Daniel T.

In article <po************ *************** ***@news.east.e arthlink.net>,
"Daniel T." <po********@ear thlink.net> wrote:

In article <11************ **********@i40g 2000cwc.googleg roups.com>,
"Greg" <gr****@pacbell .net> wrote:
Daniel T. wrote:

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

Of course the client knows that any Foo object that FooList returns is
in memory, because that is exactly where the client (and not the
FooList object) allocated it - and did so before the Foo object was
ever stored in a FooList container. A container is a storage class, it
never "computes" a contained item that it returns, because the items it
contains are all provided by the client.

The Uniform Access Principle is not relevant to containers, because it
applies to a (read-only) attribute of an object. The items that a
container contains are not its attributes (nor are the contents often
read-only) - they are independent objects that can exist (and in fact
are created) outside of any container. In fact a FooList has no
client-accessible attributes which means that it perfectly encapsulates
its internal implementation.

By George, I think Greg's got it! This is basically what I have been
saying from the beginning. UAP is broken, but it's OK in some cases...
See my first post in this thread.

One last nail in the coffin. Don't believe me, believe Scott Meyers. In
"Effective C++" item 29 and 30 he says the same thing in a different
way, "Avoid returning "handles" to internal data." and "Avoid member
functions that return non-const pointers or references to members less
accessible than themselves."

The question in this case is... Are the Foos that a FooList contains
really "less accessible" than the FooList?
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 20 '06 #32

Bob Hairgrove

On Mon, 20 Mar 2006 17:21:01 GMT, "Daniel T."
<po********@ear thlink.net> wrote:

One last nail in the coffin. Don't believe me, believe Scott Meyers. In
"Effective C++" item 29 and 30 he says the same thing in a different
way, "Avoid returning "handles" to internal data." and "Avoid member
functions that return non-const pointers or references to members less
accessible than themselves."
You said earlier: "returning a reference always breaks encapsulation".
Scott Meyers is saying something entirely different here. As I tried
to point out before, the reference is only dangerous as long as it is
a reference to an object which could change the state of FooList. And
it doesn't have to be. That is why it is important to have an
interface, even if it involves returning a reference -- because it
could be a non-state-changing object to which it refers.

Don Box, in the first chapter of his excellent book "Essential COM",
has pointed out very succinctly the fact that C++ does not provide
encapsulation at the binary level. So the question remains: in what
context do you put the UAP as far as C++ is concerned?
The question in this case is... Are the Foos that a FooList contains
really "less accessible" than the FooList?

They might or might not be. It all depends on the total design. And
that includes the documentation.

Let's talk about documentation. I strongly believe that the
documentation for a class or framework can be just as much a part of
the design as the source code itself. Consider the case of
std::vector. The C++ standard makes the guarantee that on a conforming
implementation, elements of a vector are stored contiguously in
memory. This gives us additional information which could play an
important role in how elements of the vector are accessed. Give me a
pointer or iterator to the first element, and I can access all the
rest. The same could hold true for std::list or whatever other
container is available where iterators are defined.

But this doesn't hold true for vector<bool>. How do I know? The only
reason is that I read about it in the C++ standard. I consider that
"documentat ion" as well.

As Bjarne Stroustrup says, C++ wasn't designed to be non-hackable.
Things like encapsulation and public/private access were meant to
avoid errors, not to prevent malicious coding. So if what you mean by
adherence to UAP is to avoid the latter, C++ isn't going to do it. And
if it is the former, then we all have to play by the rules -- which
includes reading the documentation.

--
Bob Hairgrove
No**********@Ho me.com

Mar 20 '06 #33

Bob Hairgrove

On Mon, 20 Mar 2006 13:26:11 GMT, "Daniel T."
<po********@ear thlink.net> wrote:

the
defining characteristic of op[] is invalid in your code, namely that
consecutive calls to op[] with the same index will return the same
object.

Who says?

--
Bob Hairgrove
No**********@Ho me.com

Mar 20 '06 #34

Noah Roberts

Greg wrote:

Roger Lakner wrote:
"Greg" <gr****@pacbell .net> wrote in message
news:11******** **************@ i40g2000cwc.goo glegroups.com.. .
There is a huge difference between using operator[] and making the
array public. With a public array, clients can bypass FooList's
interface (and FooList's methods) and obtain the data FooList
stores -
directly.

I guess I don't see, in practical terms, the difference. If an address
is returned, without any bounds checking as in my example, then
supposedly one could access any of the data FooList stores and,
consequently, any of the data Foo stores, bypassing FooList's
interface. Or perhaps I'm not understanding. Perhaps an example would
help.

Sure, let's have FooList provide the array interface and have no
persistent storage at all:

struct Foo { };

struct FooList
{
Foo* operator[](int index);
};
Foo *FooList::opera tor[](int index)
{
return new Foo;
}

int main()
{
FooList fooList;

Foo *f1 = fooList[3];
Foo *f2 = fooList[12];
Foo *f3 = fooList[15];
}

Now clients can still "retrieve" Foo objects from fooList - even though
fooList has no array at all - it simply returns a new object at any
index.

A more realistic example would have fooList obtain the objects from
disk or over a network - but the point is that the code in main() can
treat fooList as if it were an array. But fooList does not need to use
an array in its implementation - but can store the objects however it
likes, or, as in this example, not store them at all.

Whereas the original example, which returned references, did not break
encapsulation, this one does.

Foo * f1 = fooList[3];
Foo * f2 = f1 + 2;

This would be ok for a client of the class with an internal array but
not otherwise. The class's interface does nothing to suggest that
would not be valid. On the other hand, returning a reference does. It
says that this is not something that should be considered a pointer so
don't try anything pointerish with it. A client could still do
something similar with references:

Foo & f1 = fooList[3];
Foo * f2 = &f1 + 2;

You can't stop that kind of thing really, unless you return a reference
proxy. There may be reason enough to do so. However, this is
obviously much more hackish and there can be no doubt in the mind of
the developer that they are purposefully breaking encapsulation for
some stupid reason. With a pointer return this is not so obvious.

The preferable interface of course returns copies or proxies that act
like copies...maybe copy on write type of reference counting mechanism.
With a proxy you could even change the semantics of the class to allow
the creation of new Foo's because maybe the Foo is something on some
other server and it needs to be copied. Both references and pointers
pose problems for this need.

About the class not doing any bounds checking, it easily could. There
is nothing saying that operator[] doesn't throw. You could add that
functionality and not hurt clients, assuming they are already exception
safe (and they should be). std::vector doesn't check either but still
offers a lot in comparison to a public array.

Mar 20 '06 #35

Daniel T.

In article <1g************ *************** *****@4ax.com>,
Bob Hairgrove <in*****@bigfoo t.com> wrote:

On Mon, 20 Mar 2006 17:21:01 GMT, "Daniel T."
<po********@ear thlink.net> wrote:
One last nail in the coffin. Don't believe me, believe Scott Meyers. In
"Effective C++" item 29 and 30 he says the same thing in a different
way, "Avoid returning "handles" to internal data." and "Avoid member
functions that return non-const pointers or references to members less
accessible than themselves."
You said earlier: "returning a reference always breaks encapsulation".
Scott Meyers is saying something entirely different here. As I tried
to point out before, the reference is only dangerous as long as it is
a reference to an object which could change the state of FooList. And
it doesn't have to be. That is why it is important to have an
interface, even if it involves returning a reference -- because it
could be a non-state-changing object to which it refers.

Again, this is exactly my point. If logically, FooList is simply holding
the Foos for some other object and doesn't actually own them, then
breaking UAP is OK, but UAP is still broken. That's why I said to the OP
that although he was right and encapsulation was broken, that isn't
necessarily a problem.

Don Box, in the first chapter of his excellent book "Essential COM",
has pointed out very succinctly the fact that C++ does not provide
encapsulation at the binary level. So the question remains: in what
context do you put the UAP as far as C++ is concerned?
The question in this case is... Are the Foos that a FooList contains
really "less accessible" than the FooList?
They might or might not be. It all depends on the total design. And
that includes the documentation.

Let's talk about documentation. I strongly believe that the
documentation for a class or framework can be just as much a part of
the design as the source code itself. Consider the case of
std::vector. The C++ standard makes the guarantee that on a conforming
implementation, elements of a vector are stored contiguously in
memory. This gives us additional information which could play an
important role in how elements of the vector are accessed. Give me a
pointer or iterator to the first element, and I can access all the
rest. The same could hold true for std::list or whatever other
container is available where iterators are defined.

But this doesn't hold true for vector<bool>. How do I know? The only
reason is that I read about it in the C++ standard. I consider that
"documentat ion" as well.

All of this proves my point. The interface gives away the implementation
(its not encapsulated.) That's OK in this case though because the docs
require a particular implementation anyway.

As Bjarne Stroustrup says, C++ wasn't designed to be non-hackable.
Things like encapsulation and public/private access were meant to
avoid errors, not to prevent malicious coding. So if what you mean by
adherence to UAP is to avoid the latter, C++ isn't going to do it. And
if it is the former, then we all have to play by the rules -- which
includes reading the documentation.

Not at all. Nothing can prevent a programmer from hacking into the
memory footprint of an object and access anything he wants. I am talking
about the legions of programmers who provide reference returns and
*think* they are properly encapsulating their data.

As Meyers himself wrote: "Unfortunat ely, the presence of [a member
function returning a non-const reference] defeats the purpose of making
[a private member-variable] private." (Meyers, "Effective C++" item 30)

--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 21 '06 #36

Daniel T.

In article <al************ *************** *****@4ax.com>,
Bob Hairgrove <in*****@bigfoo t.com> wrote:

On Mon, 20 Mar 2006 13:26:11 GMT, "Daniel T."
<po********@ear thlink.net> wrote:
the
defining characteristic of op[] is invalid in your code, namely that
consecutive calls to op[] with the same index will return the same
object.

Who says?

Herb Sutter, Scott Meyers, Bjarne Stroustrup among others... A common
quote "when using operator overloading or any other language feature for
your own classes, when in doubt always make your class follow the same
semantics as the builtin and standard library types."

If your op[] does not follow the same semantics as op[] does on an
array, then I suggest you rename your function so you won't make life
harder on your users. Remember, operator overloading is exists to make
life easer on the users of your class.
<http://www.parashift.c om/c++-faq-lite/operator-overloading.htm l>
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 21 '06 #37

Bob Hairgrove

On Tue, 21 Mar 2006 14:00:29 GMT, "Daniel T."
<po********@ear thlink.net> wrote:

As Bjarne Stroustrup says, C++ wasn't designed to be non-hackable.
Things like encapsulation and public/private access were meant to
avoid errors, not to prevent malicious coding. So if what you mean by
adherence to UAP is to avoid the latter, C++ isn't going to do it. And
if it is the former, then we all have to play by the rules -- which
includes reading the documentation.
Not at all. Nothing can prevent a programmer from hacking into the
memory footprint of an object and access anything he wants. I am talking
about the legions of programmers who provide reference returns and
*think* they are properly encapsulating their data.

So we agree that encapsulation cannot be achieved 100% purely by using
the language features in C++, but only through proper design? If you
agree with that, I still do not understand why you categorically
stated earlier that returning a reference breaks encapsulation?
As Meyers himself wrote: "Unfortunat ely, the presence of [a member
function returning a non-const reference] defeats the purpose of making
[a private member-variable] private." (Meyers, "Effective C++" item 30)

No. This is more like what he says:

"Unfortunat ely, the presence of [a member function returning a
non-const reference TO A PRIVATE MEMBER OF THE CLASS WHICH CAN CHANGE
THE CLASS STATE] defeats the purpose of making [THAT private
member-variable] private." (Meyers, "Effective C++" item 30)

The fact that the return value is a reference is irrelevant by itself.
It all depends on what the reference refers to. As I pointed out
before, it could be a reference to some dummy variable which is kept
solely for the purpose of satisfying clients who need some kind of
non-const lvalue to write to. What actually is written (or not, as the
case may be) is solely under control of the class containing the
member and enforced through the implementation of the function
returning the reference.

And even if it is a reference to some meaningful member of the class,
the function can be implemented in a discretionary manner. For
example, operator= often checks for "this==&argumen t". If the check
proves true, the behavior is different than if it isn't.

Why are you so afraid of references? It's all about design and not
about the language itself. Implementing operator[] as in the OP's
original example DOES break encapsulation. But it can be implemented
differently in the background using the same interface. So it is the
implementation, and not the interface, which makes the difference.

--
Bob Hairgrove
No**********@Ho me.com

Mar 21 '06 #38

Bob Hairgrove

On Tue, 21 Mar 2006 14:17:49 GMT, "Daniel T."
<po********@ear thlink.net> wrote:

In article <al************ *************** *****@4ax.com>,
Bob Hairgrove <in*****@bigfoo t.com> wrote:
On Mon, 20 Mar 2006 13:26:11 GMT, "Daniel T."
<po********@ear thlink.net> wrote:
>the
>defining characteristic of op[] is invalid in your code, namely that
>consecutive calls to op[] with the same index will return the same
>object.

Who says?

Herb Sutter, Scott Meyers, Bjarne Stroustrup among others... A common
quote "when using operator overloading or any other language feature for
your own classes, when in doubt always make your class follow the same
semantics as the builtin and standard library types."

If your op[] does not follow the same semantics as op[] does on an
array, then I suggest you rename your function so you won't make life
harder on your users. Remember, operator overloading is exists to make
life easer on the users of your class.
<http://www.parashift.c om/c++-faq-lite/operator-overloading.htm l>

That seems awfully limiting to me ... look at std::map::opera tor[],
for example. Does that act like an array??

--
Bob Hairgrove
No**********@Ho me.com

Mar 21 '06 #39

Phlip

Bob Hairgrove wrote:

If your op[] does not follow the same semantics as op[] does on an
array, then I suggest you rename your function so you won't make life
harder on your users. Remember, operator overloading is exists to make
life easer on the users of your class.
<http://www.parashift.c om/c++-faq-lite/operator-overloading.htm l>

That seems awfully limiting to me ... look at std::map::opera tor[],
for example. Does that act like an array??

That is a good example of the difference between syntax and semantics. Map[]
doesn't use the same syntax, but it gives the same warm-fuzzies. Semantics.

--
Phlip
http://www.greencheese.org/ZeekLand <-- NOT a blog!!!

Mar 21 '06 #40

Similar topics