Encapsulation and Operator[]

Roger Lakner

I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept it?
Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

Thank you,

Roger

Mar 18 '06 #1

Subscribe Post Reply

3283

Bob Hairgrove

On Fri, 17 Mar 2006 21:57:20 -0800, "Roger Lakner"
<rl*****@adelphia.net> wrote:

I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept it?
Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

Why do you say that it wrecks encapsulation? Admittedly, the
implementation shown above doesn't buy you anything over direct public
access to the array member variable, but that doesn't mean that it
can't be done differently.

For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

--
Bob Hairgrove
No**********@Home.com

Mar 18 '06 #2

Bo Persson

"Roger Lakner" <rl*****@adelphia.net> skrev i meddelandet
news:Tq********************@adelphia.com...

I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems
to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use?
That is an important feature for an interface!
Or do you just bite the bullet and accept it?
It is not about accepting or not, it is about what you want to do to
an object. Without knowing what a Foo is, it is hard to tell if

FooList f;

f[5] = someFoo;

is "natural" or not. It depends!

Considering that C++ provides a std::vector with this kind of
interface, it must be correct some of the time. :-)

Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

There aren't always universal rules for all situations. You have to
consider each one individually.
Bo Persson

Mar 18 '06 #3

Greg

Roger Lakner wrote:

I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept it?

"Wrecks encapsulation?" On the contrary, FooList demonstrates exactly
how a class should encapsulate its data - with private data members and
a public interface. Note that clients cannot access FooList's data
member directly, instead they must invoke methods in FooList's public
interface to access FooList's data. In other words, FooList has
encapsulated its data by mediating all access to it.

Encapsulation makes it possible for FooList to change its underlying
storage model without affecting its clients - and that quality is the
primary benefit of encapsulation. For instance we could imagine an
implementation in which FooList accessed a network server, or a
database, or some other kind of store to retrieve the Foo object
returned by operator[]. To the client, this change to FooList would go
undetected - because even though its data representation may have
changed - its public interface, which its clients all use - would not
have changed.

Greg

Mar 18 '06 #4

Daniel T.

In article <Tq********************@adelphia.com>,
"Roger Lakner" <rl*****@adelphia.net> wrote:

I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems to
wreck encapsulation.
Returning a reference/pointer to a member object breaks both the LSP and
UAP (and thus "wrecks encapsulation".) You are entirely correct on that
point. However there are times when both can be broken... Take
std::vector for example, the vector does not logically own the objects
it contains, it is simply in charge of deleting said objects at the
appropriate time. The object that owns the vector is the one who
actually owns the objects contained in the vector. In other words:

class Foo {
vector<Bar> itsBars;
};

Logically, Foo objects own the Bar objects in the vector, *not* the
vector. If you were to write the above in UML it would look like:

0..n
[Foo]<#>--------->[Bar]

Note how the vector<Bar> is not expressed in the diagram, it is simply
an implementation artifact.

Also, although returning a const reference does break the OO principles
above, as long as it is only used as a performance optimization over
returning an object, it's OK to do. In other words, changing "const T&"
to "T" should not break client code, only slow down the function call.
Any client code that *does* break as a result of such change should be
modified.

In summary, by returning a Foo& in FooList::op[], you (as the designer
of FooList) are saying that FooList doesn't actually own the Foos it
contains, it is simply managing their lifetime for some client who
*does* own them.

Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept it?

First you have to ask yourself, "who owns the Foos that FooList
contains?" If the answer is "FooList" then you should not provide an
op[] for the contained objects (although if you want, you can provide an
op[] const.) Use something like this instead:

class FooList {
Foo array[num];
public:
const Foo operator[](unsigned index) const { return array[index]; }
void setFoo( unsigned id, const Foo& foo ) {
array[id] = foo;
}
};

(If Foo's are expensive to copy and you find that performance is
suffering because of the return by const value, you can later change the
code to "const Foo& operator[](unsigned)const".)

This way, encapsulation is preserved. You can replace 'array' with any
other class, set of classes, or remove it completely as your needs
dictate without affecting FooList clients.

--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 18 '06 #5

Daniel T.

In article <fn********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:

On Fri, 17 Mar 2006 21:57:20 -0800, "Roger Lakner"
<rl*****@adelphia.net> wrote:
I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept it?
Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

Why do you say that it wrecks encapsulation? Admittedly, the
implementation shown above doesn't buy you anything over direct public
access to the array member variable, but that doesn't mean that it
can't be done differently.

For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 18 '06 #6

Roger Lakner

Bob,
I see your point about the fact that more can go on in the op[]
implementation than just passing a reference. And in that context, it
makes some sense, though I still think encapsulation is violated. But
I have seen several contexts in which nothing goes on except as I've
illustrated (e.g., in std::vector), and it is in this context that it
seems to me just a conceit to make the array private.

Roger

"Bob Hairgrove" <in*****@bigfoot.com> wrote in message
news:fn********************************@4ax.com...

On Fri, 17 Mar 2006 21:57:20 -0800, "Roger Lakner"
<rl*****@adelphia.net> wrote:
I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems
to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept
it?
Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

Why do you say that it wrecks encapsulation? Admittedly, the
implementation shown above doesn't buy you anything over direct
public
access to the array member variable, but that doesn't mean that it
can't be done differently.

For example, if index is out of range, an exception can be thrown.
The
implementation can also be very complex. Consider that there might
not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

--
Bob Hairgrove
No**********@Home.com

Mar 19 '06 #7

Roger Lakner

"Bo Persson" <bo*@gmb.dk> wrote in message
news:48************@individual.net...

There aren't always universal rules for all situations. You have to
consider each one individually.

Amen to that. But if one of the primary advantages of C++ over C is
encapsulation, and that feature is routinely and easily subverted,
then...

Roger

Mar 19 '06 #8

Roger Lakner

"Greg" <gr****@pacbell.net> wrote in message
news:11**********************@i40g2000cwc.googlegr oups.com...

On the contrary, FooList demonstrates exactly
how a class should encapsulate its data - with private data members
and
a public interface. Note that clients cannot access FooList's data
member directly, instead they must invoke methods in FooList's
public
interface to access FooList's data. In other words, FooList has
encapsulated its data by mediating all access to it.
I guess I don't see the difference between op[] as instantiated here
and making array public.

Encapsulation makes it possible for FooList to change its underlying
storage model without affecting its clients - and that quality is
the
primary benefit of encapsulation.

I agree that is one of the primary benefits of encapsulation, but so
is data hiding. The data is not being hidden in any robust sense here.
I don't see the difference, in my example, between op[] and making
array public. It's disheartening to me to see that one of the
much-vaunted advantages of C++ over C is so easily, so naturally, so
intuitively and quite often subverted.

Roger

Mar 19 '06 #9

Roger Lakner

"Daniel T." <po********@earthlink.net> wrote in message
news:postmaster-

Returning a reference/pointer to a member object breaks both the LSP
and
UAP (and thus "wrecks encapsulation".) You are entirely correct on
that
point. However there are times when both can be broken...
What sense of "can" do you mean in "can be broken"? I know it is
possible to break both. Do you mean "should" be broken? Should be
broken only in rare cases? Should be broken when other considerations
warrant?
Take std::vector for example, the vector does not logically own the
objects
it contains, it is simply in charge of deleting said objects at the
appropriate time. The object that owns the vector is the one who
actually owns the objects contained in the vector. In other words:

class Foo {
vector<Bar> itsBars;
};

Logically, Foo objects own the Bar objects in the vector, *not* the
vector. If you were to write the above in UML it would look like:

0..n
[Foo]<#>--------->[Bar]

Note how the vector<Bar> is not expressed in the diagram, it is
simply
an implementation artifact.

Also, although returning a const reference does break the OO
principles
above, as long as it is only used as a performance optimization over
returning an object, it's OK to do. In other words, changing "const
T&"
to "T" should not break client code, only slow down the function
call.
Any client code that *does* break as a result of such change should
be
modified.
So performance is what makes it OK to break a primary feature of OO
principles? I agree its advantageous to the programmer, but at what
cost?

In summary, by returning a Foo& in FooList::op[], you (as the
designer
of FooList) are saying that FooList doesn't actually own the Foos it
contains, it is simply managing their lifetime for some client who
*does* own them.
This is a very interesting way of looking at this. I need to give this
some more thought.

Is there some standard way to avoid this
transgression and still provide the client with an interface that
is
natural and easy to use? Or do you just bite the bullet and accept
it?

First you have to ask yourself, "who owns the Foos that FooList
contains?" If the answer is "FooList" then you should not provide an
op[] for the contained objects (although if you want, you can
provide an
op[] const.) Use something like this instead:

class FooList {
Foo array[num];
public:
const Foo operator[](unsigned index) const { return
array[index]; }
void setFoo( unsigned id, const Foo& foo ) {
array[id] = foo;
}
};

(If Foo's are expensive to copy and you find that performance is
suffering because of the return by const value, you can later change
the
code to "const Foo& operator[](unsigned)const".)

This way, encapsulation is preserved. You can replace 'array' with
any
other class, set of classes, or remove it completely as your needs
dictate without affecting FooList clients.

Thank you very much, this is very helpful.

Roger

Mar 19 '06 #10

Daniel T.

In article <gK********************@adelphia.com>,
"Roger Lakner" <rl*****@adelphia.net> wrote:

"Daniel T." <po********@earthlink.net> wrote in message
news:postmaster-
Returning a reference/pointer to a member object breaks both the LSP
and
UAP (and thus "wrecks encapsulation".) You are entirely correct on
that
point. However there are times when both can be broken...
What sense of "can" do you mean in "can be broken"? I know it is
possible to break both. Do you mean "should" be broken? Should be
broken only in rare cases? Should be broken when other considerations
warrant?

What I mean to say is that at times, other conditions warrant breaking
LSP and or UAP. One issue that can cause us to break encapsulation is,
as I have already mentioned, performance. C++ is, first and foremost I
should think, designed for high performance time critical applications.
More so than any other OO language. Another reason to break
encapsulation has also been alluded to by me and that has to do with
lifetime issues and the feature unique to C++ among OO languages, the
destructor.

Take std::vector for example, the vector does not logically own the
objects
it contains, it is simply in charge of deleting said objects at the
appropriate time. The object that owns the vector is the one who
actually owns the objects contained in the vector. In other words:

class Foo {
vector<Bar> itsBars;
};

Logically, Foo objects own the Bar objects in the vector, *not* the
vector. If you were to write the above in UML it would look like:

0..n
[Foo]<#>--------->[Bar]

Note how the vector<Bar> is not expressed in the diagram, it is
simply
an implementation artifact.

Also, although returning a const reference does break the OO
principles
above, as long as it is only used as a performance optimization over
returning an object, it's OK to do. In other words, changing "const
T&"
to "T" should not break client code, only slow down the function
call.
Any client code that *does* break as a result of such change should
be
modified.

So performance is what makes it OK to break a primary feature of OO
principles?

The cost in other matters (such as maintainability, reuse, extensibility
&c.) is irrelevant if the system performs so slowly that it cannot be
used for its intended purpose. Please don't get me wrong, I heartily
agree that premature optimization is the root of many problems in
programming in general and possibly especially so in C++, but let's not
throw the baby out with the bath water...
I agree its advantageous to the programmer, but at what cost?

You don't agree with me. I think breaking encapsulation is
disadvantageous to the programmer. However, it is sometimes a necessary
evil.

In summary, by returning a Foo& in FooList::op[], you (as the
designer
of FooList) are saying that FooList doesn't actually own the Foos it
contains, it is simply managing their lifetime for some client who
*does* own them.

This is a very interesting way of looking at this. I need to give this
some more thought.

Well then, let me expound on it some more. C++ is unique (in my
experience at least) among OO languages in that every class has one
member function who's semantics are such that it must be the last method
called on the object, and it *must* be called. This unique requirement
means that C++ has some correspondingly unique idioms to deal with it
that other languages need not deal with.

The best and brightest in the C++ community, have found over the years,
that it isn't necessarily advantages for the parent object in a
composite relationship to also be the object responsible for ensuring
that the destructor is properly called. C++ has a plethora of classes
who's soul responsibility is to ensure the destructor is called on one
or more objects at the appropriate time (namely smart pointers and the
standard containers.) So, we often find classes in C++ sharing their
aggregates with others, the latter of which has the sole responsibility
of monitoring the aggregate's lifetime.

Is there some standard way to avoid this
transgression and still provide the client with an interface that
is
natural and easy to use? Or do you just bite the bullet and accept
it?

First you have to ask yourself, "who owns the Foos that FooList
contains?" If the answer is "FooList" then you should not provide an
op[] for the contained objects (although if you want, you can
provide an
op[] const.) Use something like this instead:

class FooList {
Foo array[num];
public:
const Foo operator[](unsigned index) const {
return array[index];
}
void setFoo( unsigned id, const Foo& foo ) {
array[id] = foo;
}
};

(If Foo's are expensive to copy and you find that performance is
suffering because of the return by const value, you can later change
the code to "const Foo& operator[](unsigned)const".)

This way, encapsulation is preserved. You can replace 'array' with
any
other class, set of classes, or remove it completely as your needs
dictate without affecting FooList clients.

Thank you very much, this is very helpful.

It is my pleasure.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 19 '06 #11

Bob Hairgrove

On Sat, 18 Mar 2006 19:31:25 GMT, "Daniel T."
<po********@earthlink.net> wrote:

For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

I'm sorry, but you are wrong. The reference returned may not
necessarily even be a reference (see section 23.2.5, paragraph 2 of
the C++ standard for what it says about
std::vector<bool>::operator[]). This is probably an example of
encapsulation at its best!

--
Bob Hairgrove
No**********@Home.com

Mar 19 '06 #12

Bob Hairgrove

[top-posting corrected]

On Sat, 18 Mar 2006 18:36:26 -0800, "Roger Lakner"
<rl*****@adelphia.net> wrote:

"Bob Hairgrove" <in*****@bigfoot.com> wrote in message
news:fn********************************@4ax.com.. .
On Fri, 17 Mar 2006 21:57:20 -0800, "Roger Lakner"
<rl*****@adelphia.net> wrote:
I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems
to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept
it?
Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

Why do you say that it wrecks encapsulation? Admittedly, the
implementation shown above doesn't buy you anything over direct
public
access to the array member variable, but that doesn't mean that it
can't be done differently.

For example, if index is out of range, an exception can be thrown.
The
implementation can also be very complex. Consider that there might
not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

--
Bob Hairgrove
No**********@Home.com

Bob,
I see your point about the fact that more can go on in the op[]
implementation than just passing a reference. And in that context, it
makes some sense, though I still think encapsulation is violated. But
I have seen several contexts in which nothing goes on except as I've
illustrated (e.g., in std::vector), and it is in this context that it
seems to me just a conceit to make the array private.

It does leave room for later modifications to the implementation
without having to change the interface. Often, you will see things
like these seemingly silly "do-nothing" implementations in the
pre-production stages of code. When the preliminary testing phase
passes, programmers can "harden up" the code by adding stuff to the
implementation body. Clients only see the headers, so they wouldn't
have to recompile their own code.

--
Bob Hairgrove
No**********@Home.com

Mar 19 '06 #13

Bob Hairgrove

On Sat, 18 Mar 2006 18:39:58 -0800, "Roger Lakner"
<rl*****@adelphia.net> wrote:

"Bo Persson" <bo*@gmb.dk> wrote in message
news:48************@individual.net...

There aren't always universal rules for all situations. You have to
consider each one individually.

Amen to that. But if one of the primary advantages of C++ over C is
encapsulation, and that feature is routinely and easily subverted,
then...

Encapsulation doesn't mean "non-hackable". In the "Design and
Evolution of C++" by Bjarne Stroustrup, there is a passage about
public/private access mechanisms on page 55 which addresses this
similar issue (quote from his "Annotated C++ Reference Manual"):

"The C++ access control mechanisms provide protection against accident
-- not against fraud. Any programming language that supports access to
raw memory will leave data open to deliberate tampering in ways that
violate the explicit type rules specified for a given data item."

--
Bob Hairgrove
No**********@Home.com

Mar 19 '06 #14

Bob Hairgrove

On Sun, 19 Mar 2006 08:36:33 +0100, Bob Hairgrove
<in*****@bigfoot.com> wrote:

On Sat, 18 Mar 2006 19:31:25 GMT, "Daniel T."
<po********@earthlink.net> wrote:
For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

I'm sorry, but you are wrong. The reference returned may not
necessarily even be a reference (see section 23.2.5, paragraph 2 of
the C++ standard for what it says about
std::vector<bool>::operator[]). This is probably an example of
encapsulation at its best!

I would like to amend this a little ... of course, for the example
given by the OP, you are correct. I was only trying to illustrate that
an implementation of operator[] can be done in other, non-trivial
ways, and that having an operator[] which returns a non-const lvalue
doesn't necessarily break encapsulation.

But let's also consider that it would be perfectly legal for
operator[] to return a reference to a static object or a dummy member
variable which acts as a proxy for the real array element. One could
then document this fact somewhere so that clients would know not to
attempt to store a pointer or reference to the object.

--
Bob Hairgrove
No**********@Home.com

Mar 19 '06 #15

Daniel T.

In article <16********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:

On Sat, 18 Mar 2006 19:31:25 GMT, "Daniel T."
<po********@earthlink.net> wrote:
For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

I'm sorry, but you are wrong. The reference returned may not
necessarily even be a reference (see section 23.2.5, paragraph 2 of
the C++ standard for what it says about
std::vector<bool>::operator[]). This is probably an example of
encapsulation at its best!

By all means, show me some code. I'd love to be proven wrong...

class Foo {
// implement as you see fit.
public:
int& bar(); // implement as you see fit.
};

int main() {
Foo f;
int& i = f.bar();
i = 1967;
assert( f.bar() == 1967 );
i = 1942;
assert( f.bar() == 1942 );
}

If you can implement the Foo interface such that the int returned is not
stored in RAM and the assertions in main don't abort the program, I'd
love to see how. The learning experience would be wonderful.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 19 '06 #16

Greg

Roger Lakner wrote:

"Greg" <gr****@pacbell.net> wrote in message
news:11**********************@i40g2000cwc.googlegr oups.com...
On the contrary, FooList demonstrates exactly
how a class should encapsulate its data - with private data members
and
a public interface. Note that clients cannot access FooList's data
member directly, instead they must invoke methods in FooList's
public
interface to access FooList's data. In other words, FooList has
encapsulated its data by mediating all access to it.

I guess I don't see the difference between op[] as instantiated here
and making array public.

There is a huge difference between using operator[] and making the
array public. With a public array, clients can bypass FooList's
interface (and FooList's methods) and obtain the data FooList stores -
directly. And without FooList's interface interposing itself between
its stored data and its clients there is no easy way for FooList to use
a different data storage mechanism in the future, since its clients
will all be relying on the data being stored in an array.

Accessing the data objects through the operator[] is a completely
different story. The operator[] is a function - it is therefore code in
FooList that clients must call in order to retrieve data from a FooList
object. Since FooList can implement operator[] however it likes, it can
get the data it returns from anywhere it likes. And since every client
must call this method to get the data, no client is relying on any
detail of how FooList actually stores its data. In this example FooList
happens to use an array data member - but it need not to. For clients,
the overloaded operator[] creates the illusion that FooList is - or has
- an array. But an interface is separate from the implementation. In
fact, with operator[] access, FooList clients have no way of knowing
how FooList actually stores its data.

Therefore we can conclude from these two cases, that FooList properly
encapsulates the details of its data storage implementation when making
public the operator[] but not when making its array data member public.

Encapsulation makes it possible for FooList to change its underlying
storage model without affecting its clients - and that quality is
the
primary benefit of encapsulation.

I agree that is one of the primary benefits of encapsulation, but so
is data hiding. The data is not being hidden in any robust sense here.
I don't see the difference, in my example, between op[] and making
array public. It's disheartening to me to see that one of the
much-vaunted advantages of C++ over C is so easily, so naturally, so
intuitively and quite often subverted.

It's important to ask what exactly FooList aims to encapsulate - and
the answer is not Foo objects, not by any means. Simply put, FooList
does not encapsulate the Foo objects that it stores. Containment has
nothing to do with encapsulation. Since FooList does not implement Foo,
FooList cannot encapsulate Foo. Foo is an independent class with its
own interface and its own, encapsulated implementation.

So what then does FooList encapsulate? It encapsulates just what it
implements: a storage mechanism for Foo objects. That's it. But
otherwise FooList knows almost nothing about Foo objects. The classes
are not related, so FooList is just a client (and barely one at that)
of Foo.

So does the fact that FooList returns Foo objects break its
encapsulation? Absolutely not. FooList is a container - a container is
expected to return whatever it logically contains. That's why it's
called a container after all. As we can conclude from the FooList
example, a container does not encapsulate the items that it stores -
only the way that it stores them. Encapsulation is not about data
organization or relationships between classes - it describes solely the
relationship (within a single class) between a public interface and an
implementation.

Greg

Mar 19 '06 #17

Bob Hairgrove

On Sun, 19 Mar 2006 14:23:51 GMT, "Daniel T."
<po********@earthlink.net> wrote:

In article <16********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:
On Sat, 18 Mar 2006 19:31:25 GMT, "Daniel T."
<po********@earthlink.net> wrote:
>> For example, if index is out of range, an exception can be thrown. The
>> implementation can also be very complex. Consider that there might not
>> even be a member "array", but operator[] does a database lookup
>> instead (somehow). Or that the real array is held in another class,
>> and FooList holds a pointer or reference to that class to which it
>> forwards the call. There are many possibilities here.
>
>The above is not quite true. The Foos in FooList *must* be objects in
>RAM because clients of FooList may keep a pointer/reference to the value
>returned, or modify the state of a FooList object by modifying the Foo
>returned. That breaks the UAP (clients know that the return value was
>not computed, but rather stored,) thus encapsulation is broken.
I'm sorry, but you are wrong. The reference returned may not
necessarily even be a reference (see section 23.2.5, paragraph 2 of
the C++ standard for what it says about
std::vector<bool>::operator[]). This is probably an example of
encapsulation at its best!

By all means, show me some code. I'd love to be proven wrong...

Look at your favorite STL's implementation of vector<bool>...
class Foo {
// implement as you see fit.
public:
int& bar(); // implement as you see fit.
};

int main() {
Foo f;
int& i = f.bar();
i = 1967;
assert( f.bar() == 1967 );
i = 1942;
assert( f.bar() == 1942 );
}

If you can implement the Foo interface such that the int returned is not
stored in RAM and the assertions in main don't abort the program, I'd
love to see how. The learning experience would be wonderful.

First, please read my follow-up post which you probably didn't see
before posting this.

Also, we were talking about operator[] which is overloaded for
const/non-const. They are two separate functions and don't necessarily
have to return the same reference at all (unless you expect them to be
consistent, which needs to have a requirement/business rule).

As to the challenge, those assertions of yours are new requirements,
and I didn't say anything about how the object whose reference
returned wasn't stored in RAM ... that's not really possible. It just
doesn't have to reference the private data member (array) contained
within the class. It can be a static object or a dummy member
variable, for example. Or you could even implement operator[] to
return a temporary proxy object which has an automatic conversion to
the reference type.

I've actually done this before for an ODBC class library wrapper I
wrote about two or three years ago. The cursor class had overloaded
operator[] which was used for reading and writing to the columns of
the current row. Since each column can have a completely different
data type, it wasn't possible to have it return a reference to the
real type at all, so we returned a proxy class which handled the
actual conversion. The gory details were all in the proxy, but
transparent to the clients. And it worked just fine, although there
was a lot of casting from raw memory buffers going on behind the
scenes. Clients could write stuff like:

OdbcConnection db(/*...*/);
// x is an int, y is a string, and z is a double...
OdbcCursor cr(db, "SELECT x,y,z FROM my_table;");
// prepare or execute query...
while(!cr.EOF()) {
cr.Edit();
cr[0] = 123;
cr[1] = "some string";
cr[2] = 3.14159;
cr.Update();
cr.Next();
}
cr.Close();

Operator[] was also overloaded to take a string argument so that
columns could be accessed by name, of course. I just used the above
for illustration purposes. (Now if I had been a little more STL savvy
at the time, I would have probably implemented iterators for
OdbcCursor...)

Even then, Foo::bar() doesn't necessarily have to return the same
value that you assign to i; it can even silently change i inside its
implementation...which is why we need to document the fact or disallow
keeping a pointer or reference to the object returned. Think about
vector<bool>::operator[]. It must return a reference to bool, yet we
know that it is not possible to take the address of a single bit.
(Hmmm ... how do they do it??)

I will show you another example of what I am saying. Consider having a
class which controls access to elements of a vector -- conceptually
speaking, that is. Internally it doesn't have to be a vector at all
unless maybe there are O(1) time random-access requirements on it. We
want to grant access according to user permissions. You want some
users to be able to read and write all values via operator[], but
limit what other users with less permissions can read and write.
Furthermore, the stored values are encrypted, but you want to read and
write plain-text (e.g. passwords). Also, you want each user to think
that they are accessing values at index 0..n, where in reality these
are located just about anywhere in the vector or might be looked up
somewhere else. Finally, in order to prohibit doing pointer arithmetic
to access elements, just in case we are using a vector, we use proxy
elements which serialize access (i.e. one user at a time) by
implementing a locking mechanism. This prevents clients who ARE being
sneaky and holding a reference to what Foo::bar() returns from
accessing the next user's password or code -- we allocate the dummy or
proxy object dynamically for one call only, then delete it and
reallocate it on the next call, so that the previous references or
pointers would dangle (of course, we will have to document that
somewhere...<g>)

Shall I continue? I think you get where I am going... but if you want,
I'll take a few more mintues and post some code. It's not hard, but
I'm a little lazy. <g> Somehow I think we aren't really talking about
the same thing, though...

--
Bob Hairgrove
No**********@Home.com

Mar 19 '06 #18

Daniel T.

In article <r3********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:

On Sun, 19 Mar 2006 08:36:33 +0100, Bob Hairgrove
<in*****@bigfoot.com> wrote:
On Sat, 18 Mar 2006 19:31:25 GMT, "Daniel T."
<po********@earthlink.net> wrote:
For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

I'm sorry, but you are wrong. The reference returned may not
necessarily even be a reference (see section 23.2.5, paragraph 2 of
the C++ standard for what it says about
std::vector<bool>::operator[]). This is probably an example of
encapsulation at its best!

I would like to amend this a little ... of course, for the example
given by the OP, you are correct. I was only trying to illustrate that
an implementation of operator[] can be done in other, non-trivial
ways, and that having an operator[] which returns a non-const lvalue
doesn't necessarily break encapsulation.

But let's also consider that it would be perfectly legal for
operator[] to return a reference to a static object or a dummy member
variable which acts as a proxy for the real array element. One could
then document this fact somewhere so that clients would know not to
attempt to store a pointer or reference to the object.

// assume the methods below update some file or database such that
// assigning a value to foo.bar() calls update and retrieving a value
// from foo.bar() calls get_value()
void update( unsigned id, int i );
int get( unsigned id );

class Foo {
public:
Foo( unsigned id );
~Foo();

int& bar();
};

int main() {
Foo foo1( 1 );
foo1.bar() = 1963;
Foo foo2( 1 ); // note, same ID
assert( foo2.bar() == 1963 );
assert( get( 1 ) == 1963 );
}

Implement Foo however you see fit such that the asserts in main won't
fire...
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 19 '06 #19

Daniel T.

In article <th********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:

On Sun, 19 Mar 2006 14:23:51 GMT, "Daniel T."
<po********@earthlink.net> wrote:
In article <16********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:
On Sat, 18 Mar 2006 19:31:25 GMT, "Daniel T."
<po********@earthlink.net> wrote:

>> For example, if index is out of range, an exception can be thrown. The
>> implementation can also be very complex. Consider that there might not
>> even be a member "array", but operator[] does a database lookup
>> instead (somehow). Or that the real array is held in another class,
>> and FooList holds a pointer or reference to that class to which it
>> forwards the call. There are many possibilities here.
>
>The above is not quite true. The Foos in FooList *must* be objects in
>RAM because clients of FooList may keep a pointer/reference to the value
>returned, or modify the state of a FooList object by modifying the Foo
>returned. That breaks the UAP (clients know that the return value was
>not computed, but rather stored,) thus encapsulation is broken.

I'm sorry, but you are wrong. The reference returned may not
necessarily even be a reference (see section 23.2.5, paragraph 2 of
the C++ standard for what it says about
std::vector<bool>::operator[]). This is probably an example of
encapsulation at its best!
By all means, show me some code. I'd love to be proven wrong...

Look at your favorite STL's implementation of vector<bool>...
class Foo {
// implement as you see fit.
public:
int& bar(); // implement as you see fit.
};

int main() {
Foo f;
int& i = f.bar();
i = 1967;
assert( f.bar() == 1967 );
i = 1942;
assert( f.bar() == 1942 );
}

If you can implement the Foo interface such that the int returned is not
stored in RAM and the assertions in main don't abort the program, I'd
love to see how. The learning experience would be wonderful.

First, please read my follow-up post which you probably didn't see
before posting this.

Also, we were talking about operator[] which is overloaded for
const/non-const. They are two separate functions and don't necessarily
have to return the same reference at all (unless you expect them to be
consistent, which needs to have a requirement/business rule).

As to the challenge, those assertions of yours are new requirements,
and I didn't say anything about how the object whose reference
returned wasn't stored in RAM ... that's not really possible.

And there you go. You might want to look up the UAP (which is what I
said a reference return breaks.) The Uniform Access Principle says, "All
services offered by a module should be available through a uniform
notation, which does not betray whether they are implemented through
storage or through computation."

Obviously, and by your own admission, returning a reference in a
member-function betrays whether the return value is implemented through
storage or though computation, you can't switch from one to the other.
Sure you can go through all kinds of contortions to try to keep the RAM
location returned synchronized with some computation, and if the client
only uses the reference in a prescribed set of ways, your contortions
will work, but it is so much easer to just follow the UAP in the first
place.

Shall I continue? I think you get where I am going... but if you want,
I'll take a few more mintues and post some code. It's not hard, but
I'm a little lazy. <g> Somehow I think we aren't really talking about
the same thing, though...

No need for you to continue, I think we are basically agreeing, it's
just that you are trying to put a more positive spin on reference
returns than I.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 19 '06 #20

Daniel T.

In article <11**********************@i40g2000cwc.googlegroups .com>,
"Greg" <gr****@pacbell.net> wrote:

Roger Lakner wrote:
"Greg" <gr****@pacbell.net> wrote in message
news:11**********************@i40g2000cwc.googlegr oups.com...
On the contrary, FooList demonstrates exactly
how a class should encapsulate its data - with private data members
and
a public interface. Note that clients cannot access FooList's data
member directly, instead they must invoke methods in FooList's
public
interface to access FooList's data. In other words, FooList has
encapsulated its data by mediating all access to it.

I guess I don't see the difference between op[] as instantiated here
and making array public.

There is a huge difference between using operator[] and making the
array public. With a public array, clients can bypass FooList's
interface (and FooList's methods) and obtain the data FooList stores -
directly. And without FooList's interface interposing itself between
its stored data and its clients there is no easy way for FooList to use
a different data storage mechanism in the future, since its clients
will all be relying on the data being stored in an array.

Accessing the data objects through the operator[] is a completely
different story. The operator[] is a function - it is therefore code in
FooList that clients must call in order to retrieve data from a FooList
object. Since FooList can implement operator[] however it likes, it can
get the data it returns from anywhere it likes. And since every client
must call this method to get the data, no client is relying on any
detail of how FooList actually stores its data. In this example FooList
happens to use an array data member - but it need not to. For clients,
the overloaded operator[] creates the illusion that FooList is - or has
- an array. But an interface is separate from the implementation. In
fact, with operator[] access, FooList clients have no way of knowing
how FooList actually stores its data.

Please be more clear, the above is simply not the case as I have already
shown, *unless* we document that clients of FooList only use the op[] to
access the data. In other-words, what keeps the encapsulation intact is
the documentation, not the function.

I agree that is one of the primary benefits of encapsulation, but so
is data hiding. The data is not being hidden in any robust sense here.
I don't see the difference, in my example, between op[] and making
array public. It's disheartening to me to see that one of the
much-vaunted advantages of C++ over C is so easily, so naturally, so
intuitively and quite often subverted.

It's important to ask what exactly FooList aims to encapsulate - and
the answer is not Foo objects, not by any means. Simply put, FooList
does not encapsulate the Foo objects that it stores. Containment has
nothing to do with encapsulation. Since FooList does not implement Foo,
FooList cannot encapsulate Foo. Foo is an independent class with its
own interface and its own, encapsulated implementation.

So what then does FooList encapsulate? It encapsulates just what it
implements: a storage mechanism for Foo objects. That's it. But
otherwise FooList knows almost nothing about Foo objects. The classes
are not related, so FooList is just a client (and barely one at that)
of Foo.

So does the fact that FooList returns Foo objects break its
encapsulation? Absolutely not. FooList is a container - a container is
expected to return whatever it logically contains. That's why it's
called a container after all. As we can conclude from the FooList
example, a container does not encapsulate the items that it stores -
only the way that it stores them. Encapsulation is not about data
organization or relationships between classes - it describes solely the
relationship (within a single class) between a public interface and an
implementation.

Now the above I think is a good write-up, and one that I can generally
agree with. Except as follows, the FooList's sole responsibility is to
ensure that the Foos it holds have their destructors called at the
appropriate time (as the last thing that happens to the Foo object
contained.) So we must ask ourselves, can FooList make such a guarantee
with the interface provided? Well, if some member-function returns a
pointer/reference to one of the contained Foo's, it can't. It's up to
the clients of FooList to restrain from using the free access that
FooList provides.

So yes, encapsulation is broken, but that (as I have already explained)
is not necessarily a bad thing, because FooList doesn't really own the
objects it contains anyway... Ultimately, the responsibility is on the
owner of the Foos.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 19 '06 #21

Bob Hairgrove

On Sun, 19 Mar 2006 16:49:12 GMT, "Daniel T."
<po********@earthlink.net> wrote:

And there you go. You might want to look up the UAP (which is what I
said a reference return breaks.) The Uniform Access Principle says, "All
services offered by a module should be available through a uniform
notation, which does not betray whether they are implemented through
storage or through computation."

If you want to be consistent with this, then even having an assignment
operator "breaks encapsulation", as you say. Don't you think you are
taking this a little too far?

--
Bob Hairgrove
No**********@Home.com

Mar 19 '06 #22

Daniel T.

In article <93********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:

On Sun, 19 Mar 2006 16:49:12 GMT, "Daniel T."
<po********@earthlink.net> wrote:
And there you go. You might want to look up the UAP (which is what I
said a reference return breaks.) The Uniform Access Principle says, "All
services offered by a module should be available through a uniform
notation, which does not betray whether they are implemented through
storage or through computation."

If you want to be consistent with this, then even having an assignment
operator "breaks encapsulation", as you say. Don't you think you are
taking this a little too far?

How does having an assignment operator break the UAP? You have me
stumped...

class Range {
// invariant: range() == high() - low()
public:
int low() const;
int high() const;
int range() const;
Range& operator=( const Range& );
void low( int v );
void high( int v );
void range( int v );
};

The op= above in no way tells me what values are stored in range
objects, and what values are computed. In fact, none of low, high, or
range may be stored in RAM, all three of them may be computed.

--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 20 '06 #23

Roger Lakner

"Bob Hairgrove" <in*****@bigfoot.com> wrote in message
news:og********************************@4ax.com...

[top-posting corrected]
It does leave room for later modifications to the implementation
without having to change the interface. Often, you will see things
like these seemingly silly "do-nothing" implementations in the
pre-production stages of code. When the preliminary testing phase
passes, programmers can "harden up" the code by adding stuff to the
implementation body. Clients only see the headers, so they wouldn't
have to recompile their own code.

Yes, that makes sense to me. But, supposedly, std::vector is no longer
in the early stages of development. And, though your exchange with
Daniel T. was a little over my head, it was very instructive. Thanks
for your response.

Roger

Mar 20 '06 #24

Roger Lakner

"Greg" <gr****@pacbell.net> wrote in message
news:11**********************@i40g2000cwc.googlegr oups.com...

There is a huge difference between using operator[] and making the
array public. With a public array, clients can bypass FooList's
interface (and FooList's methods) and obtain the data FooList
stores -
directly.

I guess I don't see, in practical terms, the difference. If an address
is returned, without any bounds checking as in my example, then
supposedly one could access any of the data FooList stores and,
consequently, any of the data Foo stores, bypassing FooList's
interface. Or perhaps I'm not understanding. Perhaps an example would
help.

Roger

Mar 20 '06 #25

Greg

Roger Lakner wrote:

"Greg" <gr****@pacbell.net> wrote in message
news:11**********************@i40g2000cwc.googlegr oups.com...
There is a huge difference between using operator[] and making the
array public. With a public array, clients can bypass FooList's
interface (and FooList's methods) and obtain the data FooList
stores -
directly.

I guess I don't see, in practical terms, the difference. If an address
is returned, without any bounds checking as in my example, then
supposedly one could access any of the data FooList stores and,
consequently, any of the data Foo stores, bypassing FooList's
interface. Or perhaps I'm not understanding. Perhaps an example would
help.

Sure, let's have FooList provide the array interface and have no
persistent storage at all:

struct Foo { };

struct FooList
{
Foo* operator[](int index);
};
Foo *FooList::operator[](int index)
{
return new Foo;
}

int main()
{
FooList fooList;

Foo *f1 = fooList[3];
Foo *f2 = fooList[12];
Foo *f3 = fooList[15];
}

Now clients can still "retrieve" Foo objects from fooList - even though
fooList has no array at all - it simply returns a new object at any
index.

A more realistic example would have fooList obtain the objects from
disk or over a network - but the point is that the code in main() can
treat fooList as if it were an array. But fooList does not need to use
an array in its implementation - but can store the objects however it
likes, or, as in this example, not store them at all.

Greg

Mar 20 '06 #26

Greg

Daniel T. wrote:

In article <r3********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:
On Sun, 19 Mar 2006 08:36:33 +0100, Bob Hairgrove
<in*****@bigfoot.com> wrote:
On Sat, 18 Mar 2006 19:31:25 GMT, "Daniel T."
<po********@earthlink.net> wrote:

>> For example, if index is out of range, an exception can be thrown. The
>> implementation can also be very complex. Consider that there might not
>> even be a member "array", but operator[] does a database lookup
>> instead (somehow). Or that the real array is held in another class,
>> and FooList holds a pointer or reference to that class to which it
>> forwards the call. There are many possibilities here.
>
>The above is not quite true. The Foos in FooList *must* be objects in
>RAM because clients of FooList may keep a pointer/reference to the value
>returned, or modify the state of a FooList object by modifying the Foo
>returned. That breaks the UAP (clients know that the return value was
>not computed, but rather stored,) thus encapsulation is broken.

I'm sorry, but you are wrong. The reference returned may not
necessarily even be a reference (see section 23.2.5, paragraph 2 of
the C++ standard for what it says about
std::vector<bool>::operator[]). This is probably an example of
encapsulation at its best!

I would like to amend this a little ... of course, for the example
given by the OP, you are correct. I was only trying to illustrate that
an implementation of operator[] can be done in other, non-trivial
ways, and that having an operator[] which returns a non-const lvalue
doesn't necessarily break encapsulation.

But let's also consider that it would be perfectly legal for
operator[] to return a reference to a static object or a dummy member
variable which acts as a proxy for the real array element. One could
then document this fact somewhere so that clients would know not to
attempt to store a pointer or reference to the object.

// assume the methods below update some file or database such that
// assigning a value to foo.bar() calls update and retrieving a value
// from foo.bar() calls get_value()
void update( unsigned id, int i );
int get( unsigned id );

class Foo {
public:
Foo( unsigned id );
~Foo();

int& bar();
};

int main() {
Foo foo1( 1 );
foo1.bar() = 1963;
Foo foo2( 1 ); // note, same ID
assert( foo2.bar() == 1963 );
assert( get( 1 ) == 1963 );
}

Implement Foo however you see fit such that the asserts in main won't
fire...

OK, I did:

#include <map>

class Foo
{
public:
Foo( unsigned id ) : mIndex(id)
{
}

~Foo() {};

int& bar()
{
return implementations[mIndex].value;
}
int mIndex;

struct FooImpl { int value; };
static std::map<int, FooImpl> implementations;
};

std::map<int, Foo::FooImpl> Foo::implementations;

void update( unsigned id, int i )
{
Foo::implementations[id].value = i;
}

int get( unsigned id )
{
return Foo::implementations[id].value;
}

int main()
{
Foo foo1( 1 );
foo1.bar() = 1963;
Foo foo2( 1 ); // note, same ID
assert( foo2.bar() == 1963 );
assert( get( 1 ) == 1963 );
}

Greg

Mar 20 '06 #27

Greg

Daniel T. wrote:

In article <fn********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:
On Fri, 17 Mar 2006 21:57:20 -0800, "Roger Lakner"
<rl*****@adelphia.net> wrote:
I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept it?
Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

Why do you say that it wrecks encapsulation? Admittedly, the
implementation shown above doesn't buy you anything over direct public
access to the array member variable, but that doesn't mean that it
can't be done differently.

For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

Of course the client knows that any Foo object that FooList returns is
in memory, because that is exactly where the client (and not the
FooList object) allocated it - and did so before the Foo object was
ever stored in a FooList container. A container is a storage class, it
never "computes" a contained item that it returns, because the items it
contains are all provided by the client.

The Uniform Access Principle is not relevant to containers, because it
applies to a (read-only) attribute of an object. The items that a
container contains are not its attributes (nor are the contents often
read-only) - they are independent objects that can exist (and in fact
are created) outside of any container. In fact a FooList has no
client-accessible attributes which means that it perfectly encapsulates
its internal implementation.

Greg

Mar 20 '06 #28

Daniel T.

In article <11**********************@j33g2000cwa.googlegroups .com>,
"Greg" <gr****@pacbell.net> wrote:

Roger Lakner wrote:
"Greg" <gr****@pacbell.net> wrote in message
news:11**********************@i40g2000cwc.googlegr oups.com...
There is a huge difference between using operator[] and making the
array public. With a public array, clients can bypass FooList's
interface (and FooList's methods) and obtain the data FooList
stores -
directly.

I guess I don't see, in practical terms, the difference. If an address
is returned, without any bounds checking as in my example, then
supposedly one could access any of the data FooList stores and,
consequently, any of the data Foo stores, bypassing FooList's
interface. Or perhaps I'm not understanding. Perhaps an example would
help.

Sure, let's have FooList provide the array interface and have no
persistent storage at all:

struct Foo { };

struct FooList
{
Foo* operator[](int index);
};
Foo *FooList::operator[](int index)
{
return new Foo;
}

int main()
{
FooList fooList;

Foo *f1 = fooList[3];
Foo *f2 = fooList[12];
Foo *f3 = fooList[15];
}

Now clients can still "retrieve" Foo objects from fooList - even though
fooList has no array at all - it simply returns a new object at any
index.

A more realistic example would have fooList obtain the objects from
disk or over a network - but the point is that the code in main() can
treat fooList as if it were an array. But fooList does not need to use
an array in its implementation - but can store the objects however it
likes, or, as in this example, not store them at all.

Minor nit Greg, you changed the signature of op[] to return a pointer
not a reference. Major nit, you changed the semantics of op[], the
defining characteristic of op[] is invalid in your code, namely that
consecutive calls to op[] with the same index will return the same
object.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 20 '06 #29

Daniel T.

In article <11**********************@i40g2000cwc.googlegroups .com>,
"Greg" <gr****@pacbell.net> wrote:

Daniel T. wrote:

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

Of course the client knows that any Foo object that FooList returns is
in memory, because that is exactly where the client (and not the
FooList object) allocated it - and did so before the Foo object was
ever stored in a FooList container. A container is a storage class, it
never "computes" a contained item that it returns, because the items it
contains are all provided by the client.

The Uniform Access Principle is not relevant to containers, because it
applies to a (read-only) attribute of an object. The items that a
container contains are not its attributes (nor are the contents often
read-only) - they are independent objects that can exist (and in fact
are created) outside of any container. In fact a FooList has no
client-accessible attributes which means that it perfectly encapsulates
its internal implementation.

By George, I think Greg's got it! This is basically what I have been
saying from the beginning. UAP is broken, but it's OK in some cases...
See my first post in this thread.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 20 '06 #30

Daniel T.

In article <11**********************@u72g2000cwu.googlegroups .com>,
"Greg" <gr****@pacbell.net> wrote:

Daniel T. wrote:
In article <r3********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:
On Sun, 19 Mar 2006 08:36:33 +0100, Bob Hairgrove
<in*****@bigfoot.com> wrote:

>On Sat, 18 Mar 2006 19:31:25 GMT, "Daniel T."
><po********@earthlink.net> wrote:
>
>>> For example, if index is out of range, an exception can be thrown. The
>>> implementation can also be very complex. Consider that there might not
>>> even be a member "array", but operator[] does a database lookup
>>> instead (somehow). Or that the real array is held in another class,
>>> and FooList holds a pointer or reference to that class to which it
>>> forwards the call. There are many possibilities here.
>>
>>The above is not quite true. The Foos in FooList *must* be objects in
>>RAM because clients of FooList may keep a pointer/reference to the value
>>returned, or modify the state of a FooList object by modifying the Foo
>>returned. That breaks the UAP (clients know that the return value was
>>not computed, but rather stored,) thus encapsulation is broken.
>
>I'm sorry, but you are wrong. The reference returned may not
>necessarily even be a reference (see section 23.2.5, paragraph 2 of
>the C++ standard for what it says about
>std::vector<bool>::operator[]). This is probably an example of
>encapsulation at its best!

I would like to amend this a little ... of course, for the example
given by the OP, you are correct. I was only trying to illustrate that
an implementation of operator[] can be done in other, non-trivial
ways, and that having an operator[] which returns a non-const lvalue
doesn't necessarily break encapsulation.

But let's also consider that it would be perfectly legal for
operator[] to return a reference to a static object or a dummy member
variable which acts as a proxy for the real array element. One could
then document this fact somewhere so that clients would know not to
attempt to store a pointer or reference to the object.

// assume the methods below update some file or database such that
// assigning a value to foo.bar() calls update and retrieving a value
// from foo.bar() calls get_value()
void update( unsigned id, int i );
int get( unsigned id );

class Foo {
public:
Foo( unsigned id );
~Foo();

int& bar();
};

int main() {
Foo foo1( 1 );
foo1.bar() = 1963;
Foo foo2( 1 ); // note, same ID
assert( foo2.bar() == 1963 );
assert( get( 1 ) == 1963 );
}

Implement Foo however you see fit such that the asserts in main won't
fire...

OK, I did:

#include <map>

class Foo
{
public:
Foo( unsigned id ) : mIndex(id)
{
}

~Foo() {};

int& bar()
{
return implementations[mIndex].value;
}
int mIndex;

struct FooImpl { int value; };
static std::map<int, FooImpl> implementations;
};

std::map<int, Foo::FooImpl> Foo::implementations;

void update( unsigned id, int i )
{
Foo::implementations[id].value = i;
}

int get( unsigned id )
{
return Foo::implementations[id].value;
}

int main()
{
Foo foo1( 1 );
foo1.bar() = 1963;
Foo foo2( 1 ); // note, same ID
assert( foo2.bar() == 1963 );
assert( get( 1 ) == 1963 );
}

Greg

By storing the value in RAM... This is the only way it can be done and
is my point exactly.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 20 '06 #31

Daniel T.

In article <po******************************@news.east.earthl ink.net>,
"Daniel T." <po********@earthlink.net> wrote:

In article <11**********************@i40g2000cwc.googlegroups .com>,
"Greg" <gr****@pacbell.net> wrote:
Daniel T. wrote:

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

Of course the client knows that any Foo object that FooList returns is
in memory, because that is exactly where the client (and not the
FooList object) allocated it - and did so before the Foo object was
ever stored in a FooList container. A container is a storage class, it
never "computes" a contained item that it returns, because the items it
contains are all provided by the client.

The Uniform Access Principle is not relevant to containers, because it
applies to a (read-only) attribute of an object. The items that a
container contains are not its attributes (nor are the contents often
read-only) - they are independent objects that can exist (and in fact
are created) outside of any container. In fact a FooList has no
client-accessible attributes which means that it perfectly encapsulates
its internal implementation.

By George, I think Greg's got it! This is basically what I have been
saying from the beginning. UAP is broken, but it's OK in some cases...
See my first post in this thread.

One last nail in the coffin. Don't believe me, believe Scott Meyers. In
"Effective C++" item 29 and 30 he says the same thing in a different
way, "Avoid returning "handles" to internal data." and "Avoid member
functions that return non-const pointers or references to members less
accessible than themselves."

The question in this case is... Are the Foos that a FooList contains
really "less accessible" than the FooList?
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 20 '06 #32

Bob Hairgrove

On Mon, 20 Mar 2006 17:21:01 GMT, "Daniel T."
<po********@earthlink.net> wrote:

One last nail in the coffin. Don't believe me, believe Scott Meyers. In
"Effective C++" item 29 and 30 he says the same thing in a different
way, "Avoid returning "handles" to internal data." and "Avoid member
functions that return non-const pointers or references to members less
accessible than themselves."
You said earlier: "returning a reference always breaks encapsulation".
Scott Meyers is saying something entirely different here. As I tried
to point out before, the reference is only dangerous as long as it is
a reference to an object which could change the state of FooList. And
it doesn't have to be. That is why it is important to have an
interface, even if it involves returning a reference -- because it
could be a non-state-changing object to which it refers.

Don Box, in the first chapter of his excellent book "Essential COM",
has pointed out very succinctly the fact that C++ does not provide
encapsulation at the binary level. So the question remains: in what
context do you put the UAP as far as C++ is concerned?
The question in this case is... Are the Foos that a FooList contains
really "less accessible" than the FooList?

They might or might not be. It all depends on the total design. And
that includes the documentation.

Let's talk about documentation. I strongly believe that the
documentation for a class or framework can be just as much a part of
the design as the source code itself. Consider the case of
std::vector. The C++ standard makes the guarantee that on a conforming
implementation, elements of a vector are stored contiguously in
memory. This gives us additional information which could play an
important role in how elements of the vector are accessed. Give me a
pointer or iterator to the first element, and I can access all the
rest. The same could hold true for std::list or whatever other
container is available where iterators are defined.

But this doesn't hold true for vector<bool>. How do I know? The only
reason is that I read about it in the C++ standard. I consider that
"documentation" as well.

As Bjarne Stroustrup says, C++ wasn't designed to be non-hackable.
Things like encapsulation and public/private access were meant to
avoid errors, not to prevent malicious coding. So if what you mean by
adherence to UAP is to avoid the latter, C++ isn't going to do it. And
if it is the former, then we all have to play by the rules -- which
includes reading the documentation.

--
Bob Hairgrove
No**********@Home.com

Mar 20 '06 #33

Bob Hairgrove

On Mon, 20 Mar 2006 13:26:11 GMT, "Daniel T."
<po********@earthlink.net> wrote:

the
defining characteristic of op[] is invalid in your code, namely that
consecutive calls to op[] with the same index will return the same
object.

Who says?

--
Bob Hairgrove
No**********@Home.com

Mar 20 '06 #34

Noah Roberts

Greg wrote:

Roger Lakner wrote:
"Greg" <gr****@pacbell.net> wrote in message
news:11**********************@i40g2000cwc.googlegr oups.com...
There is a huge difference between using operator[] and making the
array public. With a public array, clients can bypass FooList's
interface (and FooList's methods) and obtain the data FooList
stores -
directly.

I guess I don't see, in practical terms, the difference. If an address
is returned, without any bounds checking as in my example, then
supposedly one could access any of the data FooList stores and,
consequently, any of the data Foo stores, bypassing FooList's
interface. Or perhaps I'm not understanding. Perhaps an example would
help.

Sure, let's have FooList provide the array interface and have no
persistent storage at all:

struct Foo { };

struct FooList
{
Foo* operator[](int index);
};
Foo *FooList::operator[](int index)
{
return new Foo;
}

int main()
{
FooList fooList;

Foo *f1 = fooList[3];
Foo *f2 = fooList[12];
Foo *f3 = fooList[15];
}

Now clients can still "retrieve" Foo objects from fooList - even though
fooList has no array at all - it simply returns a new object at any
index.

A more realistic example would have fooList obtain the objects from
disk or over a network - but the point is that the code in main() can
treat fooList as if it were an array. But fooList does not need to use
an array in its implementation - but can store the objects however it
likes, or, as in this example, not store them at all.

Whereas the original example, which returned references, did not break
encapsulation, this one does.

Foo * f1 = fooList[3];
Foo * f2 = f1 + 2;

This would be ok for a client of the class with an internal array but
not otherwise. The class's interface does nothing to suggest that
would not be valid. On the other hand, returning a reference does. It
says that this is not something that should be considered a pointer so
don't try anything pointerish with it. A client could still do
something similar with references:

Foo & f1 = fooList[3];
Foo * f2 = &f1 + 2;

You can't stop that kind of thing really, unless you return a reference
proxy. There may be reason enough to do so. However, this is
obviously much more hackish and there can be no doubt in the mind of
the developer that they are purposefully breaking encapsulation for
some stupid reason. With a pointer return this is not so obvious.

The preferable interface of course returns copies or proxies that act
like copies...maybe copy on write type of reference counting mechanism.
With a proxy you could even change the semantics of the class to allow
the creation of new Foo's because maybe the Foo is something on some
other server and it needs to be copied. Both references and pointers
pose problems for this need.

About the class not doing any bounds checking, it easily could. There
is nothing saying that operator[] doesn't throw. You could add that
functionality and not hurt clients, assuming they are already exception
safe (and they should be). std::vector doesn't check either but still
offers a lot in comparison to a public array.

Mar 20 '06 #35

Daniel T.

In article <1g********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:

On Mon, 20 Mar 2006 17:21:01 GMT, "Daniel T."
<po********@earthlink.net> wrote:
One last nail in the coffin. Don't believe me, believe Scott Meyers. In
"Effective C++" item 29 and 30 he says the same thing in a different
way, "Avoid returning "handles" to internal data." and "Avoid member
functions that return non-const pointers or references to members less
accessible than themselves."
You said earlier: "returning a reference always breaks encapsulation".
Scott Meyers is saying something entirely different here. As I tried
to point out before, the reference is only dangerous as long as it is
a reference to an object which could change the state of FooList. And
it doesn't have to be. That is why it is important to have an
interface, even if it involves returning a reference -- because it
could be a non-state-changing object to which it refers.

Again, this is exactly my point. If logically, FooList is simply holding
the Foos for some other object and doesn't actually own them, then
breaking UAP is OK, but UAP is still broken. That's why I said to the OP
that although he was right and encapsulation was broken, that isn't
necessarily a problem.

Don Box, in the first chapter of his excellent book "Essential COM",
has pointed out very succinctly the fact that C++ does not provide
encapsulation at the binary level. So the question remains: in what
context do you put the UAP as far as C++ is concerned?
The question in this case is... Are the Foos that a FooList contains
really "less accessible" than the FooList?
They might or might not be. It all depends on the total design. And
that includes the documentation.

Let's talk about documentation. I strongly believe that the
documentation for a class or framework can be just as much a part of
the design as the source code itself. Consider the case of
std::vector. The C++ standard makes the guarantee that on a conforming
implementation, elements of a vector are stored contiguously in
memory. This gives us additional information which could play an
important role in how elements of the vector are accessed. Give me a
pointer or iterator to the first element, and I can access all the
rest. The same could hold true for std::list or whatever other
container is available where iterators are defined.

But this doesn't hold true for vector<bool>. How do I know? The only
reason is that I read about it in the C++ standard. I consider that
"documentation" as well.

All of this proves my point. The interface gives away the implementation
(its not encapsulated.) That's OK in this case though because the docs
require a particular implementation anyway.

As Bjarne Stroustrup says, C++ wasn't designed to be non-hackable.
Things like encapsulation and public/private access were meant to
avoid errors, not to prevent malicious coding. So if what you mean by
adherence to UAP is to avoid the latter, C++ isn't going to do it. And
if it is the former, then we all have to play by the rules -- which
includes reading the documentation.

Not at all. Nothing can prevent a programmer from hacking into the
memory footprint of an object and access anything he wants. I am talking
about the legions of programmers who provide reference returns and
*think* they are properly encapsulating their data.

As Meyers himself wrote: "Unfortunately, the presence of [a member
function returning a non-const reference] defeats the purpose of making
[a private member-variable] private." (Meyers, "Effective C++" item 30)

--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 21 '06 #36

Daniel T.

In article <al********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:

On Mon, 20 Mar 2006 13:26:11 GMT, "Daniel T."
<po********@earthlink.net> wrote:
the
defining characteristic of op[] is invalid in your code, namely that
consecutive calls to op[] with the same index will return the same
object.

Who says?

Herb Sutter, Scott Meyers, Bjarne Stroustrup among others... A common
quote "when using operator overloading or any other language feature for
your own classes, when in doubt always make your class follow the same
semantics as the builtin and standard library types."

If your op[] does not follow the same semantics as op[] does on an
array, then I suggest you rename your function so you won't make life
harder on your users. Remember, operator overloading is exists to make
life easer on the users of your class.
<http://www.parashift.com/c++-faq-lite/operator-overloading.html>
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 21 '06 #37

Bob Hairgrove

On Tue, 21 Mar 2006 14:00:29 GMT, "Daniel T."
<po********@earthlink.net> wrote:

As Bjarne Stroustrup says, C++ wasn't designed to be non-hackable.
Things like encapsulation and public/private access were meant to
avoid errors, not to prevent malicious coding. So if what you mean by
adherence to UAP is to avoid the latter, C++ isn't going to do it. And
if it is the former, then we all have to play by the rules -- which
includes reading the documentation.
Not at all. Nothing can prevent a programmer from hacking into the
memory footprint of an object and access anything he wants. I am talking
about the legions of programmers who provide reference returns and
*think* they are properly encapsulating their data.

So we agree that encapsulation cannot be achieved 100% purely by using
the language features in C++, but only through proper design? If you
agree with that, I still do not understand why you categorically
stated earlier that returning a reference breaks encapsulation?
As Meyers himself wrote: "Unfortunately, the presence of [a member
function returning a non-const reference] defeats the purpose of making
[a private member-variable] private." (Meyers, "Effective C++" item 30)

No. This is more like what he says:

"Unfortunately, the presence of [a member function returning a
non-const reference TO A PRIVATE MEMBER OF THE CLASS WHICH CAN CHANGE
THE CLASS STATE] defeats the purpose of making [THAT private
member-variable] private." (Meyers, "Effective C++" item 30)

The fact that the return value is a reference is irrelevant by itself.
It all depends on what the reference refers to. As I pointed out
before, it could be a reference to some dummy variable which is kept
solely for the purpose of satisfying clients who need some kind of
non-const lvalue to write to. What actually is written (or not, as the
case may be) is solely under control of the class containing the
member and enforced through the implementation of the function
returning the reference.

And even if it is a reference to some meaningful member of the class,
the function can be implemented in a discretionary manner. For
example, operator= often checks for "this==&argument". If the check
proves true, the behavior is different than if it isn't.

Why are you so afraid of references? It's all about design and not
about the language itself. Implementing operator[] as in the OP's
original example DOES break encapsulation. But it can be implemented
differently in the background using the same interface. So it is the
implementation, and not the interface, which makes the difference.

--
Bob Hairgrove
No**********@Home.com

Mar 21 '06 #38

Bob Hairgrove

On Tue, 21 Mar 2006 14:17:49 GMT, "Daniel T."
<po********@earthlink.net> wrote:

In article <al********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:
On Mon, 20 Mar 2006 13:26:11 GMT, "Daniel T."
<po********@earthlink.net> wrote:
>the
>defining characteristic of op[] is invalid in your code, namely that
>consecutive calls to op[] with the same index will return the same
>object.

Who says?

Herb Sutter, Scott Meyers, Bjarne Stroustrup among others... A common
quote "when using operator overloading or any other language feature for
your own classes, when in doubt always make your class follow the same
semantics as the builtin and standard library types."

If your op[] does not follow the same semantics as op[] does on an
array, then I suggest you rename your function so you won't make life
harder on your users. Remember, operator overloading is exists to make
life easer on the users of your class.
<http://www.parashift.com/c++-faq-lite/operator-overloading.html>

That seems awfully limiting to me ... look at std::map::operator[],
for example. Does that act like an array??

--
Bob Hairgrove
No**********@Home.com

Mar 21 '06 #39

Phlip

Bob Hairgrove wrote:

If your op[] does not follow the same semantics as op[] does on an
array, then I suggest you rename your function so you won't make life
harder on your users. Remember, operator overloading is exists to make
life easer on the users of your class.
<http://www.parashift.com/c++-faq-lite/operator-overloading.html>

That seems awfully limiting to me ... look at std::map::operator[],
for example. Does that act like an array??

That is a good example of the difference between syntax and semantics. Map[]
doesn't use the same syntax, but it gives the same warm-fuzzies. Semantics.

--
Phlip
http://www.greencheese.org/ZeekLand <-- NOT a blog!!!

Mar 21 '06 #40

Daniel T.

In article <js********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:

On Tue, 21 Mar 2006 14:00:29 GMT, "Daniel T."
<po********@earthlink.net> wrote:
As Bjarne Stroustrup says, C++ wasn't designed to be non-hackable.
Things like encapsulation and public/private access were meant to
avoid errors, not to prevent malicious coding. So if what you mean by
adherence to UAP is to avoid the latter, C++ isn't going to do it. And
if it is the former, then we all have to play by the rules -- which
includes reading the documentation.
Not at all. Nothing can prevent a programmer from hacking into the
memory footprint of an object and access anything he wants. I am talking
about the legions of programmers who provide reference returns and
*think* they are properly encapsulating their data.

So we agree that encapsulation cannot be achieved 100% purely by using
the language features in C++, but only through proper design?

I don't agree with the above. No design can achieve encapsulation 100%
because the language features in C++ allow intentional break-ins.

A prime example:

class Foo {
int bar;
public:
Foo(): bar( 0 ) { }
int getBar() const { return bar; }
};

int main()
{
Foo f;
std::cout << f.getBar() << '\n';
int* b = reinterpret_cast<int*>( &f );
*b = 5;
std::cout << f.getBar() << '\n';
}

The above is what Stroustrup &al. was talking about.

If you
agree with that, I still do not understand why you categorically
stated earlier that returning a reference breaks encapsulation?
Because a reference return breaks the UAP. The client knows that the
value returned is stored in RAM somewhere, and not computed on the fly.

As Meyers himself wrote: "Unfortunately, the presence of [a member
function returning a non-const reference] defeats the purpose of making
[a private member-variable] private." (Meyers, "Effective C++" item 30)

No. This is more like what he says:

"Unfortunately, the presence of [a member function returning a
non-const reference TO A PRIVATE MEMBER OF THE CLASS WHICH CAN CHANGE
THE CLASS STATE] defeats the purpose of making [THAT private
member-variable] private." (Meyers, "Effective C++" item 30)

The fact that the return value is a reference is irrelevant by itself.
It all depends on what the reference refers to. As I pointed out
before, it could be a reference to some dummy variable which is kept
solely for the purpose of satisfying clients who need some kind of
non-const lvalue to write to. What actually is written (or not, as the
case may be) is solely under control of the class containing the
member and enforced through the implementation of the function
returning the reference.

class Foo {
public:
Foo();
int& getBar();
};

int main()
{
Foo f;

assert( f.getBar() == 0 );
int& b = f.getBar();
b = 5;
assert( f.getBar() == 5 );
}

If the above doesn't work (ie if the last assert fires,) then getBar was
implemented incorrectly, if the assert doesn't fire, then encapsulation
was broken. I don't see why that is so hard for you to grasp...

And even if it is a reference to some meaningful member of the class,
the function can be implemented in a discretionary manner. For
example, operator= often checks for "this==&argument". If the check
proves true, the behavior is different than if it isn't.

Why are you so afraid of references? It's all about design and not
about the language itself. Implementing operator[] as in the OP's
original example DOES break encapsulation. But it can be implemented
differently in the background using the same interface. So it is the
implementation, and not the interface, which makes the difference.

Please Bob, I'm not afraid of references, I use them whenever
appropriate. It is sometimes appropriate to break encapsulation, and
therefore it is sometimes appropriate to return references.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 22 '06 #41

Noah Roberts

Daniel T. wrote:

In article <js********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:

The fact that the return value is a reference is irrelevant by itself.
It all depends on what the reference refers to. As I pointed out
before, it could be a reference to some dummy variable which is kept
solely for the purpose of satisfying clients who need some kind of
non-const lvalue to write to. What actually is written (or not, as the
case may be) is solely under control of the class containing the
member and enforced through the implementation of the function
returning the reference.

class Foo {
public:
Foo();
int& getBar();
};

int main()
{
Foo f;

assert( f.getBar() == 0 );
int& b = f.getBar();
b = 5;
assert( f.getBar() == 5 );
}

If the above doesn't work (ie if the last assert fires,) then getBar was
implemented incorrectly, if the assert doesn't fire, then encapsulation
was broken. I don't see why that is so hard for you to grasp...

It's hard to grasp because it isn't true. We don't know what "bar" is,
where it is stored, how it is accessed, or anything about it except
that changes we do to it are guaranteed to remain...in other words we
are getting a reference to something changeable, which is what the
return value says (however, the assert can fail and not be implemented
incorrectly as there is nothing in the definition that says bar will
not be changed between calls or as a consequence of them or even that
the returned reference will always be the same). We have no idea about
the internals of class Foo and in fact bar may not even be in class Foo
but be in some other class either contained within Foo or accessed by
getBar(). Foo could easily be implemented in terms like this:

struct BarHolder
{
int x;
};

class Foo
{
BarHolder holder;
public:
int& getBar() { return holder.x; }
};

Bar could also be a global value (which is of course a total waste of
time), or less obviously a static global in the same compile object as
Foo::getBar(). Maybe as a member of an array? Or it could be a
reference return from some other class and Foo also has no idea where
bar is located.

Fact is that the client does not know, nor care, how the internals of
Foo are represented or where the return of bar comes from. All it
cares about are the guarantees spelled out by the signature of the
member function. This is the definition of encapsulation.

Now, I'll grant you that in many cases there is a smell to returning a
non-const reference, especially how you have spelled it above. This
doesn't mean that doing so always breaks encapsulation.

Mar 22 '06 #42

Bob Hairgrove

On 22 Mar 2006 08:47:06 -0800, "Noah Roberts" <ro**********@gmail.com>
wrote:

Fact is that the client does not know, nor care, how the internals of
Foo are represented or where the return of bar comes from. All it
cares about are the guarantees spelled out by the signature of the
member function. This is the definition of encapsulation.

Thank you.

--
Bob Hairgrove
No**********@Home.com

Mar 22 '06 #43

Daniel T.

In article <ic********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:

On 22 Mar 2006 08:47:06 -0800, "Noah Roberts" <ro**********@gmail.com>
wrote:
Fact is that the client does not know, nor care, how the internals of
Foo are represented or where the return of bar comes from. All it
cares about are the guarantees spelled out by the signature of the
member function. This is the definition of encapsulation.

Thank you.

Not quite, because the guarantee spelled out by the signature of the
member function is that the thing returned exists in RAM. And that is
the definition of breaking the UAP, an important aspect of encapsulation.

Let's go back to the example:

class Foo {
public:
int& bar();
};

No matter how Foo is implemented, it must guarantee that the the int it
returns is stored in a particular place in RAM that it knows about. It
must guarantee that if a client does:

myFoo.bar() = 5;

Then that place in ram will now hold the value 5.

The function signature itself *requires* that we break the UAP. The bar
function cannot change its implementation to calculate a value on the
fly and return it. The best it can do is calculate the value and store
it in a place in RAM, then return a reference to that place in RAM. No
matter how creative you get, this fundamental fact remains.

Granted, we can do something like calculate the value each time bar is
called and ignore any value that the client may put in the RAM location
returned, however I remind you of the title of this thread... A
non-const operator[] that ignores any assignment we make to the
reference returned would break every assumption that any reasonable
programmer can expect from that particular operator. So, even if you
completely ignore everything I wrote above, you are still left with the
fact that op[]'s semantics allow for very little lee-way in how it is
implemented. There is some there, but not the freedom that you would
otherwise have.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Mar 23 '06 #44

Kai-Uwe Bux

Daniel T. wrote:

In article <ic********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:
On 22 Mar 2006 08:47:06 -0800, "Noah Roberts" <ro**********@gmail.com>
wrote:
>Fact is that the client does not know, nor care, how the internals of
>Foo are represented or where the return of bar comes from. All it
>cares about are the guarantees spelled out by the signature of the
>member function. This is the definition of encapsulation.
Thank you.

Not quite, because the guarantee spelled out by the signature of the
member function is that the thing returned exists in RAM. And that is
the definition of breaking the UAP, an important aspect of encapsulation.

Sorry for butting in. Just a few remarks: if UAP was broken any time a
non-const reference is returned, then this principle is violated so
frequently as to cast doubt whether it is really a such important aspect of
encapsulation.

Let's go back to the example:

class Foo {
public:
int& bar();
};

No matter how Foo is implemented, it must guarantee that the the int it
returns is stored in a particular place in RAM that it knows about. It
must guarantee that if a client does:

myFoo.bar() = 5;

Then that place in ram will now hold the value 5.

The function signature itself *requires* that we break the UAP. The bar
function cannot change its implementation to calculate a value on the
fly and return it. The best it can do is calculate the value and store
it in a place in RAM, then return a reference to that place in RAM. No
matter how creative you get, this fundamental fact remains.

Granted, we can do something like calculate the value each time bar is
called and ignore any value that the client may put in the RAM location
returned, however I remind you of the title of this thread... A
non-const operator[] that ignores any assignment we make to the
reference returned would break every assumption that any reasonable
programmer can expect from that particular operator.
I cannot reconcile these two paragraphs: Above you claim that the
*signature* of the function requires breaking UAP and down here you concede
that an implementation could, indeed, recalculate the value each time the
method is called. In other words: it is *not* the signature but the
contract or the reasonable expectations of a client (i.e., that changes
from assignments like myFoo.bar() = 5 are preserved) that brings about the
violation of UAP. If there is no promise on the part of Foo that values
written into the returned memory locations by the client are not ignored,
then there appears to be no violation of UAP.
So, even if you
completely ignore everything I wrote above, you are still left with the
fact that op[]'s semantics allow for very little lee-way in how it is
implemented. There is some there, but not the freedom that you would
otherwise have.

True for operator[] with the "canonical" contract.
Finally, a question: would the following break UAP:

struct Foo {

typedef int & reference;

reference bar();

};

in a case where the *documentation* states that "reference" is an
implementation defined type.

Best

Kai-Uwe Bux

Mar 23 '06 #45

Noah Roberts

Daniel T. wrote:

In article <ic********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:
On 22 Mar 2006 08:47:06 -0800, "Noah Roberts" <ro**********@gmail.com>
wrote:
Fact is that the client does not know, nor care, how the internals of
Foo are represented or where the return of bar comes from. All it
cares about are the guarantees spelled out by the signature of the
member function. This is the definition of encapsulation.
Thank you.

Not quite, because the guarantee spelled out by the signature of the
member function is that the thing returned exists in RAM. And that is
the definition of breaking the UAP, an important aspect of encapsulation.

You need to do more to show how this is so. Lots of things exist in
RAM, that doesn't mean you break encapsulation by using them.
Let's go back to the example:

class Foo {
public:
int& bar();
};

No matter how Foo is implemented, it must guarantee that the the int it
returns is stored in a particular place in RAM that it knows about.
No it doesn't. Foo doesn't have to know anything about where it is.
Even if that statement where true it STILL doesn't break encapsulation
because the CLIENT doesn't know.

It must guarantee that if a client does:

myFoo.bar() = 5;

Then that place in ram will now hold the value 5.
So?
The function signature itself *requires* that we break the UAP.
I think you must be confused about what the UAP is.

"All services offered by a module should be available through a uniform
notation, which does not betray whether they are implemented through
storage or through computation."
http://en.wikipedia.org/wiki/Uniform_access_principle

It doesn't speak of encapsulation at all. It doesn't say what you are
saying it says, speaking to syntax and language design more than
anything else, and I also don't see that it is necissarily valid.
There are people who want properties added to the C++ language...I
don't like the idea. VB has them, C++.NET has them, I don't think they
are that hot...but I guess you do.

I'll offer you more ammo though I don't see the relevance to the
argument: C++ violates the UAP!

The bar function cannot change its implementation to calculate a value on the
fly and return it. The best it can do is calculate the value and store
it in a place in RAM, then return a reference to that place in RAM. No
matter how creative you get, this fundamental fact remains.
And? Returning a value will never be able to return a location in RAM
either. In fact, without providing documentation saying,
"reinterpret_cast this to a pointer" you will _never_, no matter how
much hackery, be able to return a location in ram if your signature
specifies a reference. Whereas you just came up with a way to
calculate a value and return it (another is the thread unsafe static
variable).

The point of fact simply is that the two signatures are completely
different and are not interchangeable. If you want them to be you are
using the wrong language and I can't think of ANY that provide that
ability (not saying there isn't). int& specifies that the returned
'value' is a mutable object in memory where int says that it isn't.
Granted, we can do something like calculate the value each time bar is
called and ignore any value that the client may put in the RAM location
returned, however I remind you of the title of this thread... A
non-const operator[] that ignores any assignment we make to the
reference returned would break every assumption that any reasonable
programmer can expect from that particular operator. So, even if you
completely ignore everything I wrote above, you are still left with the
fact that op[]'s semantics allow for very little lee-way in how it is
implemented. There is some there, but not the freedom that you would
otherwise have.

Yes, the common meaning to operator[] is expected to be a certain
thing. I don't see the relevance.

Mar 23 '06 #46

Bob Hairgrove

On 23 Mar 2006 00:41:25 -0800, "Noah Roberts" <ro**********@gmail.com>
wrote:

Daniel T. wrote:
In article <ic********************************@4ax.com>,
Bob Hairgrove <in*****@bigfoot.com> wrote:
> On 22 Mar 2006 08:47:06 -0800, "Noah Roberts" <ro**********@gmail.com>
> wrote:
>
> >Fact is that the client does not know, nor care, how the internals of
> >Foo are represented or where the return of bar comes from. All it
> >cares about are the guarantees spelled out by the signature of the
> >member function. This is the definition of encapsulation.
>
> Thank you.

Not quite, because the guarantee spelled out by the signature of the
member function is that the thing returned exists in RAM. And that is
the definition of breaking the UAP, an important aspect of encapsulation.

You need to do more to show how this is so. Lots of things exist in
RAM, that doesn't mean you break encapsulation by using them.

Don't waste your time, I'm beginning to think that this guy is a
troll.

--
Bob Hairgrove
No**********@Home.com

Mar 23 '06 #47

Greg

Noah Roberts wrote:

Greg wrote:
Roger Lakner wrote:
"Greg" <gr****@pacbell.net> wrote in message
news:11**********************@i40g2000cwc.googlegr oups.com...
> There is a huge difference between using operator[] and making the
> array public. With a public array, clients can bypass FooList's
> interface (and FooList's methods) and obtain the data FooList
> stores -
> directly.

I guess I don't see, in practical terms, the difference. If an address
is returned, without any bounds checking as in my example, then
supposedly one could access any of the data FooList stores and,
consequently, any of the data Foo stores, bypassing FooList's
interface. Or perhaps I'm not understanding. Perhaps an example would
help.

Sure, let's have FooList provide the array interface and have no
persistent storage at all:

struct Foo { };

struct FooList
{
Foo* operator[](int index);
};
Foo *FooList::operator[](int index)
{
return new Foo;
}

int main()
{
FooList fooList;

Foo *f1 = fooList[3];
Foo *f2 = fooList[12];
Foo *f3 = fooList[15];
}

Now clients can still "retrieve" Foo objects from fooList - even though
fooList has no array at all - it simply returns a new object at any
index.

A more realistic example would have fooList obtain the objects from
disk or over a network - but the point is that the code in main() can
treat fooList as if it were an array. But fooList does not need to use
an array in its implementation - but can store the objects however it
likes, or, as in this example, not store them at all.

Whereas the original example, which returned references, did not break
encapsulation, this one does.

Foo * f1 = fooList[3];
Foo * f2 = f1 + 2;

This would be ok for a client of the class with an internal array but
not otherwise. The class's interface does nothing to suggest that
would not be valid.

Sure it does. Pointer arithmatic on f1 would always be invalid for any
kind of array that stores Foo pointers - whether that array is real or
simulated. And the array in this example is an array of Foo pointers (I
should have pointed out that change from the original). So the Foo
pointer returned is actually a copy of the value (and not a pointer to
the value) stored at that "index" in the array. It is not the address
of the item at that index - which in fact the client has no way of
obtaining.

To return a reference in this example, operator[] would have to be
declared like so:

Foo*& operator[](int index)const;

By returning a reference to the pointer the client is then able to
change that pointer's value as it is stored within the array. Returning
a Foo pointer by value, on the other hand, does not offer the client
the opportunity to change the Foo pointer value in the array. Whether
the client should be offered that ability is of course a design
decision.

Greg

Mar 24 '06 #48

Encapsulation and Operator[]

Similar topics