By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,679 Members | 2,602 Online
Bytes IT Community
Submit an Article
Got Smarts?
Share your bits of IT knowledge by writing an article on Bytes.

Designing for Polymorphism in C++

weaknessforcats
Expert Mod 5K+
P: 9,197
Introduction

Polymorphism is the official term for Object-Oriented Programming (OOP). Polymorphism is implemented in C++ by virtual functions.

This article uses a simple example hierarchy which you may have seen many times in one form or another. An analysis of this example produces several problems that are not obvious but which will seriously limit your ability to use hierarchies like the example in a real program. Then, the article presents proposed solutions followed by some conclusions.

This Object-Oriented Programming (OOP) requires a hierarchy of classes where a general class (the C++ base class) is at the top of the hierarchy. Specific classes (the C++ derived classes) inherit from the general class.

At run time, specific objects (C++ derived objects) are created and passed to methods expecting general objects (C++ functions with base object pointers or references as arguments). Then, when these functions call methods on what is believed to be a general object, the call actually is made to the method for the specific object since thatís what the object really is.

From this point on, this article will use the C++ terms base class and derived class.

The public methods of a class are called the interface to the class. That's because any access to the class private members has to be done by calling one of the public methods.

When you have a polymorphic hierarchy, the public methods of the base class are referred to as the interface of the hierarchy. That's because to call a derived class method using a base class pointer or reference, you have to call a base class method. It is the C++ virtual function mechanism that allows a derived class to override the base class method so the base class method call is re-directed to the derived class method.

Virtual functions implement Object-Oriented Programming (OOP) in C++. That is, if your hierarchy does not have virtual base class methods, then your C++ program is not object-oriented. That does not mean your program is incorrectly designed or is invalid in some way. It just means you cannot substitute derived objects for base objects.

The advantage to polymorphism is that the application code need be written for base objects only. For this to work, the application functions must use base class pointers or references as arguments. You see, when you substitute a derived object for a base object, it is still a derived object. When the application calls a base class method, the call has to be re-directed to the derived class method. This is done using the virtual function table (VTBL). The VTBL for the derived class has the correct addresses for the functions to be called using derived object. The address of the VBTL is embedded by the compiler inside each derived object. When you call a virtual method, the compiler has generated code to access the VTBL for the address of the method you are going to call.

As the name implies, the VTBL has the addresses of all virtual functions of the class, whether inherited or defined by class. There is one VTBL for each class.

When you hide the name of the derived class from the application, you are free to add new derived classes later, perhaps years later, and to pass these new objects into that old application code. The effect is the new objects call their new methods because the VTBL code to do this already is in the application. The life of the application code has been extended.

Pre-Requisite Knowledge

This article assumes you are familiar with:
1) C++ classes and inheritance
2) C++ classes and the public/private/protected access specifiers
3) The virtual function mechanism, including the virtual function table (VTBL)
4) The rules for C++ function overloading and overriding

The Example
In this example is a familiar application:
1) A Shape hierarchy
2) An application function that takes a Shape reference (or pointer) as an
argument and calls a virtual method. The virtual method does something or
other and returns the results in a stringstream. In the example, the virtual
method returns its own name.
3) A main() that calls the application function.

Here is the hierarchy:

Expand|Select|Wrap|Line Numbers
  1. class Shape
  2. {
  3.  
  4. public:
  5.     virtual stringstream& AMethod(stringstream& str);
  6.             //assume other methodsÖ
  7. };
  8. class Circle : public Shape
  9. {
  10.  
  11. public:
  12.     virtual stringstream& AMethod(stringstream& str);
  13. int Area();
  14. };
  15. class Cylinder : public Shape
  16. {
  17.  
  18. public:
  19.     virtual stringstream& AMethod(stringstream& str);
  20.     int Volume();
  21. };
  22. class Point : public Shape
  23. {
  24.  
  25. public:
  26. };
  27. stringstream& Shape::AMethod(stringstream& str)
  28. {
  29.     str << "Shape::AMethod()" << endl;
  30.            return str;
  31. }
  32. stringstream& Circle::AMethod(stringstream& str)
  33. {
  34.     str << "Circle::AMethod()" << endl;
  35.            return str;
  36. }
  37. stringstream& Cylinder::AMethod(stringstream& str)
  38. {
  39.     str << "Cylinder::AMethod()" << endl;
  40.            return str;
  41. }
  42.  
  43.  
The Application function takes a base object by reference (or by pointer) and calls AMethod. Because AMethod is a virtual function in class Shape and is overridden in the derived classes, passing a Circle object to the Application function will result in a call to AMethod() of the Circle object.

In the example, the Application function merely inserts the stringstream as a string into an ostream.

Expand|Select|Wrap|Line Numbers
  1. ostream& Application(Shape& obj, ostream& os)
  2. {
  3.     stringstream ss;
  4.     obj.AMethod(ss);
  5.  
  6.     os << ss.str() << endl;
  7.  
  8.     return os;
  9.  
Then in main(), Circle, Cylinder and Point objects are created and used as Shape objects. You can see the output directed once to the screen and once to a disc file.

Expand|Select|Wrap|Line Numbers
  1. int main()
  2. {
  3.     ofstream outfile("Data.txt");
  4.     Circle c;
  5.     Application(c, cout);
  6.     Application(c, outfile);
  7.     Cylinder cyl;
  8.     Application(cyl, cout);
  9.     Application(cyl, outfile);
  10.     Point p;
  11.     Application(p, cout);
  12.     Application(p, outfile);
  13.  
  14.  
  15.     return 0;
  16. }
  17.  
You can compile and execute this code yourself and verify that the correct displays appear, so far so good.

Problem 1: Not All Classes Can Support the Shape Interface

When you run the example code, the AMethod for a Point object is reported as Shape::AMethod(). That's because the base function was not overridden by Point. You see, the reason it was not overridden by Point is that AMethod is not appropriate for Point. The other Shape methods (the ones the example says to assume exist) may be appropriate for Point, but not AMethod.

Also, if the Shape::AMethod were a pure virtual function, the class Point would be forced to implement AMethod method whether it wanted to or not.

The Shape pubic virtual functions are the interface of the Shape hierarchy and are inherited by the derived classes. That makes all of the derived classes have the same interface as the Shape class. This is a bad assumption. Consider if the Shape hierarchy was a Vehicle hierarchy with Car, Boat and Airplane as derived classes, then all Car, Boats and Airplanes would have to have the same features, like wings on a Boat or a retractable sun roof on an Airplane.

The use of public virtual functions presupposes a homomorphic hierarchy . That is, one where all of the classes have exactly the same public methods. Rarely does this occur outside of textbooks or the classroom.


What is really happening here is that Shape::AMethod is being used for two things. On the one hand it is part of the interface of the Shape hierarchy. On the other hand, it is there for the derived class to override with custom behavior. So either the derived class does not override and Point gets bad results or it does override and the code for Shape::AMethod is lost to the derived class. This second phenomenon causes the derived class to code this way:

Expand|Select|Wrap|Line Numbers
  1. stringstream& Circle::AMethod(stringstream& str)
  2. {
  3.            Shape::AMethod(str);    //Recover base class implementation.
  4.     str << "Circle::AMethod()" << endl;
  5.            return str;
  6. }
  7.  
Here the overridden Shape::AMethod is called by the Circle class method to recover the Shape class implementation. As shown above, the Shape class logic is assumed a pre-condition to the derived class logic so the base class function is called before the derived class logic. However, it could be that the Shape class logic is a post-condition to the Circle class logic. In that case, the Shape class function would be called after the derived class logic.

Now there is ambiguity and that is a real weakness.

If you use a pure virtual function in Shape, you can avoid this problem but it does not resolve the issue where Point does not want to implement AMethod. Remember, you use a pure virtual function only when you want to force the derived class to implement a method.

Resolution of Problem 1: Separate the Interface from the Implementation

Your base class public functions are hierarchy interface functions and should not be overridden by derived classes. Your base class virtual functions are the ones the derived classes are to override. You separate these by making the interface functions public and the virtual functions private.

When you make the Shape::AMethod private, there is now no method for the Application to call. That requires that you write an interface method for the Shape class that is public and non-virtual. This new Shape method (Shape::Show) can call the private Shape::AMethod and can also be called by the Application.

Now the interface is the public and non-virtual functions and the implementation is the private virtual functions. The interface is public. The implementation is private. They are separate.

In C++, when you override a method, the access specifiers (public/private/protected) are ignored . For a Circle object, the VTBL contains the address of Circle::AMethod because that method overrides Shape::AMethod. When you call AMethod using a Circle object, it causes the VTBL to be accessed for the address of AMethod. The address in the VTBL for AMethod is the address of Circle::AMethod, so that is what is called.

When you identify a Circle object with a Shape class pointer or reference and you make a call to a call to a private Shape method, it results in a call down the hierarchy to the Circle class. This needs repeating: A base class pointer or reference can be used to call a derived object's private methods. Sometimes this is called the Hollywood Principle: Donít call us, weíll call you.

Next, a little software engineering magic will help out the Point class. Since Shape::AMethod is now private, you can write a default implementation that will execute when classes like Point choose not to override.

A default, or null, implementation of a virtual method is an example of a thing called a hook. A derived class may override a hook, but is not required to do so. The Shape::AMethod now is a hook.

Here is the modified hierarchy:

Expand|Select|Wrap|Line Numbers
  1. class Shape
  2. {
  3.  
  4. private:
  5.          //Implementation
  6.          //Hook method
  7.     virtual stringstream& AMethod(stringstream& str);
  8.  
  9. public:
  10.     //Interface
  11.     stringstream& Show(stringstream& str);
  12.     //assume other methodsÖ
  13. };
  14. class Circle : public Shape
  15. {
  16.  
  17. private:
  18.     virtual stringstream& AMethod(stringstream& str);
  19. public:
  20. int Area();
  21. };
  22. class Cylinder : public Shape
  23. {
  24. private:
  25.     virtual stringstream& AMethod(stringstream& str);
  26. public:
  27.  
  28.     int Volume();
  29. };
  30. class Point : public Shape
  31. {
  32.  
  33. public:
  34.  
  35. };
  36. stringstream& Shape::Show(stringstream& str)
  37. {
  38.       this->AMethod(str);
  39.       return str;
  40. }
  41. //A default implementation. This is a hook.
  42. stringstream& Shape::AMethod(stringstream& str)
  43. {
  44.     str << "Unsupported" << endl;
  45.            return str;
  46. }
  47. stringstream& Circle::AMethod(stringstream& str)
  48. {
  49.     str << "Circle::AMethod()" << endl;
  50.            return str;
  51. }
  52. stringstream& Cylinder::AMethod(stringstream& str)
  53. {
  54.     str << "Cylinder::AMethod()" << endl;
  55.            return str;
  56. }
  57.  
  58. ostream& Application(Shape& obj, ostream& os)
  59. {
  60.     stringstream ss;
  61.     obj.Show(ss);
  62.     os << ss.str() << endl;
  63.  
  64.     return os;
  65. int main()
  66. {
  67.     ofstream outfile("Data.txt");
  68.     Circle c;
  69.     Application(c, cout);
  70.     Application(c, outfile);
  71.     Cylinder cyl;
  72.     Application(cyl, cout);
  73.     Application(cyl, outfile);
  74.     Point p;
  75.     Application(p, cout);
  76.     Application(p, outfile);
  77.  
  78.  
  79.     return 0;
  80. }
  81.  
Problem 2: A Base Class Change Must be Implemented in the Derived Classes

In the original example each derived class overrides Shape::AMethod. That forces any changes required of Shape::AMethod to be implemented in the derived class.

Letís suppose requirements change and Shape::AMethodmust insert the stringstream to an output stream inside a window box. That is, the contents of the stringstream must be surrounded by asterisks.

Expand|Select|Wrap|Line Numbers
  1. stringstream& Circle::AMethod(stringstream& str)
  2. {
  3.     str << "*********************"<< endl
  4.                << "* Circle::AMethod() *" <<endl
  5.                 << "*********************" << endl;
  6.            return str;
  7.  

To do this, the override of Shape::AMethod in every derived class will need to be changed to incorporate this new feature and changed in exactly the same way, possibly by different developers employed by different companies. The chance of this happening is zero.

It is better for the Shape base class to provide the window box and the Shape::Show you have to write to solve the previous problem is the place to do this. Derived classes, like Circle can hook-in to customize Shape::Show by overriding Shape::AMethod which is called by Shape::Show.

The Shape::Show method now contains the logic steps to be followed by all derived classes with the derived classes allowed to tailor the output. Shape::Show has become an algorithm.

The only change required to insert the AMethod results in a window box is to the Shape::Show method:

Expand|Select|Wrap|Line Numbers
  1. ostream& Shape::Show(ostream& os)
  2. {
  3.     stringstream str;
  4.     this->AMethod(str);     //The Hollywood Principle: Shape calls derived method
  5.     //Find the number of characters in the stringstream
  6.     int len;
  7.     str.seekg(0, ios_base::end);
  8.     len = str.tellg();
  9.     str.seekg(0, ios_base::beg);
  10.     len -= str.tellg();
  11.  
  12.     if (!len)
  13.     {
  14.         return os;
  15.     }
  16.  
  17.     //Top of window box
  18.     os << "**";
  19.     for (int i = 0; i < len; ++i)
  20.     {
  21.         os << "*";
  22.     }
  23.     os <<"**" << endl;
  24.  
  25.     //Middle of window box
  26.     os << "* " << str.str() << " *" << endl;
  27.  
  28.     //Bottom of window box
  29.     os << "**";
  30.     for (int i = 0; i < len; ++i)
  31.     {
  32.         os << "*";
  33.     }
  34.     os << "**" << endl;
  35.  
  36.     //End
  37.     return os;
  38.  
  39. }
  40.  
Design Patterns: The Template Method

The notion of non-virtual public methods is not new. In 1994 Eric Gamma and the "gang of four" wrote the first definitive book on design patterns. It is a collection of 23 commonly encountered design problems solved using the rules of object-oriented programming with the examples coded in C++.

One design pattern is named the Template Method. This is the name of the design pattern itself and has nothing whatsoever to do with C++ templates.

In this pattern, the base class has the public interface methods and it uses private virtual hooks for optional customization and private pure virtual functions for mandatory customization.

The base class method, illustrated in this article by Shape::Show contains all of the logic except the stringstream contents. Those contents are obtained from Shape::AMethod. Either, the default value is used, or the function is overridden by a derived class for specific contents.

This allows the base class to control the steps, or algorithm, to be followed by the application. In this case it prepares the window box. The derived classes can hook-in at to tailor the contents of the window box but cannot do away with it.

The public interface is completely separate from the private implementation functions.

Researching this design pattern is very worth while.


Problem 3: Derived Classes May Have Methods Not In the Base Class

Normal inheritance has derived classes adding methods to those inherited from the base class. That saves recoding those base methods in the derived class. This works well when the application is not object-oriented as when derived objects are created and used as derived objects. That is, when the program is object-based rather than object-oriented.

When the program is object-oriented (using polymorphism), all methods must be declared public in the base class because derived objects will be identified only by base class pointers or references. That means any derived methods to be executed must also be public in the base class.

However, practically speaking, a class may wish to be polymorphic for some methods but may have additional methods that are not part of the base class. There is simply no way to call these methods unless they are declared in the base class. Again, derived classes are forced to implement methods they cannot use or the program just won't compile.

In the example, the class Circle has a Circle::Area() method and the class Cylinder has a Cylinder::Volume() method. Because these methods are not in the base class, they are not available to the application function:
Expand|Select|Wrap|Line Numbers
  1. void Application(Shape& obj)
  2. {
  3.     obj.Show();
  4.  
  5.     obj.Area();    //ERROR:  'Area' : is not a member of 'Shape'
  6.     obj.Volume(); //ERROR: 'Volume' : is not a member of 'Shape'
  7. }
  8.  
Actually, if obj is a Point, then neither Area() nor Volume() is possible. Area() is only possible if obj is a Circle and Volume() requires obj to be a Cylinder.

Unfortunately, the Application() function was written with a Shape argument thereby making access to methods in the derived class that are not in the base class impossible.

This is not a signal to use RTTI. If that is done, a run-time check will have to be made on obj's real type. If it is a Circle, then the address of obj must be typecast to a Circle* just so the Area() method can be called.

This is gruesomely ugly for many reasons:
a) the Application function would need to be changed every time a new derived
class was engineered with methods in addition to those in the Shape base
class,
b) new methods were added to existing derived classes
c) RTTI causes a database to be maintained for every object that contains the
real type of the object and that slows execution
d) a typecast is required and that involves human intervention
e) the Shape hierarchy becomes exposed which violates encapsulation
f) this only works when obj has public virtual methods.

Plus the Application function Shape& argument provides no clue about any derived classes. Hiding the derived classes is what object-oriented programming is all about. Exposing them violates the reason for having object-oriented programming in the first place.

The Visitor design pattern is used to access derived class methods that are not in the base class when the object is known only by a base class reference or pointer.

Any base class being designed as a polymorphic base class should be designed for a Visitor.

At the end of this article, you can see a ShapeVisitor class in the final version of the original hierarchy.

Rather than repeat the details here, there is an article you can read in the C/C++ HowTos on the Visitor design pattern and how to implement it.


Problem 4: Derived Class Destructors May Not Be Called

This problem is not obvious in the example. Here the designer didnít use destructors at all since there was nothing to clean up.

However, if you later write a derived class that does require cleanup, the destructor for your derived object will not be called because it will be deleted as a base object. When you do that, only the base class destructor is called.

In the original example, Shape does not have destructor. That causes the compiler to use the C++ default which is to call the destructor only on the Shape data members. In the example there are no Shape data members so nothing is actually done. If your derived class has allocated resources that need to be released when the derived object goes out of scope, your derived class destructor will not be called and you will leak.

You solve this problem easily by declaring a virtual destructor in the Shape base class. That prompts the compiler to call the destructor on your derived class first and then to call the destructor on the Shape base class.

Whenever a hierarchy is designed for polymorphism, there is always a possibility that a derived class will allocate resources that need to be cleaned up by a call the derived class destructor. Therefore, all polymorphic base classes must have a virtual destructor.

Restated: Any polymorphic base class without a virtual destructor is a design error.

Expand|Select|Wrap|Line Numbers
  1. class Shape
  2. {
  3. public:
  4.     ostream& Show(ostream& os);
  5.     virtual ~Shape();
  6.  
  7. private:
  8.     virtual void draw();
  9. };
  10. Shape::~Shape()
  11. {
  12.  
  13. }
  14.  
Conclusions

1) Virtual functions should be private.

The fact that coursework and textbooks use them publicly is really and artifact of the 20th century. In the beginning, this was thought to the correct thing to do. However, by 1994 it was becoming clear that public virtual functions were a problem. The Template Method design pattern was developed to avoid them.

As a general rule any polymorphic base class with public virtual functions is a design error.

2) Polymorphic base classes should be set up to accommodate a Visitor.

This should be done even if it is considered that a Visitor will never be used. Almost certainly, if it appears a Visitor will never be needed, then it will be needed. Try to design with an eye to the future by making your current design extendable.

3) All polymorphic base classes require a virtual destructor.

That is, any polymorphic base class without a virtual destructor is a design error.

The Final Modified Example
Here is the original example containing the modifications required to solve the problems in the original design discussed above.
Expand|Select|Wrap|Line Numbers
  1. class ShapeVisitor;
  2. class Shape
  3. {
  4. public:
  5.     ostream& Show(ostream& os);
  6.     void Visitor(ShapeVisitor*);
  7.     virtual ~Shape();
  8.  
  9. private:
  10.     virtual stringstream& AMethod(stringstream& str);
  11. };
  12. class Circle : public Shape
  13. {
  14.     public:
  15.         int Area();
  16.     private:
  17.         virtual stringstream& AMethod(stringstream& str);
  18.  
  19. };
  20. class Cylinder : public Shape
  21. {
  22.     public:
  23.         int Volume();
  24.  
  25.     private:
  26.         virtual stringstream& AMethod(stringstream& str);
  27. };
  28. class Point : public Shape
  29. {
  30.  
  31. };
  32.  
Expand|Select|Wrap|Line Numbers
  1. stringstream& Shape::AMethod(stringstream& str)
  2. {
  3.     //This is a hook method.
  4.     //It provides a default value if draw()
  5.     //not supported by the derived class
  6.     str << "Unsupported";
  7.     return str;
  8. }
  9. Shape::~Shape()
  10. {
  11.  
  12. }
  13.  
  14. stringstream& Circle::AMethod(stringstream& str)
  15. {
  16.     str << "Circle::AMethod()";
  17.     return str;
  18. }
  19.  
  20. stringstream& Cylinder::AMethod(stringstream& str)
  21. {
  22.     str << "Cylinder::AMethod()";
  23.     return str;
  24. }
  25.  
  26. ostream& Application(Shape& obj, ostream& os)
  27. {
  28.     obj.Show(os);
  29.  
  30.     return os;
  31. }
  32. ostream& Shape::Show(ostream& os)
  33. {
  34.     stringstream str;
  35.     this->AMethod(str);
  36.     //Find the number of characters in the stringstream
  37.     int len;
  38.     str.seekg(0, ios_base::end);
  39.     len = str.tellg();
  40.     str.seekg(0, ios_base::beg);
  41.     len -= str.tellg();
  42.  
  43.     if (!len)
  44.     {
  45.         return os;
  46.     }
  47.  
  48.     //Top of window box
  49.     os << "**";
  50.     for (int i = 0; i < len; ++i)
  51.     {
  52.         os << "*";
  53.     }
  54.     os <<"**" << endl;
  55.  
  56.     //Middle of window box
  57.     os << "* " << str.str() << " *" << endl;
  58.  
  59.     //Bottom of window box
  60.     os << "**";
  61.     for (int i = 0; i < len; ++i)
  62.     {
  63.         os << "*";
  64.     }
  65.     os << "**" << endl;
  66.  
  67.     //End
  68.     return os;
  69.  
  70. }
  71.  
Expand|Select|Wrap|Line Numbers
  1. int main()
  2. {
  3.     ofstream outfile("Data.txt");
  4.     Circle c;
  5.     Application(c, cout);
  6.     Application(c, outfile);
  7.     Cylinder cyl;
  8.     Application(cyl, cout);
  9.     Application(cyl, outfile);
  10.     Point p;
  11.     Application(p, cout);
  12.     Application(p, outfile);
  13.  
  14.  
  15.     return 0;
  16. }
  17.  
[b] Further Information [b]
Refer to the book Design Patterns by Erich Gamma, et al, Addison-Wesley 1994.

This article shows only the conceptual basis of the Template Method design pattern but not motivations and ramifications of using this pattern.

Copyright 2008 Buchmiller Technical Associates North Bend WA USA
Apr 15 '08 #1
Share this Article
Share on Google+
1 Comment


100+
P: 110
Polymorphism is an excellent tool, but research the vthunk table on how virtual functions utilize it. There is a tad bit more overhead than you may think.

So if your application uses timing critical functions (such as writing to graphics memory, or a function which runs in a loop 500 times), a callback function (i.e. a pointer to a function) should be used, or do what you can to avoid even using that.

But otherwise, for code clarity and for object re-using, Polymorphism--using virtual functions--is the way to go : )
Aug 19 '08 #2