the c# return statement

John Bailo

The c# *return* statement has been bothering me the past few months.

I don't like the fact that you can have different code paths in a method
and have multiple return statements. To me, it would be more orthogonal
if a method could only have one return statement.

--
W '04 <:> Open Source

Jul 21 '05 #1

Subscribe Post Reply

2124

Tom B.

John Bailo wrote:

The c# *return* statement has been bothering me the past few months.

I don't like the fact that you can have different code paths in a method
and have multiple return statements. To me, it would be more orthogonal
if a method could only have one return statement.

To some extent, I share this sentement :-) I find that
Delphi's/Pascal's 'result' returning mechanism is more elegant, for the
most part. However, I think there are /some/ instances where returning
before the end of a method can produce more straight-forward code.

Example of Delphi's/Pascal's returning mechanism:

function Foo: Boolean;
begin
Result := True;
if FooBar then
Result := GiveMeABoolean();
DoSomethingElse();
end; // return value is value of Result

Of course, this can easily be simulated in C# (but it involves more
statements, generally):

bool Foo()
{
bool result = true;

if (fooBar)
result = GiveMeABoolean();
DoSomethingElse();

return result;
}

Or even better, in this case:

bool Foo()
{
bool result = fooBar ? GiveMeABoolean() : true;
DoSomethingElse();
return result;
}

I guess I don't mind the C-style returning mechanism so much if it's not
abused (in my opinion, according to my own personal style), but find
Delphi's/Pascal's equivalent more elegant in most cases.

Jul 21 '05 #2

mlw

John Bailo wrote:

The c# *return* statement has been bothering me the past few months.

I don't like the fact that you can have different code paths in a method
and have multiple return statements. To me, it would be more orthogonal
if a method could only have one return statement.

The C# language is very much based on C++ and Java, multiple return
statements are part of the language.

If you have a problem with multiple return statements, don't use them.

Jul 21 '05 #3

Donovan Rebbechi

In article <26******************************@news.teranews.co m>, John Bailo wrote:

The c# *return* statement has been bothering me the past few months.

I don't like the fact that you can have different code paths in a method
and have multiple return statements. To me, it would be more orthogonal
if a method could only have one return statement.

If you have exceptions, you can have different paths in a method. So unless
you're prepared to do without exceptions and check return codes every time
you tie your shoelaces, you're stuck with the perceived evils of multiple paths.

Cheers,
--
Donovan Rebbechi
http://pegasus.rutgers.edu/~elflord/

Jul 21 '05 #4

Tom B.

Donovan Rebbechi wrote:

In article <26******************************@news.teranews.co m>, John Bailo wrote:
The c# *return* statement has been bothering me the past few months.

I don't like the fact that you can have different code paths in a method
and have multiple return statements. To me, it would be more orthogonal
if a method could only have one return statement.

If you have exceptions, you can have different paths in a method. So unless
you're prepared to do without exceptions and check return codes every time
you tie your shoelaces, you're stuck with the perceived evils of multiple paths.

As I say, I agree with JB to some extent in that I don't generally like
returning from methods /before/ the end, with a 'return' statement.
However, I don't feel the same with throwing exceptions since, if one's
using them correctly, one'll only be throwing them in /exceptional/
circumstances anyway, in which case one would have no desire *not* to
return from the method there and then.

Jul 21 '05 #5

Milo T.

On Thu, 08 Apr 2004 22:33:22 GMT, John Bailo wrote:

The c# *return* statement has been bothering me the past few months.

I don't like the fact that you can have different code paths in a method
and have multiple return statements. To me, it would be more orthogonal
if a method could only have one return statement.

You mean "simpler", not "orthogonal".

And it's only simpler if the logic in the function is simple. Once it gets
complex, it makes more sense to return at the point a return is required -
not waiting until later and setting up all kinds of flags and variables to
stash state until you hit the return statement.

Unless you're suggesting that a goto statement to the return would be ok?
--
People in the killfile (and whose posts I won't read) as of 4/8/2004
6:10:03 PM:
Peter Kohlmann, T.Max Devlin. Matt Templeton (scored down)

Jul 21 '05 #6

Donovan Rebbechi

In article <1m**************************@fanatastical.malapro p.net>, Milo T. wrote:

On Thu, 08 Apr 2004 22:33:22 GMT, John Bailo wrote:

The c# *return* statement has been bothering me the past few months.

I don't like the fact that you can have different code paths in a method
and have multiple return statements. To me, it would be more orthogonal
if a method could only have one return statement.
You mean "simpler", not "orthogonal".

And it's only simpler if the logic in the function is simple. Once it gets

To me, very long and complex functions in an OOP language usually indicates
messy coding. One can usually break up the logic into smaller functions.
If the complexity of passing all the local data to these functions is
prohibitive (this is one of the main reasons functions aren't subdivided,
especially in C), it's usually a sign that you need group some of this data
into objects.
complex, it makes more sense to return at the point a return is required -
not waiting until later and setting up all kinds of flags and variables to
stash state until you hit the return statement.

Unless you're suggesting that a goto statement to the return would be ok?

Nah. He was thinking of wrapping the function body in a try block.

Cheers,
--
Donovan Rebbechi
http://pegasus.rutgers.edu/~elflord/

Jul 21 '05 #7

Linønut

Error BR-549: MS DRM 1.0 rejects the following post from Tom B.:

John Bailo wrote:

Example of Delphi's/Pascal's returning mechanism:

function Foo: Boolean;
begin
Result := True;
if FooBar then
Result := GiveMeABoolean();
DoSomethingElse();
end; // return value is value of Result

Get that ugly crap offa my screen!

<hyperventilates>

I'm no longer using Builder.
I'm no longer using Builder.
I'm no longer using Builder.

--
I tried to read "Dune" but found it a little dry.

Jul 21 '05 #8

Linønut

Error BR-549: MS DRM 1.0 rejects the following post from Donovan Rebbechi:

In article <26******************************@news.teranews.co m>, John Bailo
wrote:
I don't like the fact that you can have different code paths in a method and
have multiple return statements. To me, it would be more orthogonal if a
method could only have one return statement.

If you have exceptions, you can have different paths in a method. So unless
you're prepared to do without exceptions and check return codes every time
you tie your shoelaces, you're stuck with the perceived evils of multiple paths

Multiple paths can be evil in complete code or code that is dominated by such
paths. I can still remember unrolling some Cosmic FORTRAN code into decent
structured C code.

If a routine fits on the screen, you can pretty much get away with anything.

--
Trust your data to a Linux server or desktop!

Jul 21 '05 #9

Jon Skeet [C# MVP]

[Removed comp.os.linux.advocacy, which is completely irrelevant here.]

John Bailo <ja*****@earthlink.net> wrote:

The c# *return* statement has been bothering me the past few months.

I don't like the fact that you can have different code paths in a method
and have multiple return statements. To me, it would be more orthogonal
if a method could only have one return statement.

I disagree. For instance, here's a piece of code I've been using in a
thread on the C# newsgroup recently:

public static bool IsDecimal (string data)
{
bool gotPoint = false;

foreach (char c in data)
{
if (c=='.')
{
if (gotPoint)
{
return false;
}
gotPoint = true;
continue;
}
if (c < '0' || c > '9')
{
return false;
}
}
return true;
}

(This isn't designed to be culture-sensitive, or work with +/- etc -
it's just an example.)

To avoid multiple returns, you end up having to introduce another local
variable, and break out of the loop when you know what the result is
going to be. You could do the breaking part using a while loop instead,
but then the iteration becomes less readable. Either way, you've still
got the extra local variable.

The way I see it, "return" is a powerful way of saying "If you've got
here, you know the result of the method - none of the rest of the
code/state is relevant."

As others have mentioned, if you don't like using it, you don't have to
- you can always put that extra local variable in, along with all the
breaks you need, and then return at the end of the method. No-one's
forcing you to write code like the above - but I'm glad I'm not being
forced *not* to write it, either.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Jul 21 '05 #10

Milo T.

On Fri, 9 Apr 2004 02:39:43 +0000 (UTC), Donovan Rebbechi wrote:

In article <1m**************************@fanatastical.malapro p.net>, Milo T. wrote:
On Thu, 08 Apr 2004 22:33:22 GMT, John Bailo wrote:

The c# *return* statement has been bothering me the past few months.

I don't like the fact that you can have different code paths in a method
and have multiple return statements. To me, it would be more orthogonal
if a method could only have one return statement.

You mean "simpler", not "orthogonal".

And it's only simpler if the logic in the function is simple. Once it gets

To me, very long and complex functions in an OOP language usually indicates
messy coding. One can usually break up the logic into smaller functions.
If the complexity of passing all the local data to these functions is
prohibitive (this is one of the main reasons functions aren't subdivided,
especially in C), it's usually a sign that you need group some of this data
into objects.

Yes, generally I would agree but sometimes breaking things down into
objects is something you don't want to or cannot do (for performance
reasons).

Besides, I find the first clearer than the second personally...

ValueType Function(EnumType a) {

switch(a)
{
case 1:
{
DoSomething();
DoSomethingElse();
return DoSomethingNew();
}
default:
ASSERT("Shouldn't reach here - bad switch value");
// intentional fallthrough
case 2:
{
DoSomething();
DoSomethingElse();
return DoSomethingNew();
}
}

}

ValueType Function(EnumType a) {

ValueType v;

switch(a)
{
case 1:
{
DoSomething();
DoSomethingElse();
v=DoSomethingNew();
break;
}
default:
ASSERT("Shouldn't reach here - bad switch value");
// intentional fallthrough
case 2:
{
DoSomething();
DoSomethingElse();
v=DoSomethingNew();
break;
}
}

return v;

}
--
People in the killfile (and whose posts I won't read) as of 4/9/2004
12:11:27 AM:
Peter Kohlmann, T.Max Devlin. Matt Templeton (scored down)

Jul 21 '05 #11

John Bailo

Milo T. wrote:

On Fri, 9 Apr 2004 02:39:43 +0000 (UTC), Donovan Rebbechi wrote:
In article <1m**************************@fanatastical.malapro p.net>, Milo
T. wrote:
On Thu, 08 Apr 2004 22:33:22 GMT, John Bailo wrote:
The c# *return* statement has been bothering me the past few months.

I don't like the fact that you can have different code paths in a
method
and have multiple return statements. To me, it would be more
orthogonal if a method could only have one return statement.

You mean "simpler", not "orthogonal".

And it's only simpler if the logic in the function is simple. Once it
gets

To me, very long and complex functions in an OOP language usually
indicates
messy coding. One can usually break up the logic into smaller functions.
If the complexity of passing all the local data to these functions is
prohibitive (this is one of the main reasons functions aren't subdivided,
especially in C), it's usually a sign that you need group some of this
data into objects.

Yes, generally I would agree but sometimes breaking things down into
objects is something you don't want to or cannot do (for performance
reasons).

Besides, I find the first clearer than the second personally...

ValueType Function(EnumType a) {

switch(a)
{
case 1:
{
DoSomething();
DoSomethingElse();
return DoSomethingNew();
}
default:
ASSERT("Shouldn't reach here - bad switch value");
// intentional fallthrough
case 2:
{
DoSomething();
DoSomethingElse();
return DoSomethingNew();
}
}

}

ValueType Function(EnumType a) {

ValueType v;

switch(a)
{
case 1:
{
DoSomething();
DoSomethingElse();
v=DoSomethingNew();
break;
}
default:
ASSERT("Shouldn't reach here - bad switch value");
// intentional fallthrough
case 2:
{
DoSomething();
DoSomethingElse();
v=DoSomethingNew();
break;
}
}

return v;

}

See, that's exactly what I don't like -- having conditional multiple
returns.

What I propose is the following for a method declaration, which is normally:
public type funname(type param)

would now be

public @returnVar=null type funname(type param)

in this case, @returnVar is declared as part of the general declaration and
so is inherent in the the method, as well as it's default value. thus
funname can never not return a value or have an ambiguous code path. yet,
within the method the value of @returnVal can be continuously redefined.

--
W '04 <:> Open

Jul 21 '05 #12

Whelp

Tom B. wrote:

Or even better, in this case:

bool Foo()
{
bool result = fooBar ? GiveMeABoolean() : true;
DoSomethingElse();
return result;
}

I like:
bool Foo()
{
bool result = !foobar || GiveMeABoolean();
DoSomethingElse();
return result;
}

ACK on the C/Pascal style thing.

Regards,
Whelp

Jul 21 '05 #13

cody

> The c# *return* statement has been bothering me the past few months.

I don't like the fact that you can have different code paths in a method
and have multiple return statements. To me, it would be more orthogonal
if a method could only have one return statement.

even in pascal/delphi you can prematurely exit a method which often is
useful.

--
cody

[Freeware, Games and Humor]
www.deutronium.de.vu || www.deutronium.tk

Jul 21 '05 #14

Donovan Rebbechi

In article <1w**************************@fanatastical.malapro p.net>, Milo T. wrote:

Yes, generally I would agree but sometimes breaking things down into
objects is something you don't want to or cannot do (for performance
reasons).

Seriously ? I mean, can't a switch be performed via table lookup as opposed to
an if/then/else ?

switch is one of my least favourite constructs in code. It can always be
replaced with table lookup or polymorphism. Both of these would better
address the "bad switch value" issue.

Cheers,
--
Donovan Rebbechi
http://pegasus.rutgers.edu/~elflord/

Jul 21 '05 #15

Milo T.

On Fri, 9 Apr 2004 13:57:04 +0000 (UTC), Donovan Rebbechi wrote:

In article <1w**************************@fanatastical.malapro p.net>, Milo T. wrote:
Yes, generally I would agree but sometimes breaking things down into
objects is something you don't want to or cannot do (for performance
reasons).

Seriously ? I mean, can't a switch be performed via table lookup as opposed to
an if/then/else ?

switch is one of my least favourite constructs in code. It can always be
replaced with table lookup or polymorphism. Both of these would better
address the "bad switch value" issue.

Well, any good compiler will take a switch statement and turn it into a
lookup table for you.

As for polymorphism - in C#, I guess there's no real impact. In C++, you're
now carrying around a vtable.

*shrugs* All depends on what you're optimizing for, I guess.
--
People in the killfile (and whose posts I won't read) as of 4/9/2004
9:13:52 AM:
Peter Kohlmann, T.Max Devlin. Matt Templeton (scored down)

Jul 21 '05 #16

Milo T.

On Fri, 9 Apr 2004 13:57:04 +0000 (UTC), Donovan Rebbechi wrote:

switch is one of my least favourite constructs in code. It can always be
replaced with table lookup or polymorphism. Both of these would better
address the "bad switch value" issue.

Oh, and the other reason to use a switch:

Object* ConstructMeAnObject(unsigned short idvalue)
{
switch (idvalue)
{
case Square:
return new Square();
case Circle:
return new Circle();
case Pentagon:
return new Pentagon();
case Sheep:
return new Sheep();
}
}

.... which is the kind of code that gets important pretty quickly once you
start worry about serialization of data over a network, or to and from
files.

Sure, you can craft your own table of values and function pointers, but
it's not as readable - and will never be as readable until C++/C/C# gets
better support for table-driving data structures.

Although I'd find better table-driven support a bit of a boon right now. At
the moment I have to tie a UI control to an internal lookup ID, to a member
on a network serialized structure, to a scaling ID, to an output data
structure.

Sure would be nice to be able to just fill in a table, and get that to
spill out all of the interconnections instead of having to hack them in
manually.

--
People in the killfile (and whose posts I won't read) as of 4/9/2004
9:21:11 AM:
Peter Kohlmann, T.Max Devlin. Matt Templeton (scored down)

Jul 21 '05 #17

Donovan Rebbechi

In article <1f*************************@fanatastical.malaprop .net>, Milo T. wrote:

On Fri, 9 Apr 2004 13:57:04 +0000 (UTC), Donovan Rebbechi wrote:
switch is one of my least favourite constructs in code. It can always be
replaced with table lookup or polymorphism. Both of these would better
address the "bad switch value" issue.
Oh, and the other reason to use a switch:

Object* ConstructMeAnObject(unsigned short idvalue)
{
switch (idvalue)
{
case Square:
return new Square();
case Circle:
return new Circle();
case Pentagon:
return new Pentagon();
case Sheep:
return new Sheep();
}
}

... which is the kind of code that gets important pretty quickly once you
start worry about serialization of data over a network, or to and from
files.

Better, because this kind of code usually consolidates the switch in the one
place. Note that this sort of construction doesn't come up that many times
in the same program, because you only have one of these for each family of
classes.
Sure, you can craft your own table of values and function pointers, but
it's not as readable - and will never be as readable until C++/C/C# gets
better support for table-driving data structures.

see std::map. The clean way to implement the above in c++ is to have a
std::map<int, Object*> and use a clone() method to get your prototype.
The translation unit that defines the subclass in question takes care of
inserting the instance into the map.

e.g.

template <class T> insert_into_map {
insert_into_map(std::string s) {
T* x = new T;
the_map().insert(s,x); // the_map is the global table
}
};

class Circle {

....

};
namespace {
insert_into_map <Circle> x("Circle);
};

Object* make_object(const std::string & id){
std::map<std::string,Object*>::iterator it = the_map().find(id);
if (it==the_map().end())
return NULL;
else return it->second->clone();
}

This also has the advantage that you can populate the map at runtime if you
dynamically load the code that defines circle (so you get runtime plugin
support)
Cheers,
--
Donovan Rebbechi
http://pegasus.rutgers.edu/~elflord/

Jul 21 '05 #18

Ian Hilliard

On Fri, 09 Apr 2004 16:26:05 +0000, Milo T. wrote:

On Fri, 9 Apr 2004 13:57:04 +0000 (UTC), Donovan Rebbechi wrote:
switch is one of my least favourite constructs in code. It can always be
replaced with table lookup or polymorphism. Both of these would better
address the "bad switch value" issue.

Oh, and the other reason to use a switch:

Object* ConstructMeAnObject(unsigned short idvalue)
{
switch (idvalue)
{
case Square:
return new Square();
case Circle:
return new Circle();
case Pentagon:
return new Pentagon();
case Sheep:
return new Sheep();
}
}

... which is the kind of code that gets important pretty quickly once you
start worry about serialization of data over a network, or to and from
files.

Sure, you can craft your own table of values and function pointers, but
it's not as readable - and will never be as readable until C++/C/C# gets
better support for table-driving data structures.

Although I'd find better table-driven support a bit of a boon right now. At
the moment I have to tie a UI control to an internal lookup ID, to a member
on a network serialized structure, to a scaling ID, to an output data
structure.

Sure would be nice to be able to just fill in a table, and get that to
spill out all of the interconnections instead of having to hack them in
manually.

This is a good one. If the idvalue is other than Square, Circle, Pentagon
or Sheep, then you have no return. To that I have to say that one of the
tennits of good coding, states that I cannot trust an object where I do
not control the memory in which it resides. The returned object is not in
the callers control and hence may be invalid. This may be fast but it is
very dangerous. There are far better patterns were reliability is required
and isn't that all the time.

Ian

Jul 21 '05 #19

Daniel O'Connell [C# MVP]

"Donovan Rebbechi" <ab***@aol.com> wrote in message
news:sl******************@panix2.panix.com...

In article <1f*************************@fanatastical.malaprop .net>, Milo
T. wrote:
Sure, you can craft your own table of values and function pointers, but
it's not as readable - and will never be as readable until C++/C/C# gets
better support for table-driving data structures.

see std::map. The clean way to implement the above in c++ is to have a
std::map<int, Object*> and use a clone() method to get your prototype.
The translation unit that defines the subclass in question takes care of
inserting the instance into the map.

e.g.

template <class T> insert_into_map {
insert_into_map(std::string s) {
T* x = new T;
the_map().insert(s,x); // the_map is the global table
}
};

class Circle {

....

};
namespace {
insert_into_map <Circle> x("Circle);
};

Object* make_object(const std::string & id){
std::map<std::string,Object*>::iterator it = the_map().find(id);
if (it==the_map().end())
return NULL;
else return it->second->clone();
}

This also has the advantage that you can populate the map at runtime if
you
dynamically load the code that defines circle (so you get runtime plugin
support)

IMHO, anything solution to a problem so simple as choosing a path from 3-5
static options that uses templates and complex logic is really overthinking
the problem. The resultant code is going to be slower, the time to fix bugs
is going to increase, and the code understandability goes down.

Also, in the case you want dynamic lookups, I would recommend a factory
approach over a cloning approach. That not only allows you dynamic lookups,
but also allows dynamic parameters and saves you from having to design a
class so it can be instantiated without actually doing its work.
The equivilent C# code(using generics partially) would be...

public interface IObjectFactory
{
object CreateObject(object[] arguments);
}

public class MapLookupClass
{
Dictionary<short,IObjectFactory> dict = new Dictionary<short,object>();
public vpod InsertIntoMap(short id, IObjectFactory factory)
{
dict.Add(id,objectFactory);
}

public MakeObject(short id, object[] arguments)
{
IObjectFactory fac;
fac = dict[id];
return fac.CreateObject(arguments);
}
}
It is more readable than the C++ approach, IMHO, but it is still
considerably less readable than a simple switch.
It doesn't strive to achieve automatic registration as yours does, although
I think that is possible I don't have a compiler on hand to write something
to test it and get the syntax right ATM. I would probably use attributes and
reflection anyway.

Cheers,
--
Donovan Rebbechi
http://pegasus.rutgers.edu/~elflord/

Jul 21 '05 #20

Milo T.

On Fri, 09 Apr 2004 21:03:10 +0200, Ian Hilliard wrote:

On Fri, 09 Apr 2004 16:26:05 +0000, Milo T. wrote:
On Fri, 9 Apr 2004 13:57:04 +0000 (UTC), Donovan Rebbechi wrote:
switch is one of my least favourite constructs in code. It can always be
replaced with table lookup or polymorphism. Both of these would better
address the "bad switch value" issue.

Oh, and the other reason to use a switch:

Object* ConstructMeAnObject(unsigned short idvalue)
{
switch (idvalue)
{
case Square:
return new Square();
case Circle:
return new Circle();
case Pentagon:
return new Pentagon();
case Sheep:
return new Sheep();
}
}

... which is the kind of code that gets important pretty quickly once you
start worry about serialization of data over a network, or to and from
files.

Sure, you can craft your own table of values and function pointers, but
it's not as readable - and will never be as readable until C++/C/C# gets
better support for table-driving data structures.

Although I'd find better table-driven support a bit of a boon right now. At
the moment I have to tie a UI control to an internal lookup ID, to a member
on a network serialized structure, to a scaling ID, to an output data
structure.

Sure would be nice to be able to just fill in a table, and get that to
spill out all of the interconnections instead of having to hack them in
manually.

This is a good one. If the idvalue is other than Square, Circle, Pentagon
or Sheep, then you have no return. To that I have to say that one of the
tennits of good coding, states that I cannot trust an object where I do
not control the memory in which it resides. The returned object is not in
the callers control and hence may be invalid. This may be fast but it is
very dangerous. There are far better patterns were reliability is required
and isn't that all the time.

Well, I wasn't trying to write solid and robust code, I was trying to spend
5 seconds coming up with examples where switch statements would be useful.

However, I appreciate your willingness in becoming a code reviewer for me,
though I would like to point out that the correct way to solve the above
problem is merely to add a default: return NULL; or default: throw
exception("Unknown object type in deserialization"); to the bottom of the
statement.

In future, I will remember to provide you with a fully compiled, zipped up
source tree with a make file instead of just a quick off the cuff example.

By the way, you missed the fact that I was missing a main() function as
well.
--
People in the killfile (and whose posts I won't read) as of 4/9/2004
1:52:27 PM:
Peter Kohlmann, T.Max Devlin. Matt Templeton (scored down)

Jul 21 '05 #21

Milo T.

On Fri, 9 Apr 2004 18:51:24 +0000 (UTC), Donovan Rebbechi wrote:

In article <1f*************************@fanatastical.malaprop .net>, Milo T. wrote:
On Fri, 9 Apr 2004 13:57:04 +0000 (UTC), Donovan Rebbechi wrote:
switch is one of my least favourite constructs in code. It can always be
replaced with table lookup or polymorphism. Both of these would better
address the "bad switch value" issue.

Oh, and the other reason to use a switch:

Object* ConstructMeAnObject(unsigned short idvalue)
{
switch (idvalue)
{
case Square:
return new Square();
case Circle:
return new Circle();
case Pentagon:
return new Pentagon();
case Sheep:
return new Sheep();
}
}

... which is the kind of code that gets important pretty quickly once you
start worry about serialization of data over a network, or to and from
files.

Better, because this kind of code usually consolidates the switch in the one
place. Note that this sort of construction doesn't come up that many times
in the same program, because you only have one of these for each family of
classes.
Sure, you can craft your own table of values and function pointers, but
it's not as readable - and will never be as readable until C++/C/C# gets
better support for table-driving data structures.

see std::map. The clean way to implement the above in c++ is to have a
std::map<int, Object*> and use a clone() method to get your prototype.
The translation unit that defines the subclass in question takes care of
inserting the instance into the map.

e.g.

template <class T> insert_into_map {
insert_into_map(std::string s) {
T* x = new T;
the_map().insert(s,x); // the_map is the global table
}
};

class Circle {

....

};
namespace {
insert_into_map <Circle> x("Circle);
};

Object* make_object(const std::string & id){
std::map<std::string,Object*>::iterator it = the_map().find(id);
if (it==the_map().end())
return NULL;
else return it->second->clone();
}

This also has the advantage that you can populate the map at runtime if you
dynamically load the code that defines circle (so you get runtime plugin
support)

It also has the disadvantage that you're using variable length strings as
your key for your map, you're using a tree for the lookup instead of a
hashtable or a trie (woe betide if you ever have to use this in a fast
fashion), and it adds to your initialization time. Not to mention the
potential memory cost of all of these Object's that you need to clone
later.

There's no point making code extensible unless you plan to extend it. Wrap
it in a function, and you can replace with an extensible version later.

Most cases, however, will survive perfectly well with the much faster
switch() statement.

--
People in the killfile (and whose posts I won't read) as of 4/9/2004
1:56:00 PM:
Peter Kohlmann, T.Max Devlin. Matt Templeton (scored down)

Jul 21 '05 #22

The Ghost In The Machine

In comp.os.linux.advocacy, John Bailo
<ja*****@earthlink.net>
wrote
on Thu, 08 Apr 2004 22:33:22 GMT
<26******************************@news.teranews.co m>:

The c# *return* statement has been bothering me the past few months.

I don't like the fact that you can have different code paths in a method
and have multiple return statements. To me, it would be more orthogonal
if a method could only have one return statement.

Java has the exact same problem.

I'll admit to some curiosity but one way around this issue may be
at the code generation level of the compiler: briefly, if one codes

public int doSomething()
{
if(badThing())
return -1;

/* something */

if(anotherbadThing())
return -2;

int retv = /* whatever */;

/* more something */

return retv;
}

one might get:

* * *

_doSomething:

CALLS #0, badThing
TST.L R0
JZ $1
MOV.L #-1, R0
JMP _doSomething$Return

$1:
/* something */

CALLS #0, anotherbadThing
TST.L R0
JZ $2
MOV.L #-2, R0
JMP _doSomething$Return

$2:

/* whatever */

MOV.L (whatever),R1 ; we're assuming retv is cached in a reg here

/* more something */

MOV.L R1, R0

_doSomething$Return:

RET

* * *

where only one RET is in the routine -- and the compiler is responsible
for ensuring that each other return statement has R0 loaded properly
and jumping thereto.

(The assembly language is a corruption of VAX assembly, which I happen
to like. MOV a, b stores a into b. '#' indicates immediate.
R0 and R1 are registers. TST.L tests a value and sets condition
flags which JZ tests. JMP, erm, jumps. $n is a local label, a
useful concept in some assemblers. CALLS calls a routine, using
parameters on the stack; VAX also supports CALLG, which uses
a preformatted argument list -- useful for very old FORTRAN and
COBOL dialects that did not support recursion. RET of course returns.)

Of course I for one feel this is mostly a philosophical dispute.
If the issue is code clarity, one can work around it in various fashions.

--
#191, ew****@earthlink.net
It's still legal to go .sigless.

Jul 21 '05 #23

John Bailo

The Ghost In The Machine wrote:

In comp.os.linux.advocacy, John Bailo
<ja*****@earthlink.net>
wrote
on Thu, 08 Apr 2004 22:33:22 GMT
<26******************************@news.teranews.co m>:

The c# *return* statement has been bothering me the past few months.

I don't like the fact that you can have different code paths in a method
and have multiple return statements. To me, it would be more orthogonal
if a method could only have one return statement.

Java has the exact same problem.

I'll admit to some curiosity but one way around this issue may be
at the code generation level of the compiler: briefly, if one codes

public int doSomething()
{
if(badThing())
return -1;

/* something */

if(anotherbadThing())
return -2;

int retv = /* whatever */;

/* more something */

return retv;
}

one might get:

* * *

_doSomething:

CALLS #0, badThing
TST.L R0
JZ $1
MOV.L #-1, R0
JMP _doSomething$Return

$1:
/* something */

CALLS #0, anotherbadThing
TST.L R0
JZ $2
MOV.L #-2, R0
JMP _doSomething$Return

$2:

/* whatever */

MOV.L (whatever),R1 ; we're assuming retv is cached in a reg here

/* more something */

MOV.L R1, R0

_doSomething$Return:

RET

* * *

where only one RET is in the routine -- and the compiler is responsible
for ensuring that each other return statement has R0 loaded properly
and jumping thereto.

(The assembly language is a corruption of VAX assembly, which I happen
to like. MOV a, b stores a into b. '#' indicates immediate.
R0 and R1 are registers. TST.L tests a value and sets condition
flags which JZ tests. JMP, erm, jumps. $n is a local label, a
useful concept in some assemblers. CALLS calls a routine, using
parameters on the stack; VAX also supports CALLG, which uses
a preformatted argument list -- useful for very old FORTRAN and
COBOL dialects that did not support recursion. RET of course returns.)

Of course I for one feel this is mostly a philosophical dispute.
If the issue is code clarity, one can work around it in various fashions.

what bothers me is that /return/ is both a *path* and a *value*

it sets a value to be passed to a method

*and*

it is a control statement -- offering break;

to me -- not good...

--
W '04 <:> Open

Jul 21 '05 #24

Donovan Rebbechi

In article <ke**************************@fanatastical.malapro p.net>, Milo T. wrote:

It also has the disadvantage that you're using variable length strings as
your key for your map, you're using a tree for the lookup instead of a
hashtable or a trie (woe betide if you ever have to use this in a fast
fashion), and it adds to your initialization time. Not to mention the

My understanding is that dynamic allocation is pretty slow so it should
dwarf the other stuff, but I don't really know because I haven't benchmarked
it -- never needed this idiom for fast allocation of small objects (I try to
avoid heavily dynamic code for very small objects that need to be processed
quickly). Could also depend partly on how good the allocator is.

Cheers,
--
Donovan Rebbechi
http://pegasus.rutgers.edu/~elflord/

Jul 21 '05 #25

cody

> > I don't like the fact that you can have different code paths in a method

and have multiple return statements. To me, it would be more orthogonal if a method could only have one return statement.

Java has the exact same problem.

I'll admit to some curiosity but one way around this issue may be
at the code generation level of the compiler: briefly, if one codes

[..] where only one RET is in the routine -- and the compiler is responsible
for ensuring that each other return statement has R0 loaded properly
and jumping thereto.

(The assembly language is a corruption of VAX assembly, which I happen
to like. MOV a, b stores a into b. '#' indicates immediate.
R0 and R1 are registers.

Since when does the JVM use registers? It is stackbased.

--
cody

[Freeware, Games and Humor]
www.deutronium.de.vu || www.deutronium.tk

Jul 21 '05 #26

cody

> > I don't like the fact that you can have different code paths in a method

and have multiple return statements. To me, it would be more orthogonal if a method could only have one return statement.

Java has the exact same problem.

I'll admit to some curiosity but one way around this issue may be
at the code generation level of the compiler: briefly, if one codes

[..] where only one RET is in the routine -- and the compiler is responsible
for ensuring that each other return statement has R0 loaded properly
and jumping thereto.

(The assembly language is a corruption of VAX assembly, which I happen
to like. MOV a, b stores a into b. '#' indicates immediate.
R0 and R1 are registers.

Since when does the JVM use registers? It is stackbased.

--
cody

[Freeware, Games and Humor]
www.deutronium.de.vu || www.deutronium.tk

Jul 21 '05 #27

cody

> > I don't like the fact that you can have different code paths in a method

and have multiple return statements. To me, it would be more orthogonal if a method could only have one return statement.

Java has the exact same problem.

I'll admit to some curiosity but one way around this issue may be
at the code generation level of the compiler: briefly, if one codes

[..] where only one RET is in the routine -- and the compiler is responsible
for ensuring that each other return statement has R0 loaded properly
and jumping thereto.

(The assembly language is a corruption of VAX assembly, which I happen
to like. MOV a, b stores a into b. '#' indicates immediate.
R0 and R1 are registers.

Since when does the JVM use registers? It is stackbased.

--
cody

[Freeware, Games and Humor]
www.deutronium.de.vu || www.deutronium.tk

Jul 21 '05 #28

Daniel O'Connell [C# MVP]

"cody" <pl*************************@gmx.de> wrote in message
news:%2****************@tk2msftngp13.phx.gbl...

> I don't like the fact that you can have different code paths in a
> method
> and have multiple return statements. To me, it would be more orthogonal > if a method could only have one return statement.
>
Java has the exact same problem.

I'll admit to some curiosity but one way around this issue may be
at the code generation level of the compiler: briefly, if one codes

[..]
where only one RET is in the routine -- and the compiler is responsible
for ensuring that each other return statement has R0 loaded properly
and jumping thereto.

(The assembly language is a corruption of VAX assembly, which I happen
to like. MOV a, b stores a into b. '#' indicates immediate.
R0 and R1 are registers.

Since when does the JVM use registers? It is stackbased.

JIT'ers are pretty much free to use whatever registers the cpu exposes as it
wishes, only the virtual machine itself is stack based. The above assembly
is a pretty bad choice to illustrate the point, considering its considerably
different than the architecture we are used to and it doesn't illustrate the
bytecode emitted by C#\Java\what have you. However the same thing could be
showing in IL or x86 assembly. You rarely see more than one ret instruction
in a routine, although I'm pretty sure thats nothing more than an assembler
restriction in mosts cases. I believe some assemblers I've used use ret to
close a proc instead of a specific end proc keyword or whatnot. There is no
restriction in any cpu I know of that states only one ret per proc(infact no
cpu's I know well enough to say much about know what a proc is). Virtual
machines like the CLR and the JVM may require only one ret per routine, but
that would likely be determined by the verifier, not the execution unit
itself.

I would think the most common reason for ret being at the end is that much
of the time, a return requires restoring the register state or any other
state to what it was before the routine call, excluding of course the
changes the routine did. It is considerably more efficent to place that code
in the routine once instead of writing it for every return point. Hence a
jump is logical.

This isn't the case in a higher level language, where the compiler worries
about maintaining state. --
cody

[Freeware, Games and Humor]
www.deutronium.de.vu || www.deutronium.tk

Jul 21 '05 #29

Ian Hilliard

On Fri, 09 Apr 2004 20:54:52 +0000, Milo T. wrote:

On Fri, 09 Apr 2004 21:03:10 +0200, Ian Hilliard wrote:
On Fri, 09 Apr 2004 16:26:05 +0000, Milo T. wrote:
On Fri, 9 Apr 2004 13:57:04 +0000 (UTC), Donovan Rebbechi wrote:
switch is one of my least favourite constructs in code. It can always
be replaced with table lookup or polymorphism. Both of these would
better address the "bad switch value" issue.

Oh, and the other reason to use a switch:

Object* ConstructMeAnObject(unsigned short idvalue) {
switch (idvalue)
{
case Square:
return new Square();
case Circle:
return new Circle();
case Pentagon:
return new Pentagon();
case Sheep:
return new Sheep();
}
}

... which is the kind of code that gets important pretty quickly once
you start worry about serialization of data over a network, or to and
from files.

Sure, you can craft your own table of values and function pointers,
but it's not as readable - and will never be as readable until
C++/C/C# gets better support for table-driving data structures.

Although I'd find better table-driven support a bit of a boon right
now. At the moment I have to tie a UI control to an internal lookup
ID, to a member on a network serialized structure, to a scaling ID, to
an output data structure.

Sure would be nice to be able to just fill in a table, and get that to
spill out all of the interconnections instead of having to hack them
in manually.

This is a good one. If the idvalue is other than Square, Circle,
Pentagon or Sheep, then you have no return. To that I have to say that
one of the tennits of good coding, states that I cannot trust an object
where I do not control the memory in which it resides. The returned
object is not in the callers control and hence may be invalid. This may
be fast but it is very dangerous. There are far better patterns were
reliability is required and isn't that all the time.

Well, I wasn't trying to write solid and robust code, I was trying to
spend 5 seconds coming up with examples where switch statements would be
useful.

However, I appreciate your willingness in becoming a code reviewer for
me, though I would like to point out that the correct way to solve the
above problem is merely to add a default: return NULL; or default: throw
exception("Unknown object type in deserialization"); to the bottom of
the statement.

In future, I will remember to provide you with a fully compiled, zipped
up source tree with a make file instead of just a quick off the cuff
example.

By the way, you missed the fact that I was missing a main() function as
well.

I didn't know you were looking for a code review for this code fragment.
Returning NULL could cause undefined results in the calling class. An
exception should only be used in exceptional cases. The best solution is
to have a null class, which inherits from Object, which is to be returned
in the error case. The null class can be tested on return to ensure that
it is NULL, without causing anything to break should it be used without
first being tested.

i.e.
//Allocation of static
NullObject MyClass::nullObject;

Object* MyClass::ConstructMeAnObject(unsigned short idvalue) {
switch (idvalue)
{
case Square:
return new Square();
case Circle:
return new Circle();
case Pentagon:
return new Pentagon();
case Sheep:
return new Sheep();
default:
return &nullObject:

}
}

I won't go into the fact that returning a reference is considered more
reliable than returning a pointer. Also, it is often better to use
precreated objects from a pool rather than creating new ones.

Ian

Jul 21 '05 #30

Milo T.

On Sat, 10 Apr 2004 22:18:16 +0200, Ian Hilliard wrote:

On Fri, 09 Apr 2004 20:54:52 +0000, Milo T. wrote:
On Fri, 09 Apr 2004 21:03:10 +0200, Ian Hilliard wrote:
On Fri, 09 Apr 2004 16:26:05 +0000, Milo T. wrote:

On Fri, 9 Apr 2004 13:57:04 +0000 (UTC), Donovan Rebbechi wrote:
> switch is one of my least favourite constructs in code. It can always
> be replaced with table lookup or polymorphism. Both of these would
> better address the "bad switch value" issue.

Oh, and the other reason to use a switch:

Object* ConstructMeAnObject(unsigned short idvalue) {
switch (idvalue)
{
case Square:
return new Square();
case Circle:
return new Circle();
case Pentagon:
return new Pentagon();
case Sheep:
return new Sheep();
}
}

... which is the kind of code that gets important pretty quickly once
you start worry about serialization of data over a network, or to and
from files.

Sure, you can craft your own table of values and function pointers,
but it's not as readable - and will never be as readable until
C++/C/C# gets better support for table-driving data structures.

Although I'd find better table-driven support a bit of a boon right
now. At the moment I have to tie a UI control to an internal lookup
ID, to a member on a network serialized structure, to a scaling ID, to
an output data structure.

Sure would be nice to be able to just fill in a table, and get that to
spill out all of the interconnections instead of having to hack them
in manually.

This is a good one. If the idvalue is other than Square, Circle,
Pentagon or Sheep, then you have no return. To that I have to say that
one of the tennits of good coding, states that I cannot trust an object
where I do not control the memory in which it resides. The returned
object is not in the callers control and hence may be invalid. This may
be fast but it is very dangerous. There are far better patterns were
reliability is required and isn't that all the time.
Well, I wasn't trying to write solid and robust code, I was trying to
spend 5 seconds coming up with examples where switch statements would be
useful.

However, I appreciate your willingness in becoming a code reviewer for
me, though I would like to point out that the correct way to solve the
above problem is merely to add a default: return NULL; or default: throw
exception("Unknown object type in deserialization"); to the bottom of
the statement.

In future, I will remember to provide you with a fully compiled, zipped
up source tree with a make file instead of just a quick off the cuff
example.

By the way, you missed the fact that I was missing a main() function as
well.

I didn't know you were looking for a code review for this code fragment.
Returning NULL could cause undefined results in the calling class.

Which is why as part of the contract of the method, you specify that NULL
indicates an invalid type was requested.
An
exception should only be used in exceptional cases.
Which, if unknown types are found in the serialized stream, is indeed an
exceptional case.
The best solution is
to have a null class, which inherits from Object, which is to be returned
in the error case.
Ahhh... fail silently... ok. That sounds like a wonderful idea.
The null class can be tested on return to ensure that
it is NULL, without causing anything to break should it be used without
first being tested.
You're doubling the chances of a programming error - because not only now
can the user keep using the object without knowing that it's invalid, but
you're asking them to perform an entirely separate test for whether or not
the object is valid before using it. If you pass them a NULL pointer,
however, they cannot use it without testing it for NULL - if they use it
without testing, they get very visible feedback of the flaw.
i.e.
//Allocation of static
NullObject MyClass::nullObject;

Object* MyClass::ConstructMeAnObject(unsigned short idvalue) {
switch (idvalue)
{
case Square:
return new Square();
case Circle:
return new Circle();
case Pentagon:
return new Pentagon();
case Sheep:
return new Sheep();
default:
return &nullObject:

}
}

I won't go into the fact that returning a reference is considered more
reliable than returning a pointer.
Really? Why? If your base classes have virtual destructors (which all base
classes should have), then there's no difference. Also note that we're
creating new instances of objects here - a case for which passing
references is not designed.
Also, it is often better to use
precreated objects from a pool rather than creating new ones.

I welcome you to generate precreated paragraphs of text from a pool for use
in a wordprocessor application. There's a world of difference between using
preallocated memory chunks from a pool using placement-new and using
pre-constructed objects.

As I said, next time I'll provide you with a full application + makefile,
plus an "I'm A Pedant On A Tangent Rampage" pin badge.

--
People in the killfile (and whose posts I won't read) as of 4/10/2004
2:06:07 PM:
Peter Kohlmann, T.Max Devlin. Matt Templeton (scored down)

Jul 21 '05 #31

Similar topics