Simple function arguments

JKop

Take the following simple function:

unsigned long Plus5Percent(unsigned long input)
{
return ( input + input / 20 );
}
Do yous ever consider the possibly more efficent:

unsigned long Plus5Percent(const unsigned long& input)
{
return ( input + input / 20 );
}
Specifically I'm referring to turning:
unsigned long Plus5Percent(unsigned long);

into

unsigned long Plus5Percent(const unsigned long&);
-JKop

Jul 22 '05 #1

Subscribe Post Reply

2663

Howard

"JKop" <NU**@NULL.NULL> wrote in message
news:Ty*****************@news.indigo.ie...

Take the following simple function:

unsigned long Plus5Percent(unsigned long input)
{
return ( input + input / 20 );
}
Do yous ever consider the possibly more efficent:

unsigned long Plus5Percent(const unsigned long& input)
{
return ( input + input / 20 );
}
Specifically I'm referring to turning:
unsigned long Plus5Percent(unsigned long);

into

unsigned long Plus5Percent(const unsigned long&);
-JKop

"Yous"? Yo, yous wouldn't happen to be from Philly, would yous? :-)

I don't usually consider using a const reference, for the purposes of
efficency, with something as small as a long. I simply always do it, out
of habit, whenever I'm not changing the value I'm using. For parameters
which are objects, though, I'd assume that efficiency *is* a reason, since a
const reference doesn't require a copy of the object to be made. But a copy
of a long or a reference or pointer to a long probably all take the same
space and time. (I can't say that for sure, but it'd be my guess, at least
on machines where pointers and longs are the same size.)

-Howard

Jul 22 '05 #2

Andrey Tarasevich

JKop wrote:

Take the following simple function:

unsigned long Plus5Percent(unsigned long input)
{
return ( input + input / 20 );
}
Do yous ever consider the possibly more efficent:

unsigned long Plus5Percent(const unsigned long& input)
{
return ( input + input / 20 );
}
...

In general case it heavily depends of several implementation-related
factors. At language level there's no definitive answer to this question.

--
Best regards,
Andrey Tarasevich

Jul 22 '05 #3

Mike Wahler

"JKop" <NU**@NULL.NULL> wrote in message
news:Ty*****************@news.indigo.ie...

Take the following simple function:

unsigned long Plus5Percent(unsigned long input)
{
return ( input + input / 20 );
1. return is a keyword, not a function, so the
parentheses are unnecessary (but innocuous).

2. May I assume you realize that your result will
be truncated (fractional part is discarded)?

More below.
}

Do yous ever consider the possibly
more efficent:
The key word here is 'possibly'. Whether it is
or not depends upon the implementation and target
platform. The only way to know for sure is to
measure.
unsigned long Plus5Percent(const unsigned long& input)
{
return ( input + input / 20 );
}
Specifically I'm referring to turning:
unsigned long Plus5Percent(unsigned long);

into

unsigned long Plus5Percent(const unsigned long&);

For a built-in integer type, there's probably no
difference (or perhaps pass by value could be
faster, since it will typically fit in a register.)
But for something like this, if there is indeed
a performance difference, I doubt it's significant.

While we're talking of 'performance', division is
typically more expensive than other numerical operations
(barring clever optimizations), so you might consider:

return input * 1.05;

instead.

-Mike

Jul 22 '05 #4

Dave Townsend

Why don't you measure the relative performance
of these two implementations and see ? I'm very doubtful
if the difference is going to be signficant, since in
one case you are copying a 4 byte value (typically)
for long, in the other you are copying a 4 byte value
for the reference. Perhaps dereferencing might add
a few pico seconds to the run time.

I would not use the const reference& implementation
for simple built types, it seems more straight forward to me
to use the pass-by-value mechanism.

"JKop" <NU**@NULL.NULL> wrote in message
news:Ty*****************@news.indigo.ie...

Take the following simple function:

unsigned long Plus5Percent(unsigned long input)
{
return ( input + input / 20 );
}
Do yous ever consider the possibly more efficent:

unsigned long Plus5Percent(const unsigned long& input)
{
return ( input + input / 20 );
}
Specifically I'm referring to turning:
unsigned long Plus5Percent(unsigned long);

into

unsigned long Plus5Percent(const unsigned long&);
-JKop

Jul 22 '05 #5

Karl Heinz Buchegger

JKop wrote:

Take the following simple function:

unsigned long Plus5Percent(unsigned long input)
{
return ( input + input / 20 );
}

Do yous ever consider the possibly more efficent:

unsigned long Plus5Percent(const unsigned long& input)
{
return ( input + input / 20 );
}

Specifically I'm referring to turning:

unsigned long Plus5Percent(unsigned long);

into

unsigned long Plus5Percent(const unsigned long&);

You already have received a number of answers. All of them
are good and turn around the actual argument passing.
I'd like to throw in another thought: You don't just
pass something for fun. The function actually wants
to work with the passed value.

Assumption at this stage: no inlining takes place.

As we all know a reference is just another name for an
existing object. But somehow, some information about that
existing object has to be passed to that function. And as
we all know, the common way to do that, is to implement the
reference passing by using a hidden pointer. It need not
be that way, but AFAWK all compilers do it that way as for today.

So what does this mean for the function.

In

unsigned long Plus5Percent(unsigned long input)
{
return ( input + input / 20 );
}

the argument is passed and is available for use in function
without much runtime overhead of accessing the value.

In

unsigned long Plus5Percent(const unsigned long& input)
{
return ( input + input / 20 );
}

and with the assumption of implementing the reference with
a hidden pointer, the situation changes. To access the actual
value a pointer lookup and dereference takes place. So under this
assumption, this could should be a tick slower then the pass-per-value
case.

Personally I use the rule set

* is it necc. to express the 'no valid argument' case
-> pass a pointer

* else
* function needs to change the callers argument ?
-> pass per reference
* else function does not need to change the callers argument
* is the argument an elementary builtin type like char, int, pointer ?
-> pass per value
* else
-> pass per const reference

--
Karl Heinz Buchegger
kb******@gascad.at

Jul 22 '05 #6

Alex Vinokur

"JKop" <NU**@NULL.NULL> wrote in message news:Ty*****************@news.indigo.ie...

Take the following simple function:

unsigned long Plus5Percent(unsigned long input)
{
return ( input + input / 20 );
}
Do yous ever consider the possibly more efficent:

unsigned long Plus5Percent(const unsigned long& input)
{
return ( input + input / 20 );
}
Specifically I'm referring to turning:
unsigned long Plus5Percent(unsigned long);

into

unsigned long Plus5Percent(const unsigned long&);
-JKop

Look at article "[Comparative performance] Argument passing (built-in types)"
http://groups.google.com/groups?selm....uni-berlin.de

<QUOTE from the article>

----------------
Tested functions
----------------
int x = rand();
static int s = 0;

--- fno-function : no arguments ---
void fno () { s += x; } // no arguments
--- f1-functions : an argument is never actually used ---
void f1_val (T); { s += x; } // by value

void f1_ref (T&); { s += x; } // by const reference
void f1_cref (const T&); { s += x; } // by reference

void f1_ptr (T*); { s += x; } // by pointer
void f1_cptr (const T*); { s += x; } // by const pointer
--- f2-functions : an argument is actually used ---
void f2_val (T t); { s += t; } // by value

void f2_ref (T& t); { s += t; } // by const reference
void f2_cref (const T& t); { s += t; } // by reference

void f2_ptr (T* t); { s += *t; } // by pointer
void f2_cptr (const T* t); { s += *t; } // by const pointer

-------------------------
Tested built-in types T :
-------------------------
* bool
* char
* short
* int
* long
* long long
----------------------

</QUOTE>
--
Alex Vinokur
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Jul 22 '05 #7

JKop

Karl Heinz Buchegger posted:

Personally I use the rule set

* is it necc. to express the 'no valid argument' case
-> pass a pointer

* else
* function needs to change the callers argument ?
-> pass per reference
* else function does not need to change the callers argument
* is the argument an elementary builtin type like char, int,
pointer ?
-> pass per value
* else
-> pass per const reference

Looks good.

I'm currently reading Bjarne's book, 3rd Edition, I'm on page 277 so far. He
alluded earlier in the book that he would explain this later in the book,
but what the heck I thought it would be worth a bit of a debate.

It's a great book, although never would I recommend it to a beginner.
I wasn't actually sure of whether a pointer would have to be used. Like
consider:

int main()
{
int j;

j = 4;
}
No pointer is needed to set it to 4. I was thinking that maybe a pointer
wouldn't be needed either when you pass by reference to another function?
Anwyay, I haven't a clue how function calls operate under the hood, I hear
people pandy the term "stack" around, but haven't a clue what it means!
Something to do with how or where the local variables are stored?
-JKop

Jul 22 '05 #8

Alf P. Steinbach

* Karl Heinz Buchegger:

Personally I use the rule set

* is it necc. to express the 'no valid argument' case
-> pass a pointer

* else
* function needs to change the callers argument ?
-> pass per reference
* else function does not need to change the callers argument
* is the argument an elementary builtin type like char, int, pointer ?
-> pass per value
* else
-> pass per const reference

Such a rule set can be automated, and Andrei Alexandrescu showed how to
do exactly that in his book "Modern C++ Design".

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Jul 22 '05 #9

Karl Heinz Buchegger

JKop wrote:

[snip]
I'm currently reading Bjarne's book, 3rd Edition, I'm on page 277 so far. He
alluded earlier in the book that he would explain this later in the book,
but what the heck I thought it would be worth a bit of a debate.

It's a great book, although never would I recommend it to a beginner.
Agreed.

I wasn't actually sure of whether a pointer would have to be used. Like
consider:

int main()
{
int j;

j = 4;
}

No pointer is needed to set it to 4. I was thinking that maybe a pointer
wouldn't be needed either when you pass by reference to another function?
Well. Yes. Technically there is no requirement to use a hidden pointer.
In practice the reference has to be implemented somehow. Things are easy
if the function gets inlined: The compiler can record the fact that there
are 2 names refering to the same memory location and use that. If the
function gets not inlined then some other mechnism has to take place.
Currently all compilers use hidden pointers. I don't even know if there
is another mechanism available that can do the same thing. But I think
that the technique of hidden pointers is the simplest one (both, in
implementing it into a compiler and at runtime) and it would be hard
to beat it.
Anwyay, I haven't a clue how function calls operate under the hood, I hear
people pandy the term "stack" around, but haven't a clue what it means!
Something to do with how or where the local variables are stored?

That could become a lengthy explanation right now. Most of it is really OT
and some assumptions will be made. So everybody not really interested, stop
reading right now :-)

In the following I assume you have never looked at a CPU at the instruction
level, so be patient :-) Also: different CPU's work differently in the internals,
so what I'd like to present is what I encountered during my studies, tests and
implementations.

In the internals of a CPU there are various registers. A register is nothing more
then a memory cell inside the CPU chip, where it can be electrically connected to
other parts of the CPU (such as the arithmetic unit). Some of those registers have
a special purpose, some are just used to give the program a way to park values
and be able to access them quickly.

One of those special registers is the PC (PC == Programm Counter). In it the memory
address is stored from where the next instruction from memory is read and executed.

Another one is the SP (SP == Stack Pointer). It contains a memory address. There are
special CPU instructions which use the SP, namely PUSH and POP. A PUSH operation stores
a value at the address given by SP and decrements the value stored in the SP. A POP
operation does the opposite: increment the SP and fetch the value from memory given
by the new content of SP. Both operations (+some additional ones), paired with the SP
form, what is generally referred to as 'The stack'. It is a simple way of 'dynamic
memory allocation' which grows and shrinks very easily, without much overhead.
Typically the memory layout of a program out work looks like this

+----------------+---------------------+-----------------------------+-----------------+
| Code section | Data section | free Memory pool | Stack section |
+----------------+---------------------+-----------------------------+-----------------+
^
|
SP |
+-------|-+
| o |
+---------+

with the SP pointing at the last available memory cell (Yes, the SP can be thought of as a
pointer which is located inside the CPU).

Now back to the PC. It also is logically a pointer. The CPU uses it when the next instruction
cycly begins. The PC contains the address of the next instruction to execute (which is just
a number, as anything at this level). OK. So the byte gets loaded and fed into the decoding
logic, which figures out what parts of the CPU need to be connected with other parts based
in the individual bits in that instruction byte. At the same time the PC gets incremented
by 1. (If the instruction is a multibyte instruction, the logic detects this and loads the
next view bytes to get at the complete instruction sequence, each time incrementing the PC.
So how is a JUMP instruction implemented? Simple. After decoding the instruction it is executed
and the jump target (the address of where to jump to) is loaded. This address is simply loaded
into the PC. In this way the next instruction that the CPU will execute is the one given by
the jump target. Simple, isn't it (At the CPU level things get incredible simple, as long as
no caching or pipelining comes into play).
OK. What about a CALL instruction? (JUMP is the equivalent of goto while CALL is a function
call). Well, the CPU has knowledge of parameters, this has to be implemented in other stages.
I'll ignore that for now and come back to it later. Nevertheless a CALL instruction is still
nothing more then a goto (a JUMP) with the added benefit that somewhere it is stored where
the goto came from, such that it is possible to return to that location later. Now where
is this information stored. You guessed it: on the stack. When a CALL instruction is executed
things start out the very same is in a JUMP case: the address is loaded. But instead of just
assigning this address to the PC, the previous content of PC is stored with a push on the stack.
Only after that the PC gets it's new value. When the CPU then executes the correspconding RETURN
instruction things go the other way round: a number is fetched from the stack (pop) and stored
in the PC.

Lets say the CPU has this instruction sequence (all numbers are hex)

0000 LD SP,FFFF ; set the stack pointer to ehe end of memory
0003 CALL 0007 ; call the function at memory location 0010
0006 HALT ; halt the CPU, the CPU effectivly freezes
...
0007 LD B,AA ; load register B with AA
0009 RET

(LD == Load)

The CPU The memory
+-----------------------+
| PC: 0000 | 0000
| SP: 0000 | +----+----+----+----+----+----+----+----+----+----+--...
| | | 02 | FF | FF | CD | 07 | 00 | 76 | 08 | AA | C9 |
| Register B: 00 | +----+----+----+----+----+----+----+----+----+----+
+-----------------------+
FFFF
....+----+----+----+----+----+---+----+----+----+
| | | | | | | | | |
....+----+----+----+----+----+---+----+----+----+
You see the CPU on the left side, with the memory (Ram) on the right side. The Ram has already
been loaded with the abov program (02 is the opcode for Load, CD for CALL, C9 for RETurn, etc.)

The CPU starts executing. The first instruction is fetched from memory location 0000 (the content
of PC equals 0000), which is 02, a Load register SP instruction. The logic inside the CPU knows
that the next 2 Bytes after that 02 contain the value with which to load the SP. So this value
is also fetched SP loaded with it. After that instruction has executed, the situation is

The CPU The memory
+-----------------------+
| PC: 0003 | 0000
| SP: FFFF | +----+----+----+----+----+----+----+----+----+----+--...
| | | 02 | FF | FF | CD | 07 | 00 | 76 | 08 | AA | C9 |
| Register B: 00 | +----+----+----+----+----+----+----+----+----+----+
+-----------------------+
FFFF
....+----+----+----+----+----+---+----+----+----+
| | | | | | | | | |
....+----+----+----+----+----+---+----+----+----+
The CPU starts it next cycle by loading the instruction based on the value in PC (which is 0003).
Loading from the memory at address 0003 brings up the OpCode of CD, which is a call instruction.
Again: The logic inside the CPU loads the next 2 bytes also from Ram (because they contain the
call target address). Due to this loading the PC has increased to a value of 0006

intermediate state !

The CPU The memory
+-----------------------+
| PC: 0006 | 0000
| SP: FFFF | +----+----+----+----+----+----+----+----+----+----+--...
| | | 02 | FF | FF | CD | 07 | 00 | 76 | 08 | AA | C9 |
| Register B: 00 | +----+----+----+----+----+----+----+----+----+----+
+-----------------------+
FFFF
....+----+----+----+----+----+---+----+----+----+
| | | | | | | | | |
....+----+----+----+----+----+---+----+----+----+
but since this is a CALL instruction, the value of the PC is pushed onto the stack: The value
of the PC is stored at the memory address contained in SP

The CPU The memory
+-----------------------+
| PC: 0006 | 0000
| SP: FFFF | +----+----+----+----+----+----+----+----+----+----+--...
| | | 02 | FF | FF | CD | 07 | 00 | 76 | 08 | AA | C9 |
| Register B: 00 | +----+----+----+----+----+----+----+----+----+----+
+-----------------------+
FFFF
....+----+----+----+----+----+---+----+----+----+
| | | | | | | | 06 | 00 |
....+----+----+----+----+----+---+----+----+----+

and the SP is decremented

The CPU The memory
+-----------------------+
| PC: 0006 | 0000
| SP: FFFD | +----+----+----+----+----+----+----+----+----+----+--...
| | | 02 | FF | FF | CD | 07 | 00 | 76 | 08 | AA | C9 |
| Register B: 00 | +----+----+----+----+----+----+----+----+----+----+
+-----------------------+
FFFF
....+----+----+----+----+----+---+----+----+----+
| | | | | | | | 06 | 00 |
....+----+----+----+----+----+---+----+----+----+

Only after that, the previously read call target address value is assigned to the PC (that was
0007)

The CPU The memory
+-----------------------+
| PC: 0007 | 0000
| SP: FFFD | +----+----+----+----+----+----+----+----+----+----+--...
| | | 02 | FF | FF | CD | 07 | 00 | 76 | 08 | AA | C9 |
| Register B: 00 | +----+----+----+----+----+----+----+----+----+----+
+-----------------------+
FFFF
....+----+----+----+----+----+---+----+----+----+
| | | | | | | | 06 | 00 |
....+----+----+----+----+----+---+----+----+----+
Instruction finished. Next instruction. Again: Use PC as the address from where to load the next
instruction: 0007 and the instruction is 08, which is a Load register B instruction.
Decoding, ... you already know what is going on, it is always the same cycle. After that
instruction the situation looks like this:

The CPU The memory
+-----------------------+
| PC: 0009 | 0000
| SP: FFFD | +----+----+----+----+----+----+----+----+----+----+--...
| | | 02 | FF | FF | CD | 07 | 00 | 76 | 08 | AA | C9 |
| Register B: AA | +----+----+----+----+----+----+----+----+----+----+
+-----------------------+
FFFF
....+----+----+----+----+----+---+----+----+----+
| | | | | | | | 06 | 00 |
....+----+----+----+----+----+---+----+----+----+
Next instruction from memory address 0009 -> C9, which is a return instruction.
Return takes the Satck pointer SP, increments it and fetches the value from there. In our
example the SP contains FFFD, thus the poped values are 00 06, and the SP gets incremented
to FFFF

The CPU The memory
+-----------------------+
| PC: 0009 | 0000
| SP: FFFF | +----+----+----+----+----+----+----+----+----+----+--...
| | | 02 | FF | FF | CD | 07 | 00 | 76 | 08 | AA | C9 |
| Register B: AA | +----+----+----+----+----+----+----+----+----+----+
+-----------------------+
FFFF
....+----+----+----+----+----+---+----+----+----+
| | | | | | | | 06 | 00 |
....+----+----+----+----+----+---+----+----+----+

The read value is assigned to the PC

The CPU The memory
+-----------------------+
| PC: 0006 | 0000
| SP: FFFF | +----+----+----+----+----+----+----+----+----+----+--...
| | | 02 | FF | FF | CD | 07 | 00 | 76 | 08 | AA | C9 |
| Register B: AA | +----+----+----+----+----+----+----+----+----+----+
+-----------------------+
FFFF
....+----+----+----+----+----+---+----+----+----+
| | | | | | | | 06 | 00 |
....+----+----+----+----+----+---+----+----+----+
and when the CPU starts it next instruction cycle, program execution continues at memory address
0006
(the content of PC).

So this is basically how 'function calling' works at the CPU level and what role the Stack has.

I did a Z80 emulation some time ago (to help in programming a small home built microcomputer, which
controls same hardware. Here are the corresponding functions in that emulation ( I leave out some
of the low level functions, I guess you can deduce what they do from their names).

class CPU {
public:

void Run();

protected:
void PushStack( WORD Adr );
WORD PopStack();

void JMP(), CALL(), RET();

BYTE m_Ram[0x10000];

WORD m_PC;
WORD m_SP;

BYTE m_RegB;
BYTE m_RegC;
};

void CPU::Run( CZ80CpuDoc* pDoc )
{
BYTE OpCode;

while( true ) {
OpCode = m_Ram[m_PC++];
(this->*m_Fnktions[OpCode])();
}
}

void CPU::JMP()
{
WORD Adr;
Adr = FetchAdress();
m_PC = Adr;
}

void CPU::RET ()
{
m_PC = PopStack();
}

void CPU::CALL ()
{
WORD Adress = FetchAdress();
PushStack( m_PC );
m_PC = Adress;
}

void CPU::PushStack( WORD Adr )
{
BYTE byte1, byte2;

byte1 = Adr & 0xFF;
byte2 = (Adr >> 8) & 0xFF;

m_Ram[--m_SP] = byte2;
m_Ram[--m_SP] = byte1;
}

WORD CPU::PopStack ()
{
BYTE byte1, byte2;
WORD adr;

byte1 = m_Ram[m_SP++];
byte2 = m_Ram[m_SP++];

adr = To16( byte2, byte1 );
return adr;
}

But we are not finished right now. You will say: All fine, but what about parameters?
Well. The CPU doesn't know anything about parameters. It is the compilers job to translate
parameter passing into something that, when executed, gives the illusion that some parameters
have been passed. There are multiple ways to do that. 2 commonly used are: Pass by register
and pass by pushing it on the stack.
Pass by register is simple: The opcodes simply load one of the registers with the required value
and the function is built in such a way as to expect the value in exactly that register. But
the number of registers inside the CPU is usually limited, so only a handfull of parameters
can be passed that way! What to do then?
Eg. Pass by pushing onto the stack.
The CPU has other instructions, which Push and Pop values from and to the stack (just as the PC
was pushed during the CALL instruction):

LD B, AA // load register with required value (may be a constant or a load
// from a specific memory cell, or not done at all eg. if the passed
// value is the result of a computation and already in that register
PUSH B // push value on the stack ( Memory[SP] <- value; SP-- )
CALL 0100 // call the function
POP B // pop the pushed argument from above from the stack

...

0100 // this is the function and we know:
// if we do a pop right now, we get at the return address
// but the second pop, will give a hand at the passed argument.
// this also means, we can use the value of the stack pointer SP
// to access the passed argument.

... // this function could also decide to allocate some local variables
// one way would be, to just decrease the stack pointer and thus
// reserve some memory cells.
//
// the memory layout then looks like this
//
// ...+----+----+----+----+----+----+----+----+----+....
// | | | | | | | | | |
// ...+----+----+----+----+----+----+----+----+----+....
// ^ | | |----| | |
// | +-------------+ | +----|
// | local variables | |
// | return address |
// SP |
// passed arguments
//
// when the function does this, it is clear, that it has to
// correct the SP back to the location of the return address
// before executing the RET instruction. But since the function
// knows how many bytes it has moved the SP to the left, it is also
// able to move the SP back to the right.
...
RET
There is much more to say about this. But then this reply would break all records in length.
So please let me stop right now, and suggest to continue your studies by some literature
about assembly (which would also have the benefit that it is tailored to your specific
type of CPU).

Basically: At the CPU level, things get really simple. After all it is just a machine at work.

--
Karl Heinz Buchegger
kb******@gascad.at

Jul 22 '05 #10

JKop

Thanks a million for the reply. I'll have to read that over a few times.

About 2 weeks ago, some-one posted a link to a PDF file that basically
teaches you assembly, but I don't think the author trully understood how
unreadable it was to a total beginner, just as how Bjarne's book would seem
to be full of holes to a beginner, it doesn't even tell you how to declare
varibles, it goes straight into classes!

Again thanks for your time, I appreciate it.
-JKop

Jul 22 '05 #11

Karl Heinz Buchegger

JKop wrote:

Take the following simple function:

unsigned long Plus5Percent(unsigned long input)
{
return ( input + input / 20 );
}

Do yous ever consider the possibly more efficent:

unsigned long Plus5Percent(const unsigned long& input)
{
return ( input + input / 20 );
}

Specifically I'm referring to turning:

unsigned long Plus5Percent(unsigned long);

into

unsigned long Plus5Percent(const unsigned long&);

Jul 22 '05 #12

Simple function arguments

Similar topics