I have hard time to understand run-time environment. Let assume that I have
a program that has a simple variable alpha. When this variable is
statically allocated, the compiler can use the absolute address of alpha to
access to it. What confuses me is that when the variable is dynamically
allocated, how does the compiler implement it? We know the address of the
variable until run-time. During the compilation, how can we access to the
alpha variable since we don't know its address yet?
Thanks in advance 17 3040
"newbiecpp" <ne*******@yahoo.com> wrote in message
news:u9F%c.5283$dC4.1197@trndny06... I have hard time to understand run-time environment. Let assume that I
have a program that has a simple variable alpha. When this variable is statically allocated, the compiler can use the absolute address of alpha
to access to it. What confuses me is that when the variable is dynamically allocated, how does the compiler implement it? We know the address of the variable until run-time. During the compilation, how can we access to the alpha variable since we don't know its address yet?
Thanks in advance
You assumption about the "statically allocated" variable is incorrect. The
compiler has no idea where in memory the entire program will be loaded, so
no "absolute address" can be given.
How a variable is addressed depends greatly on where it is declared, not
just on whether new was used to allocate space for it.
For example, a local variable is likely to be addressed by using an offset
from the "stack" pointer. But since there is no actual requirement for the
existence of a "stack" in the C++ language specification, compiler writers
are technically free to implement such details as they see fit.
To answer your question about dynamically allocated variables, what you're
describing is the use of a pointer variable and the function new (or malloc
or alloc, but let's stick with new, ok?). The pointer variable is just like
any other variable. If it's a local pointer, then it may be addressed as I
described above (using an offset from the stack pointer). But it's the
*contents* of that variable that get changed dynamically when allocating the
object it points to. An object gets space allocated for it when new is
called, the object's constructor gets called, and the address of the
allocated space gets put into the pointer variable, as its *value*, not its
location!
The compiler doesn't actually arrange any of the memory, outside of these
local Stack offsets I mentioned and perhaps a few other cases. The linker
comes into play, as does the operating system's loading mechanism. You
might want to get a book on compiler theory if you're really interested.
It's at least a whole semester's worth of study in college.
-Howard
newbiecpp posted: I have hard time to understand run-time environment. Let assume that I have a program that has a simple variable alpha.
unsigned int alpha;
When this variable is statically allocated, the compiler can use the absolute address of alpha to access to it.
Not really. "alpha" may have a different address each time the program is
run.
What confuses me is that when the variable is dynamically allocated, how does the compiler implement it?
unsigned int* p_alpha = new unsigned int;
It stores its address. Or then again, it can mysteriously achieve it in
whatever way it sees fit in the following:
unsigned int& alpha = *new unsigned int;
We DON'T know the address of the variable until run-time. During the compilation, how can we access to the alpha variable since we don't know its address yet?
The program starts up. Somehow it asks the system where it can store certain
data in memory. The system reponds by specifying addresses. From here on the
program accesses these specific addresses. Thanks in advance
Hope that helps.
-JKop
> I have hard time to understand run-time environment.
It is possible I messed up some things in my explanations, I am no
compiler implementor nor an operating system developper. Wait for some
corrections to appear before believing all that.
Let's review the common steps :
1) Writing source code
You specify, in a given language, the commands to be sent to the
computer. In C++, that means creating classes and functions and
using them. An object O, for example, is only a name, something
for the programmer to make his life easier.
Always think of a programming language in term of assembly
language : the machine does not understand classes, objects,
inheritance or whatever. All this high-level code is translated
into machine code, so all your variable names, functions, cool
class hierarchies will be replaced by memory locations, additions
and substractions. In fact, high level languages like C++ could
be viewed only as "sytactic sugar", as far as everything you do
could be done in assembly language (that's what your code will
become ultimately anyways).
2) Compiling
Nothing really interesting here, syntax checking and traduction
into intermediary code. The only thing here is that all names
are checked, so for example
// main.cpp
int main()
{
a = 2;
}
will fail to compile since 'a' does not exist anywhere. The
compiler makes sure every name is _potentially_ defined, so the
linker will only have to try to find them. So when you write
// main.cpp
extern int a;
int main()
{
a = 2;
}
the compiler tells the linker : "'a' refers to some int defined
somewhere else, you'll have to find it. good luck".
3) Linking
That's getting interesting. The linker makes sure everything is
in place : each use of a name is resolved to its definition. If
that definition does not exist or if it exists more than once,
an error is generated (in most cases). The linker uses the
information generated by the compiler to associate names. By
using the example above, it searches other files (actually
"translation units") for a definition of 'a'. If it finds it,
for example with
// another file.cpp
int a=0;
, it associates 'a' in main.cpp with the 'a' in file.cpp. If 'a'
is not found anywhere, it stops. If two or more 'a's are found,
it stops since there is ambiguity. The the basis of the ODR
(One Definition Rule).
The linker then translates all data access by an address, but
don't get it wrong : that address is not the real address in
memory. Actually, you could see that address as an offset in the
real memory. When the operating system runs the program, it
loads it somewhere and reserves some space in memory for that
program. Each variable/address/offset is resolved by the OS to
that memory space.
How? An easy answer could be "automagically". The real long
answer perhaps could be provided by someone in a newgroup
supporting your operating system. The thing is, that kind of
detail is left to the implementation : the C++ standard only
mandates _behavior_, not implementation. So as long as it
behaves as specified, an implementation can do anything it wants.
You must understand that the only time memory is allocated, and
therefore real addresses are defined, is when the program is
executed, not during compilation or linking.
Please, note that this is a bit of oversimplification. Let assume that I have a program that has a simple variable alpha. When this variable is statically allocated, the compiler can use the absolute address of alpha to access to it.
What's important to understand is that nothing is allocated when you
compile/link a program. The program merely becomes some machine code.
The operating system is then in charge of running it. Yes, the linker
assigns some addresses to your variables, but these do not refer to a
specific place in memory. As I said, you could consider these
"addresses" to be offsets in memory, the starting address being defined
by the operating system.
For what you said specifically, know that static data is handled
differently by most implementations. Actually, three types of data are
typically recognized : the stack, the heap and the static data.
After re-reading, what do you mean by "statically allocated"? On the
stack such as
int main()
{
int i = 0;
}
or really statically allocated such as
int main()
{
static int i = 0;
}
Be sure to make the difference between the two.
What confuses me is that when the variable is dynamically allocated, how does the compiler implement it?
That's something else, though not entirely different. What I described
about the linker only applies to the stack. For the heap, it works a
bit differently.
The operating system manages a pool of memory typically called "heap" or
"free store". That memory is different from the stack because, first of
all, of its longevity. For example, by using the stack :
void f()
{
int i=0; // i is on stack
}
you have no way of extending the life of 'i'. Once the function
terminates, 'i' is destroyed. This is mandated by the C++ standard and
it is the usual practice in all languages, including assembly, so it's
no big deal.
The heap however is a pool of memory. You can reserve and release
memory when you want. Typically, in pratice, the operating system has
some functions for allocating and deleting memory at low-level,
typically in big chunks (several kilobytes). These functions are then
used by the malloc() system, which usually maintains another pool of
memory itself. Finally, operator new is usually implemented in terms of
malloc().
The low level OS functions return the address of the allocated memory on
the heap. That address can be absolute (which is pretty rare today) or
relative, allowing in particular some sort of protection. Relative
addressing (also called virtual addressing) behaves like the relative
stack address I described earlier.
So actually, what the OS returns is only an address, so neither the
compiler or the linker has nothing to do with that. That address is
determined by the OS at run-time, depending on the loaded programs and
the content of the heap. These functions can fail if the heap is full.
We know the address of the variable until run-time.
That should read : "We don't know the real address of the variable until
run-time", since the program is not running. For memory to be
allocated, the program must run! Compiling it only translates it into
machine code and assigns "dummy" addresses which have to be resolved by
the operating system.
During the compilation, how can we access to the alpha variable since we don't know its address yet?
As I said earlier, the linker uses some kind of relative address which
is resolved by the operating system. If you are asking how the compiler
/linker do with
void f(int &i)
{
i = 2;
}
to know what 'i' refers to, well it is only a matter of keeping a list
of the variables in a given scope with their names. When you do
int main()
{
int a = 10;
f(a);
}
The compiler enters 'a' in its list for main() and associates 'i' in f()
with it. That's a simple assocation map. Once every name has been
resolved, the linker only has to take that list of variable and give
them addresses. The operating system is then in charge, when running
the program, of allocating memory and resolving the addresses made by
the linker to the real addresses in memory.
It is important for you to understand that all the things I just
explained are not specified by the C++ standard, and therefore you
cannot rely on it, altough it is commonly implemented that way.
Remember : C++ describes behavior, not implementation.
Jonathan
"Jonathan Mcdougall" <jo***************@DELyahoo.ca> wrote in message
news:URK%c.77536 2) Compiling Nothing really interesting here, syntax checking and traduction
?What's "traduction"? (It's not in my dictionary.)
into intermediary code. The only thing here is that all names are checked, so for example
Howard wrote: "Jonathan Mcdougall" <jo***************@DELyahoo.ca> wrote in message news:URK%c.77536 2) Compiling Nothing really interesting here, syntax checking and traduction
?What's "traduction"? (It's not in my dictionary.)
Try the Oxford English Dictionary. It lists as the second main (though
obsolete) meaning:
{dag}2. Translation into another language; concr. a translation. Obs. or
arch.
Dates: 1549 a1533 1663 1716 1822 1823
a1533 LD. BERNERS Gold. Bk. M. Aurel. (1546) Bv, I confesse to deserue no
merytes for my traduction. 1549 Compl. Scot. To Rdr. 10 He that hes the
gyft of traductione, compiling or teching, his faculte is..honest. 1663
COWLEY Pind. Odes Pref., The verbal Traduction of him into Latin Prose.
1716 M. DAVIES Athen. Brit. III. 5 The Jesuit Rapin's Critical Parallels
(whereof the English Traduction was so greedily bought up). 1822 SCOTT
Nigel xxxii, Whilk we do not perceive even in the Latin version of the
Septuagint, much less in the English traduction. 1823 BYRON Juan XI. xix.
note, If there be any gem'man so ignorant as to require a traduction. into intermediary code. The only thing here is that all names are checked, so for example
Best
Kai-Uwe Bux
"Kai-Uwe Bux" <jk********@gmx.net> wrote in message
news:ch**********@murdoch.acc.Virginia.EDU... Howard wrote:
"Jonathan Mcdougall" <jo***************@DELyahoo.ca> wrote in message news:URK%c.77536 2) Compiling Nothing really interesting here, syntax checking and traduction
?What's "traduction"? (It's not in my dictionary.)
Try the Oxford English Dictionary. It lists as the second main (though obsolete) meaning:
{dag}2. Translation into another language; concr. a translation. Obs. or arch.
Dates: 1549 a1533 1663 1716 1822 1823
a1533 LD. BERNERS Gold. Bk. M. Aurel. (1546) Bv, I confesse to deserue no merytes for my traduction. 1549 Compl. Scot. To Rdr. 10 He that hes the gyft of traductione, compiling or teching, his faculte is..honest. 1663 COWLEY Pind. Odes Pref., The verbal Traduction of him into Latin Prose. 1716 M. DAVIES Athen. Brit. III. 5 The Jesuit Rapin's Critical Parallels (whereof the English Traduction was so greedily bought up). 1822 SCOTT Nigel xxxii, Whilk we do not perceive even in the Latin version of the Septuagint, much less in the English traduction. 1823 BYRON Juan XI. xix. note, If there be any gem'man so ignorant as to require a traduction.
You sure that's not the Olde English dictionary? :-)
Howard wrote: "Jonathan Mcdougall" <jo***************@DELyahoo.ca> wrote in message news:URK%c.77536
2) Compiling Nothing really interesting here, syntax checking and traduction
?What's "traduction"? (It's not in my dictionary.)
Transforming C++ code into machine code, though I think it was clear in
the context. into intermediary code. The only thing here is that all names are checked, so for example
Jonathan
"Jonathan Mcdougall" <jo***************@DELyahoo.ca> wrote in message
news:UR*********************@wagner.videotron.net. .. I have hard time to understand run-time environment.
It is possible I messed up some things in my explanations, I am no compiler implementor nor an operating system developper. Wait for some corrections to appear before believing all that.
Let's review the common steps :
1) Writing source code You specify, in a given language, the commands to be sent to the computer. In C++, that means creating classes and functions and using them. An object O, for example, is only a name, something for the programmer to make his life easier.
Always think of a programming language in term of assembly language : the machine does not understand classes, objects, inheritance or whatever. All this high-level code is translated into machine code, so all your variable names, functions, cool class hierarchies will be replaced by memory locations, additions and substractions. In fact, high level languages like C++ could be viewed only as "sytactic sugar", as far as everything you do could be done in assembly language (that's what your code will become ultimately anyways).
2) Compiling Nothing really interesting here, syntax checking and traduction into intermediary code. The only thing here is that all names are checked, so for example
// main.cpp int main() { a = 2; }
will fail to compile since 'a' does not exist anywhere. The compiler makes sure every name is _potentially_ defined, so the linker will only have to try to find them. So when you write
// main.cpp extern int a;
int main() { a = 2; }
the compiler tells the linker : "'a' refers to some int defined somewhere else, you'll have to find it. good luck".
3) Linking That's getting interesting. The linker makes sure everything is in place : each use of a name is resolved to its definition. If that definition does not exist or if it exists more than once, an error is generated (in most cases). The linker uses the information generated by the compiler to associate names. By using the example above, it searches other files (actually "translation units") for a definition of 'a'. If it finds it, for example with
// another file.cpp int a=0;
, it associates 'a' in main.cpp with the 'a' in file.cpp. If 'a' is not found anywhere, it stops. If two or more 'a's are found, it stops since there is ambiguity. The the basis of the ODR (One Definition Rule).
The linker then translates all data access by an address, but don't get it wrong : that address is not the real address in memory. Actually, you could see that address as an offset in the real memory. When the operating system runs the program, it loads it somewhere and reserves some space in memory for that program. Each variable/address/offset is resolved by the OS to that memory space.
How? An easy answer could be "automagically". The real long answer perhaps could be provided by someone in a newgroup supporting your operating system. The thing is, that kind of detail is left to the implementation : the C++ standard only mandates _behavior_, not implementation. So as long as it behaves as specified, an implementation can do anything it wants.
You must understand that the only time memory is allocated, and therefore real addresses are defined, is when the program is executed, not during compilation or linking.
Please, note that this is a bit of oversimplification.
> Let assume that I have a program that has a simple variable alpha. When this variable is statically allocated, the compiler can use the absolute address of alpha
to access to it.
What's important to understand is that nothing is allocated when you compile/link a program. The program merely becomes some machine code. The operating system is then in charge of running it. Yes, the linker assigns some addresses to your variables, but these do not refer to a specific place in memory. As I said, you could consider these "addresses" to be offsets in memory, the starting address being defined by the operating system.
For what you said specifically, know that static data is handled differently by most implementations. Actually, three types of data are typically recognized : the stack, the heap and the static data.
After re-reading, what do you mean by "statically allocated"? On the stack such as
int main() { int i = 0; }
or really statically allocated such as
int main() { static int i = 0; }
Be sure to make the difference between the two.
What confuses me is that when the variable is dynamically allocated, how does the compiler implement it?
That's something else, though not entirely different. What I described about the linker only applies to the stack. For the heap, it works a bit differently.
The operating system manages a pool of memory typically called "heap" or "free store". That memory is different from the stack because, first of all, of its longevity. For example, by using the stack :
void f() { int i=0; // i is on stack }
you have no way of extending the life of 'i'. Once the function terminates, 'i' is destroyed. This is mandated by the C++ standard and it is the usual practice in all languages, including assembly, so it's no big deal.
The heap however is a pool of memory. You can reserve and release memory when you want. Typically, in pratice, the operating system has some functions for allocating and deleting memory at low-level, typically in big chunks (several kilobytes). These functions are then used by the malloc() system, which usually maintains another pool of memory itself. Finally, operator new is usually implemented in terms of malloc().
The low level OS functions return the address of the allocated memory on the heap. That address can be absolute (which is pretty rare today) or relative, allowing in particular some sort of protection. Relative addressing (also called virtual addressing) behaves like the relative stack address I described earlier.
So actually, what the OS returns is only an address, so neither the compiler or the linker has nothing to do with that. That address is determined by the OS at run-time, depending on the loaded programs and the content of the heap. These functions can fail if the heap is full.
> We know the address of the variable until run-time.
That should read : "We don't know the real address of the variable until run-time", since the program is not running. For memory to be allocated, the program must run! Compiling it only translates it into machine code and assigns "dummy" addresses which have to be resolved by the operating system.
> During the compilation, how can we access to the alpha variable since we don't know its address yet?
As I said earlier, the linker uses some kind of relative address which is resolved by the operating system. If you are asking how the compiler /linker do with
void f(int &i) { i = 2; }
to know what 'i' refers to, well it is only a matter of keeping a list of the variables in a given scope with their names. When you do
int main() { int a = 10; f(a); }
The compiler enters 'a' in its list for main() and associates 'i' in f() with it. That's a simple assocation map. Once every name has been resolved, the linker only has to take that list of variable and give them addresses. The operating system is then in charge, when running the program, of allocating memory and resolving the addresses made by the linker to the real addresses in memory.
It is important for you to understand that all the things I just explained are not specified by the C++ standard, and therefore you cannot rely on it, altough it is commonly implemented that way. Remember : C++ describes behavior, not implementation.
Jonathan
Thank you very much. I really appreciate your time and help.
My confusion is from here: C++ says that static allocation, such as
static int i;
is binding at compile-time, while dynamic allocation, such as
int* pi = new int;
is binding at run-time. I understand now that for i, compiler can put an
offset related to some location (like stack base) somewhere. But I still
have some confusion about dynamic binding. To me, the compiler may put an
offset from heap to pi. But why we call it run-time binding? To me, they
all decide at compile-time, that is, compiler record an offset from stack or
heap. During run-time, the OS will decide the addresses of stack and heap
so that we can have real addresses of i and pi. But why we call one as
compile-time binding and the other as run-time binding? Or my understanding
was wrong?
I appreciate your insight and your time.
"Jonathan Mcdougall" <jo***************@DELyahoo.ca> wrote in message
news:HG*********************@weber.videotron.net.. . Howard wrote: "Jonathan Mcdougall" <jo***************@DELyahoo.ca> wrote in message news:URK%c.77536
2) Compiling Nothing really interesting here, syntax checking and traduction
?What's "traduction"? (It's not in my dictionary.)
Transforming C++ code into machine code, though I think it was clear in the context.
Well, I agree it was fairly clear what he was tring to say, but he could
have said "splurbification" and accomplished that. :-) It doesn't make the
word itself legitimate. I checked several on-line dictionaries and found no
such word. I only found it on-line by doing a French-to-English translation
on AltaVista (BabelFish). As the other responder noted, it's listed as
"obscure" or "archaic" even in the Oxford dictionary. So you can't blame me
for asking, can you? (Besides, it's how I learn!)
-Howard into intermediary code. The only thing here is that all names are checked, so for example
Jonathan
In message <IG*********************@bgtnsc04-news.ops.worldnet.att.net>,
Howard <al*****@hotmail.com> writes "Jonathan Mcdougall" <jo***************@DELyahoo.ca> wrote in message news:HG*********************@weber.videotron.net. .. Howard wrote: > "Jonathan Mcdougall" <jo***************@DELyahoo.ca> wrote in message > news:URK%c.77536 > >> 2) Compiling >> Nothing really interesting here, syntax checking and traduction > > > ?What's "traduction"? (It's not in my dictionary.)
Transforming C++ code into machine code, though I think it was clear in the context.
Well, I agree it was fairly clear what he was tring to say, but he could have said "splurbification" and accomplished that. :-) It doesn't make the word itself legitimate. I checked several on-line dictionaries and found no such word. I only found it on-line by doing a French-to-English translation on AltaVista (BabelFish). As the other responder noted, it's listed as "obscure" or "archaic" even in the Oxford dictionary. So you can't blame me for asking, can you? (Besides, it's how I learn!)
It does have a current meaning, but you have to look under "traduce".
Personally I wouldn't want a compiler that did that to my code, though
the Italians have a saying "traduttore -- tradittore" for it.
--
Richard Herring
"newbiecpp" <ne*******@yahoo.com> wrote in message news:<tTY%c.7827$wF4.5049@trndny09>...
[snip] My confusion is from here: C++ says that static allocation, such as
static int i;
is binding at compile-time, while dynamic allocation, such as
int* pi = new int;
is binding at run-time.
Actually there are three types of storage in C and C++: static, dynamic,
and automatic. Following are examples of all three:
// The integer g has static storage duration, like all global variables.
int g;
int main()
{
// The integer s has static duration because of the 'static' keyword.
static int s = 0;
// The integer a has automatic duration, which is the default for
// local variables.
int a = 0;
// The pointer p has automatic duration, but the integer it points
// to has dynamic duration.
int* p = new int(0);
// The following statement frees the integer that p points to, which
// means the integer no longer exists and the memory it occupied can
// be reclaimed for other purposes. The pointer p still exists, but
// it is now an invalid pointer, which means you're not allowed to
// do anything with it except assign to it.
delete p;
return 0;
}
The amount of static storage, and the number of static variables, in a
program is known "in advance" (specifically, at link time) and does not
change over the life of the program -- hence the name "static". In other
words, the lifetime of a static variable is the lifetime of the program.
Automatic variables are created and destroyed "automatically" based on
the flow of control. An automatic variable is created when the control
reaches the point where it is declared, and is destroyed when the flow
of control leaves the block where it is declared. Thus, unlike static
variables, automatic variables may be bound to different addresses at
different times.
Dynamic objects are allocated on the free store (informally known as the
heap) and are allocated and freed explicitly by the programmer -- though
perhaps indirectly, e.g., via objects with constructors and destructors.
As with automatic objects, the number of dynamic objects changes over
time. A reference variable may be bound to different dynamic objects at
different times, and a pointer may point to different dynamic objects at
different times.
I understand now that for i, compiler can put an offset related to some location (like stack base) somewhere.
In your example, i is static so the stack is not involved. The actual
address of a static variable may not be known until the program is
loaded, but from that point forward it does not change. The way this
works on some common platforms (e.g., Windows) is that the linker
includes "relocation information" in an executable file, which the
operating system them uses to fix up all references to static variables
when it loads the program. Function calls work in a similar manner.
Automatic variables work quite differently. The C++ standard does not
require it, but most implementations for common hardware platforms
allocate automatic variables on the stack. This takes advantage of
the fact that automatic variables are always destroyed in the reverse
order they are created (i.e., LIFO, last-in, first-out). Stacks are
well suited to this. A function can reserve space for its local
variables by simply moving the stack pointer (usually a register)
when it is called, and restoring the previous stack pointer when it
returns.
Following is a more detailed description of how the stack might be
managed in a typical x86 implementation.
Two special registers are used to manage the stack: the stack pointer
(ESP), which points to the top of the stack; and the base pointer (EBP),
which points to the current "stack frame", i.e., the part of the stack
reserved for a particular function invocation. A typical function call
sequence might work like this:
1. The caller pushes parameters onto the stack.
2. The caller pushes the return address (i.e., the address of
the next instruction) into the stack, and transfers control
to the entry point of the function.
3. The function (i.e., the callee) saves the old value of the
base pointer by pushing it into the stack, and stores the
current stack pointer in the base pointer; it then adjusts
the stack pointer to make room for its local variables.
4. Local variables, parameters, and the old base pointer all
now have known offsets from the current base pointer.
5. Before the function returns it restores the old stack and
base pointers (reversing step 3). It then "returns", i.e.,
pops the return address off the stack and transfers control
there (reversing step 2).
6. The caller then fixes the stack pointer (reversing step 1).
There are many variations on this idea. For example, parameters may
be passed in registers instead of on the stack; step 6 may be carried
out by the callee instead of the caller; etc. The specific protocol
used is known as a calling convention, but the basic idea is usually
fairly close to what I described above.
Note that each function invocation has its own stack frame. This
explains how recursion is possible. Consider the following example:
#include <iostream>
int factorial(int n)
{
return (n > 2) ?
n * factorial(n - 1) :
n;
}
int main()
{
std::cout << factorial(3) << std::endl;
return 0;
}
In the above program, main calls factorial with n=3. This first
invocation of factorial calls factorial again with n=2. The second
invocation then returns control to the first invocation, which in
turn returns control to main. Assuming the compiler translates the
program quite literally and uses the calling convention I described
above, here's what the stack looks like during the second
invocation of the factorial function:
(Note: ASCII art best viewed in a monospaced font; the
diagram assumes the stack growns downward.)
+------------------------------------------+
| params to main | stack
| | frame for
| return address (to startup code?) | main
| |
? <--- old base pointer |
| |
+--> | local vars for main |
| | |
| +------------------------------------------+
| | n=3 |
| | | stack
| | return address (somewhere in main) | frame for
| | | factorial
+----- old base pointer | (1st call)
| |
+--> | (local vars if there were any) |
| +------------------------------------------+
| | n=2 |
| | | stack
| | return address (somewhere in factorial) | frame for
| | | factorial
+----- old base pointer | (2nd call)
| |
EBP -> | (local vars if there were any) |
+------------------------------------------+
ESP -> (top of the stack)
Mind you, things are less clear cut in practice. For example,
the recursive call in the factorial function is what's know as
a tail call, and the compiler may just turn it into a jump
instead of an actual function call, in effect replacing recursion
with a loop.
But I still have some confusion about dynamic binding. To me, the compiler may put an offset from heap to pi.
It is important to distinguish between the pointer itself (pi) and the
object it points to (*pi). If pi is a local variable it is allocated in
the manner of all automatic variables (e.g., on the stack as described
above).
The memory for the pointee (i.e., *pi) is allocated by a function
called operator new. How that works is quite complex and implementation
dependant, but basically memory management functions like operator new,
operator delete, malloc, and free must maintain some sort of data
structures to keep track of which chunks of memory are in use and which
are free. Something more complex than a stack is required because
(unlike automatic objects) objects on the free store may be freed in
any order. Some cooperation with the operating system is usually often
involved.
In any case, pi can't just point to some static offset from the heap
because pi may point to different objects at different times. Consider
the following example:
struct Node {
Node* next;
};
int main()
{
Node* first = 0;
for (int i = 0; i < 1000; ++i)
{
Node* p = new Node;
p->next = first;
first = p;
}
while (first != 0)
{
p = first->next;
delete first;
first = p;
}
return 0;
}
In the above program, the local variable first points, at different times,
to 1000 different objects. Obviously, then, its value (i.e., the address
of the node it points to) cannot be determined at compile time -- even as
an offset from the beginning of the heap.
But why we call it run-time binding? To me, they all decide at compile-time, that is, compiler record an offset from stack or heap.
A static variable is always bound to a single object at a fixed memory
address. The address may be determined at program load time (or in some
other way) rather than compile time; however, it does not change during
the execution of the program.
In contrast, the location of an automatic variable is relative to a
particular stack frame (assuming a stack is even used, which is not
required). Parameters work like automatic variables so consider the
parameter n in the example above. At one point n is bound to a memory
address in one stack frame (containing the value 2). Later it is bound
to another memory address in a different stack frame (containing the
value 3). In fact, both stack frames exist at the same time, although
only the top-most stack frame is active.
In the case of the free store, see the example above. The same pointer
many point to many different objects at different times during a
program's execution.
Well, I think I've gone on long enough. I hope this helps. :-)
--Nick
"newbiecpp" <ne*******@yahoo.com> wrote in message
news:tTY%c.7827$wF4.5049@trndny09... My confusion is from here: C++ says that static allocation, such as
static int i;
is binding at compile-time, while dynamic allocation, such as
int* pi = new int;
is binding at run-time.
I've never heard the term binding used in this way.
I understand now that for i, compiler can put an offset related to some location (like stack base) somewhere. But I still have some confusion about dynamic binding. To me, the compiler may put an offset from heap to pi. But why we call it run-time binding?
You seem to have a confusion between the variable pi and the memory it
points to.
int a;
int* b;
int* c = &a;
int *d = new int;
a, b, c and d are all local variables. The compiler treats them all the
same, allocating space for them at compile time (as an offset from the stack
pointer). The fact that at run time d is initialised to point to dynamically
allocated memory is completely irrelevent. The memory occupied by d itself
and the memory d points to are different things.
To me, they all decide at compile-time, that is, compiler record an offset from stack or heap. During run-time, the OS will decide the addresses of stack and heap so that we can have real addresses of i and pi. But why we call one as compile-time binding and the other as run-time binding? Or my understanding was wrong?
As I said I've never heard the term binding used like that. But in any case
the address of i and pi is worked out at compile time. The address pi points
to is worked out at run time.
john
> My confusion is from here: C++ says that static allocation, such as static int i;
is binding at compile-time, while dynamic allocation, such as
int* pi = new int;
is binding at run-time.
The C++ standard is completly silent on the way the compiler and system
implement allocation or "binding". C++ mandates behavior, not
implementation. An implementation would be free to implement the stack
and the heap the same way, as long as everything behaves as the standard
stated. That is important.
However...
I understand now that for i, compiler can put an offset related to some location (like stack base) somewhere. But I still have some confusion about dynamic binding. To me, the compiler may put an offset from heap to pi.
It is possible to use "offsets" when using the stack, since the system
uses a stack pointer to know exactly what starting point you are using.
The offsets are always the same, but the starting point will change.
The thing is, you only have one starting point in the heap, which is the
start of the heap. Consider it to be address 1.
When you allocate an int on the heap, it is allocated at address 1. The
next free chunk will be at address 5 (if an int is four bytes). If you
allocate 10 ints on the heap, the the next free chunk will be at address
41. If you then free int number 2 and 5, you get some empty spaces
scattered in the heap. If you then allocate another int, it will have
address 5, which is the address of the first int you freed.
The heap is a _disorganized_ system. You never know what is free and
what is allocated, that is why the compiler has no way, nor the system,
to know in advance the address of a heap allocated object. It depends
on what is in the heap right now.
If you run two instances of your program at the same time, both will
have the same "offsets" on stack, that is, their structure on the stack
will be exactly the same (if they get the same input of course). But
the will obviously get different addresses from the heap.
Let's try something else.
You need to understand a little more about how a computer works. Take
for example
void f(int a, int b)
{
}
f() resides somewhere in memory as machine code. When it is executed,
that is, when the instruction pointer gets to its address, the values
passed to f() are pushed on the stack and you can access 'a' by doing
something like *(stack_pointer) and 'b' by doing *(stack_pointer +
sizeof(int)). That is what I meant by 'offset'. The compiler/linker
have no idea about where the stack_pointer will point at the time of
executing f(), but it knows that if it goes to the address
(stack_pointer + sizeof(int)), it will find 'b'.
The stack has a very rigid structure, since it must be constructed from
the bottom to the top, by pushing and popping values. This structure
allows the system to keep a pointer to the current position in the stack.
The heap is not that organized. In modern operating systems, an
application has a given amount of memory it can use as it wishes. In
that space, it can allocate and free memory independently of what is
going on on the stack.
Since a stack is last-in-first-out, it is impossible to force some
memory to be kept. Imagine a pile of plates. When you call a function,
you push, for example, 5 plates on the top of the current stack. When
that function returns, you remove these plates to get back to where you
were. If you had some important informations you wanted to keep in one
of these 5 plates, it is now lost.
The solution is to put that important information/plate aside, not in
the stack of plates. The system then tries to find a free space on the
side, in what is called the heap. If it finds the space, it returns a
pointer to that memory space.
But since the heap is "disorganized", that is, your application could
free and allocate memory in a random order, it is impossible for the
compiler or even the system to know in advance what memory will be used.
To make the explanation simpler, imagine the heap as a linear chunk of
memory, say of 10 mb. When an application request some memory on the
heap, to bypass the rigid stack memory management, the system does a
linear search in that chunk to find some free memory and returns a
pointer to that place. It then marks that place as "allocated".
If 10 applications are running at the same time, requesting and freeing
memory randomly (from the system's point of view), the memory will
become fragmented. Allocated and free memory will be randomly scattered
in the 10mb chunk.
At that point, when you decide to run your application, you have
absolutely no way to know anything about the current memory state. You
only ask the system for a free space and the system answers with a valid
space, or fails.
That is why stack-allocated objects are taken care of by the system,
since it knows exactly what is allocated. By having the stack of plates
before you, you know how many plates you have to take off to get back to
where you were and to free the memory used by objects. In the heap,
however, it is impossible to know when the memory must be freed. And it
is your task, as the programmer, to keep track of every chunk of memory
you allocated in order to be able to free them.
But why we call it run-time binding? To me, they all decide at compile-time, that is, compiler record an offset from stack or heap.
The heap is a disorganized pool of memory. Asking for free memory today
will not return the same thing as tomorrow. It depends on what is in
the heap right now. The stack, however, will _always_ have the same
structure and the same behavior. Arguments will always be sizeof(int)
bytes after the current stack pointer, wherever the stack pointer points.
I hope it is a little clearer now. If not, continue asking and wait for
somebody else to answer, since I don't know if I will be able to be
clearer than that.
Jonathan
"newbiecpp" <ne*******@yahoo.com> wrote in message news:<u9F%c.5283$dC4.1197@trndny06>... I have hard time to understand run-time environment. Let assume that I have a program that has a simple variable alpha. When this variable is statically allocated, the compiler can use the absolute address of alpha to access to it. What confuses me is that when the variable is dynamically allocated, how does the compiler implement it? We know the address of the variable until run-time. During the compilation, how can we access to the alpha variable since we don't know its address yet?
Thanks in advance
A lot has been written already, just a small hint if you are
working under UNIX (for Windows a similar tool might exist):
UNIX> man nm
linux> nm -C program
It has helped me a lot in understanding such issues.
Stephan Brönnimann br****@osb-systems.com
Open source rating and billing engine for communication networks.
"Jonathan Mcdougall" <jo***************@DELyahoo.ca> wrote in message
news:6q********************@wagner.videotron.net.. . I hope it is a little clearer now. If not, continue asking and wait for somebody else to answer, since I don't know if I will be able to be clearer than that.
Jonathan
Thanks again. I am clearer than before now. By the way, can you recommend
some good books about assembly and compiler theory? I appreciate. Have a
wonderful weekend.
>>I hope it is a little clearer now. If not, continue asking and wait for somebody else to answer, since I don't know if I will be able to be clearer than that.
Thanks again. I am clearer than before now. By the way, can you recommend some good books about assembly and compiler theory?
No, nor am I a good source of information about it. You would really be
better off in a newsgroup servicing your platform.
Jonathan
"newbiecpp" <ne*******@yahoo.com> wrote in message news:igq0d.61$W73.19@trndny03...
|
| "Jonathan Mcdougall" <jo***************@DELyahoo.ca> wrote in message
| news:6q********************@wagner.videotron.net.. .
| > I hope it is a little clearer now. If not, continue asking and wait for
| > somebody else to answer, since I don't know if I will be able to be
| > clearer than that.
| >
| > Jonathan
|
| Thanks again. I am clearer than before now. By the way, can you recommend
| some good books about assembly and compiler theory? I appreciate. Have a
| wonderful weekend.
You can go to the following link: http://webster.cs.ucr.edu/
....and download "The Art of Assembly" book, written
by "Randall Hyde". It is also now available in hard
copy now, and it is one of the best books on the
topic available to beginners.
Additionally, you can download many other tools from
Randall' site above.
Cheers.
Chris Val This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Ed |
last post by:
Hello,
I took a course in asp about 2 years ago and I was
practicing with IIS 5.0. Then I put it down for a while.
Now trying to get back to it. I can't run asp files from
subdirectories of...
|
by: Primo |
last post by:
Hi,
This problem has been frustrating me for days and I hope you experts can
help me out. I am trying to run a command, which I would normally run from
the command line, from within my C#...
|
by: orekin |
last post by:
Hi There
I have been trying to come to grips with Application.Run(),
Application.Exit() and the Message Pump and I would really appreciate
some feedback on the following questions .. There are...
|
by: Bob Day |
last post by:
Using vs2003, vb.net
I start a thread, giving it a name before start. Code snippet:
'give each thread a unique name (for later identification)
Trunk_Thread.Name = "Trunk_0_Thread"
' allow...
|
by: emman_54 |
last post by:
Hi every one,
I am trying to run a batch file using my asp.net application.
I am using the Process class to run the batch file. When I run my web
application, In the task manager, i could see...
|
by: Bryan |
last post by:
How can i run a bit of code straight from the IDE? Right now i make a
temporary button and put the code behind that, then i run debug mode
and click on the button. Is there a way to highlight...
|
by: Brett Wesoloski |
last post by:
I am new to VS2005.
I changed my program.cs file to be a different form I am working on. But
when I go to run the application it still brings up the form that was
originally declared as new. ...
|
by: Lee Crabtree |
last post by:
I remember when I was first getting into .NET Forms programming that
there was a rather emphatic rule about not constructing a form before
calling Application.Run with it. So this:
...
|
by: David Thielen |
last post by:
Hi;
In our setup program how do I determine if I need to run "aspnet_regiis –i"
and if so, is there an API I can calll rather than finding that program on
the user's disk and calling it?
--...
|
by: traceable1 |
last post by:
Is there a way I can set up a SQL script to run when the instance
starts up?
SQL Server 2005 SP2
thanks!
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 2 August 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM)
The start time is equivalent to 19:00 (7PM) in Central...
|
by: erikbower65 |
last post by:
Using CodiumAI's pr-agent is simple and powerful. Follow these steps:
1. Install CodiumAI CLI: Ensure Node.js is installed, then run 'npm install -g codiumai' in the terminal.
2. Connect to...
|
by: kcodez |
last post by:
As a H5 game development enthusiast, I recently wrote a very interesting little game - Toy Claw ((http://claw.kjeek.com/))。Here I will summarize and share the development experience here, and hope it...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Sept 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM)
The start time is equivalent to 19:00 (7PM) in Central...
|
by: Taofi |
last post by:
I try to insert a new record but the error message says the number of query names and destination fields are not the same
This are my field names
ID, Budgeted, Actual, Status and Differences
...
|
by: DJRhino1175 |
last post by:
When I run this code I get an error, its Run-time error# 424 Object required...This is my first attempt at doing something like this. I test the entire code and it worked until I added this -
If...
|
by: Rina0 |
last post by:
I am looking for a Python code to find the longest common subsequence of two strings. I found this blog post that describes the length of longest common subsequence problem and provides a solution in...
|
by: DJRhino |
last post by:
Private Sub CboDrawingID_BeforeUpdate(Cancel As Integer)
If = 310029923 Or 310030138 Or 310030152 Or 310030346 Or 310030348 Or _
310030356 Or 310030359 Or 310030362 Or...
|
by: lllomh |
last post by:
How does React native implement an English player?
| |