arrays = pointers?

Zytan

I know there are no pointers in C#, but if you do:
a = b;
and a and b are both arrays, they now both point to the same memory
(changing one changes the other). So, it makes them seem like
pointers.

Can someone please explain why? thanks.

Zytan

Feb 25 '07 #1

Subscribe Reply

3367

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Zytan wrote:

I know there are no pointers in C#, but if you do:
a = b;
and a and b are both arrays, they now both point to the same memory
(changing one changes the other). So, it makes them seem like
pointers.

Can someone please explain why? thanks.

It is a reference, which indeed is very similar to a
C/C++ pointer in some aspects.

But you can not do pointer manipulation with it.

Arne

PS: You do have pointers in C# in unsafe mode.

Feb 25 '07 #2

Zytan

It is a reference, which indeed is very similar to a

C/C++ pointer in some aspects.

But you can not do pointer manipulation with it.

Arne

PS: You do have pointers in C# in unsafe mode.

Ok. thanks.

Ok, i just did some testing, and when you pass an array into a
function, regardless if it is ref or not, when the function changes
the array's contents, they are still changed when the function
returns. So, they are like pointers.

Zytan

Feb 25 '07 #3

Bruce Wood

On Feb 24, 6:49 pm, "Zytan" <zytanlith...@yahoo.comwrote:

It is a reference, which indeed is very similar to a
C/C++ pointer in some aspects.

But you can not do pointer manipulation with it.

Arne

PS: You do have pointers in C# in unsafe mode.

Ok. thanks.

Ok, i just did some testing, and when you pass an array into a
function, regardless if it is ref or not, when the function changes
the array's contents, they are still changed when the function
returns. So, they are like pointers.

Zytan

Yes, they are. I think of them as pointers, because that's a familiar
idiom for me.

Howver, as Arne pointed out, you can't do pointer arithmetic with
them, and, although you'll never notice it, unlike pointers references
can change at any moment as the Garbage Collector compacts the heap
and moves objects around. As I said, though, you won't notice, because
the reference will always refer to the same object instance, even if
it doesn't always point to the same place in memory.

Feb 25 '07 #4

Zytan

Yes, they are. I think of them as pointers, because that's a familiar

idiom for me.

Howver, as Arne pointed out, you can't do pointer arithmetic with
them, and, although you'll never notice it, unlike pointers references
can change at any moment as the Garbage Collector compacts the heap
and moves objects around. As I said, though, you won't notice, because
the reference will always refer to the same object instance, even if
it doesn't always point to the same place in memory.

Bruce,

Thanks for being so clear. I understand just what you're saying.

They are like pointers, but you can't think of it as an actual memory
addresses, since C# doesn't give you that kind of access, for good
reason -- since the GC moves things around on you, as you say, and
this means pointer arithmetic is no good. Also, since it's not really
a pointer, you never need to use ptr-or *ptr. you always use ptr.
(ptr being an incorrect abbr. here, since it's not a pointer).

So, it's like a pointer. But, it's really a reference.

I just did a test and found that a class that has a private array, if
it passes this out in a property, the caller has the 'pointer' and
thus can change the contents of the private array. (I guess you'd
have to make a copy to pass back in this case.) So, in this case it
acts like a pointer in C, so it's a nice way to think of them, but it
is not a direct analogy. Got it.

So, they are called 'references'? Is this the proper term?

Zytan

Feb 26 '07 #5

Tom Leylan

"Zytan" <zy**********@yahoo.comwrote...

Hi Zytan:

They are like pointers, but you can't think of it as an actual memory
addresses, since C# doesn't give you that kind of access, for good
reason -- since the GC moves things around on you, as you say, and
this means pointer arithmetic is no good. Also, since it's not really
a pointer, you never need to use ptr-or *ptr. you always use ptr.
(ptr being an incorrect abbr. here, since it's not a pointer).

So, it's like a pointer. But, it's really a reference.

A pointer in C is a reference. It's a reference to an area in memory (as
you know) and the act of getting the value at the address of the pointer is
called "dereferencing".

I just did a test and found that a class that has a private array, if
it passes this out in a property, the caller has the 'pointer' and
thus can change the contents of the private array. (I guess you'd
have to make a copy to pass back in this case.) So, in this case it
acts like a pointer in C, so it's a nice way to think of them, but it
is not a direct analogy. Got it.

So, they are called 'references'? Is this the proper term?

I'd tend to get out of the habit of thinking of C# (and VB) references as
pointers. It could mess you up and they aren't pointers. They are called
references because they are and the term itself is used in other situation,
reference counting, etc. They are an indirect route to a stored value.

If you pass an array or any object as a parameter to another method you
generally pass it by value. This value is a reference to the array and as
such you can change the contents of the array but not the reference itself.
If on the other hand you pass the actual reference to the array the method
has access to the contents as well as to the location that holds the
reference which means it can replace the original array with another one.
Sometimes you want that but often you don't. (See: pass by value; pass by
reference)

Hope this helps.

Feb 26 '07 #6

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Zytan wrote:

>Yes, they are. I think of them as pointers, because that's a familiar
idiom for me.

Howver, as Arne pointed out, you can't do pointer arithmetic with
them, and, although you'll never notice it, unlike pointers references
can change at any moment as the Garbage Collector compacts the heap
and moves objects around. As I said, though, you won't notice, because
the reference will always refer to the same object instance, even if
it doesn't always point to the same place in memory.

Bruce,

Thanks for being so clear. I understand just what you're saying.

They are like pointers, but you can't think of it as an actual memory
addresses, since C# doesn't give you that kind of access, for good
reason -- since the GC moves things around on you, as you say, and
this means pointer arithmetic is no good. Also, since it's not really
a pointer, you never need to use ptr-or *ptr. you always use ptr.
(ptr being an incorrect abbr. here, since it's not a pointer).

So, it's like a pointer. But, it's really a reference.

I just did a test and found that a class that has a private array, if
it passes this out in a property, the caller has the 'pointer' and
thus can change the contents of the private array. (I guess you'd
have to make a copy to pass back in this case.)

Yes, reference types (objects) are never copied automatically. If you
want a copy you have to explicitly create one.

An alternative to creating a copy would be to return a wrapper object
that contains the array but only allows read access to it.

So, in this case it
acts like a pointer in C, so it's a nice way to think of them, but it
is not a direct analogy. Got it.

So, they are called 'references'? Is this the proper term?

Yes. It's the proper term.

--
Göran Andersson
_____
http://www.guffa.com

Feb 26 '07 #7

Bruce Wood

On Feb 25, 9:59 pm, "Tom Leylan" <tley...@nospam.netwrote:

"Zytan" <zytanlith...@yahoo.comwrote...

Hi Zytan:

They are like pointers, but you can't think of it as an actual memory
addresses, since C# doesn't give you that kind of access, for good
reason -- since the GC moves things around on you, as you say, and
this means pointer arithmetic is no good. Also, since it's not really
a pointer, you never need to use ptr-or *ptr. you always use ptr.
(ptr being an incorrect abbr. here, since it's not a pointer).

Pointers are at arms' length in C# for security reasons, as well. You
can do truly nasty, dirty things with pointer arithmetic and aliasing
(casting) in C. You aren't allowed to do any of that in C# because
it's not verifiably correct, and would be a massive security hole.
(Well, you can, but you have to flag the code as "unsafe", which means
exactly what it sounds like: "security hole".)

So, it's like a pointer. But, it's really a reference.

A pointer in C is a reference. It's a reference to an area in memory (as
you know) and the act of getting the value at the address of the pointer is
called "dereferencing".

I just did a test and found that a class that has a private array, if
it passes this out in a property, the caller has the 'pointer' and
thus can change the contents of the private array. (I guess you'd
have to make a copy to pass back in this case.) So, in this case it
acts like a pointer in C, so it's a nice way to think of them, but it
is not a direct analogy. Got it.

So, they are called 'references'? Is this the proper term?

I'd tend to get out of the habit of thinking of C# (and VB) references as
pointers. It could mess you up and they aren't pointers.

Oh, c'mon... they _are_ pointers. They're just pointers that are under
total control of the CLR, not under your control. I, too, come from a
C background, and I had no problem understanding what was going on
with references because I immediately saw them as good ol' pointers.
Hands-off pointers, to be sure, but pointers just the same. And, in
fact, if you could snoop inside the stack frame / heap at runtime,
you'd see... pointers. It's just that in C you can play all sorts of
games with them, and in C# they're so far outside your control that
they're almost invisible, but they _are_ there.

If you pass an array or any object as a parameter to another method you
generally pass it by value. This value is a reference to the array and as
such you can change the contents of the array but not the reference itself.

Absolutely, and the analogy in C is that if you pass a pointer to some
structure then you can modify the contents of the structure, but you
cannot change to which structure the pointer points. In order to do
the latter in C you have to pass a pointer to the pointer. In C# it's
no different: reference types (classes) by default have their
references passed by value, which is to say that the runtime passes a
pointer to them. If you want to change the reference to point to a
different object, then you have to pass the reference type by "ref":
in other words, you have to pass a reference to the reference, or a
pointer to the pointer.

What's really different in C# is the interpretation of what is a
"struct": beware, beware of this one! "struct" in C# is very, very
different from "class". Value type (which includes "structs") are
truly passed by value: the value is pushed on the stack, so the called
method gets a copy of the value. Changes to the value are not
reflected in the caller's argument.

If on the other hand you pass the actual reference to the array the method
has access to the contents as well as to the location that holds the
reference which means it can replace the original array with another one.

Be careful with the wording there! I think you wanted to say, "If on
the other hand you pass the array by reference..." which means using
the "ref" keyword, which means passing a reference to the array by
reference. When I hear "pass a reference to the array," I think of
passing the reference by value, which is the usual case.

Feb 26 '07 #8

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Tom Leylan wrote:

I'd tend to get out of the habit of thinking of C# (and VB) references as
pointers.

Me too.

It could mess you up and they aren't pointers.

Well, yes and no. :)

Under the hood references _are_ just pointers.

On the other hand, the concept of a reference is not the same as the
concept of a pointer. A reference is actually a much easier concept to
work with.

--
Göran Andersson
_____
http://www.guffa.com

Feb 26 '07 #9

Zytan

A pointer in C is a reference. It's a reference to an area in memory (as

you know) and the act of getting the value at the address of the pointer is
called "dereferencing".

I guess i think of C references as pointers internally. I think of
them as one and the same, so it doesn't make any difference to me.
But, for C#, I'd like to get in-the-know with what the proper
terminology is.

I'd tend to get out of the habit of thinking of C# (and VB) references as
pointers. It could mess you up and they aren't pointers. They are called
references because they are and the term itself is used in other situation,
reference counting, etc. They are an indirect route to a stored value.

yes. They are references in a literal sense of the term, so it all
makes sense. I'll stop using the term 'pointer'.

If you pass an array or any object as a parameter to another method you
generally pass it by value. This value is a reference to the array and as
such you can change the contents of the array but not the reference itself.
If on the other hand you pass the actual reference to the array the method
has access to the contents as well as to the location that holds the
reference which means it can replace the original array with another one.
Sometimes you want that but often you don't. (See: pass by value; pass by
reference)

I understand all of this. Exactly like C++ with a pointer to some
data. This is why probably many people think of C# references as
pointers. Really, they are almost the same thing, but since C#
references can move about, they cannot be pointers. Pointers are more
strict and absolute. So, as long as this is known, i think it is
clear why C# references aren't pointers, although since references are
so similar to pointers in many ways, it helps to know what pointers
can/can't do since it can be applied to references, as you've shown
above.

Hope this helps.

it does, thanks.

Zytan

Feb 26 '07 #10

Zytan

Yes, reference types (objects) are never copied automatically. If you

want a copy you have to explicitly create one.

Yes, the reference itself is copied, but not what it references. How
can i can tell what types are reference types or not? I know arrays
would be, from my C background, but how else would i know? Would it
be proper to assume all native types (int, double) are not, and
everything else (objects) are?

An alternative to creating a copy would be to return a wrapper object
that contains the array but only allows read access to it.

Great idea.

So, they are called 'references'? Is this the proper term?

Yes. It's the proper term.

Thanks, Göran.

Zytan

Feb 26 '07 #11

Zytan

Pointers are at arms' length in C# for security reasons, as well. You

can do truly nasty, dirty things with pointer arithmetic and aliasing
(casting) in C. You aren't allowed to do any of that in C# because
it's not verifiably correct, and would be a massive security hole.
(Well, you can, but you have to flag the code as "unsafe", which means
exactly what it sounds like: "security hole".)

Yes, i was just reading about 'unsafe'. So, you can get access to
pointers, but they don't want you to, because of the demons you can
produce with it.

I'd tend to get out of the habit of thinking of C# (and VB) references as
pointers. It could mess you up and they aren't pointers.

Oh, c'mon... they _are_ pointers. They're just pointers that are under
total control of the CLR, not under your control. I, too, come from a
C background, and I had no problem understanding what was going on
with references because I immediately saw them as good ol' pointers.
Hands-off pointers, to be sure, but pointers just the same. And, in
fact, if you could snoop inside the stack frame / heap at runtime,
you'd see... pointers. It's just that in C you can play all sorts of
games with them, and in C# they're so far outside your control that
they're almost invisible, but they _are_ there.

Right. Basically, they are pointers.

But if you were to grab a pointer, and store the address it points to,
and later return to use that stored address, in C# you cannot be 100%
sure that it still points to the object it used to. Is that right?

Absolutely, and the analogy in C is that if you pass a pointer to some
structure then you can modify the contents of the structure, but you
cannot change to which structure the pointer points. In order to do
the latter in C you have to pass a pointer to the pointer. In C# it's
no different: reference types (classes) by default have their
references passed by value, which is to say that the runtime passes a
pointer to them. If you want to change the reference to point to a
different object, then you have to pass the reference type by "ref":
in other words, you have to pass a reference to the reference, or a
pointer to the pointer.

Yes, Tom showed an exact analog to C++ pointers (passing by ref or
val) or C (using a pointer to a pointer, to emulare C++'s pass by ref
or val).

What's really different in C# is the interpretation of what is a
"struct": beware, beware of this one! "struct" in C# is very, very
different from "class". Value type (which includes "structs") are
truly passed by value: the value is pushed on the stack, so the called
method gets a copy of the value. Changes to the value are not
reflected in the caller's argument.

Ok, so structs are like native types, that are always passed by
value? And classes are like non-native types that are always passed
by reference? I noticed Petzold touch upon this in the intro to
his .NET Book Zero (available for free):
http://www.charlespetzold.com/dotnet/index.html
so it must be a big deal.

If on the other hand you pass the actual reference to the array the method
has access to the contents as well as to the location that holds the
reference which means it can replace the original array with another one.

Be careful with the wording there! I think you wanted to say, "If on
the other hand you pass the array by reference..." which means using
the "ref" keyword, which means passing a reference to the array by
reference. When I hear "pass a reference to the array," I think of
passing the reference by value, which is the usual case.

yes, nice catch.

Zytan

Feb 26 '07 #12

Zytan

It could mess you up and they aren't pointers.

>
Well, yes and no. :)

Under the hood references _are_ just pointers.

On the other hand, the concept of a reference is not the same as the
concept of a pointer. A reference is actually a much easier concept to
work with.

yes, good point. thanks.

Zytan

Feb 26 '07 #13

Jon Skeet [C# MVP]

Zytan <zy**********@yahoo.comwrote:

Yes, reference types (objects) are never copied automatically. If you
want a copy you have to explicitly create one.

Yes, the reference itself is copied, but not what it references. How
can i can tell what types are reference types or not? I know arrays
would be, from my C background, but how else would i know? Would it
be proper to assume all native types (int, double) are not, and
everything else (objects) are?

Not quite. There are other value types - DateTime, GUID etc. Basically,
in MSDN, if the description is of a "struct" that means a value type.
If it's of a "class" that means a reference type.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Feb 26 '07 #14

Jon Skeet [C# MVP]

Zytan <zy**********@yahoo.comwrote:

What's really different in C# is the interpretation of what is a
"struct": beware, beware of this one! "struct" in C# is very, very
different from "class". Value type (which includes "structs") are
truly passed by value: the value is pushed on the stack, so the called
method gets a copy of the value. Changes to the value are not
reflected in the caller's argument.

Ok, so structs are like native types, that are always passed by
value? And classes are like non-native types that are always passed
by reference?

No, classes are *not* passed by reference. References (to instances of
classes) are passed by value. It's well worth making the distinction.
See http://pobox.com/~skeet/csharp/parameters.html

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Feb 26 '07 #15

Bruce Wood

On Feb 26, 10:05 am, "Zytan" <zytanlith...@yahoo.comwrote:

Pointers are at arms' length in C# for security reasons, as well. You
can do truly nasty, dirty things with pointer arithmetic and aliasing
(casting) in C. You aren't allowed to do any of that in C# because
it's not verifiably correct, and would be a massive security hole.
(Well, you can, but you have to flag the code as "unsafe", which means
exactly what it sounds like: "security hole".)

Yes, i was just reading about 'unsafe'. So, you can get access to
pointers, but they don't want you to, because of the demons you can
produce with it.

Yes, and that's why as soon as your program contains an "unsafe"
section it needs elevated privileges to run that section. The CLR
won't allow any Tom-Dick-or-Harry program from the Web to run unsafe
code (unless you're foolish enough to set up your machine's code-based
security that way).

<snip>

But if you were to grab a pointer, and store the address it points to,
and later return to use that stored address, in C# you cannot be 100%
sure that it still points to the object it used to. Is that right?

Yes, and that's why C# also provides mechanisms to "pin" objects when
calling across to unmanaged (good ol' Win32) code: while code outside
the CLR is looking at / manipulating managed data, the GC has to be
told not to move that data during compaction or... boom!

What's really different in C# is the interpretation of what is a
"struct": beware, beware of this one! "struct" in C# is very, very
different from "class". Value type (which includes "structs") are
truly passed by value: the value is pushed on the stack, so the called
method gets a copy of the value. Changes to the value are not
reflected in the caller's argument.

Ok, so structs are like native types, that are always passed by
value? And classes are like non-native types that are always passed
by reference? I noticed Petzold touch upon this in the intro to
his .NET Book Zero (available for free):http://www.charlespetzold.com/dotnet/index.html
so it must be a big deal.

It is for ex-C hacks like us. I freely admit that I, too, saw "struct"
and thought, "Oh, a lightweight class." No, it's not. It acts, as you
said, like a native type, which is tremendously useful, but if you
abuse it by trying to create a "light" class then you will suddenly
find your code becoming tortured and incomprehensible.

There are lots of threads here about value versus reference
semantics... I don't really want to hijack this thread with that
discussion. Suffice to say that the distinction _is_ a big deal, but
once you "get it" it's not complicated.

Feb 26 '07 #16

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Zytan wrote:

How
can i can tell what types are reference types or not? I know arrays
would be, from my C background, but how else would i know? Would it
be proper to assume all native types (int, double) are not, and
everything else (objects) are?

Yes, the native types are value types, but not everything else are
reference types.

You can easily see the difference in the documentation. Value types are
listed as structures, for example "Int32 structure" and reference types
are listed as classes, for example "String class".

--
Göran Andersson
_____
http://www.guffa.com

Feb 26 '07 #17

Ben Voigt

"Zytan" <zy**********@yahoo.comwrote in message
news:11*********************@z35g2000cwz.googlegro ups.com...

>Pointers are at arms' length in C# for security reasons, as well. You
can do truly nasty, dirty things with pointer arithmetic and aliasing
(casting) in C. You aren't allowed to do any of that in C# because
it's not verifiably correct, and would be a massive security hole.
(Well, you can, but you have to flag the code as "unsafe", which means
exactly what it sounds like: "security hole".)

Yes, i was just reading about 'unsafe'. So, you can get access to
pointers, but they don't want you to, because of the demons you can
produce with it.

I'd tend to get out of the habit of thinking of C# (and VB) references
as
pointers. It could mess you up and they aren't pointers.

Oh, c'mon... they _are_ pointers. They're just pointers that are under
total control of the CLR, not under your control. I, too, come from a
C background, and I had no problem understanding what was going on
with references because I immediately saw them as good ol' pointers.
Hands-off pointers, to be sure, but pointers just the same. And, in
fact, if you could snoop inside the stack frame / heap at runtime,
you'd see... pointers. It's just that in C you can play all sorts of
games with them, and in C# they're so far outside your control that
they're almost invisible, but they _are_ there.

Right. Basically, they are pointers.

But if you were to grab a pointer, and store the address it points to,
and later return to use that stored address, in C# you cannot be 100%
sure that it still points to the object it used to. Is that right?

Right. You can get the integral value of a reference using IntPtr... but
there's no typesafe way to make an integer back into a reference (though you
can use an unsafe pointer to do so). Still, you can see the actual
in-memory address, and watch to see how it moves (hint, one easy way to make
it move is pinning it with the GCHandle constructor overload).

As long as you have a reference, C# (actually the CLR) tracks it and keeps
it up-to-date. As soon as you put the address in some other variable, like
an integer, the CLR no longer knows you're using it, so the object can be
moved or deleted. C++/CLI actually has a pointer type (interior_ptr) which
allows pointer arithmetic and automatically follows objects moved by the
CLR.

Feb 26 '07 #18

Willy Denoyette [MVP]

"Ben Voigt" <rb*@nospam.nospamwrote in message
news:ug****************@TK2MSFTNGP04.phx.gbl...

>
"Zytan" <zy**********@yahoo.comwrote in message
news:11*********************@z35g2000cwz.googlegro ups.com...

As long as you have a reference, C# (actually the CLR) tracks it and keeps it up-to-date.
As soon as you put the address in some other variable, like an integer, the CLR no longer
knows you're using it, so the object can be moved or deleted. C++/CLI actually has a
pointer type (interior_ptr) which allows pointer arithmetic and automatically follows
objects moved by the CLR.

True, but this pointer arithmetic renders the code just as unsafe as the /unsafe pointer
support in C#.

Willy.

Feb 26 '07 #19

Zytan

Ok, so structs are like native types, that are always passed by

value? And classes are like non-native types that are always passed
by reference?

No, classes are *not* passed by reference. References (to instances of
classes) are passed by value. It's well worth making the distinction.

Right, the reference (pointer) is passed by value, and cannot change,
but what it refers to can change. This is what i *meant* by 'classes
are passed by reference', and i see the difference.

Seehttp://pobox.com/~skeet/csharp/parameters.html

thanks for the link. it links to many other pages, too, and i'll read
them all.

Zytan

Feb 26 '07 #20

Zytan

Yes, and that's why as soon as your program contains an "unsafe"

section it needs elevated privileges to run that section. The CLR
won't allow any Tom-Dick-or-Harry program from the Web to run unsafe
code (unless you're foolish enough to set up your machine's code-based
security that way).

Yes, i see.

Yes, and that's why C# also provides mechanisms to "pin" objects when
calling across to unmanaged (good ol' Win32) code: while code outside
the CLR is looking at / manipulating managed data, the GC has to be
told not to move that data during compaction or... boom!

Cool!

It is for ex-C hacks like us. I freely admit that I, too, saw "struct"
and thought, "Oh, a lightweight class." No, it's not. It acts, as you
said, like a native type, which is tremendously useful, but if you
abuse it by trying to create a "light" class then you will suddenly
find your code becoming tortured and incomprehensible.

Well, i never use struct as a class, so i won't run into that
problem. But, realizing structs were lightweight classes for C++
helped me a lot. But, this changes for C#.

There are lots of threads here about value versus reference
semantics... I don't really want to hijack this thread with that
discussion. Suffice to say that the distinction _is_ a big deal, but
once you "get it" it's not complicated.

Ok. I will research this. thanks.

Zytan

Feb 26 '07 #21

Zytan

Right. You can get the integral value of a reference using IntPtr... but

there's no typesafe way to make an integer back into a reference (though you
can use an unsafe pointer to do so). Still, you can see the actual
in-memory address, and watch to see how it moves (hint, one easy way to make
it move is pinning it with the GCHandle constructor overload).

Cool!

As long as you have a reference, C# (actually the CLR) tracks it and keeps
it up-to-date. As soon as you put the address in some other variable, like
an integer, the CLR no longer knows you're using it, so the object can be
moved or deleted. C++/CLI actually has a pointer type (interior_ptr) which
allows pointer arithmetic and automatically follows objects moved by the
CLR.

I see. thanks for the information!

Zytan

Feb 26 '07 #22

Zytan

Yes, the reference itself is copied, but not what it references. How

can i can tell what types are reference types or not? I know arrays
would be, from my C background, but how else would i know? Would it
be proper to assume all native types (int, double) are not, and
everything else (objects) are?

Not quite. There are other value types - DateTime, GUID etc. Basically,
in MSDN, if the description is of a "struct" that means a value type.
If it's of a "class" that means a reference type.

Yup, got it. This is what I was thinking. Thanks!

Zytan

Feb 26 '07 #23

Zytan

Yes, the native types are value types, but not everything else are

reference types.

You can easily see the difference in the documentation. Value types are
listed as structures, for example "Int32 structure" and reference types
are listed as classes, for example "String class".

Thank you, Göran. I was just looking there and noticed Int32, and I
got it, and you guys have confirmed it. structs are value, classes
are references. Easy enough, and makes sense. thanks again!

Zytan

Feb 26 '07 #24

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Zytan wrote:

>>Ok, so structs are like native types, that are always passed by
value? And classes are like non-native types that are always passed
by reference?
No, classes are *not* passed by reference. References (to instances of
classes) are passed by value. It's well worth making the distinction.

Right, the reference (pointer) is passed by value, and cannot change,
but what it refers to can change. This is what i *meant* by 'classes
are passed by reference', and i see the difference.

Maybe the following comparison would give some insight:

C# C
void m(valtyp v) void f(typ v)
void m(ref valtyp v) void f(typ *v)
void m(reftyp v) void f(typ *v)
void m(ref reftyp v) void f(typ **v)

Arne

Feb 27 '07 #25

Ben Voigt

Well, i never use struct as a class, so i won't run into that
problem. But, realizing structs were lightweight classes for C++
helped me a lot. But, this changes for C#.

C++ structs aren't lightweight classes, they are fully classes. The *only*
difference is that the struct body begins with an implicit "public:" while
the class body begins with implicit "private:". Structs have all the
capabilities that a "heavyweight" class does, virtual class, multiple
inheritance, virtual inheritance, pointers to members, templated members,
etc.

C# structs are different though... and C++/CLI did a good job of making this
evident. The rough correspondance is as follows:

C# struct (default internal) = C++/CLI value class (default private) =
C++/CLI value struct (default public)
C# class (default internal) = C++/CLI ref class (default private) = C++/CLI
ref struct (default public)

Feb 27 '07 #26

Zytan

Maybe the following comparison would give some insight:

>
C# C
void m(valtyp v) void f(typ v)
void m(ref valtyp v) void f(typ *v)
void m(reftyp v) void f(typ *v)
void m(ref reftyp v) void f(typ **v)

Yes, Arne, I follow. And it's basically true. But note that this
*implies*:
reference passing of a value type = value passing of a reference type.
and that's incorrect.

http://www.yoda.arachsys.com/csharp/parameters.html
scroll down to:
Sidenote: what is the difference between passing a value object by
reference and a reference object by value?

Zytan

Feb 27 '07 #27

Zytan

C++ structs aren't lightweight classes, they are fully classes. The *only*

difference is that the struct body begins with an implicit "public:" while
the class body begins with implicit "private:". Structs have all the
capabilities that a "heavyweight" class does, virtual class, multiple
inheritance, virtual inheritance, pointers to members, templated members,
etc.

Right. I knew that. I was just using another poster's term of
'lightweight'. I agree they are exactly the same besides for the
default privileges. And this helped me a lot to understand what
classes were all about.

Zytan

Feb 27 '07 #28

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Zytan wrote:

>Maybe the following comparison would give some insight:

C# C
void m(valtyp v) void f(typ v)
void m(ref valtyp v) void f(typ *v)
void m(reftyp v) void f(typ *v)
void m(ref reftyp v) void f(typ **v)

Yes, Arne, I follow. And it's basically true. But note that this
*implies*:
reference passing of a value type = value passing of a reference type.
and that's incorrect.

It does not imply that. It implies that the best C equivalent
of both is a pointer.

http://www.yoda.arachsys.com/csharp/parameters.html
scroll down to:
Sidenote: what is the difference between passing a value object by
reference and a reference object by value?

It is a very good description of the difference between
reference types and value types in C#, but in the comparison
with C the point is slightly different.

If we use C++ to avoid malloc (and ignore the
fact that C++ has the ref specifier &):

void m(reftyp v)
{
v = new reftyp();
}

is:

void m(typ *v)
{
v = new typ();
}

but:

void m(ref valtyp v)
{
v = new valtyp();
}

is:

void m(typ *v)
{
*v = *(new typ());
}

The difference is not so much in the declaration but
in the meaning of the = operator.

Arne

Feb 28 '07 #29

Peter Duniho

On Tue, 27 Feb 2007 01:54:38 +0800, Zytan <zy**********@yahoo.comwrote:

I guess i think of C references as pointers internally. I think of
them as one and the same, so it doesn't make any difference to me.
But, for C#, I'd like to get in-the-know with what the proper
terminology is.

I don't think that's really what Tom was talking about though. A "C
reference" is a particular syntax that hides the pointer syntax, but still
acts like a pointer. On the other hand, a general "reference" is simply a
way of saying that one thing refers to another. I think it was this more
general meaning that Tom was talking about.

>I'd tend to get out of the habit of thinking of C# (and VB) references
as pointers. It could mess you up and they aren't pointers. They are
called references because they are and the term itself is used in other
situation, reference counting, etc. They are an indirect route to a
stored value.

yes. They are references in a literal sense of the term, so it all
makes sense. I'll stop using the term 'pointer'.

Personally, I think that it's hard to say that they are "references in a
literal sense of the term". After all, as far as I know all uses of the
word "reference" in programming use it in a literal sense of the term.
That is, they refer to something else (as opposed to being the thing to
which they refer).

I think that the real reason to not use the term "pointer" is that that's
not the term that the .NET documentation uses, nor is it the term that all
the other .NET programmers are using. It's really just about using the
right jargon given the context.

As far as conceptualizing what a .NET "reference" is, I suppose that's
more open. Personally, I tend to think of them as "handles", because
that's the term the Mac OS used way back when, when I first ran into the
idea of a relocatable pointer. Of course, in that situation you had to
dereference the handles manually, but the concept was the same.

I guess in reality, the best way to conceptualize a .NET reference is as a
relocatable pointer, since that's really what it is, somehow. I don't
actually know the underlying implementation. Is it actually a pointer to
a pointer, where .NET automatically handles the double dereference for
you? Is it a simple pointer, but one for which .NET maintains some table
of references to copies of the pointer and updates if the underlying block
of memory has to be relocated? The latter seems overly complicated, but
does have certain run-time performance advantages over the former. For
all I know it's some entirely different mechanism from either of those two.

But regardless of the mechanism, a .NET reference is a pointer to a block
of memory that is relocatable. Thus, a relocatable pointer.

Pete

Feb 28 '07 #30

=?UTF-8?B?R8O2cmFuIEFuZGVyc3Nvbg==?=

Peter Duniho wrote:

I think that the real reason to not use the term "pointer" is that
that's not the term that the .NET documentation uses, nor is it the term
that all the other .NET programmers are using. It's really just about
using the right jargon given the context.

The documentation actually uses the term "pointer", but that is used for
pointers. :)

I guess in reality, the best way to conceptualize a .NET reference is as
a relocatable pointer, since that's really what it is, somehow. I don't
actually know the underlying implementation. Is it actually a pointer
to a pointer, where .NET automatically handles the double dereference
for you? Is it a simple pointer, but one for which .NET maintains some
table of references to copies of the pointer and updates if the
underlying block of memory has to be relocated? The latter seems overly
complicated, but does have certain run-time performance advantages over
the former. For all I know it's some entirely different mechanism from
either of those two.

Yes, it's much simpler than any of those two. The inner workings of a
reference is just a pointer, nothing more. What makes a reference
different from a pointer is simply how the compiler allows you to use it.

--
GÃ¶ran Andersson
_____
http://www.guffa.com

Feb 28 '07 #31

Zytan

I don't think that's really what Tom was talking about though. A "C

reference" is a particular syntax that hides the pointer syntax, but still
acts like a pointer. On the other hand, a general "reference" is simply a
way of saying that one thing refers to another. I think it was this more
general meaning that Tom was talking about.

Understood. I am stuck thinking 'reference = pointer', since in C
that's what it is internally. But, you're right, the word reference
just means a reference to something. In C, it happens to be
implemented as a pointer.

Personally, I think that it's hard to say that they are "references in a
literal sense of the term". After all, as far as I know all uses of the
word "reference" in programming use it in a literal sense of the term.
That is, they refer to something else (as opposed to being the thing to
which they refer).

You're right.

My meaning of 'reference' was 'C style pointer hidden reference', and
that's wrong.

I think that the real reason to not use the term "pointer" is that that's
not the term that the .NET documentation uses, nor is it the term that all
the other .NET programmers are using. It's really just about using the
right jargon given the context.

Precisely why I am asking the question. Thanks.

As far as conceptualizing what a .NET "reference" is, I suppose that's
more open. Personally, I tend to think of them as "handles", because
that's the term the Mac OS used way back when, when I first ran into the
idea of a relocatable pointer. Of course, in that situation you had to
dereference the handles manually, but the concept was the same.

I see, much like Win32. It was about the only way to implement data
encapsulation with no OOP in C. Handles would be a good term to use,
as well.

I guess in reality, the best way to conceptualize a .NET reference is as a
relocatable pointer, since that's really what it is, somehow. I don't
actually know the underlying implementation.

Yes, I agree.

But regardless of the mechanism, a .NET reference is a pointer to a block
of memory that is relocatable. Thus, a relocatable pointer.

Thanks Pete

Feb 28 '07 #32

=?UTF-8?B?QXJuZSBWYWpow7hq?=

Peter Duniho wrote:

On Tue, 27 Feb 2007 01:54:38 +0800, Zytan <zy**********@yahoo.comwrote:
>I guess i think of C references as pointers internally. I think of
them as one and the same, so it doesn't make any difference to me.
But, for C#, I'd like to get in-the-know with what the proper
terminology is.

I don't think that's really what Tom was talking about though. A "C
reference" is a particular syntax that hides the pointer syntax, but
still acts like a pointer.

A C++ reference - it is not in C.

Arne

Mar 1 '07 #33

Zytan

I guess i think of C references as pointers internally. I think of

them as one and the same, so it doesn't make any difference to me.
But, for C#, I'd like to get in-the-know with what the proper
terminology is.

I don't think that's really what Tom was talking about though. A "C
reference" is a particular syntax that hides the pointer syntax, but
still acts like a pointer.

A C++ reference - it is not in C.

Yes, sorry. I think I sometimes call C/C++ just as C by accident.
But I did mean C++ reference.

Zytan

Mar 1 '07 #34

Peter Duniho

On Thu, 01 Mar 2007 03:57:17 +0800, GÃ¶ran Andersson <gu***@guffa.com>
wrote:

>
The documentation actually uses the term "pointer", but that is used for
pointers. :)

Sorry...I didn't think it was necessary to explicitly state that I was
simply talking about what the .NET documentation uses to describe
references. Of course it uses the word "pointer" elsewhere, just as it
uses all sorts of other words that also do not describe .NET references.

Yes, it's much simpler than any of those two. The inner workings of a
reference is just a pointer, nothing more. What makes a reference
different from a pointer is simply how the compiler allows you to use it.

It is obviously not "just a pointer". Either the compiler, or the .NET
Framework, or both, hide the inner workers from the programmer. But for
the reference to be relocatable, *some* kind of extra work has to be
done. Whether that is for the reference to be a pointer to a pointer, or
if there is some table of relocatable pointers somewhere, or something
else, I don't know. But clearly for .NET to be able to relocate the
object that a reference refers to, there must be more than just a simple
pointer to that object.

Pete

Mar 1 '07 #35

=?UTF-8?B?R8O2cmFuIEFuZGVyc3Nvbg==?=

Peter Duniho wrote:

On Thu, 01 Mar 2007 03:57:17 +0800, GÃ¶ran Andersson <gu***@guffa.com>
wrote:
>>
The documentation actually uses the term "pointer", but that is used
for pointers. :)

Sorry...I didn't think it was necessary to explicitly state that I was
simply talking about what the .NET documentation uses to describe
references. Of course it uses the word "pointer" elsewhere, just as it
uses all sorts of other words that also do not describe .NET references.

What I mean to say is that if you want to call references pointers, then
it will conflict with pointers, as they are also called pointers.

>Yes, it's much simpler than any of those two. The inner workings of a
reference is just a pointer, nothing more. What makes a reference
different from a pointer is simply how the compiler allows you to use it.

It is obviously not "just a pointer". Either the compiler, or the .NET
Framework, or both, hide the inner workers from the programmer.

Yes, it's just a pointer.

Actually it's the garbage collector that hides the inner workings. It
uses the type information for the data (the same that is used by
reflection) to recognise the references as references. It's only the
garbage collector that moves objects around, so that's the only place
where references has to be handled differently from pointers.

The compiler only limits what you can do with the reference/pointer, it
doesn't add any extra code for it. Once the code is compiled, it doesn't
care if it's a reference or a pointer. The garbage collector takes care
of everything that has to do with moving objects and updating
references, so the application code never has to worry about that.

But for
the reference to be relocatable, *some* kind of extra work has to be
done. Whether that is for the reference to be a pointer to a pointer,
or if there is some table of relocatable pointers somewhere, or
something else, I don't know.

No, it's just a pointer. The only difference is that the data type is
different. Just as you can have a pointer to a string or a pointer to an
array, you have a reference to a string or a reference to an array. It's
just a pointer that is flagged to be a reference.

But clearly for .NET to be able to
relocate the object that a reference refers to, there must be more than
just a simple pointer to that object.

Nope, there is nothing more. The only difference is the type information
for it. As all variables have type information anyway, there is nothing
extra added.

--
GÃ¶ran Andersson
_____
http://www.guffa.com

Mar 1 '07 #36

Peter Duniho

On Thu, 01 Mar 2007 18:28:18 +0800, GÃ¶ran Andersson <gu***@guffa.com>
wrote:

What I mean to say is that if you want to call references pointers, then
it will conflict with pointers, as they are also called pointers.

I don't want to call references pointers. I don't really see how your
previous reply to my post makes any sense. Maybe you meant to respond to
someone else?

Yes, it's just a pointer.

Obviously, it is not "just a pointer".

Just because the compiler knows it as only a pointer, that does not mean
it's "just a pointer".

Actually it's the garbage collector that hides the inner workings. It
uses the type information for the data (the same that is used by
reflection) to recognise the references as references. It's only the
garbage collector that moves objects around, so that's the only place
where references has to be handled differently from pointers.

In other words, .NET has knowledge of where these references are, and when
an object is relocated, it has to go update those values to reflect the
change in address of the object.

Which is essentially what I wrote (as one possible solution to the
implementation of references) in the post to which you disagreed. The
fact that the compiler isn't doing the work does not change the fact that
the reference is not simply a pointer. It's a special type of memory
address that refers to data in memory that is treated differently from an
actual pointer by the .NET Framework. It's not "just a pointer".

[...]
>But clearly for .NET to be able to relocate the object that a reference
refers to, there must be more than just a simple pointer to that object.

Nope, there is nothing more.

Of course there is something more. You went to great lengths to describe
what more there is. Just because the compiler isn't responsible for
handling the "something more", that doesn't mean it's not there. It just
means the compiler doesn't do it.

The only difference is the type information for it. As all variables
have type information anyway, there is nothing extra added.

Of course there is something extra added. The garbage collector moving
things around and updating the references to them is extra.

Pete

Mar 1 '07 #37

=?UTF-8?B?R8O2cmFuIEFuZGVyc3Nvbg==?=

Peter Duniho wrote:

On Thu, 01 Mar 2007 18:28:18 +0800, GÃ¶ran Andersson <gu***@guffa.com>
wrote:

>What I mean to say is that if you want to call references pointers,
then it will conflict with pointers, as they are also called pointers.

I don't want to call references pointers. I don't really see how your
previous reply to my post makes any sense. Maybe you meant to respond
to someone else?

>Yes, it's just a pointer.

Obviously, it is not "just a pointer".

Just because the compiler knows it as only a pointer, that does not mean
it's "just a pointer".

Well, there is nothing more than a pointer there, there is no pointer to
a pointer, or a table of pointers to references. Just the pointer.

In the same way, a pointer is just an integer. The compiler knows that
it can be used to point to a memory location, but it's still just a 32
bit number (on a 32 bit system, of course).

>Actually it's the garbage collector that hides the inner workings. It
uses the type information for the data (the same that is used by
reflection) to recognise the references as references. It's only the
garbage collector that moves objects around, so that's the only place
where references has to be handled differently from pointers.

In other words, .NET has knowledge of where these references are, and
when an object is relocated, it has to go update those values to reflect
the change in address of the object.

Which is essentially what I wrote (as one possible solution to the
implementation of references) in the post to which you disagreed. The
fact that the compiler isn't doing the work does not change the fact
that the reference is not simply a pointer. It's a special type of
memory address that refers to data in memory that is treated differently
from an actual pointer by the .NET Framework. It's not "just a pointer".

>[...]
>>But clearly for .NET to be able to relocate the object that a
reference refers to, there must be more than just a simple pointer to
that object.

Nope, there is nothing more.

Of course there is something more. You went to great lengths to
describe what more there is. Just because the compiler isn't
responsible for handling the "something more", that doesn't mean it's
not there. It just means the compiler doesn't do it.

>The only difference is the type information for it. As all variables
have type information anyway, there is nothing extra added.

Of course there is something extra added. The garbage collector moving
things around and updating the references to them is extra.

Yes, but that is part of the garbage collector. There is nothing extra
added to the reference.

--
GÃ¶ran Andersson
_____
http://www.guffa.com

Mar 1 '07 #38

Zytan

No, it's just a pointer.

This would explain why C# is fast. And, if the GC handles the details
(while pasuing the program in the meantime) then the program need know
nothing more than that it is just a pointer.

If it were doubly indirection, then I think we'd see it being slower
than it is.

Zytan

Mar 1 '07 #39

Zytan

You're speaking the same thing in two different languages.

To the program, it's just a pointer, like a C++ program sees a
pointer. One level of indirection. When it gets to the 32-bit
number, it knows wherever that points to directly in the data. That's
(one reason) why C# is fast.

But, the GC in C# contains all of the extra crap to handle 'more than
just a pointer' on its end, by moving things around (and updating the
pointers, etc).

Does that suffice as a simple explanation?

Zytan

Mar 1 '07 #40

Peter Duniho

On Thu, 01 Mar 2007 21:38:56 +0800, GÃ¶ran Andersson <gu***@guffa.com>
wrote:

>
Well, there is nothing more than a pointer there, there is no pointer to
a pointer, or a table of pointers to references. Just the pointer.

Of course there is. The "table" is inherent in the memory manager (aka
garbage collector). The garbage collector obviously has some mechanism by
which it finds and updates references when objects are relocated. This
mechanism is the "table" (whether it is a literal table, or simply some
other data structure that allows the GC to find the references is
irrelevant...the mechanism exists, regardless).

Yes, but that is part of the garbage collector. There is nothing extra
added to the reference.

Who said anything about "extra added"? The only thing at issue is whether
some mechanism beyond the simple pointer exists. And some mechanism
beyond the simple pointer does exist.

Pete

Mar 2 '07 #41

Peter Duniho

On Fri, 02 Mar 2007 00:15:22 +0800, Zytan <zy**********@yahoo.comwrote:

>No, it's just a pointer.

This would explain why C# is fast.

Except when garbage collecting, of course. With the application having
just a single pointer, that means that when the GC relocates something, it
needs to go around updating all of the pointers that refer to it. If the
application had to do the double-dereference, then the GC would be able to
just update a single pointer and be done.

I presume that the thought is that garbage collection can happen
infrequently enough, and at least to some extent in the background, that
speeding the common case execution nets a gain. Of course, it does also
create synchronization issues, since while an object is being relocated
and all those pointers are being updated, .NET needs to ensure that no
code using that reference tries to access it.

Performance might be better the way .NET does it, but certainly the old
Mac-style "handle" paradigm was much simpler, and required much less
memory overhead (and back then, with memory space always being at a
premium, reducing data structure and code size was always the highest
priority). :)

If it were doubly indirection, then I think we'd see it being slower
than it is.

Execution of one's own code would definitely be slower, I agree. But
garbage collection could be much faster. It's just a matter of optimizing
for the most effective case (one hopes that in this scenario, the .NET
designers got it right...I assume they did :) ).

Pete

Mar 2 '07 #42

=?UTF-8?B?R8O2cmFuIEFuZGVyc3Nvbg==?=

Peter Duniho wrote:

On Thu, 01 Mar 2007 21:38:56 +0800, GÃ¶ran Andersson <gu***@guffa.com>
wrote:
>>
Well, there is nothing more than a pointer there, there is no pointer
to a pointer, or a table of pointers to references. Just the pointer.

Of course there is. The "table" is inherent in the memory manager (aka
garbage collector). The garbage collector obviously has some mechanism
by which it finds and updates references when objects are relocated.
This mechanism is the "table" (whether it is a literal table, or simply
some other data structure that allows the GC to find the references is
irrelevant...the mechanism exists, regardless).

>Yes, but that is part of the garbage collector. There is nothing extra
added to the reference.

Who said anything about "extra added"? The only thing at issue is
whether some mechanism beyond the simple pointer exists. And some
mechanism beyond the simple pointer does exist.

Pete

Yes, there is a mechanism, but that's all in the garbage collector, as I
am saying over and over. There is no mechanism in the application code
that handles a reference any different from a pointer.

There is type information that the garbage collector uses, but that is
nothing special for references, that information exists for all
variables. The reference is just a pointer, there is nothing more than
the pointer, and the only thing that makes it a reference is how it's used.

--
GÃ¶ran Andersson
_____
http://www.guffa.com

Mar 2 '07 #43

Peter Duniho

On Fri, 02 Mar 2007 14:28:30 +0800, GÃ¶ran Andersson <gu***@guffa.com>
wrote:

Yes, there is a mechanism, but that's all in the garbage collector, as I
am saying over and over. There is no mechanism in the application code
that handles a reference any different from a pointer.

I never said the mechanism had to be in the application. I am simply at a
loss as to why you would invest so much time rebutting a point that I
never actually made

There is type information that the garbage collector uses, but that is
nothing special for references, that information exists for all
variables.

Yet, the mechanism by which the garbage collector *uses* this information
is entirely specific to the implementation of references. There's a bunch
of code in there that exists for the sole purpose of updating references
when the objects are relocated. It's disingenous to claim that there's
"nothing extra".

The reference is just a pointer, there is nothing more than the pointer,
and the only thing that makes it a reference is how it's used.

There is plenty more than the pointer. The code in the garbage collector,
and the data it uses, are all "more than the pointer".

Pete

Mar 2 '07 #44

=?UTF-8?B?R8O2cmFuIEFuZGVyc3Nvbg==?=

Peter Duniho wrote:

On Fri, 02 Mar 2007 00:15:22 +0800, Zytan <zy**********@yahoo.comwrote:

>>No, it's just a pointer.

This would explain why C# is fast.

Except when garbage collecting, of course. With the application having
just a single pointer, that means that when the GC relocates something,
it needs to go around updating all of the pointers that refer to it. If
the application had to do the double-dereference, then the GC would be
able to just update a single pointer and be done.

I presume that the thought is that garbage collection can happen
infrequently enough, and at least to some extent in the background, that
speeding the common case execution nets a gain.

Yes, as the program code has to do nothing at all to manage the
references, that means that updating a reference is perhaps twice as
efficient in .NET as in an environment that uses for example reference
counting. If let's say 5% of the code is doing this, then you save 5%
execution time that can be used for garbage collection, which of course
is way more than is normally used.

Of course, it does also
create synchronization issues, since while an object is being relocated
and all those pointers are being updated, .NET needs to ensure that no
code using that reference tries to access it.

This is done by simply freezing the application during the garbage
collection. This is normally not a problem at all, as the same thing is
done all the time by windows on a thread level for multitasking. The
only difference when the garbage collector does it from when the windows
dispatcher does it, is that it freezes all the threads in the
application. So, when an application is running in a single thread,
there is no difference at all.

Performance might be better the way .NET does it, but certainly the old
Mac-style "handle" paradigm was much simpler, and required much less
memory overhead (and back then, with memory space always being at a
premium, reducing data structure and code size was always the highest
priority). :)

Well, how much memory overhead is there really? The garbage collected
model doesn't need a table of handles, and there is no reference counter
in each object. There is just type information for the variables, and
that can't add up to much.

With the garbage collected model, all the memory management is done by
the garbage collector. Hopefully this means that as the garbage
collector has all the information, it is able to do better descisions
about the memory management.

In a reference counting model, each object would be freed at the instant
that the reference counter reaches zero (which means that clearing a
reference is very unpredictable timewise), while in the garbage
collected model an entire heap generation is cleared all at once.
Usually this means moving away the few object that still are used so
that the entire heap generation can be wiped clean.

As the most active heap generation is regularly cleaned out completely,
this vastly reduces memory fragmentation.

>If it were doubly indirection, then I think we'd see it being slower
than it is.

Execution of one's own code would definitely be slower, I agree. But
garbage collection could be much faster. It's just a matter of
optimizing for the most effective case (one hopes that in this scenario,
the .NET designers got it right...I assume they did :) ).

Yes, probably. As the applicataion code is running most of the time, and
the garbage collector only occasionally (from a computer point of view),
optimising the application code seems to be the obvious choise.

Even if the garbage collector has to to ten times the work to make up
for the lack of a object handle table or a reference reference table,
that should be nothing compared to making every reference operation in
the application code twice as fast.

--
GÃ¶ran Andersson
_____
http://www.guffa.com

Mar 2 '07 #45

Peter Duniho

On Fri, 02 Mar 2007 17:05:42 +0800, GÃ¶ran Andersson <gu***@guffa.com>
wrote:

Yes, as the program code has to do nothing at all to manage the
references, that means that updating a reference is perhaps twice as
efficient in .NET as in an environment that uses for example reference
counting.

It seems you are again responding to points I did not make. I never once
brought up the question of "reference counting", nor do I find it relevant
to the discussion I'm participating in.

The .NET method is nowhere near as efficient in *updating* as the old Mac
"handles" method. Using handles, a single value needs to be updated.
That's it. It's a few instructions at most. This is WAY faster than .NET
running through all of the data structures in the memory manager, finding
those that refer to a given object and modifying them.

[...]
>Performance might be better the way .NET does it, but certainly the old
Mac-style "handle" paradigm was much simpler, and required much less
memory overhead (and back then, with memory space always being at a
premium, reducing data structure and code size was always the highest
priority). :)

Well, how much memory overhead is there really?

Above and beyond what .NET already requires? Practically none in the form
of data storage, as far as I know. But that only happens because .NET has
an enormous data overhead already. It's only because each and every
variable "knows" what it's doing that the garbage collector can avoid
having to store any additional data.

However, there is still the code required to do the actual garbage
collection. Even when one discounts the data storage requirements, the
code required to navigate the existing data and update the references is
far greater than the handle method used in the old Mac OS.

With the garbage collected model, all the memory management is done by
the garbage collector. Hopefully this means that as the garbage
collector has all the information, it is able to do better descisions
about the memory management.

I see at least two major advantages to a garabage collection paradigm:

* Lazy freeing of resources means that in many cases, memory
management overhead is lower during key performance-critical code paths

* Reduced address space fragmentation (ie the virtual memory
equivalent to the reduced RAM fragmentation that the old Mac OS "handle"
paradigm was so important for)

The latter, of course, has only recently been very important. When the
Win32 first came along, no application came close to using the full 2GB of
virtual address space, nor did any application allocate large enough
blocks of virtual address space to risk failing due to address space
fragmentation. Things have change, of course (though, ironically .NET has
only started to gain a foothold just as 64-bit Windows is nearing
achieving mainstream status, eliminating the address space
fragmentationissue again :) ).

I see as a minor benefit the memory manager's ability to use object type
information for the purpose of memory management. Yes the additional
information is useful, but I don't see it as a huge leap. We got along
fine without taking advantage of that information...applications would
generally implement their own layer on top of the Windows memory
management, if they needed (for example) to keep objects of the same time
within a single block of virtual address space (one way to help avoid
fragmentation issues, also for example).

In a reference counting model, each object would be freed at the instant
that the reference counter reaches zero (which means that clearing a
reference is very unpredictable timewise), while in the garbage
collected model an entire heap generation is cleared all at once.
Usually this means moving away the few object that still are used so
that the entire heap generation can be wiped clean.

Not that I mentioned reference counting in the first place, but yes...I
agree that the .NET paradigm by its very design avoids some of the
pitfalls of reference counting (one major one being the classic problem of
the reference count being incorrect).

As the most active heap generation is regularly cleaned out completely,
this vastly reduces memory fragmentation.

In theory, garabage collection should practically eliminate virtual
address space fragmentation (the virtual memory manager handles the
avoidance of physical memory fragmentation). Only objects that have been
locked will present a problem, and one hopes that an application is wise
enough to minimize those. That said, I don't see that as relevant to the
question of reference counting (since you brought it up). That is, even a
garbage collection scheme based on reference counting can avoid
fragmentation in the same way. It's the garbage collection that is
important, not how memory objects are tracked.

[...]
Yes, probably. As the applicataion code is running most of the time, and
the garbage collector only occasionally (from a computer point of view),
optimising the application code seems to be the obvious choise.

One hopes. The obvious counter-example is where the garbage collection,
though it happens infrequently, takes an inordinate amount of time.
Presumably this isn't the case in actual implementation, but the "most of
the time" versus "only occasionally" difference in no way guarantees a
desirable performance outcome.

Even if the garbage collector has to to ten times the work to make up
for the lack of a object handle table or a reference reference table,
that should be nothing compared to making every reference operation in
the application code twice as fast.

Well, I haven't measured, but I'd say the garbage collector has to work a
LOT harder than just ten times, at least compared to the old Mac OS
"handle" paradigm.

In the "handle" paradigm, the garbage collector is basically O(n); the
collector can simply scan through the handle table, coalescing the objects
and updating each handle once. In the .NET paradigm, the collector has to
either scan the entire collection of data for each object it wants to
move, or it has to maintain some sort of hash table or other fast-access
data structure in which it stores (at least temporarily) all of the
current references to each given object (which is essentially the
"reference reference table" method anyway). In the former case, that's
essentially an O(n^2) algorithm, which as you know is much slow than an
O(n) algorithm. In the latter case, the premise that the garbage
collector requires no extra data is invalidated (in other words, is
irrelevant to this discussion, since you have made the claim that no extra
data is required by the .NET scheme).

(The O() above obviously doesn't take into account the actual copying of
data when moving objects...that work is identical regardless of the
collection scheme, so I don't find it relevant).

I don't actually know which scheme is used by .NET, but it's clear to me
that however it is done, garbage collection is a lot more costly under
..NET than it would be under a "handle" scheme. Likely way more than just
10x difference. Do we still come out ahead? Probably. Heck, I have to
assume so because I love using .NET and would hate to learn that it's
moved backwards in any way. :) But I can't say that it's an obvious
given.

Pete

Mar 2 '07 #46

Zytan

Even if the garbage collector has to to ten times the work to make up

for the lack of a object handle table or a reference reference table,
that should be nothing compared to making every reference operation in
the application code twice as fast.

And in most cases, I bet there is only 1 reference to the object, so
if the GC really needs to move things, which I imagine it doesn't very
often, there's only 1 32-bit pointer it needs to update.

You guys are very knowledgeable, and are really speaking the same
language. I think you any argument between you is more on
definition. I tend to think like Goran, in the way that, from the
program's perspective (or from the CPU's perspective), a pointer is
just a 32-bit address stating where an object is. Yes, there's "more"
to it on the GC side, but there's not "more" to it in terms of the CPU
saying "hey, here's a pointer, let's go to the address it has stored
to get my object". And in that manner, C# is FAST.

Yes, possibly, it is doubly-indirected meaning that the GC need only
update ONE reference to it, rather than potentially many, but, how
often does that occur, and how much more quick would that have to be
to overcome the slowdown of dual indirection for every reference.

I do have faith that the C# designers, being so smart as they are, got
it right.

Thanks for the very informative discussion, guys. I am enjoying this.

Zytan

Mar 2 '07 #47

Zytan

I don't actually know which scheme is used by .NET, but it's clear to me

that however it is done, garbage collection is a lot more costly under
.NET than it would be under a "handle" scheme. Likely way more than just
10x difference. Do we still come out ahead? Probably. Heck, I have to
assume so because I love using .NET and would hate to learn that it's
moved backwards in any way. :) But I can't say that it's an obvious
given.

Pete, consider that it is optimized for the most gain as a whole.
Meaning, if most often, things are only referenced once or twice, then
that's more important to make them fast than anything else, since
that's where the speed will be gained or lost. 10x to update a single
reference? Yes, i know that single reference must be stored somehow
that allows other references to be in there, as well, in some kind of
data structure, or hash, or something, but, still. And, as you said,
even if it was 10x as bad as the "handle" scheme, we're probably still
ahead.

Also, isn't it unfair to compare it to the "handle" scheme? Shouldn't
you be comparing it to whatever other alternative the C# could have
used? Or, are you claiming the "handle" scheme IS something C# could
have used (I may have missed that)?

Zytan

Mar 2 '07 #48

Willy Denoyette [MVP]

"Zytan" <zy**********@yahoo.comwrote in message
news:11**********************@z35g2000cwz.googlegr oups.com...

>Even if the garbage collector has to to ten times the work to make up
for the lack of a object handle table or a reference reference table,
that should be nothing compared to making every reference operation in
the application code twice as fast.

And in most cases, I bet there is only 1 reference to the object, so
if the GC really needs to move things, which I imagine it doesn't very
often, there's only 1 32-bit pointer it needs to update.

You guys are very knowledgeable, and are really speaking the same
language. I think you any argument between you is more on
definition. I tend to think like Goran, in the way that, from the
program's perspective (or from the CPU's perspective), a pointer is
just a 32-bit address stating where an object is. Yes, there's "more"
to it on the GC side, but there's not "more" to it in terms of the CPU
saying "hey, here's a pointer, let's go to the address it has stored
to get my object". And in that manner, C# is FAST.

Yes, possibly, it is doubly-indirected meaning that the GC need only
update ONE reference to it, rather than potentially many, but, how
often does that occur, and how much more quick would that have to be
to overcome the slowdown of dual indirection for every reference.

I do have faith that the C# designers, being so smart as they are, got
it right.

Thanks for the very informative discussion, guys. I am enjoying this.

Zytan

Sorry, I don't want to sound rude, but you got it wrong, each object reference is held in a
program variable, and this variable can actually exist in a register, on the stack, in the
Finalizer list or in the Handle table.
N references can point to the same object in the GC heap, see sample [1].
The JIT helps the GC, by updating a table (the GCInfo table) in which he stores the
aliveness state of the variables holding object references at JIT compile time (per method).
Note that the JIT doesn't keep track of this for each machine instruction, only those that
can possibly trigger a GC are kept in the GCInfo table.
All the GC has to do is inspect the GCInfo table and start walking the stack(s) and the
handle table to find the references to dead objects and reset these references (set to
null). When done, he can start a compactation of the heap, hereby updating the life
references of the moved objects, in the stack and the handle table. Note that I've left-out
some details but at large that's it.

[1]
class C {
int i;
}
....
void Foo()
{
C c = new C();
C c1 = c
(1)
....
}
Here at point (1), the stack (or registers) will hold (at least) two reference to same
instance of C.

Willy.

Mar 2 '07 #49

Zytan

Sorry, I don't want to sound rude, but you got it wrong,

Telling me i'm wrong is not rude, so don't worry :)

each object reference is held in a
program variable, and this variable can actually exist in a register, on the stack, in the
Finalizer list or in the Handle table.

Ok.

N references can point to the same object in the GC heap, see sample [1].

Yes.

The JIT helps the GC, by updating a table (the GCInfo table) in which he stores the
aliveness state of the variables holding object references at JIT compile time (per method).

Ok.

Note that the JIT doesn't keep track of this for each machine instruction, only those that
can possibly trigger a GC are kept in the GCInfo table.
All the GC has to do is inspect the GCInfo table and start walking the stack(s) and the
handle table to find the references to dead objects and reset these references (set to
null). When done, he can start a compactation of the heap, hereby updating the life
references of the moved objects, in the stack and the handle table. Note that I've left-out
some details but at large that's it.

[1]
class C {
int i;}

...
void Foo()
{
C c = new C();
C c1 = c
(1)
...}

Here at point (1), the stack (or registers) will hold (at least) two reference to same
instance of C.

Yes.

Ok, while my explanation was very general, and as far as I can tell,
you explained the same thing, except in much more technical detail
(details that I didn't know). In general, this is what I suspected
was going on. Maybe I was unclear.

Thanks,

Zytan

Mar 2 '07 #50

Similar topics