Garbage collectable pinned arrays!

Atmapuri

Hi!

It would be great, if the pinned arrays would be garbage
collectable. This would greatly reduce the amount of copy
back and forth the managed and unmanaged memory.
I cant think of a single reason, why the GC should not allow
this.

It would be also great, if the memory could already be allocated
as pinned:

double[] array= new pinned uninitialized double [500];

where uninitialized would mean that it is not set to zero, thus
saving processing power where GC pressure is high.

Thanks!
Atmapuri

Feb 12 '08 #1

Subscribe Post Reply

3108

Jesse McGrew

On Feb 11, 11:51 pm, "Atmapuri" <janez.makov...@usa.netwrote:

Hi!

It would be great, if the pinned arrays would be garbage
collectable. This would greatly reduce the amount of copy
back and forth the managed and unmanaged memory.
I cant think of a single reason, why the GC should not allow
this.

It would be also great, if the memory could already be allocated
as pinned:

double[] array= new pinned uninitialized double [500];

where uninitialized would mean that it is not set to zero, thus
saving processing power where GC pressure is high.

As I understand it, pinning is an attribute of the *reference*, not
the object itself. The object is pinned when there's a pinning
reference to it anywhere on the call stack.

Thus, it doesn't make sense for a pinned object to be garbage
collectible. The object can't be considered garbage anyway while
you're still holding a reference to it, and once you let go of the
last reference, it's no longer pinned.

I don't see why you'd even want it to be collectible, actually. The
point of pinning is to let unmanaged code access the object without
worrying that the GC will move it. But collecting the object and
letting its memory be used for something else is just as dangerous as
moving it!

Jesse

Feb 12 '08 #2

Jesse McGrew

On Feb 12, 3:55 am, "Atmapuri" <janez.makov...@usa.netwrote:

Hi!

As I understand it, pinning is an attribute of the *reference*, not
the object itself. The object is pinned when there's a pinning
reference to it anywhere on the call stack.

Thus, it doesn't make sense for a pinned object to be garbage
collectible. The object can't be considered garbage anyway while
you're still holding a reference to it, and once you let go of the
last reference, it's no longer pinned.

I don't see why you'd even want it to be collectible, actually. The
point of pinning is to let unmanaged code access the object without
worrying that the GC will move it. But collecting the object and
letting its memory be used for something else is just as dangerous as
moving it!

You are mixing two points:

- reference to unamanged memory, where the word pinned also
means that it won't be collected.
- location of the array in GC (Heap or else).

When the array is pinned it is copied to heap. All I would like
to see is an option to allocate the array on the heap initially.

The array is automatically allocated on the heap once it exceeds
a certain size and thu's becomes "pinned", because the heap is
never compacted and all addresses are absolute.

Really? I thought arrays were always allocated on the heap, like any
other reference type. And why do you say the heap is never compacted?

Jesse

Feb 12 '08 #3

Atmapuri

Hi!

Where is that copying taking place though? Pinning doesn't inherently
involve copying.

Yes. It does. All arrays shorter than the limit are allocated inside
the "compactable" part of the GC heap and thus have to be copied
out of the "compactable" heap before they can have a "fixed" address.

Thanks!
Atmapuri

Feb 12 '08 #4

Jon Skeet [C# MVP]

On Feb 12, 1:55 pm, "Atmapuri" <janez.makov...@usa.netwrote:

Where is that copying taking place though? Pinning doesn't inherently
involve copying.

Yes. It does. All arrays shorter than the limit are allocated inside
the "compactable" part of the GC heap and thus have to be copied
out of the "compactable" heap before they can have a "fixed" address.

I was under the impression that pinned objects live where they are,
but potentially cause heap fragmentation:
http://msdn.microsoft.com/msdnmag/is...t/default.aspx

That's still a performance hit, but it's not copying. Do you have
evidence that pinning involves copying?

Jon

Feb 12 '08 #5

Atmapuri

Hi!

That's still a performance hit, but it's not copying. Do you have
evidence that pinning involves copying?

It does copying and it does cause "fragmentation" and performance hit
due to fragmentation. Both at once. (But that fragmentation issue, is
actually the same fragmenation issue as with all unmanaged code apps
including
Windows.)

However, the copying is from small object heap to the large object heap,
And the fragmentation is in the large object heap, because the small object
heap which is compactable of course can not be fragmented.

You can check that arrays are copied by allocating ever
larger arrays and pinning them down and measuring the time it takes
to do that for each array size.

You will see that beyond certain array size, the pinning cost becomes
zero. The timings must be normalized with the array length however,
otherwise
it is harder to see. When the pinning cost becomes zero this means that
the array is so large that it is now allocated on the large object heap
from the start and needs not to be copied there anymore.

If however, there would be a language feature which would allow you
to specify that the array could be allocated on the large object
heap regardless of its size.... you could save yourself a lot of
copy operations when interfacing unmanaged code.

Currently the GC decides where the array goes:
- in the small object heap which is compactable
- large object heap which is not compactable

Please give us a C# language feature where the programmer
decides where the arrays go, because only the programmer
knows how will they be used.

Thanks!
Atmapuri

Feb 12 '08 #6

Atmapuri

Hi!

Frankly, when you require that much control over how memory is used,
I'd consider writing unmanaged code instead.

I thought Microsoft is willing to listen to the problems of the customers
and work together to improve both their products and customer
satisfaction.

Thanks!
Atmapuri

Feb 12 '08 #7

Atmapuri

Hi!

I'd very much like to know where you have this from.

Common knowledge? <g>

I tried the following code:

Object o = 10;
Debug.WriteLine(o);
GCHandle h = GCHandle.Alloc(o, GCHandleType.Pinned);
Debug.WriteLine(o);
Text = ((Int32)h.AddrOfPinnedObject()).ToString("X8");
h.Free();

// Just for security
GC.KeepAlive(o);
GC.KeepAlive(h);

I am using array's larger than 80 bytes up to 100KBytes.
Pinning 4byte large objects or arrays is handled vastly
differently by GC. Time something like this:

for (int k = 0; k < GCIterCount; k++)
{
testArray = new double[testArrayLength];
testArray[2] = 2;
}

Such that GCIterCont*testArrayLength is a constant value.
When testArrayLength reaches 1024 elements (80kBytes) you will see a big
jump in the cost of the allocation. All timings must be normalized with
(GCIterCont*testArrayLength). That will give allocation cost
per element as a function of the array length.

Then add a second series to the chart:

for (int k = 0; k < GCIterCount; k++)
{
testArray = new double[testArrayLength];
GCHandle h = GCHandle.Alloc(testArray, GCHandleType.Pinned);
testArray[2] = 2;
h.Free();
}

and compare it with the first series. Beyond 1024 elements,
there won't be any difference in the cost. But before array
length of 1024, the difference will be large and fairly constant.

Thanks!
Atmapuri

Feb 12 '08 #8

Willy Denoyette [MVP]

"Atmapuri" <ja************@usa.netwrote in message
news:E6**********************************@microsof t.com...

Hi!

>That's still a performance hit, but it's not copying. Do you have
evidence that pinning involves copying?

It does copying and it does cause "fragmentation" and performance hit
due to fragmentation. Both at once. (But that fragmentation issue, is
actually the same fragmenation issue as with all unmanaged code apps
including
Windows.)

However, the copying is from small object heap to the large object heap,
And the fragmentation is in the large object heap, because the small
object
heap which is compactable of course can not be fragmented.

What makes you believe that a small object is moved to the LOH when it gets
pinned, any proof of evidence?

You can check that arrays are copied by allocating ever
larger arrays and pinning them down and measuring the time it takes
to do that for each array size.

Array's are created at new when enlarged, and finaly when the exeed
85KBytes, they get allocated on the LOH, but this has nothing to with
pinning.

You will see that beyond certain array size, the pinning cost becomes
zero. The timings must be normalized with the array length however,
otherwise
it is harder to see. When the pinning cost becomes zero this means that
the array is so large that it is now allocated on the large object heap
from the start and needs not to be copied there anymore.

What are you talking about ? What's the pinning cost? How did you measure.

If however, there would be a language feature which would allow you
to specify that the array could be allocated on the large object
heap regardless of its size.... you could save yourself a lot of
copy operations when interfacing unmanaged code.

The LOH is meant to be used for large objects, and is never compacted, that
means it's a candidate number one for fragmentation issues. The generations
0, 1 and 2 heap gets compacted and is quasi free from fragmentation issues,
unless you keep objects pinned for a (too) large period of time.

Currently the GC decides where the array goes:
- in the small object heap which is compactable
- large object heap which is not compactable

Please give us a C# language feature where the programmer
decides where the arrays go, because only the programmer
knows how will they be used.

To resume, it doesn't copy the array when pinning, it simply sets a bit in
the object header and it stores the reference to the object in the "Object
Reference" table, this tells GC that the object is pinned and should not be
moved. When the CLR unpins the object it resets the header bit "nulls" the
reference in the "Object Reference" table.

Willy.

Feb 12 '08 #9

Ben Voigt [C++ MVP]

The LOH is meant to be used for large objects, and is never

compacted, that means it's a candidate number one for fragmentation
issues. The generations 0, 1 and 2 heap gets compacted and is quasi
free from fragmentation issues, unless you keep objects pinned for a
(too) large period of time.

And for interop buffers with a long lifetime, the cost of pinning is very
high, yet the fragmentation impact wouldn't be a problem at all.

For example, I have an application which uses glVertexPointer. I need to
interop that buffer every single frame for the life of my program, there's
no reason it shouldn't sit in the LOH and avoid the overhead of pinning
(both direct cost and extra fragmentation of Gen0 heap, because the buffer
is pinned it can never move to Gen2).

Feb 12 '08 #10

Atmapuri

Hi!

And for interop buffers with a long lifetime, the cost of pinning is very
high, yet the fragmentation impact wouldn't be a problem at all.

For example, I have an application which uses glVertexPointer. I need to
interop that buffer every single frame for the life of my program, there's
no reason it shouldn't sit in the LOH and avoid the overhead of pinning
(both direct cost and extra fragmentation of Gen0 heap, because the buffer
is pinned it can never move to Gen2).

God bless you :) Finally a real man (!)
Atmapuri

Feb 12 '08 #11

Ben Voigt [C++ MVP]

122100: 0,297616

123100: 0,31328
124100: 0,297616
125100: 0,297616
126100: 0,297616
127100: 0,281952
128100: 0,31328
129100: 0,297616
130100: 0,31328
131100: 0,297616
132100: 0,31328
133100: 0,297616
134100: 0,297616
135100: 0,31328
136100: 0,297616
137100: 0,297616
138100: 0,297616
139100: 0,328944
140100: 0,297616
141100: 0,297616
142100: 0,31328
143100: 0,31328
144100: 0,297616
145100: 0,31328

The values are around 0.3 regardless of how large or small the array
is, so clearly (to me), pinning seems to have a reasonable constant
cost.

From the first column, it looks like index is plenty big to put the arrays
in the LOH.

Feb 12 '08 #12

Jeroen Mostert

Ben Voigt [C++ MVP] wrote:

And I still don't understand why .NET considers use of a pointer unsafe...
only casts or pointer arithmetic can ever be unsafe, only the operations
which create a new pointer, and only if not bounds checked.

Which leaves... what useful operation you could do with pointers that
couldn't be done with references?

You *can* eliminate all unsafe operations (and there sure are a lot of
them), but when you've done that, what you get is managed code. Safe managed
pointers are references... OK, garbage collection is orthogonal to that, but
the discussion's already been there. Pointers as they are are not safe. You
could have references to unmoveable objects, but it would be more akin to a
C++ reference than a C++ pointer.

--
J.

Feb 13 '08 #13

Ben Voigt [C++ MVP]

Jeroen Mostert wrote:

Ben Voigt [C++ MVP] wrote:
>And I still don't understand why .NET considers use of a pointer
unsafe... only casts or pointer arithmetic can ever be unsafe, only
the operations which create a new pointer, and only if not bounds
checked.
Which leaves... what useful operation you could do with pointers that
couldn't be done with references?

Return a pointer to a single element of a member array.

Feb 13 '08 #14

Ben Voigt [C++ MVP]

>That's not an unusual case. I've already given two examples of APIs

>in widespread use which require a buffer to stay in one position
after the initial function call which accepts the pointer.

True, but if you need this, why is the cost of pinning so important?
The cost of GCHandle.Alloc is ~5500 cycles. That means a one time
cost to pin a buffer that lives until the end of the process, if you
do this early in the process you won't suffer from fragmentation of
the gen0 heap as this object will end on the gen2 heap anyway.

That's what I do now.

But doesn't the object need to be moved to end up in gen2 data space? Won't
the pinning reference prevent that?

>

>>Also, you keep ignoring my remark that the fact that addresses of
*Large* objects are fixed is a convenience of the current version of
the CLR, nothing stops MS from changing this.

Which is why the OP is asking for a keyword / MSIL flag that will
let the runtime know that the object is intended to be fixed for as
long as it lives. It would be an implementation detail whether the
memory is allocated from the LOH, OLE task allocator, etc, etc. Also I
don't think that sacrificing GC for such objects would
necessarily be a big loss, they either will live to the end of the
process anyway, or they can be explicitly freed.

But , this is what "fixed" is meant for, sure, it's scope is limited
by it's containing function scope, but you can perfectly pin an
object across several unmanaged function calls.

But it needs an "unsafe" block, for no apparent reason.

Feb 13 '08 #15

Willy Denoyette [MVP]

"Ben Voigt [C++ MVP]" <rb*@nospam.nospamwrote in message
news:%2****************@TK2MSFTNGP02.phx.gbl...

>>That's not an unusual case. I've already given two examples of APIs
in widespread use which require a buffer to stay in one position
after the initial function call which accepts the pointer.

True, but if you need this, why is the cost of pinning so important?
The cost of GCHandle.Alloc is ~5500 cycles. That means a one time
cost to pin a buffer that lives until the end of the process, if you
do this early in the process you won't suffer from fragmentation of
the gen0 heap as this object will end on the gen2 heap anyway.

That's what I do now.

But doesn't the object need to be moved to end up in gen2 data space?
Won't the pinning reference prevent that?

No, but all depends on the sizes of the individual generations, that's why I
say pin your buffers early in the process and after a GC.Collect().
You can verify the location of your pinned buffer, by attaching to a native
debugger like nsdb or windbg..
Once attached, enter:
!dumpheap -type <yourPinnedType>
this returns the object's address (and some other info), something like:
....
Address MT Size
02721660 79131a68 1612
....

To find the generation where your object "lives" , you'll have to compare
it's address with the start address of the generations returned by:

!eeheap -gc
....
Number of GC Heaps: 1
generation 0 starts at 0x0272d230
generation 1 starts at 0x02721cac
generation 2 starts at 0x02721000
....

here the object at address 02721660 falls in the Gen2 range.
Pinning an object that follows a couple of MB's of other live objects, has
little chance to end on the Gen2, unless the GC can move the object to gen2
at the moment of "pinning".

>>

>>>Also, you keep ignoring my remark that the fact that addresses of
*Large* objects are fixed is a convenience of the current version of
the CLR, nothing stops MS from changing this.

Which is why the OP is asking for a keyword / MSIL flag that will
let the runtime know that the object is intended to be fixed for as
long as it lives. It would be an implementation detail whether the
memory is allocated from the LOH, OLE task allocator, etc, etc. Also I
don't think that sacrificing GC for such objects would
necessarily be a big loss, they either will live to the end of the
process anyway, or they can be explicitly freed.

But , this is what "fixed" is meant for, sure, it's scope is limited
by it's containing function scope, but you can perfectly pin an
object across several unmanaged function calls.

But it needs an "unsafe" block, for no apparent reason.

"unsafe" blocks are needed when using pointers!
This produces non-verifiable code, so no surprises here, you are warned.....

Willy.

Feb 13 '08 #16

Ben Voigt [C++ MVP]

Willy Denoyette [MVP] wrote:

"Ben Voigt [C++ MVP]" <rb*@nospam.nospamwrote in message
news:%2****************@TK2MSFTNGP02.phx.gbl...

>>>That's not an unusual case. I've already given two examples of
APIs in widespread use which require a buffer to stay in one
position after the initial function call which accepts the pointer.
True, but if you need this, why is the cost of pinning so important?
The cost of GCHandle.Alloc is ~5500 cycles. That means a one time
cost to pin a buffer that lives until the end of the process, if you
do this early in the process you won't suffer from fragmentation of
the gen0 heap as this object will end on the gen2 heap anyway.

That's what I do now.

But doesn't the object need to be moved to end up in gen2 data space?
Won't the pinning reference prevent that?

No, but all depends on the sizes of the individual generations,
that's why I say pin your buffers early in the process and after a
GC.Collect(). You can verify the location of your pinned buffer, by
attaching to a
native debugger like nsdb or windbg..
Once attached, enter:
!dumpheap -type <yourPinnedType>
this returns the object's address (and some other info), something
like: ...
Address MT Size
02721660 79131a68 1612
...

To find the generation where your object "lives" , you'll have to
compare it's address with the start address of the generations
returned by:
!eeheap -gc
...
Number of GC Heaps: 1
generation 0 starts at 0x0272d230
generation 1 starts at 0x02721cac
generation 2 starts at 0x02721000
...

here the object at address 02721660 falls in the Gen2 range.
Pinning an object that follows a couple of MB's of other live
objects, has little chance to end on the Gen2, unless the GC can move
the object to gen2 at the moment of "pinning".

Ok, so changing the generation of an object is done by moving the generation
boundary in memory, not by actually changing the object's address. I guess
that makes good sense.

>

>>>

Also, you keep ignoring my remark that the fact that addresses of
*Large* objects are fixed is a convenience of the current version
of the CLR, nothing stops MS from changing this.

Which is why the OP is asking for a keyword / MSIL flag that will
let the runtime know that the object is intended to be fixed for as
long as it lives. It would be an implementation detail whether the
memory is allocated from the LOH, OLE task allocator, etc, etc.
Also I don't think that sacrificing GC for such objects would
necessarily be a big loss, they either will live to the end of the
process anyway, or they can be explicitly freed.
But , this is what "fixed" is meant for, sure, it's scope is limited
by it's containing function scope, but you can perfectly pin an
object across several unmanaged function calls.

But it needs an "unsafe" block, for no apparent reason.

"unsafe" blocks are needed when using pointers!
This produces non-verifiable code, so no surprises here, you are
warned.....

GCHandle.Alloc and Marshal.UnsafeAddrOfPinnedArrayElement are verifiable and
need no unsafe block.

>
Willy.

Feb 14 '08 #17

Garbage collectable pinned arrays!

Similar topics