Roll your own std::vector ???

Peter Olcott

I need std::vector like capability for several custom classes. I already
discussed this extensively in the thread named ArrayList without Boxing and
Unboxing. The solution was to simply create non-generic (non C++ template)
std::vector like capability for each of these custom classes. (Solution must
work in Visual Studio 2002).

Since I have already written one std::vector for a YeOlde C++ compiler (Borland
C++ 1.0) that had neither templates nor STL, I know how to do this. What I don't
know how to do is to directly re-allocate memory in the garbage collected C#.

I have written what I need in pseudocode, what is the correct C# syntax for
this?

int Size;
int Capacity;

bool AppendDataItem(DataItemType Data) {
if (Size == Capacity) {
(1) Capacity = Capacity * 2; // Or * 1.5
(2) Temp = MemoryPointer;
(3) MemoryPointer = Allocate(Capacity);
(4) Copy Data from Temp to MemoryPointer;
(5) DeAllocate(Temp);
(6) MemoryPointer[Size] = Data;
(7) Size++;
}
}

Dec 17 '06

Subscribe Post Reply

3917

Jon Skeet [C# MVP]

Peter Olcott <No****@SeeScreen.comwrote:

Changing the benchmark code to use a reference type instead of a value
type (i.e. removing the boxing/unboxing entirely) makes very little
difference to the performance of the code. (In fact, on my box it
actually makes it worse - I'm not entirely sure why at the moment.)

The most time critical aspect of my application spends about 99% of its time
comparing elements in one array to elements in other arrays. This subsystem can
tolerate no additional overhead at all, having a real-time constraint of 1/10
second.

a) arrays aren't the same as lists
b) comparing elements in an array isn't the same as adding/removing
elements to/from the start of a list

So the benchmark you've been basing your 500% figure on is irrelevant
to your code.

Try benchmarking your *actual code* rather than relying on external
benchmarks.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Dec 19 '06 #51

Chris Mullins

"Peter Olcott" <No****@SeeScreen.comwrote

>
The most time critical aspect of my application spends about 99% of its
time comparing elements in one array to elements in other arrays. This
subsystem can tolerate no additional overhead at all, having a real-time
constraint of 1/10 second.

So why are you so worried about the "adding into a list" performance case?
It seems as if you would be much more worried about traversal and
comparision, rather than just inserting.

Also, if you are inserting into an array, (be it List<>, or something else)
you're always going to have hits while it resizes the array - as long as
you're doing insertions or deletions, this is going to be a problem. Use a
linked lists. You gotta pick the right datastructure for your problem.

To further confuse issues, if this is the most time critical piece of your
application, you need to spend some very serious time looking at a better
algorithm. You're going to want to partition your comparisions across
processors using multiple threads (possibly setting affinity), and do so in
a way that avoids locking in the general case, and is intelligent about use
of processor cache (aka: avoids blowing the cache constantly).

Parallel Algorithms are going to be (well, should be) a major part of your
solution. Depending on your use case and processor counts, you may also want
to look into the various lock-free list implementations.

These problems are in no way language specific. C / C++ / Java / C# / VB.Net
will all do a fine job. It's really up to you to determine the best
algorithm given your problem.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise
http://www.coversant.net/blogs/cmullins

Dec 19 '06 #52

Peter Olcott

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...

Peter Olcott <No****@SeeScreen.comwrote:
>The main reason that generics were created was because the boxing/unboxing
overhead of ArrayList was too expensive.

Well, that was *one* reason. The other (principal, IMO) reasons were
type safety and better expression of ideas.

Generics are in Java 5+ as well, without removing the impact of
boxing/unboxing.

Of course, even if generics *had* primarily been invented to alleviate
the performance hit of boxing/unboxing, that still wouldn't make the
benchmark you're looking at any more relevant, because boxing/unboxing
simply isn't a significant factor in it.

If one must convert the form of the data for every access to this data, then
this must take much longer than the simple access by itself. The last time that
we had this discussion there was all kinds of extra overhead (such as bounds
checking) that was thrown in and assumed. I am talking about the bare minimum
simple access that is translated into native code as a simple memory comparison
such as CMP EAX, MEM[EBX].

>
--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Dec 19 '06 #53

Peter Olcott

"Chris Mullins" <cm******@yahoo.comwrote in message
news:%2****************@TK2MSFTNGP04.phx.gbl...

"Peter Olcott" <No****@SeeScreen.comwrote
>>
The most time critical aspect of my application spends about 99% of its time
comparing elements in one array to elements in other arrays. This subsystem
can tolerate no additional overhead at all, having a real-time constraint of
1/10 second.

So why are you so worried about the "adding into a list" performance case? It
seems as if you would be much more worried about traversal and comparision,
rather than just inserting.

Also, if you are inserting into an array, (be it List<>, or something else)
you're always going to have hits while it resizes the array - as long as
you're doing insertions or deletions, this is going to be a problem. Use a
linked lists. You gotta pick the right datastructure for your problem.

To further confuse issues, if this is the most time critical piece of your
application, you need to spend some very serious time looking at a better
algorithm. You're going to want to partition your comparisions across
processors using multiple threads (possibly setting affinity), and do so in a
way that avoids locking in the general case, and is intelligent about use of
processor cache (aka: avoids blowing the cache constantly).

It works fine now in native C++, I just don't want the port to .NET to break
this performance. Unless I can know in advance that .NET will not break this
performance I don't have time to learn .NET.

>
Parallel Algorithms are going to be (well, should be) a major part of your
solution. Depending on your use case and processor counts, you may also want
to look into the various lock-free list implementations.

The simple thing to do is to use dual core processors and allocate one core to
this process, since this process must achieve its 1/10 second real time limit
concurrently with other processes. In actuality this probably will not be
required for most applications because most of the time the other process will
not be using very much CPU time during the execution of this process.

>
These problems are in no way language specific. C / C++ / Java / C# / VB.Net
will all do a fine job. It's really up to you to determine the best algorithm
given your problem.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise
http://www.coversant.net/blogs/cmullins

Dec 19 '06 #54

Jon Skeet [C# MVP]

Peter Olcott <No****@SeeScreen.comwrote:

Generics are in Java 5+ as well, without removing the impact of
boxing/unboxing.

Of course, even if generics *had* primarily been invented to alleviate
the performance hit of boxing/unboxing, that still wouldn't make the
benchmark you're looking at any more relevant, because boxing/unboxing
simply isn't a significant factor in it.

If one must convert the form of the data for every access to this data, then
this must take much longer than the simple access by itself. The last time that
we had this discussion there was all kinds of extra overhead (such as bounds
checking) that was thrown in and assumed. I am talking about the bare minimum
simple access that is translated into native code as a simple memory comparison
such as CMP EAX, MEM[EBX].

I'm not sure where the relevance of that is to anything I wrote...

If you really want unsafe code that could dive into the middle of
*anywhere* if you give it an invalid array index, you should indeed
stick to unmanaged code.

If you want code that performs extremely well anyway, whilst being much
more robust and easier to develop, go with .NET.

I'm not sure why you're continuing this line of enquiry, to be honest -
you seem to have made up your mind long ago that .NET wouldn't perform
appropriately, which is presumably why you haven't accepted that the
benchmarks you've been using as a basis for rejecting C# are entirely
inappropriate.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Dec 19 '06 #55

Chris Mullins

"Peter Olcott" <No****@SeeScreen.comwrote

>
"Chris Mullins" <cm******@yahoo.comwrote in message

>To further confuse issues, if this is the most time critical piece of
your application, you need to spend some very serious time looking at a
better algorithm. You're going to want to partition your comparisions
across processors using multiple threads (possibly setting affinity), and
do so in a way that avoids locking in the general case, and is
intelligent about use of processor cache (aka: avoids blowing the cache
constantly).

It works fine now in native C++, I just don't want the port to .NET to
break this performance. Unless I can know in advance that .NET will not
break this performance I don't have time to learn .NET.

You decided long ago that .Net wasn't a platform you wanted to use. You
found a set of benchmarks that supported this stance, and have proceeded to
cling to them despite a number of people showing you huge and obvious flaws
in those benchmarks.

You've then changed tactics, and talked about how it's actually not just
those benchmarks, but other technologies like boxing and unboxing that are
really going to be your problem. Now it's not about memory allocation
anymore, but about traversing arrays and datastructures for comparisons.

At this point, honestly, you're just being stubborn.

>>
Parallel Algorithms are going to be (well, should be) a major part of
your solution.

The simple thing to do is to use dual core processors and allocate one
core to this process, since this process must achieve its 1/10 second real
time limit concurrently with other processes.

That's not at all the simplist thing to do. In fact, all that tells me is
that you've not written multi-threaded code before, aren't familiar with
parallel algorithms, and don't really know how to take advantage of multiple
CPU's. It also says you don't really know how threading works under windows,
in terms of scheduling.

Given the problem you describe, these really are technologies you need to
master - regardless of C++, Java, .Net, or whatever other language you
choose.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise
http://www.coversant.net/blogs/cmullins

Dec 19 '06 #56

Peter Olcott

"Chris Mullins" <cm******@yahoo.comwrote in message
news:uu**************@TK2MSFTNGP04.phx.gbl...

"Peter Olcott" <No****@SeeScreen.comwrote
>>
"Chris Mullins" <cm******@yahoo.comwrote in message

>>To further confuse issues, if this is the most time critical piece of your
application, you need to spend some very serious time looking at a better
algorithm. You're going to want to partition your comparisions across
processors using multiple threads (possibly setting affinity), and do so in
a way that avoids locking in the general case, and is intelligent about use
of processor cache (aka: avoids blowing the cache constantly).

It works fine now in native C++, I just don't want the port to .NET to break
this performance. Unless I can know in advance that .NET will not break this
performance I don't have time to learn .NET.

You decided long ago that .Net wasn't a platform you wanted to use. You found
a set of benchmarks that supported this stance, and have proceeded to cling to
them despite a number of people showing you huge and obvious flaws in those
benchmarks.

You've then changed tactics, and talked about how it's actually not just those
benchmarks, but other technologies like boxing and unboxing that are really
going to be your problem. Now it's not about memory allocation anymore, but
about traversing arrays and datastructures for comparisons.

At this point, honestly, you're just being stubborn.

Actually it is just the opposite of your incorrect assumptions. .NET is a
technology that I have definitely decided to support. In theory there is no
fundamental reason why this technology would need to remain any slower than
traditional forms of native code execution.

Because of the truly brilliant software engineering that went into .NET there
will definitely be a point in time when the 500% productivity gains are worth
the learning curve. For most applications programming that point in time was a
few years ago.

For critical systems programming that point in time must wait until the
performance is up to snuff. I would tentatively estimate that this point in time
may have arrived at the same time that .NET 2.0 and generics arrived. I must
make sure of this before I can proceed.

>

>>>
Parallel Algorithms are going to be (well, should be) a major part of your
solution.

>The simple thing to do is to use dual core processors and allocate one core
to this process, since this process must achieve its 1/10 second real time
limit concurrently with other processes.

That's not at all the simplist thing to do. In fact, all that tells me is that
you've not written multi-threaded code before, aren't familiar with parallel
algorithms, and don't really know how to take advantage of multiple CPU's. It
also says you don't really know how threading works under windows, in terms of
scheduling.

Given the problem you describe, these really are technologies you need to
master - regardless of C++, Java, .Net, or whatever other language you choose.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise
http://www.coversant.net/blogs/cmullins

Dec 19 '06 #57

Mark Wilden

"Peter Olcott" <No****@SeeScreen.comwrote in message
news:u6*******************@newsfe13.phx...

Because of the truly brilliant software engineering that went into .NET
there will definitely be a point in time when the 500% productivity gains
are worth the learning curve.

The learning curve to go from C++ to like C# just isn't all that high - in
my experience, at least.

Programmers should learn a new language every year, anyway.

///ark

Dec 20 '06 #58

Peter Olcott

"Mark Wilden" <mw*****@communitymtm.comwrote in message
news:OL**************@TK2MSFTNGP04.phx.gbl...

"Peter Olcott" <No****@SeeScreen.comwrote in message
news:u6*******************@newsfe13.phx...

>Because of the truly brilliant software engineering that went into .NET there
will definitely be a point in time when the 500% productivity gains are worth
the learning curve.

The learning curve to go from C++ to like C# just isn't all that high - in my
experience, at least.

Programmers should learn a new language every year, anyway.

///ark

Bjarne Stroustrup (the creator of C++) once said that he has not yet learned all
of C++ yet. Trying to learn a new computer programming language every year would
then necessarily result in a very shallow understanding.

Dec 20 '06 #59

Mark Wilden

"Peter Olcott" <No****@SeeScreen.comwrote in message
news:Zx********************@newsfe07.phx...

>
Bjarne Stroustrup (the creator of C++) once said that he has not yet
learned all of C++ yet. Trying to learn a new computer programming
language every year would then necessarily result in a very shallow
understanding.

It's not necessary to learn "all" of anything. But being afraid to invest
time in a new skill is a sure way to make yourself extinct.

This year I learned Ruby. Do I have a shallow understanding of it? Perhaps.
But I bet it's deeper than yours. :)

///ark

Dec 20 '06 #60

Peter Olcott

"Mark Wilden" <mw*****@communitymtm.comwrote in message
news:uv****************@TK2MSFTNGP04.phx.gbl...

"Peter Olcott" <No****@SeeScreen.comwrote in message
news:Zx********************@newsfe07.phx...
>>
Bjarne Stroustrup (the creator of C++) once said that he has not yet learned
all of C++ yet. Trying to learn a new computer programming language every
year would then necessarily result in a very shallow understanding.

It's not necessary to learn "all" of anything. But being afraid to invest time
in a new skill is a sure way to make yourself extinct.

This year I learned Ruby. Do I have a shallow understanding of it? Perhaps.
But I bet it's deeper than yours. :)

///ark

I am starting a business on very limited resources and must work 40 hours every
day (16 more hours than there are in a day) to get enough done each day. That
tends to leave me with about 24 hours every day of less than no spare time at
all.

Dec 20 '06 #61

Jon Skeet [C# MVP]

Peter Olcott <No****@SeeScreen.comwrote:

I am starting a business on very limited resources and must work 40 hours every
day (16 more hours than there are in a day) to get enough done each day. That
tends to leave me with about 24 hours every day of less than no spare time at
all.

If your business is relying on the impossible to start with, is it
really realistically viable, whatever platform you use? Whatever you
do, you *cannot* work 40 hours a day, obviously - so either your
"must" is wrong, or you will fail. Logically, I can't see any other
option.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Dec 20 '06 #62

Joanna Carter [TeamB]

"Peter Olcott" <No****@SeeScreen.coma écrit dans le message de news:
Zx********************@newsfe07.phx...

| Bjarne Stroustrup (the creator of C++) once said that he has not yet
learned all
| of C++ yet. Trying to learn a new computer programming language every year
would
| then necessarily result in a very shallow understanding.

Well C# only has 80 keywords, 54 operators and no need to understand
pointers and indirection; assuming that you already know the basic
programming constructs, that really isn't a lot to learn.

Having said that, mastering topics like generics has certainly taken me more
time than I had anticipated.

Joanna

--
Joanna Carter [TeamB]
Consultant Software Engineer

Dec 20 '06 #63

Chris Mullins

"Joanna Carter [TeamB]" <jo****@not.for.spamwrote

>
| Bjarne Stroustrup (the creator of C++) once said that he has not yet
| learned all of C++ yet. Trying to learn a new computer programming
| language every year would
| then necessarily result in a very shallow understanding.

Well C# only has 80 keywords, 54 operators and no need to understand
pointers and indirection; assuming that you already know the basic
programming constructs, that really isn't a lot to learn.

Well, it's not really the keywords and operators that take a long time to
learn in any language.

In C++ the area that took me the longest to really wrap my head around was
the combination of multiple-inheritence and the STL - expecially as it
worked with and related to streams and the magic they can work.

In .Net, learning C# (or VB.Net) is (I think) pretty easy. Learning enough
about the 65K classes in the FCL to get things done efficiently has taken a
while.

In both languages, (if you were coming in cold) learning OO and how it works
would be (I believe) by far the biggest hurtle. The realization that you can
make your complex class (or set of classes) implement this silly little
interface (pick one: IEnumerable, IComparable, IClonable, IEquitable,
IPrincipal, etc, etc) and get all this great stuff to happen as a result is
quite a moment.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise
http://www.coversant.net/blogs/cmullins

Dec 20 '06 #64

Peter Olcott

"Chris Mullins" <cm******@yahoo.comwrote in message
news:%2****************@TK2MSFTNGP04.phx.gbl...

"Joanna Carter [TeamB]" <jo****@not.for.spamwrote
>>
| Bjarne Stroustrup (the creator of C++) once said that he has not yet
| learned all of C++ yet. Trying to learn a new computer programming
| language every year would
| then necessarily result in a very shallow understanding.

>Well C# only has 80 keywords, 54 operators and no need to understand
pointers and indirection; assuming that you already know the basic
programming constructs, that really isn't a lot to learn.

Well, it's not really the keywords and operators that take a long time to
learn in any language.

In C++ the area that took me the longest to really wrap my head around was the
combination of multiple-inheritence and the STL - expecially as it worked with
and related to streams and the magic they can work.

In .Net, learning C# (or VB.Net) is (I think) pretty easy. Learning enough
about the 65K classes in the FCL to get things done efficiently has taken a
while.

This is the huge learning curve that I am referring to. C# by itself is
essentially for the most part a subset of C++. The trickiest things that I have
not quite completely mastered about C# are those things pertaining to its
interface with the .NET architecture. I still don't understand why there is any
need for the boxing and unboxing stuff.

>
In both languages, (if you were coming in cold) learning OO and how it works
would be (I believe) by far the biggest hurtle. The realization that you can
make your complex class (or set of classes) implement this silly little
interface (pick one: IEnumerable, IComparable, IClonable, IEquitable,
IPrincipal, etc, etc) and get all this great stuff to happen as a result is
quite a moment.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise
http://www.coversant.net/blogs/cmullins

Dec 20 '06 #65

Mark Wilden

"Peter Olcott" <No****@SeeScreen.comwrote in message
news:0b*******************@newsfe12.phx...

>
I am starting a business on very limited resources and must work 40 hours
every day (16 more hours than there are in a day) to get enough done each
day. That tends to leave me with about 24 hours every day of less than no
spare time at all.

I'm sure you have very strong reasons for this decision, and I wish you
luck!

///ark

Dec 20 '06 #66

Peter Olcott

"Mark Wilden" <mw*****@communitymtm.comwrote in message
news:Oq**************@TK2MSFTNGP06.phx.gbl...

"Peter Olcott" <No****@SeeScreen.comwrote in message
news:0b*******************@newsfe12.phx...
>>
I am starting a business on very limited resources and must work 40 hours
every day (16 more hours than there are in a day) to get enough done each
day. That tends to leave me with about 24 hours every day of less than no
spare time at all.

I'm sure you have very strong reasons for this decision, and I wish you luck!

///ark

Its the culmination of twenty years worth of work. I am actively looking for a
marketing partner to delegate at least half this work to.

Dec 20 '06 #67

Joanna Carter [TeamB]

"Peter Olcott" <No****@SeeScreen.coma écrit dans le message de news:
ct******************@newsfe19.lga...

| This is the huge learning curve that I am referring to. C# by itself is
| essentially for the most part a subset of C++. The trickiest things that I
have
| not quite completely mastered about C# are those things pertaining to its
| interface with the .NET architecture. I still don't understand why there
is any
| need for the boxing and unboxing stuff.

If you go straight to .NET 2.0, you should never need to use boxing and
unboxing.

Joanna

--
Joanna Carter [TeamB]
Consultant Software Engineer

Dec 21 '06 #68

Marc Gravell

If you go straight to .NET 2.0, you should never need to use boxing and

unboxing.

For rolling containers, agreed...

....unless you code against the component-model (e.g. providing dynamic
properties) or using reflection; for instance UI binding - in both
cases setters, getters and list access tends to be as "object", so
boxed for the value-type primatives.

I'm not disagreeing... just adding a caveat... ;-p

Marc

Dec 21 '06 #69

Peter Olcott

"Joanna Carter [TeamB]" <jo****@not.for.spamwrote in message
news:uV**************@TK2MSFTNGP06.phx.gbl...

"Peter Olcott" <No****@SeeScreen.coma écrit dans le message de news:
ct******************@newsfe19.lga...

| This is the huge learning curve that I am referring to. C# by itself is
| essentially for the most part a subset of C++. The trickiest things that I
have
| not quite completely mastered about C# are those things pertaining to its
| interface with the .NET architecture. I still don't understand why there
is any
| need for the boxing and unboxing stuff.

If you go straight to .NET 2.0, you should never need to use boxing and
unboxing.

So boxing and unboxing was merely a design anomaly of the earlier versions?

>
Joanna

--
Joanna Carter [TeamB]
Consultant Software Engineer

Dec 21 '06 #70

Peter Olcott

"Joanna Carter [TeamB]" <jo****@not.for.spamwrote in message
news:Oo*************@TK2MSFTNGP06.phx.gbl...

"Peter Olcott" <No****@SeeScreen.coma écrit dans le message de news:
XB****************@newsfe14.lga...

| Of even better use the actual std::vector itself that is now available
with the
| currently released version of visual studio, that way I further still
reduce my
| learning curve. There are two reason why I don't want to upgrade yet, (1)
Price,
| (2) The exams refer to the older version.

I think you will find that List<Tshould give you most of the functionality
you need. (1)You can get a copy of Visual Studio 2005 Express for free (2)
are you more interested in writing code and achieving results or passing
exams ? :-)

Joanna

Thank you very much for this excellent advice. It looks like all of the Visual
Studio 2005 Express compilers include the optimizer and are free to be used
commercially. The C++ version includes STL .NET, and the C# version includes
generics.

Because most of the complexity has been stripped from these compilers while
still retaining the most useful functionality the learning curve is finally
minimized.

>
--
Joanna Carter [TeamB]
Consultant Software Engineer

Dec 22 '06 #71

Marc Gravell

It looks like all of the Visual

Studio 2005 Express compilers include the optimizer and are free to be used
commercially. The C++ version includes STL .NET, and the C# version includes
generics.

[refers to C#; I have no clue about C++]
Strictly speaking, these products don't actually contain the compiler
at all... the compiler is part of the framework itself, so you actually
have *full* access to the compiler ("csc" for c#, etc) *just* by
installing the framework. At a push, all you need is notepad.
VS/VSExpress provide an IDE to make development more productive.

Marc

Dec 22 '06 #72

Jon Skeet [C# MVP]

Peter Olcott wrote:

If you go straight to .NET 2.0, you should never need to use boxing and
unboxing.

So boxing and unboxing was merely a design anomaly of the earlier versions?

No. Boxing and unboxing are fundamental to the unified type system.

I disagree with Joanna's characterisation with regards to generics -
you should rarely if ever need to use boxing and unboxing for
collections, but there are plenty of other places where a method may
have a type of System.Object and call ToString on it, for instance.

Jon

Dec 22 '06 #73

Peter Olcott

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:11**********************@h40g2000cwb.googlegr oups.com...

Peter Olcott wrote:

If you go straight to .NET 2.0, you should never need to use boxing and
unboxing.

So boxing and unboxing was merely a design anomaly of the earlier versions?

No. Boxing and unboxing are fundamental to the unified type system.

I disagree with Joanna's characterisation with regards to generics -
you should rarely if ever need to use boxing and unboxing for
collections, but there are plenty of other places where a method may
have a type of System.Object and call ToString on it, for instance.

Jon

Can you explain the underlying details why boxing and unboxing are required, and
not merely unnecessary overhead?

Dec 22 '06 #74

Joanna Carter [TeamB]

"Peter Olcott" <No****@SeeScreen.coma écrit dans le message de news:
O9*****************@newsfe12.phx...

| Can you explain the underlying details why boxing and unboxing are
required, and
| not merely unnecessary overhead?

Unlike reference types, value types are "optimised" to be held as their
expected native types (4 bytes, 2 bytes, etc) until they are needed as their
full System.Object derived type. When this transformation takes place, this
is known as boxing; taking an object reference to a value type and assigning
it to a "native" type is known as unboxing.

In effect, the following code happens :

{
int i = 123; // native type, 4 bytes, no methods or properties

string s = i.ToString();

// equivalent to
//
// Int32 = new Int32(i); // Int32 is a struct so boxing occurs here
// string s = Int32.ToString();
//

object o = i;

// equivalent to
//
// object o = new Int32(i);
//

int j = o;

// equivalent to
//
// int j = ((Int32) o).m_value; // pseudocode but unboxing occurs like
this
//
}

Joanna

--
Joanna Carter [TeamB]
Consultant Software Engineer

Dec 22 '06 #75

Peter Olcott

"Joanna Carter [TeamB]" <jo****@not.for.spamwrote in message
news:eI**************@TK2MSFTNGP04.phx.gbl...

"Peter Olcott" <No****@SeeScreen.coma écrit dans le message de news:
O9*****************@newsfe12.phx...

| Can you explain the underlying details why boxing and unboxing are
required, and
| not merely unnecessary overhead?

Unlike reference types, value types are "optimised" to be held as their
expected native types (4 bytes, 2 bytes, etc) until they are needed as their
full System.Object derived type. When this transformation takes place, this
is known as boxing; taking an object reference to a value type and assigning
it to a "native" type is known as unboxing.

So the compiler designers couldn't figure out a way to provide the capability of
member functions to intrinsic types without a lot of extra overhead? It would
seem that all of this capability could easily be provided at compile-time with
no need for run-time overhead.

>
In effect, the following code happens :

{
int i = 123; // native type, 4 bytes, no methods or properties

string s = i.ToString();

// equivalent to
//
// Int32 = new Int32(i); // Int32 is a struct so boxing occurs here
// string s = Int32.ToString();
//

object o = i;

// equivalent to
//
// object o = new Int32(i);
//

int j = o;

// equivalent to
//
// int j = ((Int32) o).m_value; // pseudocode but unboxing occurs like
this
//
}

Joanna

--
Joanna Carter [TeamB]
Consultant Software Engineer

Dec 22 '06 #76

Joanna Carter [TeamB]

"Peter Olcott" <No****@SeeScreen.coma écrit dans le message de news:
_3*****************@newsfe14.phx...

| So boxing and unboxing was merely a design anomaly of the earlier
versions?

No, it's still an essential part of .NET 2.0, it's just that it is no longer
required when creating typesafe collection classes. See my other post for an
explanation of where (un)boxing is still used.

Joanna

--
Joanna Carter [TeamB]
Consultant Software Engineer

Dec 22 '06 #77

Joanna Carter [TeamB]

"Marc Gravell" <ma**********@gmail.coma écrit dans le message de news:
11**********************@f1g2000cwa.googlegroups.c om...

|If you go straight to .NET 2.0, you should never need to use boxing and
| unboxing.
| For rolling containers, agreed...
|
| ...unless you code against the component-model (e.g. providing dynamic
| properties) or using reflection; for instance UI binding - in both
| cases setters, getters and list access tends to be as "object", so
| boxed for the value-type primatives.
|
| I'm not disagreeing... just adding a caveat... ;-p

Caveat away, kind sir :-) Good point. Although I have made some generic
extensions to the ComponentModel classes to avoid this where possible.

Joanna

--
Joanna Carter [TeamB]
Consultant Software Engineer

Dec 22 '06 #78

Jon Skeet [C# MVP]

Peter Olcott wrote:

So the compiler designers couldn't figure out a way to provide the capability of
member functions to intrinsic types without a lot of extra overhead? It would
seem that all of this capability could easily be provided at compile-time with
no need for run-time overhead.

No, that's not it at all.

If you call (for instance) myInt.ToString() then boxing isn't involved.

However, if you want to provide a method which can call ToString on
*anything*, then the type of the parameter needs to be Object. As
Object is a reference type, you need a way to convert the integer value
into a reference type value - and that's what boxing does.

I suggest you read some good books on C# and .NET to understand why
it's a good thing to have both value types and reference types, and why
it's also useful to be able to create a reference type value from a
value type value in certain situations.

It's certainly *not* designer laziness.

Jon

Dec 22 '06 #79

Peter Olcott

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:11*********************@i12g2000cwa.googlegro ups.com...

Peter Olcott wrote:
>So the compiler designers couldn't figure out a way to provide the capability
of
member functions to intrinsic types without a lot of extra overhead? It would
seem that all of this capability could easily be provided at compile-time
with
no need for run-time overhead.

No, that's not it at all.

If you call (for instance) myInt.ToString() then boxing isn't involved.

However, if you want to provide a method which can call ToString on
*anything*, then the type of the parameter needs to be Object. As
Object is a reference type, you need a way to convert the integer value
into a reference type value - and that's what boxing does.

I suggest you read some good books on C# and .NET to understand why
it's a good thing to have both value types and reference types, and why
it's also useful to be able to create a reference type value from a
value type value in certain situations.

It's certainly *not* designer laziness.

It would still seem like far less than the best possible design of all possible
designs. It still seems like run-time overhead that could have been done at
compile-time.

>
Jon

Dec 22 '06 #80

Jon Skeet [C# MVP]

Peter Olcott wrote:

It's certainly *not* designer laziness.

It would still seem like far less than the best possible design of all possible
designs. It still seems like run-time overhead that could have been done at
compile-time.

I strongly suggest that you wait until you've got a bit more experience
with the type system as a whole before passing judgements like that.
Considering you were still unsure about the differences between value
types and reference types until a few days ago (at least, so this
thread suggests) you're not in an ideal situation to comment on design
decisions.

In particular, you've been convinced before now that the
boxing/unboxing overhead was prohibitively expensive due to a benchmark
which actually demonstrated nothing of the kind.

The actual overhead involved due to boxing and unboxing in most .NET
applications is miniscule, but the advantages provided by it in terms
of type unification are very great.

Jon

Dec 22 '06 #81

Joanna Carter [TeamB]

"Peter Olcott" <No****@SeeScreen.coma écrit dans le message de news:
8k********************@newsfe07.phx...

| If you call (for instance) myInt.ToString() then boxing isn't involved.

My apologies, I assumed boxing occurred here.

| It would still seem like far less than the best possible design of all
possible
| designs. It still seems like run-time overhead that could have been done
at
| compile-time.

Rathermore, it is usually down to developer laziness, using object
parameters to methods where strictly typed methods would be more efficient
:-)

Many times, developers, wanting to do the same thing to different types,
will use the following method signature :

string ConvertToString(object value)
{
return value.ToString();
}

Now, this means that any type may be passed to the method, but that boxing
will occur. However if, in .NET 1.1, we use overloaded method signatures :

string ConvertToString(int value)
{
return value.ToString();
}
string ConvertToString(double value)
{
return value.ToString();
}

void Test()

{
int i = 123;

string s = ConvertToString(i);
}

.... the correct overloaded method will be called for the true type of the
parameter and the native version of ToString() for each type will be called.

Or you could use generics to simplify this even further :
string ConvertToString<T>(T value)
{
return value.ToString();
}
void Test()

{
int i = 123;

string s = ConvertToString(i);
}

The compîler infers the type of the generic method from the type being
passed to the value parameter. A true int gets passed to the method and
ToString() gets called on the native type with no boxing.

So, you can see, for a little more effort on the part of the develpoer,
boxing can be avoided again.

Joanna

--
Joanna Carter [TeamB]
Consultant Software Engineer

Dec 22 '06 #82

Peter Olcott

"Joanna Carter [TeamB]" <jo****@not.for.spamwrote in message
news:%2****************@TK2MSFTNGP02.phx.gbl...

"Peter Olcott" <No****@SeeScreen.coma écrit dans le message de news:
8k********************@newsfe07.phx...

| If you call (for instance) myInt.ToString() then boxing isn't involved.

My apologies, I assumed boxing occurred here.

| It would still seem like far less than the best possible design of all
possible
| designs. It still seems like run-time overhead that could have been done
at
| compile-time.

Rathermore, it is usually down to developer laziness, using object
parameters to methods where strictly typed methods would be more efficient
:-)

Many times, developers, wanting to do the same thing to different types,
will use the following method signature :

string ConvertToString(object value)
{
return value.ToString();
}

Now, this means that any type may be passed to the method, but that boxing
will occur. However if, in .NET 1.1, we use overloaded method signatures :

string ConvertToString(int value)
{
return value.ToString();
}
string ConvertToString(double value)
{
return value.ToString();
}

void Test()

{
int i = 123;

string s = ConvertToString(i);
}

... the correct overloaded method will be called for the true type of the
parameter and the native version of ToString() for each type will be called.

Or you could use generics to simplify this even further :
string ConvertToString<T>(T value)
{
return value.ToString();
}
void Test()

{
int i = 123;

string s = ConvertToString(i);
}

The compîler infers the type of the generic method from the type being
passed to the value parameter. A true int gets passed to the method and
ToString() gets called on the native type with no boxing.

So, you can see, for a little more effort on the part of the develpoer,
boxing can be avoided again.

Joanna

This would now seem to make much more sense. As long as the unnecessary overhead
can be easily avoided, and the purpose of this overhead is to make it a little
easier on the programmer for applications where performance is not important,
then there would be no problem. Now that Generics exist, the fast way is also
just as easy. I was pleasantly surprised to find that managed code in 2005
seemed to perform better that native code under Visual C++ 6.0 on two critical
benchmarks.

>
--
Joanna Carter [TeamB]
Consultant Software Engineer

Dec 22 '06 #83

Similar topics