473,692 Members | 2,487 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Boxing and Unboxing ??

According to Troelsen in "C# and the .NET Platform"
"Boxing can be formally defined as the process of explicitly converting a value
type into a corresponding reference type."

I think that my biggest problem with this process is that the terms "value type"
and "reference type" mean something entirely different than what they mean on
every other platform in every other language. Normally a value type is the
actual data itself stored in memory, (such as an integer) and a reference type
is simply the address of this data.

It seems that .NET has made at least one of these two terms mean something
entirely different. can someone please give me a quick overview of what the
terms "value type" and "reference type" actually mean in terms of their
underlying architecture?
Jan 12 '07
161 7820
Peter Olcott wrote:
So with Generics Boxing and UnBoxing beomes obsolete?
Not in general.

Generics make boxing an unboxing obsolete in the context
of storing value types in collections.

Arne
Jan 14 '07 #31
Peter Olcott wrote:
What I am looking for is all of the extra steps that form what is referred to as
boxing and unboxing. In C/C++ converting a value type to a reference type is a
very simple operation and I don't think that there are any runtime steps at all.
All the steps are done at compile time. Likewise for converting a reference type
to a value type.

in C/C++
int X = 56;
int *Y = &X;
Now both X and *Y hold 56, and Y is a reference to X;
That code is not equivalent to what we are discussing in C#.

In fact it does not really have any equivalent in C# (not using
unsafe code).

Arne
Jan 14 '07 #32

Peter Olcott wrote:
"Barry Kelly" <ba***********@ gmail.comwrote in message
news:2d******** *************** *********@4ax.c om...
Peter Olcott wrote:
"Barry Kelly" <ba***********@ gmail.comwrote:

So a member function can not add two integer members without unboxing them
first? That would sound like horrendous design.
If you carefully read what I wrote, you'll notice:
From the point of view of C#, an integer (or any other value type) is
only boxed if it's been assigned to a location of type 'object' -
whether local variable, argument or field.
You *cannot*, I repeat *CANNOT*, have two boxed integer members in C# -
the members would need to be of type *object*, not int, in order for
them to be boxed.

-- Barry

I carefully read it, yet, did not fully understand the meaning of all of the
terminology that was used. For one thing, I don't see why there is ever any need
for boxing and unboxing. I know that there is no such need in C++.
That's because in C# (and Java) you _can't_ say:

int x = 3;
int *p = &x;

because the "&" operator simply doesn't exist. You can't take the
address of an arbitrary variable.
I also know that it must somehow support GC, and that is why it is needed.
Well, more to the point, a language that supports garbage collection
can't allow one to take addresses of arbitrary memory locations, as the
garbage collector could then never determine what objects were
referenced and which weren't (because an address into the midst of an
object would then be legal).
Is it something like maintaining a chain of pointers indicating who owns what?
Well, sort of. The GC walks the stack and all static objects, looking
for references to objects on the heap. It then follows references
stored in those objects, etc, until it exhausts the network of
references. Any objects left thus unmarked are available for
collection.

Of course, it's rather more complex than that, but you get the idea. If
you allow references into the midst of objects, then it's much more
difficult to decide what is referenced and what isn't.

In C# and Java, every reference that you can directly manipulate in
code is to a valid object on the heap. That's why, if you want to treat
an int as an object (and thus have a reference to it) then the CLR has
to create an object wrapper for it and put it on the heap.

Jan 14 '07 #33

"Barry Kelly" <ba***********@ gmail.comwrote in message
news:bv******** *************** *********@4ax.c om...
Peter Olcott wrote:
You can, but in a strictly downwards (call stack) fashion, via the 'ref'
modifier on arguments. You can't safely store the address.
So I can call one member function from another member function of a different
class and pass the address of the a struct to the second member function so
that
the second member function can directly update the contents of the struct,
without any boxing and unboxing overhead?

Yes.
>If the answer is yes, then what is the
syntax for doing this?

void Foo(ref MyStruct value) { } // declaration

// ...
MyStruct myStructValue;
// ...
Foo(ref myStructValue); // usage

There is also an 'out', which is similar but (a) argument need not be
definitely assigned when passed in (but it will be definitely assigned
after the call) and (b) it is treated as unassigned in the body of the
method taking the parameter, and will be so treated until it's assigned
(and must be assigned before the method returns).

But be sure to measure that:

1) MyStruct being a struct (value type) is the right thing to do.
Typically, if sizeof(MyStruct ) is greater than (say) 16 bytes, it's
looking like it might be too big. Of course, there are exceptions to
this, like in all performance work. Measure, etc.

2) The savings by passing by-ref outweigh the fact that it's a mutable
reference. In other words, beware that there's no const by-ref
mechanism.
There is no inherent reason why this could not be added to the language as a
compile time feature later on. It might be simpler to stick with the established
convention and simply make an [in] equivalent of an [out] parameter, instead of
using the somewhat less obvious [const]. There would be no reason to distinguish
between [in] by reference and [in] by value, they could all be passed by
reference, or anything larger than [int] could always be passed by reference.

It is good to know that aggregate data can be passed by reference without the
boxing and unboxing overhead, if need be.
>
-- Barry

--
http://barrkel.blogspot.com/

Jan 14 '07 #34

"Barry Kelly" <ba***********@ gmail.comwrote in message
news:u9******** *************** *********@4ax.c om...
Peter Olcott wrote:
>I carefully read it, yet, did not fully understand the meaning of all of the
terminology that was used. For one thing, I don't see why there is ever any
need
for boxing and unboxing.

Consider how these things would be implemented in a memory-safe[1]
manner without boxing (whether manual boxing like Java 1.4, or
autoboxing like C# and Java 1.5+):

* IEnumerable
* ArrayList or List<object(tak e your pick)
* Component.Tag
>I know that there is no such need in C++.

If you try to create a C++ analogue of IEnumerable in a memory-safe way,
you'll need to reinvent boxing. In other words, you'll need some way of
unifying all values into some interface that can be queried for type and
safely converted into its actual value.

And just because a feature is useful doesn't mean that it is necessary.
C++ isn't memory-safe.
>I also know
that it must somehow support GC, and that is why it is needed.

GC is an orthogonal issue to boxing per se. Autoboxing, however,
requires some kind of GC, even if it's as dumb as reference counting, if
it's to be sane (IMHO).
>I don't see how it supports GC. Is it something like maintaining a chain
of pointers indicating who owns what?

GC in no way requires boxing. GC follows the same references you program
with. There are no magic references behind the scenes.

[1] By "memory-safe", I mean that it's provably impossible to violate
language's memory model. See e.g. type safety on Wikipedia for more
info:

http://en.wikipedia.org/wiki/Type_safety
A strongly type language like C++ effectively prevents any accidental type
errors, why bother with more than this?
>
-- Barry

--
http://barrkel.blogspot.com/

Jan 14 '07 #35

"Arne Vajhøj" <ar**@vajhoej.d kwrote in message
news:45******** *************** @news.sunsite.d k...
Peter Olcott wrote:
>What I am looking for is all of the extra steps that form what is referred to
as boxing and unboxing. In C/C++ converting a value type to a reference type
is a very simple operation and I don't think that there are any runtime steps
at all. All the steps are done at compile time. Likewise for converting a
reference type to a value type.

in C/C++
int X = 56;
int *Y = &X;
Now both X and *Y hold 56, and Y is a reference to X;

That code is not equivalent to what we are discussing in C#.

In fact it does not really have any equivalent in C# (not using
unsafe code).

Arne
Couldn't there possibly be a way to create safe code that does not ever require
any extra runtime overhead? Couldn't all the safety checking somehow be done at
compile time?
Jan 14 '07 #36

"Bruce Wood" <br*******@cana da.comwrote in message
news:11******** **************@ a75g2000cwd.goo glegroups.com.. .
>
Peter Olcott wrote:
>"Barry Kelly" <ba***********@ gmail.comwrote in message
news:2d******* *************** **********@4ax. com...
Peter Olcott wrote:

"Barry Kelly" <ba***********@ gmail.comwrote:

So a member function can not add two integer members without unboxing them
first? That would sound like horrendous design.

If you carefully read what I wrote, you'll notice:

From the point of view of C#, an integer (or any other value type) is
only boxed if it's been assigned to a location of type 'object' -
whether local variable, argument or field.

You *cannot*, I repeat *CANNOT*, have two boxed integer members in C# -
the members would need to be of type *object*, not int, in order for
them to be boxed.

-- Barry

I carefully read it, yet, did not fully understand the meaning of all of the
terminology that was used. For one thing, I don't see why there is ever any
need
for boxing and unboxing. I know that there is no such need in C++.

That's because in C# (and Java) you _can't_ say:

int x = 3;
int *p = &x;

because the "&" operator simply doesn't exist. You can't take the
address of an arbitrary variable.
I still don't see any reason why a completely type safe language can not be
constructed without the need for any runtime overhead. You could even allow
construct such as the above, and still be completely type safe, merely disallow
type casting.
>
>I also know that it must somehow support GC, and that is why it is needed.

Well, more to the point, a language that supports garbage collection
can't allow one to take addresses of arbitrary memory locations, as the
garbage collector could then never determine what objects were
referenced and which weren't (because an address into the midst of an
object would then be legal).
It could do this, but, then you have the issue of reference counting, more extra
overhead. You don't have this problem when data is simply passed by address with
no assignment to another pointer variable.
>
>Is it something like maintaining a chain of pointers indicating who owns
what?

Well, sort of. The GC walks the stack and all static objects, looking
for references to objects on the heap. It then follows references
stored in those objects, etc, until it exhausts the network of
references. Any objects left thus unmarked are available for
collection.
Global data is disallowed?
Of course, it's rather more complex than that, but you get the idea. If
you allow references into the midst of objects, then it's much more
difficult to decide what is referenced and what isn't.

In C# and Java, every reference that you can directly manipulate in
code is to a valid object on the heap. That's why, if you want to treat
an int as an object (and thus have a reference to it) then the CLR has
to create an object wrapper for it and put it on the heap.
I still don't see any need for a wrapper. Do you mean for reference counting?
Jan 14 '07 #37
Peter Olcott wrote:
[1] By "memory-safe", I mean that it's provably impossible to violate
language's memory model. See e.g. type safety on Wikipedia for more
info:

http://en.wikipedia.org/wiki/Type_safety

A strongly type language like C++ effectively prevents any accidental type
errors, why bother with more than this?
Because this also prevents *intentional* type errors, which is
important for running code in a sandbox. Your web browser can guarantee
that a Java applet embedded into a page won't crash your system or
delete all your files, because Java enforces type safety at all levels;
this is the same sort of thing.

Jesse

Jan 14 '07 #38
Peter Olcott wrote:
"Bruce Wood" <br*******@cana da.comwrote in message
news:11******** **************@ a75g2000cwd.goo glegroups.com.. .
[...]
That's because in C# (and Java) you _can't_ say:

int x = 3;
int *p = &x;

because the "&" operator simply doesn't exist. You can't take the
address of an arbitrary variable.

I still don't see any reason why a completely type safe language can not be
constructed without the need for any runtime overhead. You could even allow
construct such as the above, and still be completely type safe, merely disallow
type casting.
If you disallow type casting, you neuter the language. You need to be
able to cast instances of derived classes to their bases and back. You
can do the first kind of cast without any runtime overhead, but you
need *some* runtime overhead to cast a base instance to its actual
derived class, even in C++ with dynamic_cast<>.

(The overhead in C++ isn't for performing the actual cast, but for
verifying that the cast is valid - that the object actually belongs to
the class you're casting it to. In C#, that's usually the case, but for
unboxing casts there's also overhead for copying the value out of its
box.)
Well, more to the point, a language that supports garbage collection
can't allow one to take addresses of arbitrary memory locations, as the
garbage collector could then never determine what objects were
referenced and which weren't (because an address into the midst of an
object would then be legal).

It could do this, but, then you have the issue of reference counting, more extra
overhead. You don't have this problem when data is simply passed by address with
no assignment to another pointer variable.
The desire to avoid that overhead (as well as other problems with
reference counting) is, presumably, why .NET uses a garbage collector
instead.
Well, sort of. The GC walks the stack and all static objects, looking
for references to objects on the heap. It then follows references
stored in those objects, etc, until it exhausts the network of
references. Any objects left thus unmarked are available for
collection.
Global data is disallowed?
No, that's what "static objects" refers to. In C#, you typically only
store global data by putting it in the static fields of a class. (There
are a couple other types of global data used with C++/CLI: bare global
variables and gcroots.)
Of course, it's rather more complex than that, but you get the idea. If
you allow references into the midst of objects, then it's much more
difficult to decide what is referenced and what isn't.

In C# and Java, every reference that you can directly manipulate in
code is to a valid object on the heap. That's why, if you want to treat
an int as an object (and thus have a reference to it) then the CLR has
to create an object wrapper for it and put it on the heap.

I still don't see any need for a wrapper. Do you mean for reference counting?
The wrapper is there so that the int on the heap can be treated like
any other object, with a type pointer, virtual methods, etc. If it were
just stored on the heap as a plain integer, there'd be no way for your
code (and more importantly, the garbage collector) to tell it apart
from a float or an object reference at runtime.

Boxing lets you write a method like this:

public static void PrintIt(object foo)
{
Console.WriteLi ne("Thanks for this " + foo.GetType().N ame + ": " +
foo.ToString()) ;
}

And then pass in *any* value, whether it's an integer, a structure, or
an object reference. An unboxed integer is just a number, with no type
information other than that stored in the compiler's internals; a boxed
integer is a full-fledged instance of a class derived from
System.Object.

Jesse

Jan 14 '07 #39

Peter Olcott wrote:
"Bruce Wood" <br*******@cana da.comwrote in message
news:11******** **************@ a75g2000cwd.goo glegroups.com.. .
That's because in C# (and Java) you _can't_ say:

int x = 3;
int *p = &x;

because the "&" operator simply doesn't exist. You can't take the
address of an arbitrary variable.

I still don't see any reason why a completely type safe language can not be
constructed without the need for any runtime overhead. You could even allow
construct such as the above, and still be completely type safe, merely disallow
type casting.
I also know that it must somehow support GC, and that is why it is needed.
Well, more to the point, a language that supports garbage collection
can't allow one to take addresses of arbitrary memory locations, as the
garbage collector could then never determine what objects were
referenced and which weren't (because an address into the midst of an
object would then be legal).

It could do this, but, then you have the issue of reference counting, more extra
overhead. You don't have this problem when data is simply passed by address with
no assignment to another pointer variable.
C# supports pass-by-reference using the "ref" keyword.

However, I don't see how a language that allowed one to take the
address of arbitrary data could implement garbage collection. Even with
reference counting, the theory is that an _object_ counts references to
itself. An int, however, isn't an object. You're faced with the problem
of an object counting references to itself _or piece of data that it
holds_. How could you engineer a system whereby object A could keep
track of this sort of thing:

int *p = &(A.X);
int *q = p;

How does the object A now know that there are two references to it, p
and q, which point to a field inside A and not to A itself?

I don't see how you could automate this kind of reference counting,
even in C++, but then I'm no C++ guru.
Is it something like maintaining a chain of pointers indicating who owns
what?
Well, sort of. The GC walks the stack and all static objects, looking
for references to objects on the heap. It then follows references
stored in those objects, etc, until it exhausts the network of
references. Any objects left thus unmarked are available for
collection.
Global data is disallowed?
No. Global data is allowed. That's what I meant by "static".
Of course, it's rather more complex than that, but you get the idea. If
you allow references into the midst of objects, then it's much more
difficult to decide what is referenced and what isn't.

In C# and Java, every reference that you can directly manipulate in
code is to a valid object on the heap. That's why, if you want to treat
an int as an object (and thus have a reference to it) then the CLR has
to create an object wrapper for it and put it on the heap.

I still don't see any need for a wrapper. Do you mean for reference counting?
C# and Java don't do reference counting. They walk the network of
object references at garbage collection time. "Mark and sweep."

I guess a good summary would be to say that the more regular the
situation, the easier it is to write good code to deal with it. By
forcing every collectable object to be the same, and allowing
references only to objects on the heap (apart from pass-by-ref, which
doesn't enter into garbage collection), C# and Java make it easier on
the garbage collector, which allows the GC to be more efficient.

Once you open up the language to allow arbitrary addressing of objects
and the values within them, you create a nightmare situation for the
garbage collector. Not that a sufficiently clever team of people
couldn't do it, I suppose, but it adds a lot of additional complexity,
and one has to ask exactly what would be gained? Java has demonstrated
that you can write perfectly good code without the ability to take
arbitrary addresses, pointer arithmetic, and the other stuff that C and
C++ pointers provide. There are some domains where the power of C / C++
pointers is arguably a great boon, but for most programming problems it
isn't required. So, you don't lose very much, and you gain a much
simpler garbage collector and better run-time security.

And yes, in .NET 2.0 you can pretty-much avoid boxing (and unboxing)
altogether. It was difficult in .NET 1.1 because all of the standard
collections were collections of Object, and so storing values in a
Hashtable or an ArrayList (aka Vector in C++) meant incurring boxing
overhead. Even in .NET 1.1, however, you could roll your own
collections that didn't box or unbox, but they had to be type-specific.
..NET 2.0's generics (aka templates in C++) eliminate this problem. I
wouldn't say that boxing is a thing of the past, but more than 90% of
boxing in .NET 1.1 was in collections, and that's no longer necessary.

So the runtime penalty is almost non-existent, assuming that you use
appropriate language constructs.

Personally, I'm glad that arbitrary addressing was never put into Java
or C#. When I moved from C / C++ to Java I wondered how I would ever do
without the "&" operator, but I quickly realized that for the type of
software I write (business software) it really isn't needed. If,
however, I ever go back to writing real-time switching systems, I will
no doubt want C++ back again. Each tool has its uses, and C# is, in my
opinion, better suited to most day-to-day programming problems than is
C++. However, there are places that C# won't take you, where C++ is
much better suited.

Jan 14 '07 #40

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

43
6873
by: Mountain Bikn' Guy | last post by:
I have a situation where an app writes data of various types (primitives and objects) into a single dimensional array of objects. (This array eventually becomes a row in a data table, but that's another story.) The data is written once and then read many times. Each primitive read requires unboxing. The data reads are critical to overall app performance. In the hopes of improving performance, we have tried to find a way to avoid the...
3
1796
by: Steve | last post by:
Hi, I have a class like: public ClassA { int vals1; int vals2; }
24
2612
by: ALI-R | last post by:
Hi All, First of all I think this is gonna be one of those threads :-) since I have bunch of questions which make this very controversial:-0) Ok,Let's see: I was reading an article that When you pass a Value-Type to method call ,Boxing and Unboxing would happen,Consider the following snippet: int a=1355; myMethod(a); ......
4
2973
by: Peter Olcott | last post by:
I want to be able to make my .NET applications run just as fast as unmanaged C++. From my currently somewhat limited understanding of the .NET framework and the C# language, it seems that Boxing/Unboxing might present of problem. Since C++ has pointer syntax, I was thinking that this might eliminate the need for Boxing and Unboxing. Am I right? One of the things that my application needs is something exactly like std::vector<unsigned...
94
5665
by: Peter Olcott | last post by:
How can I create an ArrayList in the older version of .NET that does not require the expensive Boxing and UnBoxing operations? In my case it will be an ArrayList of structures of ordinal types. Thanks.
19
13713
by: ahjiang | last post by:
hi there,, what is the real advantage of boxing and unboxing operations in csharp? tried looking ard the internet but couldnt find any articles on it. appreciate any help
0
8970
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8812
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7639
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6462
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5822
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4329
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4564
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2984
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
1962
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.