ArrayList vs. List<>

Zytan

The docs for List say "The List class is the generic equivalent of the
ArrayList class." Since List<is strongly typed, and ArrayList has
no type (is that called weakly typed?), I would assume List<is far
better. So, why do people use ArrayList so often? Am I missing
somehing? What's the difference between them?

Zytan

May 7 '07 #1

Subscribe Post Reply

39116

Jon Skeet [C# MVP]

Zytan <zy**********@gmail.comwrote:

The docs for List say "The List class is the generic equivalent of the
ArrayList class." Since List<is strongly typed, and ArrayList has
no type (is that called weakly typed?), I would assume List<is far
better. So, why do people use ArrayList so often? Am I missing
somehing? What's the difference between them?

ArrayList was available from the start - List<Tonly arrived in .NET
2.0, along with generics.

There are a lot of old samples, and samples which want to stay version-
neutral.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

May 7 '07 #2

Alvin Bruney [MVP]

There is a whole lot of performance difference between the two in favor of
lists. Mostly, people use arraylist because of habit and it is functionally
more familiar than lists are. Eventually, that should change though.

--
Regards,
Alvin Bruney
------------------------------------------------------
Shameless author plug
Excel Services for .NET is coming...
OWC Black book on Amazon and
www.lulu.com/owc
Professional VSTO 2005 - Wrox/Wiley
"Zytan" <zy**********@gmail.comwrote in message
news:11**********************@q75g2000hsh.googlegr oups.com...

The docs for List say "The List class is the generic equivalent of the
ArrayList class." Since List<is strongly typed, and ArrayList has
no type (is that called weakly typed?), I would assume List<is far
better. So, why do people use ArrayList so often? Am I missing
somehing? What's the difference between them?

Zytan

May 7 '07 #3

Crash

On May 7, 3:52 pm, Zytan <zytanlith...@gmail.comwrote:

The docs for List say "The List class is the generic equivalent of the
ArrayList class." Since List<is strongly typed, and ArrayList has
no type (is that called weakly typed?), I would assume List<is far
better. So, why do people use ArrayList so often? Am I missing
somehing? What's the difference between them?

Zytan

Version 1.x of the .NET framework did not support generics so you had
no option other than to use the non-type safe ArrayList. Generics are
generally preferred as they promote type safetly and also because they
eliminate the need to box/unbox collections of value type...

May 7 '07 #4

Zytan

ArrayList was available from the start - List<Tonly arrived in .NET

2.0, along with generics.

There are a lot of old samples, and samples which want to stay version-
neutral.

Yes, tell me about it. But I supposed that's due to the quick
popularity of the language, so I can't be mad at that.

So, you are saying that List<Tis indeed better? Was I right about
that?

Zytan

May 7 '07 #5

Zytan

There is a whole lot of performance difference between the two in favor of

lists. Mostly, people use arraylist because of habit and it is functionally
more familiar than lists are. Eventually, that should change though.

Alvin, thanks, so List<Tis faster, type safe (strongly typed), and
more easy to use, so, hm, which should I use? haha. thanks

Zytan

May 7 '07 #6

Zytan

Version 1.x of the .NET framework did not support generics so you had

no option other than to use the non-type safe ArrayList. Generics are
generally preferred as they promote type safetly and also because they
eliminate the need to box/unbox collections of value type...

Ok, thanks for the reply!

Zytan

May 7 '07 #7

Peter Duniho

On Mon, 07 May 2007 15:52:35 -0700, Zytan <zy**********@gmail.comwrote:

The docs for List say "The List class is the generic equivalent of the
ArrayList class." Since List<is strongly typed, and ArrayList has
no type (is that called weakly typed?), I would assume List<is far
better. So, why do people use ArrayList so often? Am I missing
somehing? What's the difference between them?

The difference? The List class is the generic equivalent of the ArrayList
class. :)

The only reason I can think of off the top of my head to use the ArrayList
class is when you need to pass one to an existing .NET method. Otherwise,
I'd use the List<class (or even its relatives, such as LinkedList<>).

Have I mentioned lately how much I love generics in C#? :)

Pete

May 7 '07 #8

Carl Daniel [VC++ MVP]

Peter Duniho wrote:

Have I mentioned lately how much I love generics in C#? :)

.... just don't try to do generic math in them. :) (Yes, there are ways,
but it's a pain).

-cd

May 8 '07 #9

Jon Harrop

Zytan wrote:

is that called weakly typed?

No, that is called "dynamically" typed, meaning type checks are deferred
until run-time, slowing down both development and execution. The antonym
is "statically typed", meaning the types are checked at compile type and
run-time type checks are removed.

Python, Ruby, Perl, Lisp and Scheme are dynamically typed languages. C, C++,
C#, Java (mainly), F#, OCaml and Haskell are statically typed languages.
The last three use state-of-the-art techniques to combine the brevity of
dynamic typing with the performance of static typing.

Strong and weak typing refers to the ability to misinterpret a value as
being of another type. For example, you can accidentally get the 64-bit int
representing the bits of a 64-bit float in the C language. Modern languages
almost always try to be strongly typed to avoid such errors. Then you must
explicitly invoke a function to get at the bits of a floating point number.

--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/product...ournal/?usenet

May 8 '07 #10

Marc Scheuner

>The only reason I can think of off the top of my head to use the ArrayList

>class is when you need to pass one to an existing .NET method.

Or if you need to have a list of objects which are not all of the same
type - that won't jive in a List<T:-)

Marc

May 8 '07 #11

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Marc Scheuner wrote:

>The only reason I can think of off the top of my head to use the ArrayList
class is when you need to pass one to an existing .NET method.

Or if you need to have a list of objects which are not all of the same
type - that won't jive in a List<T:-)

Marc

Oh, but it will. Just make a List<object>. :)

--
Göran Andersson
_____
http://www.guffa.com

May 8 '07 #12

Cor Ligthert [MVP]

"Alvin, thanks, so List<Tis faster,

No it is in fact slower, a strongly typed type has always extra code.
However it is type safe and helps you to make more secure programs.

If you use the arraylist only inside a method with one attribute and let it
go out of scoop afterwards, the arraylist is probably the fastest.

Cor

May 8 '07 #13

Jon Skeet [C# MVP]

Zytan <zy**********@gmail.comwrote:

ArrayList was available from the start - List<Tonly arrived in .NET
2.0, along with generics.

There are a lot of old samples, and samples which want to stay version-
neutral.

Yes, tell me about it. But I supposed that's due to the quick
popularity of the language, so I can't be mad at that.

So, you are saying that List<Tis indeed better? Was I right about
that?

Yup. I can't think of any reason - off the top of my head, anyway - to
use ArrayList rather than List<Tif you know you *can* use List<Tin
all target environments.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

May 8 '07 #14

Jon Skeet [C# MVP]

Jon Harrop <jo*@ffconsultancy.comwrote:

Zytan wrote:
is that called weakly typed?

No, that is called "dynamically" typed, meaning type checks are deferred
until run-time, slowing down both development and execution. The antonym
is "statically typed", meaning the types are checked at compile type and
run-time type checks are removed.

Python, Ruby, Perl, Lisp and Scheme are dynamically typed languages. C, C++,
C#, Java (mainly), F#, OCaml and Haskell are statically typed languages.
The last three use state-of-the-art techniques to combine the brevity of
dynamic typing with the performance of static typing.

Strong and weak typing refers to the ability to misinterpret a value as
being of another type. For example, you can accidentally get the 64-bit int
representing the bits of a 64-bit float in the C language. Modern languages
almost always try to be strongly typed to avoid such errors. Then you must
explicitly invoke a function to get at the bits of a floating point number.

I've been looking into this recently, and the latter point is usually
called safe or unsafe typing, IMO. Strong vs weak typing is very poorly
defined - different people use it to mean all kinds of different
things.

A few useful links to read up on:
http://en.wikipedia.org/wiki/Strong_typing
http://en.wikipedia.org/wiki/Type_system
http://eli.thegreenplace.net/2006/11...yping-systems/

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

May 8 '07 #15

Jon Skeet [C# MVP]

Cor Ligthert [MVP] <no************@planet.nlwrote:

"Alvin, thanks, so List<Tis faster,

No it is in fact slower, a strongly typed type has always extra code.
However it is type safe and helps you to make more secure programs.

It entirely depends on what you're storing in the ArrayList. If you're
storing value types, that will be slower and more memory hungry because
it will be boxing for every store and unboxing on every fetch.

For reference types, I would still expect it to be faster to use a
List<Tbecause of the lack of need for a runtime cast while fetching,
but I haven't tried it. I'd expect the difference in this case either
way to be smaller than the difference between using value types in an
arraylist and in a List<T>.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

May 8 '07 #16

Christof Nordiek

"Cor Ligthert [MVP]" <no************@planet.nlschrieb im Newsbeitrag
news:uS**************@TK2MSFTNGP04.phx.gbl...

>
"Alvin, thanks, so List<Tis faster,

No it is in fact slower, a strongly typed type has always extra code.

Do you really think?
I couldn't imagine any reason why generic lists should be slower then
ArrayList.
Can you demonstrate this with some sample code?

I don't know where there should be extra code. Only that each method is
compiled (from IL to native) ones for each value type, it is used with and
once for all reference types (IIRC). But the use with value types surely is
faster with List<than with ArrayList because of boxing and unboxing.

Christof

May 8 '07 #17

Marc Gravell

Maybe for 1.1-style typed collections etc... but for generics?

Marc

May 8 '07 #18

Zytan

Have I mentioned lately how much I love generics in C#? :)

You're not the only one! :)

Zytan

May 8 '07 #19

Ben Voigt

"Marc Gravell" <ma**********@gmail.comwrote in message
news:eY**************@TK2MSFTNGP02.phx.gbl...

Maybe for 1.1-style typed collections etc... but for generics?

Compile-time enforcement (ala generics) is by nature cheaper than run-time
enforcement. And just try using ArrayList without some sort of runtime
enforcement (you do need to cast back from object eventually, unless it's a
list of hashtable keys or some equally bizarre circumstance).

That said, the CLR with JIT blurs the line between compile-time and
run-time. All bets are off concerning which IL will verify faster. So if
it's only run once, it's anybody's guess which is faster. But if it only
runs once, why is anyone concerned about performance. If you have a loop,
the generic wins by a huge margin.

May 8 '07 #20

Zytan

http://en.wikipedia.org/wiki/Strong_typing

"Programming language expert Benjamin C. Pierce ... has said: "I spent
a few weeks... trying to sort out the terminology of "strongly typed,"
"statically typed," "safe," etc., and found it amazingly difficult....
The usage of these terms is so various as to render them almost
useless.""

Wow.

Zytan

May 8 '07 #21

Nicholas Paldino [.NET/C# MVP]

When storing value types in an ArrayList, the boxing overhead makes the
ArrayList perform approximately 2x as slow. For reference types, a List<T>
is still faster, but only by about 5-10%.
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP********************@msnews.microsoft.com.. .

Cor Ligthert [MVP] <no************@planet.nlwrote:
>"Alvin, thanks, so List<Tis faster,

No it is in fact slower, a strongly typed type has always extra code.
However it is type safe and helps you to make more secure programs.

It entirely depends on what you're storing in the ArrayList. If you're
storing value types, that will be slower and more memory hungry because
it will be boxing for every store and unboxing on every fetch.

For reference types, I would still expect it to be faster to use a
List<Tbecause of the lack of need for a runtime cast while fetching,
but I haven't tried it. I'd expect the difference in this case either
way to be smaller than the difference between using value types in an
arraylist and in a List<T>.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

May 8 '07 #22

Jon Harrop

Cor Ligthert [MVP] wrote:

"Alvin, thanks, so List<Tis faster,

No it is in fact slower, a strongly typed type has always extra code.
However it is type safe and helps you to make more secure programs.

No. Static typing results in fewer run-time checks and faster code as well
as improved reliability.

--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/product...ournal/?usenet

May 8 '07 #23

Peter Duniho

On Mon, 07 May 2007 21:48:41 -0700, Marc Scheuner <no*****@for.mewrote:

>The only reason I can think of off the top of my head to use the
ArrayListclass is when you need to pass one to an existing .NET
method.

Or if you need to have a list of objects which are not all of the same
type - that won't jive in a List<T:-)

As Goran points out, if all of the objects have the same base type, then
you just use List<Twhere T is the least-derived type common to all of
the objects.

If all of the objects don't have the same base type, I would question
whether they really ought to be found in the same array. :)

And yes, I suppose you can always fall back to the Object base class, but
if that's all the objects have in common I think it would be unusual for
them to be collected together (one obvious exception being something that
involves itself specifically with the Object-ness of the objects, such as
the garbage collection for example).

Pete

May 8 '07 #24

Marc Scheuner

>Oh, but it will. Just make a List<object>. :)

True - but then you've basically thrown away all the benefits of a
type-safe list :-) I see your point though

Marc

May 9 '07 #25

Jon Davis

"Cor Ligthert [MVP]" <no************@planet.nlwrote in message
news:uS**************@TK2MSFTNGP04.phx.gbl...

>
"Alvin, thanks, so List<Tis faster,

No it is in fact slower, a strongly typed type has always extra code.

Where did you come up with that idea? Strong typing enforces type checking,
but that's faster than unboxing. ArrayList is nothing but a List<object>
with different semantics. Returning a string that was boxed to System.Object
and then casted back to a string is slower than returning a string that was
never boxed.

Jon

May 16 '07 #26

Frans Bouma [C# MVP]

Jon Davis wrote:

>
"Cor Ligthert [MVP]" <no************@planet.nlwrote in message
news:uS**************@TK2MSFTNGP04.phx.gbl...

"Alvin, thanks, so List<Tis faster,

No it is in fact slower, a strongly typed type has always extra
code.

Where did you come up with that idea? Strong typing enforces type
checking, but that's faster than unboxing. ArrayList is nothing but a
List<objectwith different semantics. Returning a string that was
boxed to System.Object and then casted back to a string is slower
than returning a string that was never boxed.

Strings aren't boxed, they're reference types under the hood.

Generics aren't always faster. A good example is the Dictionary<K,V>
class, which is slower than the Hashtable in many situations.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 16 '07 #27

Jon Harrop

Frans Bouma [C# MVP] wrote:

Strings aren't boxed

Can you elaborate on this? Presumably they end up getting passed as a
pointer to a char array?

Generics aren't always faster. A good example is the Dictionary<K,V>
class, which is slower than the Hashtable in many situations.

I just timed this and the generic Dictionary was much faster than the
non-generic Hashtable for double->double mappings (timings are best of 3):

Insert:
Hashtable 31.0s
Dictionary 4.90s

Fetch:
Hashtable 6.09s
Dictionary 3.20s

Can you cite some benchmark results where Hashtable is faster?

--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/product...ournal/?usenet

May 16 '07 #28

Jon Skeet [C# MVP]

On May 16, 10:45 am, Jon Harrop <j...@ffconsultancy.comwrote:

Strings aren't boxed

Can you elaborate on this? Presumably they end up getting passed as a
pointer to a char array?

They end up getting passed in the same way as any other reference
type. The reference is passed by value.

Strings aren't just reference types "under the hood" - they're
reference types from every perspective. They happen to be immutable,
of course, but that's a separate matter.

Jon

May 16 '07 #29

Jon Harrop

Jon Skeet [C# MVP] wrote:

The reference is passed by value.

Right. So in what sense are they not "boxed"?

--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/product...ournal/?usenet

May 16 '07 #30

Jon Harrop

Cor Ligthert [MVP] wrote:

"Alvin, thanks, so List<Tis faster,

No it is in fact slower

Can you cite anything to back this up? All I can find is overwhelming
evidence to the contrary.

a strongly typed type has always extra code.

Are you referring to IL or native code? Can you cite a reference on this
because it doesn't make any sense to me. For one thing, surely it depends
entirely upon the compilers.

However it is type safe

Are they not both type safe?

If you use the arraylist only inside a method with one attribute and let
it go out of scoop afterwards, the arraylist is probably the fastest.

Why?

--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/product...ournal/?usenet

May 16 '07 #31

Christof Nordiek

"Jon Harrop" <jo*@ffconsultancy.comschrieb im Newsbeitrag
news:46**********************@ptn-nntp-reader02.plus.net...

>The reference is passed by value.

Right. So in what sense are they not "boxed"?

They are "boxed" as any reference type always is "boxed" as they never occur
unboxed.
So while you *could* say, they are boxed, no boxing takes place, when
casting from string to object and no unboxing, but only typechecking takes
place, when casting from object to string.

Note, that the word "boxed" seldom to never is used in the way I used it
above but only in connection with value types, where boxing occurs.

Christof

May 16 '07 #32

Jon Skeet [C# MVP]

On May 16, 12:39 pm, Jon Harrop <j...@ffconsultancy.comwrote:

The reference is passed by value.

Right. So in what sense are they not "boxed"?

In the sense that boxing only applies to value types in the first
place. The process of boxing take a value type value, creates an
object on the heap containing that value, and returns a reference to
that object.

None of that happens when you're using a reference type to start with.

Jon

May 16 '07 #33

Jon Harrop

Jon Skeet [C# MVP] wrote:

In the sense that boxing only applies to value types in the first
place. The process of boxing take a value type value, creates an
object on the heap containing that value, and returns a reference to
that object.

None of that happens when you're using a reference type to start with.

So C# programmers don't regard reference types as boxed. I didn't know that.

Frans' statement seems to have been about referential transparency then.

--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/product...ournal/?usenet

May 16 '07 #34

Jon Skeet [C# MVP]

Jon Harrop <jo*@ffconsultancy.comwrote:

Jon Skeet [C# MVP] wrote:
In the sense that boxing only applies to value types in the first
place. The process of boxing take a value type value, creates an
object on the heap containing that value, and returns a reference to
that object.

None of that happens when you're using a reference type to start with.

So C# programmers don't regard reference types as boxed. I didn't
know that.

Well, it's not just a C# thing - boxing/unboxing is a CLR concept too.
(FWIW, the same terminology applies in Java - I've never heard of
reference-to-reference conversions being described as boxing before.)

Frans' statement seems to have been about referential transparency
then.

Do you mean Jon Davis' statement? The only statement I saw Frans making
was to say that strings aren't boxed because they're already reference
types.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

May 16 '07 #35

Jon Harrop

Jon Skeet [C# MVP] wrote:

Well, it's not just a C# thing - boxing/unboxing is a CLR concept too.
(FWIW, the same terminology applies in Java - I've never heard of
reference-to-reference conversions being described as boxing before.)

I'm from a functional programming background most recently and, to me,
boxing means referring to a block of memory by pointer and it is only used
in reference to the internal run-time representations of a language
implementation.

For example, if you write a complex number implementation in C# then (to me)
using a class results in boxing and using a struct avoids boxing, i.e. an
array of complex numbers is represented as a float array internally in the
case of a struct Complex, but an array of pointers to C-like structs in the
case of an object Complex.

Moreover, the ubiquity of referential transparency in functional programming
languages leads to the notion of boxing only being useful when discussing
performance: it makes no difference to the semantics whatsoever.

In the case of C# (and the CLR), the word "box" seems to have quite a
different meaning where (correct me if I'm wrong) some types have been
given special status as so-called "value" types (probably in the interests
of performance) that are not inherited from the almost-universal type
Object, and regaining the flexibility of a universal type by wrapping a
value type in an object is referred to as "boxing".

I've done a little C# and Java but most of my work is in OCaml, F# and C++.
This is all hidden in F# because all value types are immutable and
parametric polymorphism (generics) obviates the need for a universal type.
This does appear sometimes when you want to pass a parameter to a
dynamically-typed C# interface (e.g. Office automation) and you must
write "box 1" instead of "1" because the value type "int" does not unify
with the universal type "obj".

>Frans' statement seems to have been about referential transparency
then.

Do you mean Jon Davis' statement? The only statement I saw Frans making
was to say that strings aren't boxed because they're already reference
types.

Frans' statement really confused me because I understood the first half to
mean that strings are handled as an inline char array (unboxed) but the
second half means that strings are handled as a pointer to a char array.

In fact, I think he was saying that strings are immutable values passed by
reference. That's pretty much the simplest decent representation of a
string so it makes sense. If you wanted to handle huge strings with
variable-length chars then a more sophisticated data structure could be
swapped in transparently.

--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/product...ournal/?usenet

May 16 '07 #36

Jon Skeet [C# MVP]

Jon Harrop <jo*@ffconsultancy.comwrote:

Jon Skeet [C# MVP] wrote:
Well, it's not just a C# thing - boxing/unboxing is a CLR concept too.
(FWIW, the same terminology applies in Java - I've never heard of
reference-to-reference conversions being described as boxing before.)

I'm from a functional programming background most recently and, to me,
boxing means referring to a block of memory by pointer and it is only used
in reference to the internal run-time representations of a language
implementation.

Right. It's certainly nothing like that.

In the case of C# (and the CLR), the word "box" seems to have quite a
different meaning where (correct me if I'm wrong) some types have been
given special status as so-called "value" types (probably in the interests
of performance) that are not inherited from the almost-universal type
Object, and regaining the flexibility of a universal type by wrapping a
value type in an object is referred to as "boxing".

Well, it's not just about performance. See
http://pobox.com/~skeet/csharp/memory.html
and
http://pobox.com/~skeet/csharp/references.html
for more description - but there may be "gotchas" in there where the
same word is used for different meanings in functional code.

Frans' statement seems to have been about referential transparency
then.
Do you mean Jon Davis' statement? The only statement I saw Frans making
was to say that strings aren't boxed because they're already reference
types.

Frans' statement really confused me because I understood the first half to
mean that strings are handled as an inline char array (unboxed) but the
second half means that strings are handled as a pointer to a char array.

In fact, I think he was saying that strings are immutable values passed by
reference. That's pretty much the simplest decent representation of a
string so it makes sense. If you wanted to handle huge strings with
variable-length chars then a more sophisticated data structure could be
swapped in transparently.

Strings are indeed immutable, but they're not passed by reference in
the formal description of those semantics. Instead, the reference is
passed by value.

See http://pobox.com/~skeet/csharp/parameters.html for a rather more
detailed description.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

May 16 '07 #37

Barry Kelly

Jon Harrop wrote:

Jon Skeet [C# MVP] wrote:
Well, it's not just a C# thing - boxing/unboxing is a CLR concept too.
(FWIW, the same terminology applies in Java - I've never heard of
reference-to-reference conversions being described as boxing before.)

I'm from a functional programming background most recently and, to me,
boxing means referring to a block of memory by pointer and it is only used
in reference to the internal run-time representations of a language
implementation.

Because functional languages typically use immutable values in a GC
environment, whether a type is implemented as a value type passed by
value or by reference, or as a heap-allocated reference type passed by
value, it does indeed matter little which representation is used, and
boxing there would be an implementation detail. Not so when values can
be arbitrarily modified - modifying a copy is quite different from
modifying a reference.

-- Barry

--
http://barrkel.blogspot.com/

May 16 '07 #38

Jon Harrop

Jon Skeet [C# MVP] wrote:

Strings are indeed immutable, but they're not passed by reference in
the formal description of those semantics. Instead, the reference is
passed by value.

Which brings me to another question: how is passing a reference by value not
pass-by-reference?

See http://pobox.com/~skeet/csharp/parameters.html for a rather more
detailed description.

I'll check it out, thanks for the links.

--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/product...ournal/?usenet

May 16 '07 #39

Jon Skeet [C# MVP]

Jon Harrop <jo*@ffconsultancy.comwrote:

Jon Skeet [C# MVP] wrote:
Strings are indeed immutable, but they're not passed by reference in
the formal description of those semantics. Instead, the reference is
passed by value.

Which brings me to another question: how is passing a reference by value not
pass-by-reference?

See http://pobox.com/~skeet/csharp/parameters.html for a rather more
detailed description.

I'll check it out, thanks for the links.

The last link should answer your previous question - if it doesn't,
I'll explain in more detail (and update the article). Please let me
know either way!

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

May 16 '07 #40

Peter Duniho

On Wed, 16 May 2007 15:19:47 -0700, Jon Harrop <jo*@ffconsultancy.com
wrote:

Which brings me to another question: how is passing a reference by value
not pass-by-reference?

In some sense it is. Consider, for example, the C way of doing things in
which everything is really passed by value, but you can effectively pass
something by reference by explicitly passing a reference to it. But of
course when you did that, you had to explicitly dereference the passed in
reference value. Even though C always passes things by value, you could
still essentially pass things by reference by doing it explicitly:

void SomeFunction(int *pi)
{
// The parameter is "pi"...to modify the original, you need to
// explicitly dereference "pi" using the "*" operator
*pi = 5;
}

void OtherFunction()
{
int i = 4;

SomeFunction(&i);
printf("%d\r\n", i);
}

Output:
5

Later on, we got true "passing by reference" in C++, where you simply used
the parameter identifier, but it was a reference to the original value
from the caller. This is like using the "Var" keyword in Pascal, for
example. And, of course, like using the "ref" or "out" keywords in C#.
For example:

void SomeFunction(int &i)
{
// The parameter is "i"...no extra work is necessary to modify
// the original.
i = 5;
}

void OtherFunction()
{
int i = 4;

SomeFunction(&i);
printf("%d\r\n", i);
}

Output:
5

IMHO, one very nice thing about distinguishing between passing a reference
and passing BY reference is that it makes it simpler when you are passing
a reference to a reference. In C, you have to do this by passing a
pointer to a pointer, which makes some people's head hurt. Granted, maybe
passing references by reference in C# makes some peoples head hurt too,
but I think maybe it makes fewer people's head hurt. :)

All that said, note that I only said "in some sense". I don't believe
it's true in the most useful sense...just in a certain way of looking at
it. In the most useful sense, passing a reference by value is definitely
*not* the same as passing by reference. That is, when you pass a
reference by value, the object to which the reference refers can be
directly modified. But the original reference that you passed in is *not*
changeable. A copy of that reference was made, and if the called function
changes the reference itself, that change will be reflected only within
the called function.

Only by passing a reference by reference can the called function modify
the original reference. And that's the difference between passing a
reference by value and passing it by reference.

For what it's worth, I still think Jon's write-up is a great place to
figure this stuff out, but since I went to the trouble of writing my own
little summary in a previous post, I am copying it here in case you find
it useful:

---- begin quoted text ----
By default passing is always "by value". The "ref" and "out" keywords
cause things to be passed "by reference". Not to be confused with the
difference between "value types" and "reference types". Again, I think
Jon's write-up does a good job of outlining the various combinations of
value and reference types being passed by value and by reference. A quick
summary though:

1) pass value type by value: a copy of the value type is made and
passed to the method
example:
void method1(int x)
{
x = 5;
}
void method2()
{
int y = 4;

method1(y);
Console.WriteLine(y);
}

output:
4

2) pass reference type by value: a copy of the reference is made and
passed to the method, but it's important to note that it's the *reference*
being copied, not the object itself. So unless the reference type is
immutable (like String, for example), the object itself can be modified by
the called method (but the reference to the object cannot be)
example:
class A
{
public int i;

public A(int j)
{
i = j;
}
}
void method1(A x)
{
x.i = 5;
x = new A(6);
}
void method2()
{
A y = new A(4);

method1(y);
Console.WriteLine(y.i);
}

output:
5

3) pass value type by reference: a reference to the value type is
passed, and the called method can modify the original value type instance
example:
void method1(ref int x)
{
x = 5;
}
void method2()
{
int y = 4;

method1(ref y);
Console.WriteLine(y);
}

output:
5

4) pass reference type by reference: a reference to the reference is
made; not only can the reference type object be modifed as in case #2, but
the reference *to* that object can be modified as well.
example:
class A
{
public int i;

public A(int j)
{
i = j;
}
}
void method1(ref A x)
{
x.i = 5;
x = new A(6);
}
void method2()
{
A y = new A(4);

method1(ref y);
Console.WriteLine(y.i);
}

output:
6
---- end quoted text ----

Pete

May 16 '07 #41

Jon Davis

Off-topic,

I think we should start a Jon with no 'h' club.

Jon

May 17 '07 #42

Frans Bouma [C# MVP]

Jon Harrop wrote:

Jon Skeet [C# MVP] wrote:
In the sense that boxing only applies to value types in the first
place. The process of boxing take a value type value, creates an
object on the heap containing that value, and returns a reference to
that object.

None of that happens when you're using a reference type to start
with.

So C# programmers don't regard reference types as boxed. I didn't
know that.

Frans' statement seems to have been about referential transparency
then.

What I wanted to make clear is that a string isn't a value type,
although it acts like one. Often, people make the mistake that they
think a string is a value type and as value types are 'boxed' when
added to an object typed data structure like ArrayList, they assume
strings are boxed as well and thus run into a performance hit. This
isnt' the case, strings aren't value types and therefore there's no
boxing taking place, as 'boxing' in this context is the objectification
of a value type which doesn't happen so no performance loss.

That was the cause of my reply :).

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 17 '07 #43

Frans Bouma [C# MVP]

Jon Harrop wrote:

Frans Bouma [C# MVP] wrote:
Strings aren't boxed

Can you elaborate on this? Presumably they end up getting passed as a
pointer to a char array?

see my reply deeper in the thread and Jon's explanation.

Generics aren't always faster. A good example is the Dictionary<K,V>
class, which is slower than the Hashtable in many situations.

I just timed this and the generic Dictionary was much faster than the
non-generic Hashtable for double->double mappings (timings are best
of 3):

Insert:
Hashtable 31.0s
Dictionary 4.90s

Fetch:
Hashtable 6.09s
Dictionary 3.20s

Can you cite some benchmark results where Hashtable is faster?

With objects I see different results:
Iteration: 0
1000000 inserts into Hashtable took: 1587ms
1000000 inserts into dictionary took: 1256ms
Iteration: 1
1000000 inserts into Hashtable took: 1737ms
1000000 inserts into dictionary took: 1248ms
Iteration: 2
1000000 inserts into Hashtable took: 1612ms
1000000 inserts into dictionary took: 1844ms

code:
public class A
{
private string _foo;

public string Foo
{
get
{
return _foo;
}
set
{
_foo = value;
}
}

}

public class Program
{
public static void Main(string[] args)
{
Stopwatch sw = new Stopwatch();

int amount = 1000000;

for(int l = 0; l < 3; l++)
{
sw.Reset();
Console.WriteLine("Iteration: {0}", l);

Hashtable ht = new Hashtable(amount);
Dictionary<string, Adt= new Dictionary<string, A>(amount);
sw.Start();
for(int i = 0; i < amount; i++)
{
string name = i.ToString();
A a = new A();
a.Foo = name;
ht.Add(name, a);
}

sw.Stop();
Console.WriteLine("{0} inserts into Hashtable took: {1}ms", amount,
sw.ElapsedMilliseconds);

sw.Reset();
sw.Start();
for(int i = 0; i < amount; i++)
{
string name = i.ToString();
A a = new A();
a.Foo = name;
dt.Add(name, a);
}

sw.Stop();
Console.WriteLine("{0} inserts into dictionary took: {1}ms", amount,
sw.ElapsedMilliseconds);
}
}
}

Though this isn't scientific. What I've seen in our framework is that
where we store fieldname-index tuples into a dictionary, it is in
profiling slower than with a hashtable. WHich I found odd of course,
but I couldn't explain why it was. I probably should have said: "some"
situations instead of 'many'.

One serious bottleneck in the dictionary is binary serialization. It's
so incredibly slow and gives more data than the hashtable it's not even
funny. For that I've written a 'FastDictionary':

/// <summary>
/// Utility class which can be used instead of a normal Dictionary
class when the Dictionary class is serialized. This class is faster and
/// has much less overhead than the normal dictionary class, as it
doesn't use generic types.
/// </summary>
/// <typeparam name="TKey">Key type</typeparam>
/// <typeparam name="TValue">Value type</typeparam>
[Serializable]
public class FastDictionary<TKey, TValue: Dictionary<TKey, TValue>
{
#region Class Member Declarations
[NonSerialized]
private SerializationInfo _info;
#endregion

/// <summary>
/// CTor
/// </summary>
/// <param name="capacity"></param>
public FastDictionary( int capacity )
: base( capacity )
{
}

/// <summary>
/// CTor
/// </summary>
public FastDictionary() : base()
{
}

/// <summary>
/// CTor
/// </summary>
/// <param name="d"></param>
public FastDictionary(IDictionary<TKey, TValued)
: base( d )
{
}

/// <summary>
/// Deserialization CTor
/// </summary>
/// <param name="info"></param>
/// <param name="context"></param>
protected FastDictionary(SerializationInfo info, StreamingContext
context)
{
_info = info;
}

/// <summary>
/// Implements the <see
cref="T:System.Runtime.Serialization.ISerializable "></seeinterface
and returns the data needed to serialize the <see
cref="T:System.Collections.Generic.Dictionary`2"></seeinstance.
/// </summary>
/// <param name="info">A <see
cref="T:System.Runtime.Serialization.Serialization Info"></seeobject
that contains the information required to serialize the <see
cref="T:System.Collections.Generic.Dictionary`2"></see>
instance.</param>
/// <param name="context">A <see
cref="T:System.Runtime.Serialization.StreamingCont ext"></seestructure
that contains the source and destination of the serialized stream
associated with the <see
cref="T:System.Collections.Generic.Dictionary`2"></see>
instance.</param>
/// <exception cref="T:System.ArgumentNullException">info is
null.</exception>
public override void
GetObjectData(System.Runtime.Serialization.Seriali zationInfo info,
System.Runtime.Serialization.StreamingContext context)
{
object[] keys = new object[base.Count];
object[] values = new object[base.Count];

int i = 0;
foreach(KeyValuePair<TKey, TValuepair in this)
{
keys[i] = pair.Key;
values[i] = pair.Value;
i++;
}

info.AddValue("keys", keys);
info.AddValue("values", values);
}

/// <summary>
/// Implements the <see
cref="T:System.Runtime.Serialization.ISerializable "></seeinterface
and raises the deserialization event when the deserialization is
complete.
/// </summary>
/// <param name="sender">The source of the deserialization
event.</param>
/// <exception
cref="T:System.Runtime.Serialization.Serialization Exception">The <see
cref="T:System.Runtime.Serialization.Serialization Info"></seeobject
associated with the current <see
cref="T:System.Collections.Generic.Dictionary`2"></seeinstance is
invalid.</exception>
public override void OnDeserialization(object sender)
{
object[] keys = (object[])_info.GetValue("keys", typeof(object[]));
object[] values = (object[])_info.GetValue("values",
typeof(object[]));

for(int i = 0; i < keys.Length; i++)
{
base.Add((TKey)keys[i], (TValue)values[i]);
}
}
}

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 17 '07 #44

Alvin Bruney [MVP]

>In some sense it is. Consider, for example, the C way of doing things in
"C"? Is that still around? Yuck.

--
Regards,
Alvin Bruney
------------------------------------------------------
Shameless author plug
Excel Services for .NET is coming...
https://www.microsoft.com/MSPress/books/10933.aspx
OWC Black Book www.lulu.com/owc
Professional VSTO 2005 - Wrox/Wiley
"Peter Duniho" <Np*********@nnowslpianmk.comwrote in message
news:op***************@petes-computer.local...
On Wed, 16 May 2007 15:19:47 -0700, Jon Harrop <jo*@ffconsultancy.com>
wrote:

Which brings me to another question: how is passing a reference by value
not pass-by-reference?

May 17 '07 #45

Similar topics