473,407 Members | 2,312 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,407 software developers and data experts.

Efficiency about hashtable, arraylist, string and synchronization

Hi,

Here is the scenario. I have a list of IDs and there are multiple threads
trying to add/remove/read from this list. I can do in C#

1. create Hashtable hList = Hashtable.Synchronized(new Hashtable());
2. create ArrayList aList = ArrayList.Synchronized(new ArrayList());
3. create a string sList = "";

For 1 and 2, since the list is synced, many threads can directly
add/remvoe/read without worrying about the synchrionization.
For 3, I could just
lock(sList)
sList.Replace(id,"");
for removing.

lock(sList)
sList += id;
for addition.

lock(sList)
sList.Indexof(id) >= 0
for searching availability.

My question is

Which way is the thread-safe and the most efficient way to do my work?

If one method is preferred at certain list length, is there a breaking point
based on teh length of the list, by then one method may start to get better
than another one.

The situation is very simple but I face it in most of my applications. Very
interested on how MS implement it behind the scene and whehter any evaluation
is done on this.

Thanks a lot

Chris

Oct 27 '06 #1
10 8478
chrisben wrote:
Here is the scenario. I have a list of IDs and there are multiple threads
trying to add/remove/read from this list. I can do in C#

1. create Hashtable hList = Hashtable.Synchronized(new Hashtable());
2. create ArrayList aList = ArrayList.Synchronized(new ArrayList());
3. create a string sList = "";

For 1 and 2, since the list is synced, many threads can directly
add/remvoe/read without worrying about the synchrionization.
So long as you don't try to iterate through the list/table... only
*individual* operations are synchronized.
For 3, I could just
lock(sList)
sList.Replace(id,"");
for removing.
That would be a no-op - string.Replace returns a new string.
lock(sList)
sList += id;
for addition.
This changes sList to refer to a different string.
lock(sList)
sList.Indexof(id) >= 0
for searching availability.

My question is

Which way is the thread-safe and the most efficient way to do my work?
Don't use locking at all for strings. They're threadsafe in themselves,
but you need to be aware that anything which "changes" the string
doesn't actually change the string - it just retuns a new string.
If one method is preferred at certain list length, is there a breaking point
based on teh length of the list, by then one method may start to get better
than another one.

The situation is very simple but I face it in most of my applications. Very
interested on how MS implement it behind the scene and whehter any evaluation
is done on this.
If you *actually* want to synchronize operations on the *variable*
sList, you should create a separate lock object, and lock on that
whenever you want to change sList's value.

Jon

Oct 27 '06 #2
Hashtable, but not your way ;-p

The string option is bad. Don't go near it. Firstly you would need a
delimiter, but mainly because strings are immutable (and hence your code is
actually wrong). You would need sList = sList.Replace(id,""); but now you
have multiple lock objects; OK you could avoid this with a separate lock,
but you are still making *lots* of strings.

Personally I also wouldn't go near the .Synchronized options... I'd just:

lock(hList) {
hList.Add(id);
}

etc; this way you know the duration of the lock, so you can safely do
multiple steps as an atomic unit:
lock(hList) {
if(!hList.Contains(id)) {
hList.Add(id,id); // value is dummy
}
}

Note also that (assuming ID is an integer) this will involve boxing; if you
can, I would suggest moving to 2.0 and using a Dictionary<int,int>.

Marc
Oct 27 '06 #3
Thanks Jon. Actually, I used either arraylist and hashtable in my
applications, but now I think it may be less efficient especially if my list
is small.

I am wrong on string sample, should be like

lock(myObject)
sList = sList.Replace(id,"");

lock(myObject)
sList += id;

sList.Indexob(id) 0 (should I lock here?)
My feeling is that for a short list, efficiency will be

string ArrayList Hashtable

but I am not sure.

What do you think?

Thanks

Chris
"Jon Skeet [C# MVP]" wrote:
chrisben wrote:
Here is the scenario. I have a list of IDs and there are multiple threads
trying to add/remove/read from this list. I can do in C#

1. create Hashtable hList = Hashtable.Synchronized(new Hashtable());
2. create ArrayList aList = ArrayList.Synchronized(new ArrayList());
3. create a string sList = "";

For 1 and 2, since the list is synced, many threads can directly
add/remvoe/read without worrying about the synchrionization.

So long as you don't try to iterate through the list/table... only
*individual* operations are synchronized.
For 3, I could just
lock(sList)
sList.Replace(id,"");
for removing.

That would be a no-op - string.Replace returns a new string.
lock(sList)
sList += id;
for addition.

This changes sList to refer to a different string.
lock(sList)
sList.Indexof(id) >= 0
for searching availability.

My question is

Which way is the thread-safe and the most efficient way to do my work?

Don't use locking at all for strings. They're threadsafe in themselves,
but you need to be aware that anything which "changes" the string
doesn't actually change the string - it just retuns a new string.
If one method is preferred at certain list length, is there a breaking point
based on teh length of the list, by then one method may start to get better
than another one.

The situation is very simple but I face it in most of my applications. Very
interested on how MS implement it behind the scene and whehter any evaluation
is done on this.

If you *actually* want to synchronize operations on the *variable*
sList, you should create a separate lock object, and lock on that
whenever you want to change sList's value.

Jon

Oct 27 '06 #4
Thanks Marc. My ids are strings. For long list, I agree, but if it is only
3-5 ids one time (maybe 100 chars total at most), do you think string
approach is better? How about stringbuffer? would that be better?
Yes, my studio is still for 1.1 .NET :-(

"Marc Gravell" wrote:
Hashtable, but not your way ;-p

The string option is bad. Don't go near it. Firstly you would need a
delimiter, but mainly because strings are immutable (and hence your code is
actually wrong). You would need sList = sList.Replace(id,""); but now you
have multiple lock objects; OK you could avoid this with a separate lock,
but you are still making *lots* of strings.

Personally I also wouldn't go near the .Synchronized options... I'd just:

lock(hList) {
hList.Add(id);
}

etc; this way you know the duration of the lock, so you can safely do
multiple steps as an atomic unit:
lock(hList) {
if(!hList.Contains(id)) {
hList.Add(id,id); // value is dummy
}
}

Note also that (assuming ID is an integer) this will involve boxing; if you
can, I would suggest moving to 2.0 and using a Dictionary<int,int>.

Marc
Oct 27 '06 #5
No, I don't think that strings are a good solution here.

Yes, for very short sets, lists can be quicker.
If you are worried about different performance at different scales, then
look at HybridDictionary; this wee beastie starts life using a list
implementation, and if the list gets to the pivot point it switches to a
hashtable.

Best of both. But personally I still prefer to keep things simple and use a
Dictionary<,>; generally, if I am worried about performance, it is when the
list gets too *big*, not too small.

Marc
Oct 27 '06 #6

Here is my problem Marc.

My list is small, but I may have many objects (hundreds), and each one will
hold a list like this. Hastable and ArrayList are complex data structure (I
think), which will require more resource regardless how small your list is.
That is why I start to think about string or stringbuffer, which I assume
less in term of resource and little or no difference interm of performance on
manipulation. For a short list, I am wondering whether it will be an overkill
to use ArrayList or Hashtable, especially that I may have hundreds of the
objects which hold this list structure.

What do you think?

Thanks

Chris
"Marc Gravell" wrote:
No, I don't think that strings are a good solution here.

Yes, for very short sets, lists can be quicker.
If you are worried about different performance at different scales, then
look at HybridDictionary; this wee beastie starts life using a list
implementation, and if the list gets to the pivot point it switches to a
hashtable.

Best of both. But personally I still prefer to keep things simple and use a
Dictionary<,>; generally, if I am worried about performance, it is when the
list gets too *big*, not too small.

Marc
Oct 27 '06 #7
I think "hundreds" is barely worth worrying about given modern PCs; and no,
I don't advocate lazy memoery swamping code, but there is a middle ground.
Developer time is more expensive than the memory you are using, and doing
things a convoluted way is going to make it harder to develop, test,
support, migrate, etc...

Now, if that was hundreds of /thousands/ of object... then maybe... but even
then "keep it simple" is worth a lot. Plus I'd probably mention words like
"database".

In short, I wouldn't hesitate to start with the list/hashtable whatever
implementation, and then optimise it *if* profiling shows it to be an issue.
I doubt it would.

;-p

Marc

"chrisben" <ch******@discussions.microsoft.comwrote in message
news:0F**********************************@microsof t.com...
>
Here is my problem Marc.

My list is small, but I may have many objects (hundreds), and each one
will
hold a list like this. Hastable and ArrayList are complex data structure
(I
think), which will require more resource regardless how small your list
is.
That is why I start to think about string or stringbuffer, which I assume
less in term of resource and little or no difference interm of performance
on
manipulation. For a short list, I am wondering whether it will be an
overkill
to use ArrayList or Hashtable, especially that I may have hundreds of the
objects which hold this list structure.

What do you think?

Thanks

Chris
"Marc Gravell" wrote:
>No, I don't think that strings are a good solution here.

Yes, for very short sets, lists can be quicker.
If you are worried about different performance at different scales, then
look at HybridDictionary; this wee beastie starts life using a list
implementation, and if the list gets to the pivot point it switches to a
hashtable.

Best of both. But personally I still prefer to keep things simple and use
a
Dictionary<,>; generally, if I am worried about performance, it is when
the
list gets too *big*, not too small.

Marc

Oct 27 '06 #8
But, if you often change the list, you will allways copy the whole list and
you will create many new instances of string.
The overhead of ArrayList is mainly in the storage, the acces should be
almost as fast as your string solution. The adding and removing will be much
faster.
This could have only advantages, if the lists are static most the time and
occupie the most part of the memory your application uses.
In any case, i would recommend your string-solution only, if the application
really has performance problems and this way its remarkably better.

"chrisben" <ch******@discussions.microsoft.comschrieb im Newsbeitrag
news:0F**********************************@microsof t.com...
>
Here is my problem Marc.

My list is small, but I may have many objects (hundreds), and each one
will
hold a list like this. Hastable and ArrayList are complex data structure
(I
think), which will require more resource regardless how small your list
is.
That is why I start to think about string or stringbuffer, which I assume
less in term of resource and little or no difference interm of performance
on
manipulation. For a short list, I am wondering whether it will be an
overkill
to use ArrayList or Hashtable, especially that I may have hundreds of the
objects which hold this list structure.

What do you think?

Thanks

Chris
"Marc Gravell" wrote:
>No, I don't think that strings are a good solution here.

Yes, for very short sets, lists can be quicker.
If you are worried about different performance at different scales, then
look at HybridDictionary; this wee beastie starts life using a list
implementation, and if the list gets to the pivot point it switches to a
hashtable.

Best of both. But personally I still prefer to keep things simple and use
a
Dictionary<,>; generally, if I am worried about performance, it is when
the
list gets too *big*, not too small.

Marc

Oct 27 '06 #9
Thanks for the help. I am convinced to stay with Hash/Array plan.

Thanks for the help

"chrisben" wrote:
Hi,

Here is the scenario. I have a list of IDs and there are multiple threads
trying to add/remove/read from this list. I can do in C#

1. create Hashtable hList = Hashtable.Synchronized(new Hashtable());
2. create ArrayList aList = ArrayList.Synchronized(new ArrayList());
3. create a string sList = "";

For 1 and 2, since the list is synced, many threads can directly
add/remvoe/read without worrying about the synchrionization.
For 3, I could just
lock(sList)
sList.Replace(id,"");
for removing.

lock(sList)
sList += id;
for addition.

lock(sList)
sList.Indexof(id) >= 0
for searching availability.

My question is

Which way is the thread-safe and the most efficient way to do my work?

If one method is preferred at certain list length, is there a breaking point
based on teh length of the list, by then one method may start to get better
than another one.

The situation is very simple but I face it in most of my applications. Very
interested on how MS implement it behind the scene and whehter any evaluation
is done on this.

Thanks a lot

Chris
Oct 27 '06 #10

First, ArrayList and Hashtable are two very different types of
collections. You shouldn't be choosing between the two in terms of
performance but rather functionality. Do you need to keep a list of
items which you'll refer to by index and just loop through or do you
need a dictionary of items so you can look up some value by a custom
key? The answer to that will determine if you an ArrayList or
Hashtable (or a variant of each).

Also there's a big difference between complex functionality and
complex data storage (memory usage).

ArrayList may be more complicated than a simple array and it provides
a lot more functionality, but the memory overhead is only two
additional ints and two object references (extra object reference for
the array itself, extra object for the syncroot, an int for the size,
and an int for the internal version). So the extra memory overhead is
very minor compared to the actual contents of an array (normally at
least). If you're ever going to be adding/removing elements from the
array, then definitely want to use arraylist.

However, if once the array items are created the array is static, then
you can use ToArray to convert it to a standard array and store a
regular typed array. We use this approach a lot with public
properties and return variables--we don't want to return ArrayLists
because they don't provide type, so we use ArrayLists internal to a
method to generate a list and then convert to an ArrayList (which has
a copy overhead of course).

Hashtable is a lot more complicated in both functionality and data
storage. It uses 4 int variables and six object references. If each
Hashtable only stores a small number of items, it's a lot of overhead.
If each hashtable has many items, then the overhead is meaninglist.

If you truely have a lot of dictionaries with only a few items in
each, then look at ListDictionary. It uses only two ints and two
object references for it's data storage and provides a very fast
implementation for a dictionary as long as the data set is small. We
have a calculation engine that performs calculations on a dynamic
hierarchy of data for reporting. We typically load about 300,000
ListDictionary instances with 3-5 objects and the whole process of
loading the data from a DataReader into the dictionaries and then
running the required calculations is under 30ms (compared to 1-2
seconds for running the sql and 10-40 seconds for the 3rd party
reporting engine to generate the reports).

HTH,

Sam
------------------------------------------------------------
We're hiring! B-Line Medical is seeking Mid/Sr. .NET
Developers for exciting positions in medical product
development in MD/DC. Work with a variety of technologies
in a relaxed team environment. See ads on Dice.com.
On Fri, 27 Oct 2006 08:31:01 -0700, chrisben
<ch******@discussions.microsoft.comwrote:
>
Here is my problem Marc.

My list is small, but I may have many objects (hundreds), and each one will
hold a list like this. Hastable and ArrayList are complex data structure (I
think), which will require more resource regardless how small your list is.
That is why I start to think about string or stringbuffer, which I assume
less in term of resource and little or no difference interm of performance on
manipulation. For a short list, I am wondering whether it will be an overkill
to use ArrayList or Hashtable, especially that I may have hundreds of the
objects which hold this list structure.

What do you think?

Thanks

Chris
Oct 27 '06 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Dan V. | last post by:
I would like to create a 2D string list (2D ArrayList ???). I would like to pass in a table or query as a parameter and have both columns transform into a 2D ArrayList. When I sort the one...
5
by: Netmonster | last post by:
Hello, Does any one have an example of how to create an ArrayList of objects? i.e. ArrayList of ArrayLists or an ArrayList of Hashtables? Thanks in advance KC
11
by: paradox | last post by:
Basically I have an ArrayList of strings; I need a fast way of searching through and comparing the elements of the ArrayList and the return an ArrayList of items that have 3 Duplicates. For...
3
by: Fred | last post by:
I'm trying to build a hashtable and a arraylist as object value I'm not able to retrieve stored object from the hashtable. Hashtable mp = new Hashtable(); // THE HASHTABLE ArrayList...
1
by: hung tran | last post by:
Hi, Here is the code to eliminate duplicate rows, but what if I want to keep them and elimiante all other the unique ? public void RemoveDuplicateRows(DataTable dTable,string colName) {...
9
by: raylopez99 | last post by:
Hello all— I’m trying to get the below to work and cannot get the format right. It’s from this example: http://msdn.microsoft.com/en-us/library/8627sbea(VS.71).aspx What it is: I’m trying...
2
by: almurph | last post by:
H ieveryone, Can you help me please? I am trying to sort a hashtable but get the error: "Cannot implicity convert type void to System.Collections.ArrayList" I am doing the following: ...
1
by: almurph | last post by:
H ieveryone, Can you help me please? I am trying to sort a hashtable but get the error: "Cannot implicity convert type void to System.Collections.ArrayList" I am doing the following: ...
7
by: JakeMathews | last post by:
Hi everyone, First off this is my first post on this site so hello again. I'm a beginner at programming in C#/ASP so the code I'm about show you may be bad practice or even worse but I'm getting...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.