How does the GetHashCode() of an array object behave?
Does it combine the GetHashCode() of its elements, or does
it create a sync block for the object?
I want to use readonly arrays as dictionary keys, based on
their content, not their identity. Is this feasible using the
arrays directly, or do I need to wrap them in a struct that
handles GetHashCode and Equal? If so, is such a wrapper
present in the standard class library?
My app will have hundreds or maybe thousands of these
dictionaries, some of which will contain only a single
arraykey-value entry, while others will have thousands
of entries. The number of entries in a particular dictionary
cannot be predicted or estimated reliably in advance. The
values associated with the arrays will be WeakReference
objects, each pointing to an object with a finalizer that will
remove the entry from the containing dictionary.
Would the Dictionary<Some Class[], WeakReference>
generic collection be suitable for this, or do I need to
spice it up? 19 3833
Ole Nielsby <ol*********@sn ailmail.dk> wrote: How does the GetHashCode() of an array object behave?
Does it combine the GetHashCode() of its elements, or does it create a sync block for the object?
Neither, as far as I'm aware. I believe it doesn't override GetHashCode
at all, so the implementation in Object is used.
I want to use readonly arrays as dictionary keys, based on their content, not their identity. Is this feasible using the arrays directly, or do I need to wrap them in a struct that handles GetHashCode and Equal? If so, is such a wrapper present in the standard class library?
You'll need to wrap them, I believe. (Any reason for using a struct
rather than a class though?)
My app will have hundreds or maybe thousands of these dictionaries, some of which will contain only a single arraykey-value entry, while others will have thousands of entries. The number of entries in a particular dictionary cannot be predicted or estimated reliably in advance. The values associated with the arrays will be WeakReference objects, each pointing to an object with a finalizer that will remove the entry from the containing dictionary.
Would the Dictionary<Some Class[], WeakReference> generic collection be suitable for this, or do I need to spice it up?
I'd consider using a HybridDictionar y - it's likely to be much more
efficient for the smaller ones. Unfortunately, however, I don't believe
there's a generic version :(
--
Jon Skeet - <sk***@pobox.co m> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Jon Skeet [C# MVP] <sk***@pobox.co m> wrote: Ole Nielsby <ol*********@sn ailmail.dk> wrote: How does the GetHashCode() of an array object behave?
Does it combine the GetHashCode() of its elements, or does it create a sync block for the object? Neither, as far as I'm aware. I believe it doesn't override GetHashCode at all, so the implementation in Object is used.
I may have got this wrong but I think the Object::GetHash Code
implementation will create a sync block if there isn't already
one. I want to use readonly arrays as dictionary keys, based on their content, not their identity. Is this feasible using the arrays directly, or do I need to wrap them in a struct that handles GetHashCode and Equal? If so, is such a wrapper present in the standard class library?
You'll need to wrap them, I believe. (Any reason for using a struct rather than a class though?)
There will be large numbers of these arrays, and wrappers
would be unnecessary memory bloat. A generic collection
used with a struct will, if I understand generics right, be able
to store the array pointers as efficiently as if I used them
directly, while using the overrides defined by my struct when
hashing and comparing.
I'd consider using a HybridDictionar y - it's likely to be much more efficient for the smaller ones. Unfortunately, however, I don't believe there's a generic version :(
Well then, I guess I will google up an open source generic
HybridDictionar y-like thing - I'm pretty sure it exists somewhere,
I just hoped I could make do with the standard library.
Regards/Ole N.
Ole Nielsby <ol*********@sn ailmail.dk> wrote: Does it combine the GetHashCode() of its elements, or does it create a sync block for the object?
Neither, as far as I'm aware. I believe it doesn't override GetHashCode at all, so the implementation in Object is used.
I may have got this wrong but I think the Object::GetHash Code implementation will create a sync block if there isn't already one.
Sync blocks and GetHashCode are unrelated. The first is to do with
locking; the second is to do with hashing. What makes you think they're
related? You'll need to wrap them, I believe. (Any reason for using a struct rather than a class though?)
There will be large numbers of these arrays, and wrappers would be unnecessary memory bloat. A generic collection used with a struct will, if I understand generics right, be able to store the array pointers as efficiently as if I used them directly, while using the overrides defined by my struct when hashing and comparing.
Yes, if you use generics you should be okay. I'd consider using a HybridDictionar y - it's likely to be much more efficient for the smaller ones. Unfortunately, however, I don't believe there's a generic version :(
Well then, I guess I will google up an open source generic HybridDictionar y-like thing - I'm pretty sure it exists somewhere, I just hoped I could make do with the standard library.
It sounds like you do need generics.
--
Jon Skeet - <sk***@pobox.co m> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Ole Nielsby <ol*********@sn ailmail.dk> wrote: Neither, as far as I'm aware. I believe it doesn't override GetHashCode at all, so the implementation in Object is used.
I may have got this wrong but I think the Object::GetHash Code implementation will create a sync block if there isn't already one.
Having just googled this, it appears that under v1.0 and v1.1,
Object.GetHashC ode does indeed create a sync block. It's incidental,
however - and the blog linked below suggests that v2.0 may not behave
in the same way. http://blogs.msdn.com/brada/archive/.../30/50396.aspx
--
Jon Skeet - <sk***@pobox.co m> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Jon Skeet [C# MVP] <sk***@pobox.co m> wrote: Having just googled this, it appears that under v1.0 and v1.1, Object.GetHashC ode does indeed create a sync block. It's incidental, however - and the blog linked below suggests that v2.0 may not behave in the same way.
http://blogs.msdn.com/brada/archive/.../30/50396.aspx
I think what the blog suggests is, you cannot assume that different
objects with non-overridden GetHashCode() always return different
hash codes even though this may be the case under v1.x and some
MS developers were caught using it.
I'll bet a beer that Object.GetHashC ode still creates a sync block in v2.
There is really nowhere else to store the hash code, unless they'd take
the trouble of storing it in the sync lock slot in a form distinguishable
from a real sync lock, then, in case a sync block is created later, move
the hash code to a slot in the sync block. This somehow seems bad
style though, and I don't think the clr forgers will (or should) do
this to facilitate some exotic uses of Object.GetHashC ode() which is
supposed to be overridden anyway.
Regards/Ole N.
"Ole Nielsby" <ol*********@sn ailmail.dk> wrote: I want to use readonly arrays as dictionary keys, based on their content, not their identity. Is this feasible using the arrays directly, or do I need to wrap them in a struct that handles GetHashCode and Equal? If so, is such a wrapper present in the standard class library?
Incidentally, I was also looking to hash a list based on contents
recently. I found "lookup2" by Bob Jenkins, and translated it into C#.
Maybe this will be helpful. (or maybe there already exist better C#
hash implementations elsewhere that I hadn't found...)
/// <summary>
/// Produces a hash out of a list of hash-values.
/// This is a C# implementation of the "lookup2" hash of Bob Jenkins
(1996).
/// Licensing conditions for Lookup2: Bob Jenkins writes, "You may
use this
/// code any way you wish, private, educational, or commercial. It's
free."
/// </summary>
/// <param name="keys">An enumerable list of items; we will
GetHashCode()
/// on each of them</param>
/// <param name="previousH ash">If you call hash more than once, feed
the previous
/// hash result into the next call to hash. Otherwise, if you only
call it once,
/// previousHash can be an arbitrary value.</param>
/// <returns>A 32-bit signed integer hash code.</returns>
private static int lookup2_hash(Sy stem.Collection s.IEnumerable keys,
int previousHash)
{ System.Collecti ons.IEnumerator keyi = keys.GetEnumera tor();
uint a = 0x9e3779b9; uint b=a; uint c = (uint)previousH ash;
uint[] k=new uint[12]; uint length=0; uint sublen=12;
while (true)
{ sublen=0; while (keyi.MoveNext( ) && sublen<12)
{ k[sublen] = (uint)keyi.Curr ent.GetHashCode ();
sublen++; length++;
}
if (sublen==12)
{ a += k[0] +(k[1]<<8) +(k[ 2]<<16) +(k[3]<<24);
b += k[4] +(k[5]<<8) +(k[ 6]<<16) +(k[7]<<24);
c += k[8] +(k[9]<<8) +(k[10]<<16) +(k[11]<<24);
a-=b;a-=c;a^=(c>>13);b-=c;b-=a;b^=(a<<8);c-=a;c-=b;c^=(b>>13);a-=b;a-=c;a^=(c>>12);
b-=c;b-=a;b^=(a<<16);c-=a;c-=b;c^=(b>>5);a-=b;a-=c;a^=(c>>3);b-=c;b-=a;b^=(a<<10);
c-=a;c-=b;c^=(b>>15);
}
else
{ c += length;
if (sublen>=11) c+= (k[10]<<24);
if (sublen>=10) c+= (k[9]<<16);
if (sublen>=9 ) c+= (k[8]<<8);
// the first byte of c is reserved for the sublength
if (sublen>=8) b+= (k[7]<<24);
if (sublen>=7) b+= (k[6]<<16);
if (sublen>=6) b+= (k[5]<<8);
if (sublen>=5) b+= (k[4]);
if (sublen>=4) a+= (k[3]<<24);
if (sublen>=3) a+= (k[2]<<16);
if (sublen>=2) a+= (k[1]<<8);
if (sublen>=1) a+= (k[0]);
a-=b;a-=c;a^=(c>>13);b-=c;b-=a;b^=(a<<8);c-=a;c-=b;c^=(b>>13);a-=b;a-=c;a^=(c>>12);
b-=c;b-=a;b^=(a<<16);c-=a;c-=b;c^=(b>>5);a-=b;a-=c;a^=(c>>3);b-=c;b-=a;b^=(a<<10);
c-=a;c-=b;c^=(b>>15);
return (int)c;
}
}
}
--
Lucian
Ole Nielsby wrote: How does the GetHashCode() of an array object behave?
In .NET1 atleast it returns an identity of the array, so:
byte[] b1 = {};
byte[] b2 = {};
b1.GetHashCode( ) != b2.GetHashCode( )
I want to use readonly arrays as dictionary keys, based on their content, not their identity. Is this feasible using the arrays directly, or do I need to wrap them in a struct that handles GetHashCode and Equal? If so, is such a wrapper present in the standard class library?
I would suggest using an IDictionary with keys something like:
public class ComparableArray <T>:
IEquatable<Comp arableArray<T>> , // if you decide to use hashing
IComparable<Com parableArray<T> >
{
int? hash; // if you decide to do hashing
public readonly T[] Array;
public ComparableArray (T[] array, int? hash) {
this.Array = array;
this.hash = hash;
}
public ComparableArray (T[] array): this(array, null) {}
public static int CalculateHash(T[] array) {
// do your stuff with Knuth, or whatever. perhaps something like:
uint x = 0;
foreach ( T t in array ) {
x ^= (uint)t.GetHash Code(); // combine hashes
// rotate make ordering %32 significant, Knuth can do wonders here
x = (uint)(x >> 1) | (uint)((x & 0x1) << 31);
}
return (int)x;
}
public override GetHashCode() {
if ( hash == null )
hash = CalculateHash(A rray);
return (int)hash;
}
public bool Equals(Comparab leArray<T> other) {
return
Array.Length == other.Array.Len gth
&& GetHashCode() == other.GetHashCo de()
&& compareItemByIt em(Array, other.Array);
}
public int Compare(Compara bleArray<T> other) {
int diff;
T[] a1 = Array, a2 = other.Array;
diff = a2.Length - a1.Length;
IComparer<T> comp = Comparer.Defaul t<T>;
if ( diff != 0 )
return diff;
for ( int i = 0; i < a1.Length; ++i ) {
diff = a2[i] - a1[i];
if ( diff != 0 )
return diff;
}
return 0;
}
public static bool compareItemByIt em(T[] a1, T[] a2) {
if ( a1.Length != a2.Length )
throw new ArgumentExcepti on("Unequal lengths", "a1 and a2");
// you could just compare some small subset of actual valuse
// iff your hash is really good
for ( int i = 0; i < a1.Length; ++i )
if ( !a1[i].Equals(a2[i]) )
return false;
return true;
}
// Make interchanging T[] and ComparableArray <T> easy
public static explicit operator T[](ComparableArra y a)
{ returrn a.Array; }
public static implicit operator ComparableArray (byte[] a)
{ return new ComparableArray (a); }
// decorate with operators == != <, <=, ... as you like
}
And use IDictionary<Com parableArray<T> , V> as the interface for storage.
You can use any implementation of IDictinoary using ComparableArray <T>
as the key-type, since it supports both hashing and alphabetic sorting.
The above Key-type implementation will cache the hashing of keys, so you
won't have to rehash the keys in the dictionary when doing lookup. It's
not that much performance gained, but since you need a wrapper Key-type
for GetHashCode and Equals anyway...
If you choose your hash-function well you might just want to sample a
few items when comparing item-by-item.
The best data-structure for this kind of operation is know as a
dictionary (don't confuse that with the C# Dictionary :) and is related
to prefix-trees. Have a look at those if you are screaming for
performance (they are pretty cute too :)
My app will have hundreds or maybe thousands of these dictionaries, some of which will contain only a single arraykey-value entry, while others will have thousands of entries. The number of entries in a particular dictionary cannot be predicted or estimated reliably in advance. The values associated with the arrays will be WeakReference objects, each pointing to an object with a finalizer that will remove the entry from the containing dictionary.
Try just using Dictionary<Comp arableArray<T>, WeakReference> or
SortedDictionar y<ComparableArr ay<T>, WeakReference>, it will have pretty
OK performance. You can always complicate things by using the
proxy-pattern to dispatch to different implementations of IDictionary
depending on dictionary-size.
Would the Dictionary<Some Class[], WeakReference> generic collection be suitable for this, or do I need to spice it up?
No, SomeClass[].GetHashCode screws it up.
--
Helge Jensen
mailto:he****** ****@slog.dk
sip:he********* *@slog.dk
-=> Sebastian cover-music: http://ungdomshus.nu <=-
"Lucian Wischik" <lu***@wischik. com> wrote: I found "lookup2" by Bob Jenkins, and translated it into C#.
Thanks - this goes into my code snippet collection.
Helge Jensen <he**********@s log.dk> wrote: I would suggest using an IDictionary with keys something like:
public class ComparableArray <T>: [...]
The above Key-type implementation will cache the hashing of keys [...]
I'll leave that part out... the arrays will never be hashed more than
once or twice. (They will be hashed and unified on creation). And
I'll do a struct instead of a class if possible, since large amounts of
them will be created.
[...]
Try just using Dictionary<Comp arableArray<T>, WeakReference> or SortedDictionar y<ComparableArr ay<T>, WeakReference>, it will have pretty OK performance.
I'll stick with the Dictionary<> then, and change it only if
profiling identifies it as a hot thing.
(The app is an interpreter for a homebrew programming language
that I use for some rather complex tasks, the meanest of which
is as a parser generator system that can produce parsers for
ill-structured languages like VB6. The data model of my language
implies that constant trees must be unified, like intern'ed strings
but deep structures, rather like attribute based XML. This
'cascading unification' doesn't really fit well with .NET,it's much
faster when objects aren't compacted, allowing hash functions
to operate directly on already unified pointers. Anyway, since
I'm doung a .NET port, I might as well go all the way and use
managed objects, interoperabilit y being more important than
raw speed. It's a pity, though, that the framework doesn't have
some support for cascading unification, like a generalized Intern,
perhaps controlled by an interface, like serialization is controlled
by ISerializable.)
Anyway, thanks for clarifying the array hashing issue. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Peter |
last post by:
Hi
how can I compare two byte arrays in VB.NET
Thank
Peter
|
by: Iain |
last post by:
Hi
Hopefully I am missing something really simple with this question, but
here goes.
I have two Bitarrays that I would like to compare. At the moment, I am
XORing one with the other and checking to see if the result has any 1s
in it (if so, the arrays are different). This seems to be faster than
comparing each bit of the two original arrays one at a time. But I
still have to iterate over each element of the array, and I'd like to
|
by: Elijah Bailey |
last post by:
I have two char arrays of size k.
I want to know which one is bigger (exactly like for instance
I compare two ints/longs/etc.).
What is the fastest way to do this? k <= 10 usually for my application.
I tried bcmp / a loop for comparison, but it seems they are very slow
compared to comparing longs...Any ideas?
I tried splitting the array into longs and comparing, but then
I face the high endian/low endian problem on some machines.
|
by: agent349 |
last post by:
First off, I know arrays can't be compared directly (ie: if (arrary1
== array2)). However, I've been trying to compare two arrays using
pointers with no success. Basically, I want to take three sets of
character strings from the user. Then I want to run through each
element and compare the two strings. If they match I print they
match... I'm having a bit of trouble with the actual loop through each
array using the pointers and comparing...
|
by: Donald Grove |
last post by:
If I have two arrays, what is a good paradigm for comparing what is in
them, to determine what elements they share, or don't share?
Specifically, each array could potentially contain the integers 1 to
9, but probably not all 9 integers.
MyArray1 = 1,2,3,5,6,7,8
MyArray2 = 1,4,6,7
I would need the answers to be arrays too:
| |
by: Sheldon |
last post by:
Hi,
I have two arrays that are identical and contain 1s and zeros. Only the
ones are valid and I need to know where both arrays have ones in the
same position. I thought logical_and would work but this example proves
otherwise:
>>> a =
>>> b =
>>> Numeric.logical_and(a==6,b==6)
0
|
by: psmahesh |
last post by:
Hi folks,
I am comparing two arrays and removing matches from the second array
from the first array.
Can someone take a look at this code below and mention if this is okay
and perhaps if there is a better way to achieve it
for(i=0;i<arrayA.length;i++){
for(j=0;j<arrayB.length;j++){
if(arrayA==arrayB)
|
by: Jayender |
last post by:
Hi,
What is the difference between Hashing and Encryption ?
|
by: Andrew Robinson |
last post by:
I am working on a pretty simple e-commerce web site that will sell our
company gift cards online. Our company and merchant policy prohibits us from
storing credit card numbers in any way once we clear the transaction using
Pay Flow. To help protect against fraud, I would like to know when the same
card number is used to make more than one purchase in a given period of
time.
Would hashing card numbers and then storing and comparing hashes...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
| |
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |