Hi lads,
In implementing a hash table, I offer the option of storing
arbitrarily-typed objects using a pointer-to-void.
My hash function is as follows:
unsigned int hashfn(void * key, unsigned int tablesize)
{
return *(unsigned int *)(key) % tablesize;
}
The question is not whether this is a good hash function (in terms of
hash collisions and whatnot), but rather if taking the first
width-of-an-unsigned-int bits from the object pointed to by 'key', and
returning their value as if they were representing an unsigned int,
through the use of a cast, is well-defined.
In general, I can see that if whatever 'key' were pointing to occupied
less storage space than an int, the behaviour would likely be
undefined, but in my case I can safely assume the object pointed to by
'key' is equal or greater in size than an int.
Thanks,
Thomas 12 1527
T Koster wrote: Hi lads,
In implementing a hash table, I offer the option of storing arbitrarily-typed objects using a pointer-to-void.
My hash function is as follows:
unsigned int hashfn(void * key, unsigned int tablesize) { return *(unsigned int *)(key) % tablesize; }
The question is not whether this is a good hash function (in terms of hash collisions and whatnot), but rather if taking the first width-of-an-unsigned-int bits from the object pointed to by 'key', and returning their value as if they were representing an unsigned int, through the use of a cast, is well-defined.
No, it isn't. Unless key is aligned in a way compatible with
unsigned int, the behaviour is undefined.
T Koster wrote: Hi lads,
In implementing a hash table, I offer the option of storing arbitrarily-typed objects using a pointer-to-void.
My hash function is as follows:
unsigned int hashfn(void * key, unsigned int tablesize) { return *(unsigned int *)(key) % tablesize; }
The question is not whether this is a good hash function (in terms of hash collisions and whatnot), but rather if taking the first width-of-an-unsigned-int bits from the object pointed to by 'key', and returning their value as if they were representing an unsigned int, through the use of a cast, is well-defined.
In general, I can see that if whatever 'key' were pointing to occupied less storage space than an int, the behaviour would likely be undefined, but in my case I can safely assume the object pointed to by 'key' is equal or greater in size than an int.
The problem is that there may be padding bits in the representation
of an unsigned int which -- if set in the "right" way -- may lead to
a trap representation. Then everything probably will go downhill.
The only safe way to access the representation of an object is treating
the bytes as unsigned char (which does not have padding bits or trap
representations).
From the "value" of the bytes accessed by (unsigned char *), you
can then construct a value which lies in the range from 0 to UINT_MAX.
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
infobahn <in******@btinternet.com> writes: T Koster wrote: In implementing a hash table, I offer the option of storing arbitrarily-typed objects using a pointer-to-void.
My hash function is as follows:
unsigned int hashfn(void * key, unsigned int tablesize) { return *(unsigned int *)(key) % tablesize; }
The question is not whether this is a good hash function (in terms of hash collisions and whatnot), but rather if taking the first width-of-an-unsigned-int bits from the object pointed to by 'key', and returning their value as if they were representing an unsigned int, through the use of a cast, is well-defined.
No, it isn't. Unless key is aligned in a way compatible with unsigned int, the behaviour is undefined.
Sure, but if all the objects are allocated by malloc(), the alignment
is guaranteed.
--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Keith Thompson <ks***@mib.org> wrote: infobahn <in******@btinternet.com> writes: T Koster wrote: In implementing a hash table, I offer the option of storing arbitrarily-typed objects using a pointer-to-void.
My hash function is as follows:
unsigned int hashfn(void * key, unsigned int tablesize) { return *(unsigned int *)(key) % tablesize; }
The question is not whether this is a good hash function (in terms of hash collisions and whatnot), but rather if taking the first width-of-an-unsigned-int bits from the object pointed to by 'key', and returning their value as if they were representing an unsigned int, through the use of a cast, is well-defined.
No, it isn't. Unless key is aligned in a way compatible with unsigned int, the behaviour is undefined.
Sure, but if all the objects are allocated by malloc(), the alignment is guaranteed.
It still leaves the problem of trap representations, though. One can get
around that either by assuming that there are no trap representations in
unsigned integer types (not completely portable, but a good assumption
on modern desktop machines), or using a combination of unsigned chars
instead.
Richard rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
[snip] It still leaves the problem of trap representations, though. One can get around that either by assuming that there are no trap representations in unsigned integer types (not completely portable, but a good assumption on modern desktop machines), or using a combination of unsigned chars instead.
You can probably write a test that checks whether unsigned int has any
padding bits (by checking whether UINT_MAX == 2**(sizeof unsigned int)-1).
If the test fails, the program can immediately abort with an error
message; if it passes, you can safely assume that there are no padding
bits.
--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Keith Thompson wrote: rl*@hoekstra-uitgeverij.nl (Richard Bos) writes: [snip]
It still leaves the problem of trap representations, though. One can get around that either by assuming that there are no trap representations in unsigned integer types (not completely portable, but a good assumption on modern desktop machines), or using a combination of unsigned chars instead.
You can probably write a test that checks whether unsigned int has any padding bits (by checking whether UINT_MAX == 2**(sizeof unsigned int)-1).
You meant 2**(sizeof(unsigned int)*CHAR_BIT)-1 :-)
If the test fails, the program can immediately abort with an error message; if it passes, you can safely assume that there are no padding bits.
--
E-Mail: Mine is a gmx dot de address.
Keith Thompson wrote: infobahn <in******@btinternet.com> writes: T Koster wrote: In implementing a hash table, I offer the option of storing arbitrarily-typed objects using a pointer-to-void.
My hash function is as follows:
unsigned int hashfn(void * key, unsigned int tablesize) { return *(unsigned int *)(key) % tablesize; }
The question is not whether this is a good hash function (in terms of hash collisions and whatnot), but rather if taking the first width-of-an-unsigned-int bits from the object pointed to by 'key', and returning their value as if they were representing an unsigned int, through the use of a cast, is well-defined.
No, it isn't. Unless key is aligned in a way compatible with unsigned int, the behaviour is undefined.
Sure, but if all the objects are allocated by malloc(), the alignment is guaranteed.
Sure, but I could see nothing in the OP's article that showed he
was passing in pointers returned by malloc (and co).
Michael Mair <Mi**********@invalid.invalid> writes: T Koster wrote: My hash function is as follows: unsigned int hashfn(void * key, unsigned int tablesize) { return *(unsigned int *)(key) % tablesize; }
The problem is that there may be padding bits in the representation of an unsigned int which -- if set in the "right" way -- may lead to a trap representation. Then everything probably will go downhill.
Even if padding bits are not used for a trap representation, they
may still interfere with the goal of obtaining a hash function,
because changing their values will change the hash value without
changing the integer value.
--
int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}
Michael Mair <Mi**********@invalid.invalid> writes: Keith Thompson wrote:
[...] You can probably write a test that checks whether unsigned int has any padding bits (by checking whether UINT_MAX == 2**(sizeof unsigned int)-1).
You meant 2**(sizeof(unsigned int)*CHAR_BIT)-1 :-)
Yes, thank you. I had it right in my head; somehow it failed to
propagate to my fingers.
--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Ben Pfaff wrote: Michael Mair <Mi**********@invalid.invalid> writes:
T Koster wrote:
My hash function is as follows: unsigned int hashfn(void * key, unsigned int tablesize) { return *(unsigned int *)(key) % tablesize; }
The problem is that there may be padding bits in the representation of an unsigned int which -- if set in the "right" way -- may lead to a trap representation. Then everything probably will go downhill.
Even if padding bits are not used for a trap representation, they may still interfere with the goal of obtaining a hash function, because changing their values will change the hash value without changing the integer value.
Thanks for pointing this out.
It was so obvious that I need unsigned char that I forgot this
reason... :-)
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
On Tue, 08 Feb 2005 16:13:41 GMT, Keith Thompson <ks***@mib.org> wrote
in comp.lang.c: infobahn <in******@btinternet.com> writes: T Koster wrote: In implementing a hash table, I offer the option of storing arbitrarily-typed objects using a pointer-to-void.
My hash function is as follows:
unsigned int hashfn(void * key, unsigned int tablesize) { return *(unsigned int *)(key) % tablesize; }
The question is not whether this is a good hash function (in terms of hash collisions and whatnot), but rather if taking the first width-of-an-unsigned-int bits from the object pointed to by 'key', and returning their value as if they were representing an unsigned int, through the use of a cast, is well-defined.
No, it isn't. Unless key is aligned in a way compatible with unsigned int, the behaviour is undefined.
Sure, but if all the objects are allocated by malloc(), the alignment is guaranteed.
Alignment is not the issue, or at least not the only issue. You are
missing the issue of effective type.
========
6.5 p6
The effective type of an object for an access to its stored value is
the declared type of the object, if any. If a value is stored into an
object having no declared type through an lvalue having a type that is
not a character type, then the type of the lvalue becomes the
effective type of the object for that access and for subsequent
accesses that do not modify the stored value.
========
A footnote explains that allocated objects have no declared type.
Now we move along to:
========
6.5 p7
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of
the object,
— a type that is the signed or unsigned type corresponding to the
effective type of the object,
— a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or
— a character type.
========
So if the declared type of the first sizeof(int) bytes of object, or
the effective type if it was dynamically allocated, is not compatible
with unsigned int, the behavior is undefined regardless of alignment
or trap representation.
--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
In article <42***************@news.individual.net>, rl*@hoekstra-uitgeverij.nl (Richard Bos) writes: T Koster wrote: > In implementing a hash table, I offer the option of storing > arbitrarily-typed objects using a pointer-to-void. > > My hash function is as follows: > > unsigned int hashfn(void * key, unsigned int tablesize) > { > return *(unsigned int *)(key) % tablesize; > } > > The question is not whether this is a good hash function (in terms of > hash collisions and whatnot), but rather if taking the first > width-of-an-unsigned-int bits from the object pointed to by 'key', and > returning their value as if they were representing an unsigned int, > through the use of a cast, is well-defined.
It still leaves the problem of trap representations, though. One can get around that either by assuming that there are no trap representations in unsigned integer types (not completely portable, but a good assumption on modern desktop machines), or using a combination of unsigned chars instead.
Or with memcpy, which is the moral equivalent of accessing via
unsigned char, though it may not be implemented that way:
unsigned int hashfn(void * key, unsigned int tablesize)
{
unsigned int keyval;
memcpy(&keyval, key, sizeof keyval);
return keyval % tablesize;
}
(Koster guaranteed that key points to an object of at least
sizeof(unsigned int) bytes, so accessing past the end of an object
with the memcpy isn't a problem.)
Obviously this has the same potential (though unlikely) problem with
trap representations, but avoids the "effective type" UB issue (as
does the combination-of-unsigned-chars solution).
--
Michael Wojcik mi************@microfocus.com
Today's Carnivore bait: Distracted by the Anthrax song, I let my bin,
laden with goods, crash into a bush. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Vinodh Kumar |
last post by:
I see that casting changes the value of a pointer in case of multiple
inheritance.In single inheritance also it is the same know?Isn't it?
Vinodh Kumar P
|
by: Suzanne Vogel |
last post by:
** Isn't the 'static_cast' operator the same as traditional type
casting? ie, Aren't the following ways of getting b1, b2 the same?
// 'Derived' is derived from 'Base'
Derived* d = new...
|
by: coeng |
last post by:
I am having trouble comprehending the use of the four casting
operators in C++.
Does anyone know of a URL that provides a very detailed, yet very
simple, explanation of what they accomplish and...
|
by: JohnK |
last post by:
under the covers is type casting in VB.Net the same as C# ?
myObject = CType(..,..) in VB.Net vs
myObject = (SomeClass)aObject
|
by: Vic |
last post by:
Hi,
in VB.NET there is the following code:
Friend htProvidedProperties As New Hashtable()
Private Class TextboxValidatorProvidedProperties
Public DataType As DataTypeConstants
Public...
|
by: Brian Blais |
last post by:
Hello,
I saw on a couple of recent posts people saying that casting the return
value of malloc is bad, like:
d=(double *) malloc(50*sizeof(double));
why is this bad? I had always thought...
|
by: Brett |
last post by:
I find there is more casting required in C# than VB.NET. If Option
Strict/Explicit is turned on, will this basically create the same
environment as C# - uppercase, more casting required, must...
|
by: William S Fulton |
last post by:
I'm looking for the name of the following casting style in order to do
some reading around on it and why it is sometimes used.
unsigned long long ull = 0;
void * ptr = 0;
ull = *(unsigned...
|
by: cheesiong |
last post by:
Can anyone explain the following codes?
especially the cast of pointer type.
What does it really mean. Thanks.
float f1;
unsigned long l1;
f1=4.5;
l1=*(unsigned long*)&f1;
|
by: =?Utf-8?B?WWFua2VlIEltcGVyaWFsaXN0IERvZw==?= |
last post by:
I'm doing my c# more and more like i used to code c++, meaning i'm casting
more often than creating an instance of objects.
like :
protected void gvOrderDetailsRowDataBound(object sender,...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
| |