473,406 Members | 2,705 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

Casting pointer-to-void

Hi lads,

In implementing a hash table, I offer the option of storing
arbitrarily-typed objects using a pointer-to-void.

My hash function is as follows:

unsigned int hashfn(void * key, unsigned int tablesize)
{
return *(unsigned int *)(key) % tablesize;
}

The question is not whether this is a good hash function (in terms of
hash collisions and whatnot), but rather if taking the first
width-of-an-unsigned-int bits from the object pointed to by 'key', and
returning their value as if they were representing an unsigned int,
through the use of a cast, is well-defined.

In general, I can see that if whatever 'key' were pointing to occupied
less storage space than an int, the behaviour would likely be
undefined, but in my case I can safely assume the object pointed to by
'key' is equal or greater in size than an int.

Thanks,
Thomas
Nov 14 '05 #1
12 1527
T Koster wrote:

Hi lads,

In implementing a hash table, I offer the option of storing
arbitrarily-typed objects using a pointer-to-void.

My hash function is as follows:

unsigned int hashfn(void * key, unsigned int tablesize)
{
return *(unsigned int *)(key) % tablesize;
}

The question is not whether this is a good hash function (in terms of
hash collisions and whatnot), but rather if taking the first
width-of-an-unsigned-int bits from the object pointed to by 'key', and
returning their value as if they were representing an unsigned int,
through the use of a cast, is well-defined.


No, it isn't. Unless key is aligned in a way compatible with
unsigned int, the behaviour is undefined.
Nov 14 '05 #2
T Koster wrote:
Hi lads,

In implementing a hash table, I offer the option of storing
arbitrarily-typed objects using a pointer-to-void.

My hash function is as follows:

unsigned int hashfn(void * key, unsigned int tablesize)
{
return *(unsigned int *)(key) % tablesize;
}

The question is not whether this is a good hash function (in terms of
hash collisions and whatnot), but rather if taking the first
width-of-an-unsigned-int bits from the object pointed to by 'key', and
returning their value as if they were representing an unsigned int,
through the use of a cast, is well-defined.

In general, I can see that if whatever 'key' were pointing to occupied
less storage space than an int, the behaviour would likely be undefined,
but in my case I can safely assume the object pointed to by 'key' is
equal or greater in size than an int.


The problem is that there may be padding bits in the representation
of an unsigned int which -- if set in the "right" way -- may lead to
a trap representation. Then everything probably will go downhill.
The only safe way to access the representation of an object is treating
the bytes as unsigned char (which does not have padding bits or trap
representations).
From the "value" of the bytes accessed by (unsigned char *), you
can then construct a value which lies in the range from 0 to UINT_MAX.
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Nov 14 '05 #3
infobahn <in******@btinternet.com> writes:
T Koster wrote:
In implementing a hash table, I offer the option of storing
arbitrarily-typed objects using a pointer-to-void.

My hash function is as follows:

unsigned int hashfn(void * key, unsigned int tablesize)
{
return *(unsigned int *)(key) % tablesize;
}

The question is not whether this is a good hash function (in terms of
hash collisions and whatnot), but rather if taking the first
width-of-an-unsigned-int bits from the object pointed to by 'key', and
returning their value as if they were representing an unsigned int,
through the use of a cast, is well-defined.


No, it isn't. Unless key is aligned in a way compatible with
unsigned int, the behaviour is undefined.


Sure, but if all the objects are allocated by malloc(), the alignment
is guaranteed.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #4
Keith Thompson <ks***@mib.org> wrote:
infobahn <in******@btinternet.com> writes:
T Koster wrote:
In implementing a hash table, I offer the option of storing
arbitrarily-typed objects using a pointer-to-void.

My hash function is as follows:

unsigned int hashfn(void * key, unsigned int tablesize)
{
return *(unsigned int *)(key) % tablesize;
}

The question is not whether this is a good hash function (in terms of
hash collisions and whatnot), but rather if taking the first
width-of-an-unsigned-int bits from the object pointed to by 'key', and
returning their value as if they were representing an unsigned int,
through the use of a cast, is well-defined.


No, it isn't. Unless key is aligned in a way compatible with
unsigned int, the behaviour is undefined.


Sure, but if all the objects are allocated by malloc(), the alignment
is guaranteed.


It still leaves the problem of trap representations, though. One can get
around that either by assuming that there are no trap representations in
unsigned integer types (not completely portable, but a good assumption
on modern desktop machines), or using a combination of unsigned chars
instead.

Richard
Nov 14 '05 #5
rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
[snip]
It still leaves the problem of trap representations, though. One can get
around that either by assuming that there are no trap representations in
unsigned integer types (not completely portable, but a good assumption
on modern desktop machines), or using a combination of unsigned chars
instead.


You can probably write a test that checks whether unsigned int has any
padding bits (by checking whether UINT_MAX == 2**(sizeof unsigned int)-1).
If the test fails, the program can immediately abort with an error
message; if it passes, you can safely assume that there are no padding
bits.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #6


Keith Thompson wrote:
rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
[snip]
It still leaves the problem of trap representations, though. One can get
around that either by assuming that there are no trap representations in
unsigned integer types (not completely portable, but a good assumption
on modern desktop machines), or using a combination of unsigned chars
instead.

You can probably write a test that checks whether unsigned int has any
padding bits (by checking whether UINT_MAX == 2**(sizeof unsigned int)-1).


You meant 2**(sizeof(unsigned int)*CHAR_BIT)-1 :-)
If the test fails, the program can immediately abort with an error
message; if it passes, you can safely assume that there are no padding
bits.

--
E-Mail: Mine is a gmx dot de address.

Nov 14 '05 #7
Keith Thompson wrote:

infobahn <in******@btinternet.com> writes:
T Koster wrote:
In implementing a hash table, I offer the option of storing
arbitrarily-typed objects using a pointer-to-void.

My hash function is as follows:

unsigned int hashfn(void * key, unsigned int tablesize)
{
return *(unsigned int *)(key) % tablesize;
}

The question is not whether this is a good hash function (in terms of
hash collisions and whatnot), but rather if taking the first
width-of-an-unsigned-int bits from the object pointed to by 'key', and
returning their value as if they were representing an unsigned int,
through the use of a cast, is well-defined.


No, it isn't. Unless key is aligned in a way compatible with
unsigned int, the behaviour is undefined.


Sure, but if all the objects are allocated by malloc(), the alignment
is guaranteed.


Sure, but I could see nothing in the OP's article that showed he
was passing in pointers returned by malloc (and co).
Nov 14 '05 #8
Michael Mair <Mi**********@invalid.invalid> writes:
T Koster wrote:
My hash function is as follows:
unsigned int hashfn(void * key, unsigned int tablesize)
{
return *(unsigned int *)(key) % tablesize;
}


The problem is that there may be padding bits in the representation
of an unsigned int which -- if set in the "right" way -- may lead to
a trap representation. Then everything probably will go downhill.


Even if padding bits are not used for a trap representation, they
may still interfere with the goal of obtaining a hash function,
because changing their values will change the hash value without
changing the integer value.
--
int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}
Nov 14 '05 #9
Michael Mair <Mi**********@invalid.invalid> writes:
Keith Thompson wrote:

[...]
You can probably write a test that checks whether unsigned int has any
padding bits (by checking whether UINT_MAX == 2**(sizeof unsigned int)-1).


You meant 2**(sizeof(unsigned int)*CHAR_BIT)-1 :-)


Yes, thank you. I had it right in my head; somehow it failed to
propagate to my fingers.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #10
Ben Pfaff wrote:
Michael Mair <Mi**********@invalid.invalid> writes:

T Koster wrote:
My hash function is as follows:
unsigned int hashfn(void * key, unsigned int tablesize)
{
return *(unsigned int *)(key) % tablesize;
}


The problem is that there may be padding bits in the representation
of an unsigned int which -- if set in the "right" way -- may lead to
a trap representation. Then everything probably will go downhill.


Even if padding bits are not used for a trap representation, they
may still interfere with the goal of obtaining a hash function,
because changing their values will change the hash value without
changing the integer value.


Thanks for pointing this out.
It was so obvious that I need unsigned char that I forgot this
reason... :-)
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Nov 14 '05 #11
On Tue, 08 Feb 2005 16:13:41 GMT, Keith Thompson <ks***@mib.org> wrote
in comp.lang.c:
infobahn <in******@btinternet.com> writes:
T Koster wrote:
In implementing a hash table, I offer the option of storing
arbitrarily-typed objects using a pointer-to-void.

My hash function is as follows:

unsigned int hashfn(void * key, unsigned int tablesize)
{
return *(unsigned int *)(key) % tablesize;
}

The question is not whether this is a good hash function (in terms of
hash collisions and whatnot), but rather if taking the first
width-of-an-unsigned-int bits from the object pointed to by 'key', and
returning their value as if they were representing an unsigned int,
through the use of a cast, is well-defined.


No, it isn't. Unless key is aligned in a way compatible with
unsigned int, the behaviour is undefined.


Sure, but if all the objects are allocated by malloc(), the alignment
is guaranteed.


Alignment is not the issue, or at least not the only issue. You are
missing the issue of effective type.

========
6.5 p6

The effective type of an object for an access to its stored value is
the declared type of the object, if any. If a value is stored into an
object having no declared type through an lvalue having a type that is
not a character type, then the type of the lvalue becomes the
effective type of the object for that access and for subsequent
accesses that do not modify the stored value.
========

A footnote explains that allocated objects have no declared type.

Now we move along to:

========
6.5 p7

An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:

— a type compatible with the effective type of the object,

— a qualified version of a type compatible with the effective type of
the object,

— a type that is the signed or unsigned type corresponding to the
effective type of the object,

— a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,

— an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or

— a character type.
========

So if the declared type of the first sizeof(int) bytes of object, or
the effective type if it was dynamically allocated, is not compatible
with unsigned int, the behavior is undefined regardless of alignment
or trap representation.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #12

In article <42***************@news.individual.net>, rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
T Koster wrote:
> In implementing a hash table, I offer the option of storing
> arbitrarily-typed objects using a pointer-to-void.
>
> My hash function is as follows:
>
> unsigned int hashfn(void * key, unsigned int tablesize)
> {
> return *(unsigned int *)(key) % tablesize;
> }
>
> The question is not whether this is a good hash function (in terms of
> hash collisions and whatnot), but rather if taking the first
> width-of-an-unsigned-int bits from the object pointed to by 'key', and
> returning their value as if they were representing an unsigned int,
> through the use of a cast, is well-defined.


It still leaves the problem of trap representations, though. One can get
around that either by assuming that there are no trap representations in
unsigned integer types (not completely portable, but a good assumption
on modern desktop machines), or using a combination of unsigned chars
instead.


Or with memcpy, which is the moral equivalent of accessing via
unsigned char, though it may not be implemented that way:

unsigned int hashfn(void * key, unsigned int tablesize)
{
unsigned int keyval;
memcpy(&keyval, key, sizeof keyval);
return keyval % tablesize;
}

(Koster guaranteed that key points to an object of at least
sizeof(unsigned int) bytes, so accessing past the end of an object
with the memcpy isn't a problem.)

Obviously this has the same potential (though unlikely) problem with
trap representations, but avoids the "effective type" UB issue (as
does the combination-of-unsigned-chars solution).

--
Michael Wojcik mi************@microfocus.com

Today's Carnivore bait: Distracted by the Anthrax song, I let my bin,
laden with goods, crash into a bush.
Nov 14 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Vinodh Kumar | last post by:
I see that casting changes the value of a pointer in case of multiple inheritance.In single inheritance also it is the same know?Isn't it? Vinodh Kumar P
5
by: Suzanne Vogel | last post by:
** Isn't the 'static_cast' operator the same as traditional type casting? ie, Aren't the following ways of getting b1, b2 the same? // 'Derived' is derived from 'Base' Derived* d = new...
4
by: coeng | last post by:
I am having trouble comprehending the use of the four casting operators in C++. Does anyone know of a URL that provides a very detailed, yet very simple, explanation of what they accomplish and...
1
by: JohnK | last post by:
under the covers is type casting in VB.Net the same as C# ? myObject = CType(..,..) in VB.Net vs myObject = (SomeClass)aObject
1
by: Vic | last post by:
Hi, in VB.NET there is the following code: Friend htProvidedProperties As New Hashtable() Private Class TextboxValidatorProvidedProperties Public DataType As DataTypeConstants Public...
231
by: Brian Blais | last post by:
Hello, I saw on a couple of recent posts people saying that casting the return value of malloc is bad, like: d=(double *) malloc(50*sizeof(double)); why is this bad? I had always thought...
6
by: Brett | last post by:
I find there is more casting required in C# than VB.NET. If Option Strict/Explicit is turned on, will this basically create the same environment as C# - uppercase, more casting required, must...
7
by: William S Fulton | last post by:
I'm looking for the name of the following casting style in order to do some reading around on it and why it is sometimes used. unsigned long long ull = 0; void * ptr = 0; ull = *(unsigned...
9
by: cheesiong | last post by:
Can anyone explain the following codes? especially the cast of pointer type. What does it really mean. Thanks. float f1; unsigned long l1; f1=4.5; l1=*(unsigned long*)&f1;
19
by: =?Utf-8?B?WWFua2VlIEltcGVyaWFsaXN0IERvZw==?= | last post by:
I'm doing my c# more and more like i used to code c++, meaning i'm casting more often than creating an instance of objects. like : protected void gvOrderDetailsRowDataBound(object sender,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.