473,396 Members | 1,785 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Hash Map slower than Map.

Hi,
I am trying to write a hash function that works on a pair of integers.

so I have a pair<int,intof integers. Let the first element be called
"a" and second as "b".

I know before hand that:

0<=a<30
-2^a<=b<=2^a-1.

In more detail, I am using this to keep a record of lattice points seen
on a multi resolution hierarchical grid. A resolution level "l" has
2^{l*d} elements where d is the dimension of the space.

Also, there can be at most two different instantiations of the same
region in the grid for my work. To distinguish them, I add +1 and put
-ve sign in front of one copy.
So <a,b>, and <a,-(b+1)represent the same region in the grid.
As a result, there can be a key <4,3and another key <4,-3>
If all the b's were >0, then I could already think of a very simple
perfect hash function:
size_t value = 1<<a+b;

However, if b<0 is also possible, then, I can only think of

size_t value;
if(b>=0) value = 1<<a+b;
else value = 2^30 + 1<<a+-(b+1); // I know a<30 for sure.

Now I used this in my code, and did some profiling and I found that I am
spending almost 80 % of the time just looking up if a key exists in this
hash map. (there are no other major calculations in the program function
where this "find" operation is called).

So for example, in one particular execution of the program, the function
which calls this key lookup is called

158905316 times and takes 100 sec (~80%) of the time.

My question is: can I do much faster than this? I am not an expert, and
I am using g++ 4.2 on a pentium 4 2.4 Ghz with 512 MB RAM. But still
this should be much faster.

What is really surprising is that if I use maps instead, they are
running faster.

So for about, 83127320 calls, I take 16.06 sec (~47.26%) if I use a stl map.

thanks,
-amit.
Oct 8 '07 #1
4 1807
Amit Bhatia wrote:
size_t value;
if(b>=0) value = 1<<a+b;
else value = 2^30 + 1<<a+-(b+1); // I know a<30 for sure.
This would yield very poor keys.

Why not print out the values from your hash function. Your problem is
probably there.
Oct 8 '07 #2
Amit Bhatia wrote:
If all the b's were >0, then I could already think of a very simple
perfect hash function:

size_t value = 1<<a+b;
+ has higher precedence than << in C++, so the expression above means
1<<(a+b), which is probably not what you intended.

-- Ben
Oct 8 '07 #3
Ben Rudiak-Gould wrote:
Amit Bhatia wrote:
>If all the b's were >0, then I could already think of a very simple
perfect hash function:

size_t value = 1<<a+b;

+ has higher precedence than << in C++, so the expression above means
1<<(a+b), which is probably not what you intended.

-- Ben
Thanks a lot. It works a lot better now. :) For the same example, for
about 83805820 calls it takes about 12.85 seconds. There are two
distinct hash maps, and each has about ~1800 objects.

I am wondering is there a better function that is even faster than what
I am doing? When b<0, since I add 2^30, it kind of takes the value to
very huge numbers. But I am not sure what can be used. Most of the
examples I have seen online talk about mapping strings to keys.

thanks,
amit.

Oct 9 '07 #4
On Oct 9, 5:52 am, Amit Bhatia <abha...@nospam.nospam.comwrote:
Ben Rudiak-Gould wrote:
Amit Bhatia wrote:
If all the b's were >0, then I could already think of a very simple
perfect hash function:
size_t value = 1<<a+b;
+ has higher precedence than << in C++, so the expression above means
1<<(a+b), which is probably not what you intended.
Thanks a lot. It works a lot better now. :) For the same example, for
about 83805820 calls it takes about 12.85 seconds. There are two
distinct hash maps, and each has about ~1800 objects.
I am wondering is there a better function that is even faster than what
I am doing? When b<0, since I add 2^30, it kind of takes the value to
very huge numbers.
I'm not sure about what you're doing, but if you actually have
written 2^32, the value is the constant 34 in C++. Which may
not be what you wanted.
But I am not sure what can be used. Most of the examples I
have seen online talk about mapping strings to keys.
Most hashing functions (even for strings) actually hash
sequences of unsigned integral values. In your place, I'd just
convert the two values to unsigned, then adopt one of the
classical hash codes, say by multiplying the first by a prime
number, then adding the second (e.g.: "127 * (unsigned)( a ) +
(unsigned)( b )").

Most hash tables I know have functions which allow you to verify
the distribution, but a priori, something like the above should
give very good results, and is not very expensive to calculate.
(On machines with slow multiplication, the compiler will convert
the 127*x to a left shift and a subtraction.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Oct 9 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Matthias Käppler | last post by:
Hi, I need to store objects of a FileInfo class containing information about a specific file. I can uniquely identify these files by their serial number. I thought it would be a good idea to use...
2
by: Bryan Olson | last post by:
The current Python standard library provides two cryptographic hash functions: MD5 and SHA-1 . The authors of MD5 originally stated: It is conjectured that it is computationally infeasible to...
12
by: wxs | last post by:
Many times we have a bunch of enums we have from either different enums or the same enum that will have various numeric values assigned. Rarely will there be collisions in numbering between the...
2
by: Ravi | last post by:
Hi, I am working on a winform app. I need to use an object which can store some information(key/value pairs) and also can be acessed by multiple threads(read/write). From what I heard Hash table...
24
by: kdotsky | last post by:
Hello, I am using some very large dictionaries with keys that are long strings (urls). For a large dictionary these keys start to take up a significant amount of memory. I do not need access to...
12
by: Arash Partow | last post by:
Hi all, I've ported various hash functions to python if anyone is interested: def RSHash(key): a = 378551 b = 63689 hash = 0
21
by: Johan Tibell | last post by:
I would be grateful if someone had a minute or two to review my hash table implementation. It's not yet commented but hopefully it's short and idiomatic enough to be readable. Some of the code...
139
by: ravi | last post by:
Hi can anybody tell me that which ds will be best suited to implement a hash table in C/C++ thanx. in advanced
4
by: Evan Camilleri | last post by:
What is the fastest way to get the 'hash' (or CRC32 or whatever) of the contents of an object i don't care what's inside, I just want the 'hash' of its contents (not to mixed with...
23
by: raylopez99 | last post by:
A quick sanity check, and I think I am correct, but just to make sure: if you have a bunch of objects that are very much like one another you can uniquely track them simply by using an ArrayList...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.