473,545 Members | 2,196 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

python hash function

Ive been doing some investigation of python hashtables and the hash
function used in Python.

It turns out, that when running pystone.py, as much time is spent in
string_hash() as is spent in lookdict_string (). If you look at
python's hash function, shown below, you can get an idea why - a
multiply is used for every character in the string.

static long
string_hash1(re gister char *p, int size)
{
register int len;
register long x;

len = size;
x = *p << 7;
while (--len >= 0)
x = (100003*x) ^ *p++;
x ^= size;
return x;
}

Looking around the web I found the following hash function, which is
much faster, and seems to distribute better than the python hash
function.

static long
string_hash3(re gister char *p, register int size)
{
register long x;
register char c;

x = 0xd2d84a61;
while (c = *p++)
x ^= ( (x<<7) + c + (x>>5) );
return x;
}

I also came up with a hash function of my own, which seems to
distribute near-perfectly (in the sense of being comparable to
assigning, as hashes, random numbers to unique strings) and be faster
yet (at least, on my P4 box).

static long
string_hash6(re gister unsigned short *p, int size)
{
register short s;
register unsigned long x = 0;

if (size == 0) return 0;

len = (size+1) >> 1;
while (len--) {
x += (x>>14) + (*p++ * 0xd2d84a61);
}
x += (x>>14) + (size*0xd2d84a6 1);

return x;
}

Ive tested these functions by hashing a large set of strings generated
by instrumenting the python interpeter to emit a string every time a
string is added to a dictionary. These strings are hashed and thrown
into the buckets of various sized tables. I then calculate sigma
statistics (sum((observed-expected)^2)/(N-1)) for the number of items
in the buckets of those tables.

Im not sure what other tests to try out. Im hoping that someone on
c.l.py has some experience in testing hash functions, and can suggest
some additional tests and/or tweaks.
Jul 18 '05 #1
1 14163
go************* @bitfurnace.com (Damien Morton) writes:
Ive been doing some investigation of python hashtables and the hash
function used in Python.

[...]

Have you had a look at the python-dev archives? Bound to be stuff
there about this general area.
John
Jul 18 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
3003
by: Juho Saarikko | last post by:
The program attached to this message makes the Python interpreter segfault randomly. I have tried both Python 2.2 which came with Debian Stable, and self-compiled Python 2.3.3 (newest I could find on www.python.org, compiled with default options (./configure && make). I'm using the pyPgSQL plugin to connect to a PostGreSQL database, and have...
3
2816
by: fdsl ysnh | last post by:
--- python-list-request@python.orgдµÀ: > Send Python-list mailing list submissions to > python-list@python.org > > To subscribe or unsubscribe via the World Wide Web, > visit > http://mail.python.org/mailman/listinfo/python-list > or, via email, send a message with subject or body > 'help' to
12
6982
by: Arash Partow | last post by:
Hi all, I've ported various hash functions to python if anyone is interested: def RSHash(key): a = 378551 b = 63689 hash = 0
852
27958
by: Mark Tarver | last post by:
How do you compare Python to Lisp? What specific advantages do you think that one has over the other? Note I'm not a Python person and I have no axes to grind here. This is just a question for my general education. Mark
13
3003
by: Chris Carlen | last post by:
Hi: I have begun learning Python by experimenting with the code snippets here: http://hetland.org/writing/instant-python.html In the section on functions, Magnus Lie Hetland writes: -------------------------------------------------------------------- For those of you who understand it: When you pass a parameter to a
22
2070
by: sapsi | last post by:
Hello, I recently tried using the set function in Python and was surprised to find that a= ] doesn't work with 'set', throwing TyperError (unhashable exception). I found out that this is because lists can't be hashed. So,this implies 'a' cannot be a set in python which i think is quite
0
7656
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
7805
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7416
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
7752
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
5969
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
3441
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1878
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1013
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
701
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.