python hash function

Damien Morton

Ive been doing some investigation of python hashtables and the hash
function used in Python.

It turns out, that when running pystone.py, as much time is spent in
string_hash() as is spent in lookdict_string(). If you look at
python's hash function, shown below, you can get an idea why - a
multiply is used for every character in the string.

static long
string_hash1(register char *p, int size)
{
register int len;
register long x;

len = size;
x = *p << 7;
while (--len >= 0)
x = (100003*x) ^ *p++;
x ^= size;
return x;
}

Looking around the web I found the following hash function, which is
much faster, and seems to distribute better than the python hash
function.

static long
string_hash3(register char *p, register int size)
{
register long x;
register char c;

x = 0xd2d84a61;
while (c = *p++)
x ^= ( (x<<7) + c + (x>>5) );
return x;
}

I also came up with a hash function of my own, which seems to
distribute near-perfectly (in the sense of being comparable to
assigning, as hashes, random numbers to unique strings) and be faster
yet (at least, on my P4 box).

static long
string_hash6(register unsigned short *p, int size)
{
register short s;
register unsigned long x = 0;

if (size == 0) return 0;

len = (size+1) >> 1;
while (len--) {
x += (x>>14) + (*p++ * 0xd2d84a61);
}
x += (x>>14) + (size*0xd2d84a61);

return x;
}

Ive tested these functions by hashing a large set of strings generated
by instrumenting the python interpeter to emit a string every time a
string is added to a dictionary. These strings are hashed and thrown
into the buckets of various sized tables. I then calculate sigma
statistics (sum((observed-expected)^2)/(N-1)) for the number of items
in the buckets of those tables.

Im not sure what other tests to try out. Im hoping that someone on
c.l.py has some experience in testing hash functions, and can suggest
some additional tests and/or tweaks.

Jul 18 '05 #1

Subscribe Reply

14157

John J. Lee

go*************@bitfurnace.com (Damien Morton) writes:

Ive been doing some investigation of python hashtables and the hash
function used in Python.

[...]

Have you had a look at the python-dev archives? Bound to be stuff
there about this general area.
John

Jul 18 '05 #2

Similar topics

2999

This program makes Python segfault - no other does

by: Juho Saarikko | last post by:

The program attached to this message makes the Python interpreter segfault randomly. I have tried both Python 2.2 which came with Debian Stable, and self-compiled Python 2.3.3 (newest I could find...

Python

2808

Can you introduce some book about python?

by: fdsl ysnh | last post by:

--- python-list-request@python.orgÐ´µÀ: > Send Python-list mailing list submissions to > python-list@python.org > > To subscribe or unsubscribe via the World Wide Web, > visit >...

Python

6969

General Hash Functions In Python

by: Arash Partow | last post by:

Hi all, I've ported various hash functions to python if anyone is interested: def RSHash(key): a = 378551 b = 63689 hash = 0

Python

852

27933

merits of Lisp vs Python

by: Mark Tarver | last post by:

How do you compare Python to Lisp? What specific advantages do you think that one has over the other? Note I'm not a Python person and I have no axes to grind here. This is just a question for...

Python

2992

Understanding python functions - Instant Python tutorial

by: Chris Carlen | last post by:

Hi: I have begun learning Python by experimenting with the code snippets here: http://hetland.org/writing/instant-python.html In the section on functions, Magnus Lie Hetland writes: ...

Python

2062

Sets in Python

by: sapsi | last post by:

Hello, I recently tried using the set function in Python and was surprised to find that a= ] doesn't work with 'set', throwing TyperError (unhashable exception). I found out that this is...

Python

7121

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

7197

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

7375

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

4899

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

3088

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration

3078

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

1411

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

C# / C Sharp

650

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

287

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

General