By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
439,941 Members | 1,766 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 439,941 IT Pros & Developers. It's quick & easy.

my small hashlib - using pythons hash-functions

P: n/a
Hi.

I wrote a small hashlib for C. Because I'm new to hashes I looked at pythons implementation and
reused *some* of the code... or more the mathematical "hash-function", not really the code.

In particular I looked at pythons hash and lookup functions, so I came up with this (see the code
underneath).

So, can this code be considered as derived and do I have to put my code under the GPL? I'd like to
publish it under something less restrictive, like a BSD style license. But if GPL is the only way,
then GPL it is. :)

-panzi
#define PERTURB_SHIFT 5

static ENTRY_T * lookup(CONTAINER_T * table, hash_const_str_t key,
int (*loop_cond)(ENTRY_T * ptr, hash_const_str_t key)) {
register ENTRY_T * entries = table->entries;
register hash_size_t size = table->size;
register hash_size_t pos = hash(key) % size;
register hash_size_t perturb = pos;

register ENTRY_T * ptr = entries + pos;

while(loop_cond(ptr,key)) {
pos = ((pos << 2) + pos + perturb + 1) % size;
perturb >>= PERTURB_SHIFT;

ptr = entries + pos;

#ifdef HASH_COUNT_COLLISIONS
++ table->collisions;
#endif
}

return ptr;
}

hash_t hash(register hash_const_str_t str) {
/* python 2.5's hash function: */
register hash_size_t len;
register hash_t h;

assert(str);

len = strlen(str);
h = *str << 7;

while (*str)
h = (1000003 * h) ^ *str ++;
h ^= len;

return h;
}
Nov 25 '06 #1
Share this Question
Share on Google+
5 Replies


P: n/a
Mathias Panzenboeck <e0******@student.tuwien.ac.atwrites:
So, can this code be considered as derived and do I have to put my
code under the GPL? I'd like to publish it under something less
restrictive, like a BSD style license. But if GPL is the only way,
then GPL it is. :)
Python is not GPL'd but has its own license. You can make the derived
work GPL if you want, but it is not required. You'll have to check
its terms to see whether you can specifically use the BSD license.
Simplest might be to just stay with the Python license.
Nov 25 '06 #2

P: n/a
Paul Rubin wrote:
Mathias Panzenboeck <e0******@student.tuwien.ac.atwrites:
>So, can this code be considered as derived and do I have to put my
code under the GPL? I'd like to publish it under something less
restrictive, like a BSD style license. But if GPL is the only way,
then GPL it is. :)

Python is not GPL'd but has its own license. You can make the derived
work GPL if you want, but it is not required. You'll have to check
its terms to see whether you can specifically use the BSD license.
Simplest might be to just stay with the Python license.
Yes, right. It's PSF.

But the question is: *IS* this derived work? I mean, it's not copied code.
It's the same hashing-logic, which I learned by watching pythons code.

-panzi
Nov 25 '06 #3

P: n/a
Mathias Panzenboeck wrote:
But the question is: *IS* this derived work? I mean, it's not copied code.
It's the same hashing-logic, which I learned by watching pythons code.
given that it's only a few lines of code, and there's hardly any other
way to write those lines if you want to implement the same algorithm,
I'd say it falls under "fair use". crediting the Python developers in
the source code should be good enough.

</F>

Nov 25 '06 #4

P: n/a
Fredrik Lundh wrote:
Mathias Panzenboeck wrote:
>But the question is: *IS* this derived work? I mean, it's not copied
code.
It's the same hashing-logic, which I learned by watching pythons code.

given that it's only a few lines of code, and there's hardly any other
way to write those lines if you want to implement the same algorithm,
I'd say it falls under "fair use". crediting the Python developers in
the source code should be good enough.

</F>
ic. I'll write a mail to the python developers then.

The code in python looks like this (see code underneath), just if someone what's to know.
I only used the logic of the parts I understand. ;)

Objects/stringobject.c:

static long
string_hash(PyStringObject *a)
{
register Py_ssize_t len;
register unsigned char *p;
register long x;

if (a->ob_shash != -1)
return a->ob_shash;
len = a->ob_size;
p = (unsigned char *) a->ob_sval;
x = *p << 7;
while (--len >= 0)
x = (1000003*x) ^ *p++;
x ^= a->ob_size;
if (x == -1)
x = -2;
a->ob_shash = x;
return x;
}

Objects/dictobject.c:

static dictentry *
lookdict_string(dictobject *mp, PyObject *key, register long hash)
{
register size_t i;
register size_t perturb;
register dictentry *freeslot;
register size_t mask = (size_t)mp->ma_mask;
dictentry *ep0 = mp->ma_table;
register dictentry *ep;

/* Make sure this function doesn't have to handle non-string keys,
including subclasses of str; e.g., one reason to subclass
strings is to override __eq__, and for speed we don't cater to
that here. */
if (!PyString_CheckExact(key)) {
#ifdef SHOW_CONVERSION_COUNTS
++converted;
#endif
mp->ma_lookup = lookdict;
return lookdict(mp, key, hash);
}
i = hash & mask;
ep = &ep0[i];
if (ep->me_key == NULL || ep->me_key == key)
return ep;
if (ep->me_key == dummy)
freeslot = ep;
else {
if (ep->me_hash == hash && _PyString_Eq(ep->me_key, key))
return ep;
freeslot = NULL;
}

/* In the loop, me_key == dummy is by far (factor of 100s) the
least likely outcome, so test for that last. */
for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
i = (i << 2) + i + perturb + 1;
ep = &ep0[i & mask];
if (ep->me_key == NULL)
return freeslot == NULL ? ep : freeslot;
if (ep->me_key == key
|| (ep->me_hash == hash
&& ep->me_key != dummy
&& _PyString_Eq(ep->me_key, key)))
return ep;
if (ep->me_key == dummy && freeslot == NULL)
freeslot = ep;
}
}
Nov 25 '06 #5

P: n/a
Fredrik Lundh wrote:
Mathias Panzenboeck wrote:
>But the question is: *IS* this derived work? I mean, it's not copied code.
It's the same hashing-logic, which I learned by watching pythons code.

given that it's only a few lines of code, and there's hardly any other
way to write those lines if you want to implement the same algorithm,
I'd say it falls under "fair use". crediting the Python developers in
the source code should be good enough.
If you'll forgive my pedantry, that's not "fair use," but simply the use of
material that cannot be copyrighted by itself (in many jursidictions, at least)
because it is so short, obvious, non-creative, etc.

A good overview of fair use is here:

http://fairuse.stanford.edu/Copyrigh...er9/index.html

IANAL. TINLA.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Nov 25 '06 #6

This discussion thread is closed

Replies have been disabled for this discussion.