In article <11*********************@g49g2000cwa.googlegroups. com>,
<th************@gmail.com> wrote:
I'm looking for a fast and simple one to one hash function, suitable
for longer strings (up to 2048 in length). I'd like keys to be
relatively short, I doubt I'd be creating more than 256 keys..
I could use md5 but even that is more than I need, and I assume not the
fastest algorithm (?), so I'm reluctant to use it.
There are lot of different hash functions. In order to be able to
recommend one, we would have to know:
- do you need a hash function that is *certain* not to produce
the same hash for values you wish to distinguish; or
- are you using a "bucket" hashing system; or
- are you just going to fill the next available bucket when you have
a hash collision; or
- are you going to use a series of rehash functions to attempt to
resolve hash collisions;
- are you truly looking for a hash function, or are you looking
for a digital signature?
- is this a case in which you can take your time hashing during the
data analysis phase, but then expect to retrieve the data over and
over again, so theoretical read performance is much more important than
time to insert into hash structures? If so, then have you considered
"perfect hash" ?
- would it be acceptable to put an arbitrary limit on the number of
keys if by doing so you could create a noticably more efficient hash?
For example, would either of 251 or 257 be acceptable limits?
- if you find out later that you need more than 256 keys, must it be
easy to expand the maximum number of keys, or would it be acceptable for
noticable "work" (by the computer) to be involved? Though in this
particular case it sounds like you won't be needing more than 512 Kb for
your hash, in the general case someone might be hashing over a million
entries each of which might be a couple of hundred megabytes, and
so for them, it could be quite important that as the number of keys
expanded that the previously processed objects continue to hash into
exactly the same disk storage location that they did before.
--
"No one has the right to destroy another person's belief by
demanding empirical evidence." -- Ann Landers