Joseph Lee <jo*********@hotmail.com> wrote:
<snip>
I will try to summarize and simplify about the method implementation
1>a keyword is hashed using a hash method, the author suggests HMAC-SHA1
2>the ouput value determines the location of an array where the value in
that location is changed to flag an entry.(the author does not define
anything about how the array is created in programming, just that it acts
like a database of single bit value 0 and 1 to mark as flagged or not)
3>this is done to every keywords there is filling up the array
So I am having a problem where the output value is a 20 bytes(HMAC-SHA1).
I don't think that an array that i define can be of that size?? am i wrong
here?
So i need a way to either
reduce the size of the output value 20 bytes to a smaller value that i can
define array[value] and flag the value in the array
or a way to represent a huge array for a value up to 20 bytes location
spaces 2^20*8
Any other way of implementation is greatly appreciated.
Unless you're willing to use really large amounts of memory for this
array, I would suggest a different technique - the same technique used
in normal hashtables.
Break your sparse array into "buckets". The more entries you have, the
more buckets you have. Each bucket represents part of the array - each
bucket is usually the same size, and they're set at regular intervals.
Within each bucket, you have a straight list of entries. Within a
hashtable each entry would be the hash code, the key and the value.
When you are asked to look up a key (or a hashcode) you find the right
bucket (which is very quick, as it's basically just a division
operation) and then you walk down the list of entries in that bucket.
The more buckets you have, the faster it is to find an entry but the
more memory is taken.
I would suggest cutting the 20 byte hash down to 8 bytes to start with,
as you can then use a long to represent it in a simple way. You can
always keep the "full" hash in the list entries, if you want.
--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too