473,320 Members | 1,719 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

hash function

9
i didn't understand what does it mean for a hash function to produce well-dispersed output, and why it is important?

thank u for ur help
Dec 20 '06 #1
3 3693
b1randon
171 Expert 100+
i didn't understand what does it mean for a hash function to produce well-dispersed output, and why it is important?

thank u for ur help
I am assuming you are talking about using a hash for security. Usually you take a hash of some data for encryption. It is important for your has output to be well-dispersed so that your encryption is not easily cracked. If your output is predictable someone can tell what kind of input was used when the hash was created.
Dec 20 '06 #2
Ganon11
3,652 Expert 2GB
When I've used hash, it has been for sorting and searching. A hash code takes the attributes of an object or struct and generates a semi-random number from it - each object of that type with those attributes will have that hash code, but the hash code itself might be nonsense.

If you were storing a list of objects and wanted to search for them on an exact basis (rather than searching by name, or amount, or any search based on one attribute), hash would be very useful. Many objects may have the same name, or amount, or the same of one attribute - thus, searching by name would produce multiple results. Hash code, though, produces a number unique to that object with those specific qualities. By searching by hash code, you ensure that whatever results you get will be exactly what you need.

For hash code to work, though, you must have a good method that will create unique hash for each object. Usually, this involves taking some integer representation of all object variables and multiplying them by something, or some other method of generating a number from the variables.

A bad hash method would be, for example, adding the character values of each string - since "abc" and "cab" would both produce the same hash (198). In this case, "abc", "cab", "acb", etc would all have the same hash code, and searching for all strings with hash value 198 would produce many results.
Dec 20 '06 #3
Banfa
9,065 Expert Mod 8TB
When you create a hash code and hash algorithm you would often expect to get multiple hits on the same code.

This is because you are representing an object of some sort as either an 8, 16 or 32 bit integer value (normally) and generally you would be expecting to represent many objects. Certainly enough that for a 8 or 16 bit hash code you would easily have more objects than unique codes.

You create you hash code from the data of the object.

You would then have a hash table which would be a table that easily allows you to look up a given code and jump straight to the objects that have that code then using specific object data to pick from the objects with that hash code the specific object you are interested in.

Well dispersed output is required in a good hash table because it provides more efficient lookup of the object.

Taking an example lets assume that we have 2000 objects and we are using a 8 bit hash. Well dispersed distribution would result in 7 to 8 objects with each hash code. Looking for a single object requires 1 hash table lookup plus a search of 7-8 objects.

On the other hand if we have 1 object each with hash values 0-250, no objects with hash values 251-254 and 1750 objects with a hash value of 255, i.e. very poor distribution then you do not get efficient lookup of objects.

Each lookup will require 1 hash table lookup, then 12.5% of the time you will have a hash value that leads directly to a single object but 87.5% of the time you will then have to search 1750 objects to find the required object. This is little better than just having a list of objects and looking up directly in the list.


So you want even distribution to provide an efficient lookup, even distribution is dependent on the hash algorithm, so this algorithm needs careful design.

Note also I mention having a 32 bit hash value but for a hash algorithm to improve efficiency then you need to be able to quickly lookup in a hash table the location of the objects with that hash value. A 32 bit hash table would require 2^32 entries. Few computers have this capacity.
Dec 21 '06 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

3
by: Markus Dehmann | last post by:
I have a class "Data" and I store Data pointers in an STL set. But I have millions of inserts and many more lookups, and my profiler found that they cost a lot of runtime. Therefore, I want to...
2
by: Bryan Olson | last post by:
The current Python standard library provides two cryptographic hash functions: MD5 and SHA-1 . The authors of MD5 originally stated: It is conjectured that it is computationally infeasible to...
4
by: flipdog | last post by:
Hello all, I didn't know there is a thread on hash function started in this newsgroup so reposted my posting from another group. Hope I can have some feedbacks. I am new to hash table. I came...
6
by: barcaroller | last post by:
I'm looking for a hash function (in C) that will convert a string of arbitrary length (but less than 1024 chars) to a reasonably-unique 16-bit short integer. Can anyone point me to such a hash...
6
by: thecodemachine | last post by:
Hi, I'm looking for a fast and simple one to one hash function, suitable for longer strings (up to 2048 in length). I'd like keys to be relatively short, I doubt I'd be creating more than 256...
12
by: Arash Partow | last post by:
Hi all, I've ported various hash functions to python if anyone is interested: def RSHash(key): a = 378551 b = 63689 hash = 0
21
by: Johan Tibell | last post by:
I would be grateful if someone had a minute or two to review my hash table implementation. It's not yet commented but hopefully it's short and idiomatic enough to be readable. Some of the code...
21
by: Hallvard B Furuseth | last post by:
Is the code below valid? Generally a value must be accessed through the same type it was stored as, but there is an exception for data stored through a character type. I'm not sure if that...
11
by: Douglas Dude | last post by:
how much time does it take to lok for a value - key in a hash_map ? Thanks
139
by: ravi | last post by:
Hi can anybody tell me that which ds will be best suited to implement a hash table in C/C++ thanx. in advanced
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.