473,394 Members | 1,916 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

hash function

9
i didn't understand what does it mean for a hash function to produce well-dispersed output, and why it is important?

thank u for ur help
Dec 20 '06 #1
3 3697
b1randon
171 Expert 100+
i didn't understand what does it mean for a hash function to produce well-dispersed output, and why it is important?

thank u for ur help
I am assuming you are talking about using a hash for security. Usually you take a hash of some data for encryption. It is important for your has output to be well-dispersed so that your encryption is not easily cracked. If your output is predictable someone can tell what kind of input was used when the hash was created.
Dec 20 '06 #2
Ganon11
3,652 Expert 2GB
When I've used hash, it has been for sorting and searching. A hash code takes the attributes of an object or struct and generates a semi-random number from it - each object of that type with those attributes will have that hash code, but the hash code itself might be nonsense.

If you were storing a list of objects and wanted to search for them on an exact basis (rather than searching by name, or amount, or any search based on one attribute), hash would be very useful. Many objects may have the same name, or amount, or the same of one attribute - thus, searching by name would produce multiple results. Hash code, though, produces a number unique to that object with those specific qualities. By searching by hash code, you ensure that whatever results you get will be exactly what you need.

For hash code to work, though, you must have a good method that will create unique hash for each object. Usually, this involves taking some integer representation of all object variables and multiplying them by something, or some other method of generating a number from the variables.

A bad hash method would be, for example, adding the character values of each string - since "abc" and "cab" would both produce the same hash (198). In this case, "abc", "cab", "acb", etc would all have the same hash code, and searching for all strings with hash value 198 would produce many results.
Dec 20 '06 #3
Banfa
9,065 Expert Mod 8TB
When you create a hash code and hash algorithm you would often expect to get multiple hits on the same code.

This is because you are representing an object of some sort as either an 8, 16 or 32 bit integer value (normally) and generally you would be expecting to represent many objects. Certainly enough that for a 8 or 16 bit hash code you would easily have more objects than unique codes.

You create you hash code from the data of the object.

You would then have a hash table which would be a table that easily allows you to look up a given code and jump straight to the objects that have that code then using specific object data to pick from the objects with that hash code the specific object you are interested in.

Well dispersed output is required in a good hash table because it provides more efficient lookup of the object.

Taking an example lets assume that we have 2000 objects and we are using a 8 bit hash. Well dispersed distribution would result in 7 to 8 objects with each hash code. Looking for a single object requires 1 hash table lookup plus a search of 7-8 objects.

On the other hand if we have 1 object each with hash values 0-250, no objects with hash values 251-254 and 1750 objects with a hash value of 255, i.e. very poor distribution then you do not get efficient lookup of objects.

Each lookup will require 1 hash table lookup, then 12.5% of the time you will have a hash value that leads directly to a single object but 87.5% of the time you will then have to search 1750 objects to find the required object. This is little better than just having a list of objects and looking up directly in the list.


So you want even distribution to provide an efficient lookup, even distribution is dependent on the hash algorithm, so this algorithm needs careful design.

Note also I mention having a 32 bit hash value but for a hash algorithm to improve efficiency then you need to be able to quickly lookup in a hash table the location of the objects with that hash value. A 32 bit hash table would require 2^32 entries. Few computers have this capacity.
Dec 21 '06 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

3
by: Markus Dehmann | last post by:
I have a class "Data" and I store Data pointers in an STL set. But I have millions of inserts and many more lookups, and my profiler found that they cost a lot of runtime. Therefore, I want to...
2
by: Bryan Olson | last post by:
The current Python standard library provides two cryptographic hash functions: MD5 and SHA-1 . The authors of MD5 originally stated: It is conjectured that it is computationally infeasible to...
4
by: flipdog | last post by:
Hello all, I didn't know there is a thread on hash function started in this newsgroup so reposted my posting from another group. Hope I can have some feedbacks. I am new to hash table. I came...
6
by: barcaroller | last post by:
I'm looking for a hash function (in C) that will convert a string of arbitrary length (but less than 1024 chars) to a reasonably-unique 16-bit short integer. Can anyone point me to such a hash...
6
by: thecodemachine | last post by:
Hi, I'm looking for a fast and simple one to one hash function, suitable for longer strings (up to 2048 in length). I'd like keys to be relatively short, I doubt I'd be creating more than 256...
12
by: Arash Partow | last post by:
Hi all, I've ported various hash functions to python if anyone is interested: def RSHash(key): a = 378551 b = 63689 hash = 0
21
by: Johan Tibell | last post by:
I would be grateful if someone had a minute or two to review my hash table implementation. It's not yet commented but hopefully it's short and idiomatic enough to be readable. Some of the code...
21
by: Hallvard B Furuseth | last post by:
Is the code below valid? Generally a value must be accessed through the same type it was stored as, but there is an exception for data stored through a character type. I'm not sure if that...
11
by: Douglas Dude | last post by:
how much time does it take to lok for a value - key in a hash_map ? Thanks
139
by: ravi | last post by:
Hi can anybody tell me that which ds will be best suited to implement a hash table in C/C++ thanx. in advanced
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.