hash function - C / C++

i didn't understand what does it mean for a hash function to produce well-dispersed output, and why it is important?

thank u for ur help

Dec 20 '06 #1

Subscribe Post Reply

3697

b1randon

171

Expert 100+

i didn't understand what does it mean for a hash function to produce well-dispersed output, and why it is important?

thank u for ur help

I am assuming you are talking about using a hash for security. Usually you take a hash of some data for encryption. It is important for your has output to be well-dispersed so that your encryption is not easily cracked. If your output is predictable someone can tell what kind of input was used when the hash was created.

Dec 20 '06 #2

Ganon11

3,652

Expert 2GB

When I've used hash, it has been for sorting and searching. A hash code takes the attributes of an object or struct and generates a semi-random number from it - each object of that type with those attributes will have that hash code, but the hash code itself might be nonsense.

If you were storing a list of objects and wanted to search for them on an exact basis (rather than searching by name, or amount, or any search based on one attribute), hash would be very useful. Many objects may have the same name, or amount, or the same of one attribute - thus, searching by name would produce multiple results. Hash code, though, produces a number unique to that object with those specific qualities. By searching by hash code, you ensure that whatever results you get will be exactly what you need.

For hash code to work, though, you must have a good method that will create unique hash for each object. Usually, this involves taking some integer representation of all object variables and multiplying them by something, or some other method of generating a number from the variables.

A bad hash method would be, for example, adding the character values of each string - since "abc" and "cab" would both produce the same hash (198). In this case, "abc", "cab", "acb", etc would all have the same hash code, and searching for all strings with hash value 198 would produce many results.

Dec 20 '06 #3

Banfa

9,065

Expert Mod 8TB

When you create a hash code and hash algorithm you would often expect to get multiple hits on the same code.

This is because you are representing an object of some sort as either an 8, 16 or 32 bit integer value (normally) and generally you would be expecting to represent many objects. Certainly enough that for a 8 or 16 bit hash code you would easily have more objects than unique codes.

You create you hash code from the data of the object.

You would then have a hash table which would be a table that easily allows you to look up a given code and jump straight to the objects that have that code then using specific object data to pick from the objects with that hash code the specific object you are interested in.

Well dispersed output is required in a good hash table because it provides more efficient lookup of the object.

Taking an example lets assume that we have 2000 objects and we are using a 8 bit hash. Well dispersed distribution would result in 7 to 8 objects with each hash code. Looking for a single object requires 1 hash table lookup plus a search of 7-8 objects.

On the other hand if we have 1 object each with hash values 0-250, no objects with hash values 251-254 and 1750 objects with a hash value of 255, i.e. very poor distribution then you do not get efficient lookup of objects.

Each lookup will require 1 hash table lookup, then 12.5% of the time you will have a hash value that leads directly to a single object but 87.5% of the time you will then have to search 1750 objects to find the required object. This is little better than just having a list of objects and looking up directly in the list.

So you want even distribution to provide an efficient lookup, even distribution is dependent on the hash algorithm, so this algorithm needs careful design.

Note also I mention having a 32 bit hash value but for a hash algorithm to improve efficiency then you need to be able to quickly lookup in a hash table the location of the objects with that hash value. A 32 bit hash table would require 2^32 entries. Few computers have this capacity.

Dec 21 '06 #4

Similar topics

hash_set: problem with hash function?

by: Markus Dehmann | last post by:

I have a class "Data" and I store Data pointers in an STL set. But I have millions of inserts and many more lookups, and my profiler found that they cost a lot of runtime. Therefore, I want to...

C / C++

Better crypto hash functions, long, with code

by: Bryan Olson | last post by:

The current Python standard library provides two cryptographic hash functions: MD5 and SHA-1 . The authors of MD5 originally stated: It is conjectured that it is computationally infeasible to...

Python

Hash Function

by: flipdog | last post by:

Hello all, I didn't know there is a thread on hash function started in this newsgroup so reposted my posting from another group. Hope I can have some feedbacks. I am new to hash table. I came...

C / C++

Hash function (str -> uint16)

by: barcaroller | last post by:

I'm looking for a hash function (in C) that will convert a string of arbitrary length (but less than 1024 chars) to a reasonably-unique 16-bit short integer. Can anyone point me to such a hash...

C / C++

Fast 1 -> 1 non md5 hash algorithm

by: thecodemachine | last post by:

Hi, I'm looking for a fast and simple one to one hash function, suitable for longer strings (up to 2048 in length). I'd like keys to be relatively short, I doubt I'd be creating more than 256...

C / C++

General Hash Functions In Python

by: Arash Partow | last post by:

Hi all, I've ported various hash functions to python if anyone is interested: def RSHash(key): a = 378551 b = 63689 hash = 0

Python

My hash table is in need of peer review

by: Johan Tibell | last post by:

I would be grateful if someone had a minute or two to review my hash table implementation. It's not yet commented but hopefully it's short and idiomatic enough to be readable. Some of the code...

C / C++

accessing char as int through union

by: Hallvard B Furuseth | last post by:

Is the code below valid? Generally a value must be accessed through the same type it was stored as, but there is an exception for data stored through a character type. I'm not sure if that...

C / C++

hash

by: Douglas Dude | last post by:

how much time does it take to lok for a value - key in a hash_map ? Thanks

C / C++

139

How to implement a Hash Table in C

by: ravi | last post by:

Hi can anybody tell me that which ds will be best suited to implement a hash table in C/C++ thanx. in advanced

C / C++

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server