Hash functions

The mapping of keys to indices of a hash table is called as hash function.It is the usually the composition of two maps:

  • Hash Code Map - Maps keys to integers when they are not integers.
  • Compression Map - Maps wide range of integers to size of hash table.If S is size of table then integers need to brought to range 0 to [S -1]

Hash Code Maps

Integer Cast - Reinterpret the bits of number as integer for numeric type with 32 bits or less.

Component sum - For numeric type with more than 32 bits,we can chunk into group of 32 bits and then interpret as integers.Component sum is not a good choice for  strings as it leads to more number of collisions.For e.g. if I want to convert 'cusp' to integer value firstly ASCII value will be fetched  and then   added up  together to obtain the sum and integer value will be obtained.But there are huge no of possibilities that many words of english  dictionary have the same sum.If sum is same,multiple keys are mapped to same location resulting in high number of collisions.One of the  technique's used to overcome this  problem is Polynomial accumulation

Polynomial Accumulation

For strings of english language,combine the ASCII values(ao, a1 ,a2,..an-1)  and view them as coefficients of a polynomial.

ao + a1x + a2x2 + .an-1xn-1

The integer value is computed with Horner's rule for certain fixed value of x.The choice of x=33,37,39 or 41 gives at most 6 collisions on vocabulary of 50,000 words,

ao + x(a1+ x(a2 + .x(an-2 +x an-1)))

Compression Maps

Use the remainder method - Use the modulo method for mapping high range of integers to size of table.If h is the hash function,k is the key and s is the size of table then use

h(k) = k mod s

Double Hashing

It uses two hash functions h1 and h2.h1(k) is the position in the table to check for key k and h2 determines the offset from first position we use when searching again for k.Offset is how much distance has to covered while mapping the key to slot.The offset for linear probing is always 1,as we jumped one location ahead to store the element in the table.


Consider a series of number 18,41,22,44,59,32,31,73 to be stored in hash table.

h1(k) = k mod 13(Fig 1)

h2(k) =  8 - (k mod 8)

Fig 1 : Animation of Double Hashing
Fig 1 : Animation of Double Hashing


Hashing with non-integer keys

We can find some technique/way to covert keys to integers.For e.g if we have a mobile number like  98134-93789,we can  remove hyphen store it like 9813493789 or if we have a  string then we can  add ASCII values and then use hash functions on integer values.


Features of Good Hash Function

  • It is faster to compute.
  • Distribution of keys is uniform throughout the table.
  • It minimizes the probability of collisions.(No matter how good hash function we use,there is some probability of collisions in the slots).
Rate this post


So empty here ... leave a comment!

Leave a Reply

Your email address will not be published. Required fields are marked *