# Hash functions

The mapping of **keys** to **indices** of a **hash table** is called as **hash function**.It is the usually the composition of two maps:

**Hash Code Map**- Maps**keys to integers**when they are not integers.**Compression Map**- Maps wide range of**integers**to**size**of hash table.If S is size of table then integers need to brought to range 0 to [S -1]

### Hash Code Maps

**Integer Cast** - Reinterpret the** bits** of number as **integer** for numeric type with **32 bits** or less.

**Component sum** - For **numeric** type with more than **32 bits**,we can chunk into group of **32 bits** and then interpret as **integers**.Component sum is not a good choice for** strings** as it leads to more number of **collisions**.For e.g. if I want to convert '**cusp**' to** integer** value firstly** ASCII** value will be fetched and then added up together to obtain the** sum** and **integer** value will be obtained.But there are huge no of possibilities that many words of english dictionary have the **same sum**.If **sum** is same,**multiple keys** are mapped to **same location** resulting in high number of **collisions**.One of the technique's used to overcome this problem is Polynomial accumulation

**Polynomial Accumulation**

For strings of english language,combine the **ASCII** values(**a _{o, }a_{1 },a_{2},..a**

_{n-1) }and view them as

**coefficients of a polynomial.**

a_{o }+ a_{1}x + a_{2}x^{2}+ .a_{n-1}x^{n-1}

The **integer value** is computed with **Horner's rule** for certain **fixed** value of x.The choice of x=33,37,39 or 41 gives at most 6 collisions on vocabulary of 50,000 words,

a_{o }+ x(a_{1}+ x(a_{2}+ .x(a_{n-2 }+x a_{n-1})))

### Compression Maps

**Use the remainder method** - Use the **modulo** method for mapping **high range** of integers to **size** of table.If h is the hash function,k is the key and s is the size of table then use

h(k) = k mod s

## Double Hashing

It uses **two hash functions** h1 and h2.h1(k) is the** position** in the table to check for key k and h2 determines the **offset** from first position we use when searching again for k.Offset is how much** distance** has to covered while **mappin**g the key to slot.The offset for** linear probing** is always** 1**,as we jumped one location ahead to store the element in the table.

**Example**

Consider a series of number 18,41,22,44,59,32,31,73 to be stored in hash table.

h1(k) = k mod 13(Fig 1)

h2(k) = 8 - (k mod 8)

## Hashing with non-integer keys

We can find some technique/way to covert keys to integers.For e.g if we have a mobile number like **98134-93789**,we can **remove hyphen** store it like **9813493789 **or if we have a ** string** then we can **add ASCII** values and then use **hash functions** on **integer values**.

## Features of Good Hash Function

- It is
**faster**to compute. **Distribution**of**keys**is**uniform**throughout the table.- It minimizes the
**probability of collisions***.(No matter how good hash function we use,there is some probability of collisions in the slots).*

## Comments

So empty here ... leave a comment!