The mapping of **keys** to **indices** of a **hash table** is called as **hash function**.It is the usually the composition of two maps:

**Hash Code Map**– Maps**keys to integers**when they are not integers.**Compression Map**– Maps wide range of**integers**to**size**of hash table.If S is size of table then integers need to brought to range 0 to [S -1]

**Integer Cast** – Reinterpret the** bits** of number as **integer** for numeric type with **32 bits** or less.

**Component sum** – For **numeric** type with more than **32 bits**,we can chunk into group of **32 bits** and then interpret as **integers**.Component sum is not a good choice for** strings** as it leads to more number of **collisions**.For e.g. if I want to convert ‘**cusp**‘ to** integer** value firstly** ASCII** value will be fetched and then added up together to obtain the** sum** and **integer** value will be obtained.But there are huge no of possibilities that many words of english dictionary have the **same sum**.If **sum** is same,**multiple keys** are mapped to **same location** resulting in high number of **collisions**.One of the technique’s used to overcome this problem is Polynomial accumulation

**Polynomial Accumulation**

For strings of english language,combine the **ASCII** values(**a _{o, }a_{1 },a_{2},..a**

a_{o }+ a_{1}x + a_{2}x^{2}+ .a_{n-1}x^{n-1}

The **integer value** is computed with **Horner’s rule** for certain **fixed** value of x.The choice of x=33,37,39 or 41 gives at most 6 collisions on vocabulary of 50,000 words,

a_{o }+ x(a_{1}+ x(a_{2}+ .x(a_{n-2 }+x a_{n-1})))

**Use the remainder method** – Use the **modulo** method for mapping **high range** of integers to **size** of table.If h is the hash function,k is the key and s is the size of table then use

h(k) = k mod s

It uses **two hash functions** h1 and h2.h1(k) is the** position** in the table to check for key k and h2 determines the **offset** from first position we use when searching again for k.Offset is how much** distance** has to covered while **mappin**g the key to slot.The offset for** linear probing** is always** 1**,as we jumped one location ahead to store the element in the table.

**Example**

Consider a series of number 18,41,22,44,59,32,31,73 to be stored in hash table.

h1(k) = k mod 13(Fig 1)

h2(k) = 8 – (k mod 8)

We can find some technique/way to covert keys to integers.For e.g if we have a mobile number like **98134-93789**,we can **remove hyphen** store it like **9813493789 **or if we have a ** string** then we can **add ASCII** values and then use **hash functions** on **integer values**.

- It is
**faster**to compute. **Distribution**of**keys**is**uniform**throughout the table.- It minimizes the
**probability of collisions***.(No matter how good hash function we use,there is some probability of collisions in the slots).*

**Hashing** is a method and useful **technique** to implement** dictionaries**.This method is used to perform searching,insertion and deletion at a faster rate.A function called** Hash Function** is used to** compute** and return **position** of the record instead of searching with comparisons.The data is stored in **array** called as **Hash table**.The Hash Function maps **keys** into **positions** in a hash table.The mapping of **keys** to** indices** of a hash table is known as **Hash Function**.The major requirement of hash function is to map **equal keys** to **equal indices**.

Given a key, the algorithm computes an *index *that suggests where the entry can be found:

index = f(key, array_size)

The value of index is determined by 2 steps

- hash = hash_func(key)
- index = hash % array_size

A hash function is purely a user choice and a good hash function has following features :

- It is quick to compute.
- Keys are distributed uniformly throughout the table.
- The load factor is defined by t= n/m where m is the slots/buckets and n is the elements in m of hash table t.

Say we want to perform certain operations on salaries of employes from employee id’s **51000** to **52000**.In order to access the data at a quicker rate we can use hashing(Fig 1 and Fig 2).

- Now keys(
**k**) are are id’s from**51000 to 52000**, i.e the**array_size**would be**1000**. - Say
**table[1000]**is the array. - We choose a Hash Function

hash = k - 51000 index = hash % array_size

When **two or more values** are **mapped** at **same location/bucket** by the **hash function**,then collision arises.

Say we have **100** students of subject **CS501** of batch **2014** and the roll no of first student is like **2014 CS501210**,second is like **2014 CS501211** and the list goes on till **2104 CS501310**.

- There are 100 students in class so we create a
**hash table**of size**100**. - Say we chose a hash function of
**last two digits**. **2014 CS501210**is stored at**10**location.- Where will
**2014 CS501310**go?Again at**same location**i.e at**10**? - A concept called
**collision**arises when two or more entries are mapped at same location.

In chaining,an a**rray of tables/link**s is set up **indexed** by the** keys** to the** list of items** with the** same key**.Say we have a hash function which computed **modulo 5** of any series and say ** hash table** has** 6 locations** in it.If **two or more keys** are mapped into same location we will keep on** adding to the linked list**.By example given below** 1** **key** is mapped to** location 0**,**2 keys** are mapped to **location 1**,**1 key** is mapped to **location 2**,**2 keys** to location **3**,**3keys** to** location 4** and **no keys** are mapped to **location 5**.The hash values **ending at 0** are stored at** 0 location**,**ending at 1** is stored at **first location** and the list goes on in the similar pattern..Thus ,the problem of collision is resolved by **chaining**.

It is a method to resolve **collision**.The item will be stored at next **available slot** in the table if there is clash,assuming that the table is already not full.It uses **less memory** than chaining as links are not required but is** slower** than chaining as one has to **move across** the table for long time.

Consider Fig 4.Say a list of numbers has to be stored in a hash table of **size 8** and hash function is **h(k) = k%8**.The no. in series are** 48,26,43,61,86,10,4 ,13**. **Mapping** is done and numbers are stored in the table.**10 and 26** have same remainder(**2**),so **10** goes to** next empty slot**.i.e. at **location 4**,Number **4** was to be stored at **location 4** but since it is occupied it will go next empty slot.i.e. at **7** and lastly **13 and 61** have same remainder (**5**) but **61** is already occupied so **13** will go to **location 1**.

Graphs can be represented by

- Adjacency Matrix by Sequential Representation
- Linked Representation by using an adjacency list that stores neighbors of a node using linked list.

- It represents which
**nodes**are**adjacent**to each other i.e. which nodes have edge connecting between them. **Rows**and**Columns**are labeled by**graph vertices**.- a is the adjacency matrix,i represents row, j represents column,v
_{ i }and v_{j}represents vertices. - a
_{ij }is**1**if v_{ i }and v_{j}are adjacent to each other and**0**if they are not. - Adjacency Matrix is also known as
**Bit matrix**or**Boolean Matrix**since it contains only**o’s**and**1’s**.

The graph is shown in Fig 1 and matrix is shown in Fig 2.

The graph is shown in Fig 3 and matrix is shown in Fig 4.

The graph is shown in Fig 5 and matrix is shown in Fig 6.

It is an another way to represent **graphs** in computer memory.It is a **linked representation** in which every node is linked to its own list that contains he names of all other nodes which are adjacent to itself.The graph is shown in Fig 7 and list is shown in Fig 8.

- It clearly shows the adjacent nodes of a particular node.
- Adding new nodes in the list is easy as compared to adding nodes in the adjacency matrix.

A graph is a collections of** vertices/nodes** and **edges(arcs)**.It is like a tree structure with a difference that it** does not** have any** parent-child** relationship between nodes.Instead,any complex relationships are possible between the nodes.A Graph **G**(Fig 1) is defined as an ordered set **(V,E)**,where **V(G)** is the set of vertices and **E(G)** represents the edges that connect the vertices together.

Fig 2 shows a graph with V(G) = {A,B,C,D,E,F}

E(G) = {(A,B),(A,C),(B,D),(C,D),(C,F),(D,E)}

There are 6 nodes and 6 edges in this Graph.

**Adjacent Nodes**– For every edge,**E = (U, V)**that connects nodes**U**and**V;**the nodes U and V are the end-points and are said to be the adjacent nodes.In Fig 2,**A and B**are**adjacent nodes.****Degree**– Degree of a node a is the total**number of edges**associated with the particular**node**.For e.g. ,degree of vertex A in Fig 2 is 2.**Path**– The**route/way**followed by**edges**from**one node**to**another node**.**Cycle**– A**closed simple path**with**length 3**or more is known as cycle.For e.g. P = {A,B,D,E,F,D} forms a cycle in Fig 3.**Multiple edges**–**Distinct edges**which connect the**same end points**/vertex are known as multiple edges.**E = (A,C)**and**E’ = (D,C)**are**multiple edges**in Fig 2.**Loop**– An edge that has**identical end points**is called as Loop.**P = {A,B,D,C,A}**forms a**loop**in**Fig 2**.**Size of Graph**– It is the**total**number of**edges**in the graph**.Size**of graph in Fig 2 is**6**.

In a **directed graph**,edges have an **orientation** and form an** ordered pair** (Fig 3). If there is an edge from A to B, then there is a path from A to B but not from B to A. The edge (A, B) is said to initiate from node A (also known as initial node) and terminate at node B (terminal node).Fig 3 is a directed graph with directed edges {(A,B),(B,D),(D,E),(E,F),(F,D)}

A graph is called as **undirected** if the edges **do not** have any** orientation**.The **edge** can be** traversed** from **one point** to **another** and vice versa.Traversal can be done from A to B and B to A in Fig 4 and same is applicable for other nodes also.

A **regular graph** is a graph where each **vertex **Â has the **same number** of **neighbors**, i.e., every vertex has the **same degree** . In Fig 5.degree of every vertex is 2.

A graph is said to be** connected** if there is **no isolated node** and there exists a **path** between any of the **two nodes** i.e there will be** at least** one edge **emerging** from any of the vertices in graph(Fig 6).

A graph G is said to be a **complete**, if there is **always** a **path **from **one node** to** another **i.e if all the nodes are fully connected(Fig 7).

A graph is said to be **weighted** if every edge has** some data/value ** associated with it.It is denoted by **w(e)** and is a** positive value** referring to the cost of traversing the edge.Weight of the graph as shown in Fig 8 , w(e) is** 29**.Cost of traversing edge A to B is 2,B to C is 5,C to D is 6,D to E is 9,E to A is 7.

A graph in which **vertex ** have **multiple edges ** and /or ** loops** is called a multi-graph(Fig 9).

Traversal means to **move across/visit **each and every **node** in the tree.There are three techniques to traverse a Binary Tree :

- Preorder
- Inorder
- Postorder

In preorder traversal,firstly the **root** is traversed then **left subtree** and in the last** right subtree**(Fig 1)**.**

Consider the following binary tree(Fig 2) :

**Root** is traversed in the **beginning** so **10** is** visited** at the **firs**t.After this **left sub-tree** is traversed. (Fig 3). In the** left subtree** **22** is the root of **44** and **25**,so** 22** is visited after 10 followed by left child i.e.**44**.**25** is the **right child** and is the root node of **17** and **88**.**44** is followed by **25** then **17** and **88**.After visiting left sub-tree traversal of right subtree is done.**13** is visited and followed by left child i.e.**46**.

In In-order traversal,**left subtree** is traversed in the **beginning ** followed by **root ** which is further followed by **right sub-tree**(Fig 4).

Consider the following binary tree(Fig 5) :

In post-order traversal left subtree is traversed first followed by right subtree and further followed by root.

Consider the following binary tree(Fig 5) :

**Binary trees** in memory can be represented by

- Array
- Linked Lists

Binary Trees can be represented using **1-D array** in memory(Fig 1).The rule to store** binary tree** in array are :

- The values of binary tree is stored in an array called as
**tree**. - The root of the tree is stored at
**first location**i.e. at**tree[0](0 index)**. - The
**left child**of tree is stored at**2i +1**and right child is stored at^{ }**2i+2**location. -
The
**maximum size**of the array tree is given as**2**, where^{d+1}-1**d**is the**depth**of the**tree**. -
An
**empty tree**or**sub-tree**is specified using**NULL**. If**tree[0] = NULL**, then it means that**tree is empty.**

In linked list representation,every** node** of the tree has **three parts** : the **data**/root element,the** pointer** to the** left node** and the **pointer** to the **right node**(Fig 2).In C,tree is built using structures by the following way :

struct node { struct node* left; int data; struct node* right; };

**Binary Trees** are the** data structure** defined by collection of elements called **nodes**.They are one of the efficient data structures for searching and insertion operations.The **topmost** node of binary tree is known as** ROOT** node pointed by ROOT pointer.Root node has ** left child/successor** and **right child/successor** pointed by left and right pointers(Fig 1).

The left child further may or may not have successors and the right child may/may not have further successors.If nodes are split into further nodes then a tree is formed known as **subtree** of root.The left node of root(divided into further nodes) are collectively called as **left subtree** and right node(divided into further nodes) are called as **right subtree**(Fig 2).Every node in a binary tree has **0,1 or at most 2 successors**.The node which **does not have** any successor is called as **leaf node** or **terminal node**.

**Sibling**- If**T**is a**tree**and**N**is any**node**that has**left**child/**successor S1**and**right successor S2**,then**N**is the**parent**of**S**1 and**S2**.**S1**and**S2**are called as**siblings**(Fig 3).Every node other than the root node has a parent.Thus all**nodes**which are at**same level**and have**same parent**are known as**siblings**.In Fig 3,**2 and 3**are**siblings**and**1**is the**parent**of**2 and 3.**

**Level Number**: It is the**level**of the**node**.The**root node**is at**level 0**and its**successors**are at**one level higher**than previous i.e at**level****1**.The further successors of left and right child are at again one level higher i.e.at 2.Thus level number of all child nodes are**+1 level**of their**parent**level.

**Degree**– It is equal to**number of children**that a**node**has.It can be**0,1 or 2**in case of binary tree.The degree of**leaf nodes**is**0**.The degree of nodes in Fig 4 is as given below:

**In-Degree**– It is the number of**edges arriving**at that**node**.The in-degree of**root node**is**0**as no edge arrives at root node.**Out-degree**– It is the number of**edges**that**exits**from that**node**.**Leaf node**– It is the**terminal node**having**no children**.**Directed edge**-**Line**drawn from**node**to any of the**successor**is known as directed edge.A binary tree of**N**nodes has exactly**N-1**edges.**Path**– A sequence of**consecutive edges**is known as Path.**Ancestor nodes**– All the**nodes**along the**path**from the**root node**to that**node**.**Descendant nodes**– All the nodes along the path from that**node**to**leaf node**.**Depth**– It is the**length**of path from**root node**to node**N**.A binary tree of height has at least h nodes and at most 2^{h }-1 nodes . For e.g In Fig 4 height of tree is 4 and it can have 2^{4 }-1 =15 nodes at the max.

n = 2^{h} -1

n + 1 =2^{h}

log(n+1) = log(2^{h)}

** log _{2}(n+1) = h**

** n** is the** number** of nodes and **h** is the **height** of tree. The **height** of binary tree is at least** n** and at most **log _{2}(n+1)**.

**Queues** are the **data structure** that stores data in an **ordered way**.The ** first element entered ** is the ** first** one to be ** removed**.It is a **FIFO(First In First Out)** data structure.The elements are added at one end called **REAR** end and removed from other end called as ** FRONT** end.

In Fig 2.FRONT =0 and REAR =4

**Queues** can be represented by **linear arrays**.It has **FRONT** and **REAR** variables.**FRONT** points to the position from where deletion occurs and **REAR** points to the position where insertion occurs.

Before insertion,check whether the queue has the space to accommodate the new element .If the queue is full is and still insertion in tried then “**OVERFLOW**” condition arises and element does not get into queue.It is checked by condition if (**REAR = MAX -1**) where** MAX** is the** maximum** number of elements that a queue can hold.If yes,then **OVERFLOW** condition arises.

**Program 1**

#include <stdio.h> int main( ) { int queue[10]; int max,num,i,item; int front,rear; //printf("Enter the size of queue"); scanf("%d",&max); //printf("Enter the no of elements in queue"); scanf("%d",&num); //printf("Enter the elements of queue"); printf("Queue before insertion : "); //Fig3 for(i=0;i<num;i++) { scanf("%d",&queue[i]); printf(" %d\t",queue[i]); } printf("\n"); front = 0; rear = num -1 ; printf("Front end : %d\n",front); printf("Rear end : %d\n",rear); if (rear == max-1) { printf("Overflow"); } else { //printf("Enter the item you want to insert \n"); scanf("%d",&item); rear = rear + 1; front = 0; queue[rear] = item; printf("Queue after insertion :");//Fig 4 for(i=0;i<=rear;i++) { printf(" %d\t",queue[i]); } printf("\n"); printf("Front end : %d\n",front); printf("Rear end : %d\n",rear); } return 0; }

Say we want to** insert 15** in Queue of Fig 3.

**REAR** would get incremented by 1(as insertions are done at REAR end) and **15** is stored at value of** 5th index**(Fig 4).

Before deleting an element check for “**UNDERFLOW**” condition.It occurs when we try to delete the element from an already empty queue.

If **FRONT =-1 and REAR = -1** ,it means that there is no element in the queue and it is empty since index starts from 0.

If we want to delete element from the queue,then value of **FRONT** is incremented(Fig 6).Deletions are done at FRONT end only.

**Program 2**

#include <stdio.h> int main( ) { int queue[10]; int max,num,i; int front=0,rear; //printf("Enter the size of queue"); scanf("%d",&max); //printf("Enter the no of elements in queue"); scanf("%d",&num); //printf("Enter the elements of queue"); printf("Queue before deletion : "); for(i=0;i<num;i++) { scanf("%d",&queue[i]); printf(" %d\t",queue[i]); } printf("\n"); front = 0; rear = num -1 ; printf("Front end : %d\n",front); printf("Rear end : %d\n",rear); if (front == -1 && rear == -1) { printf("Underflow"); } else { front = front +1; } printf("Queue after deletion :"); for(i=front;i<=rear;i++) { printf(" %d\t",queue[i]); } printf("\n"); printf("Front end : %d\n",front); printf("Rear end : %d\n",rear); return 0; }

Say we want to **delete 25** from the queue(Fig 5)

The value of FRONT in **incremented t0 1** and **25** is deleted from the queue(Fig 6)

Stacks are the** linear data structure** which stores **data** in an **ordered** way.The **first element** inserted is at the **bottom** and **last elemen**t is at the **top**.It can be implemented using an **array or a linked list**.It is a **LIFO(Last In First Out)** data structure.This means that last element inserted will be first element to be removed incase of deletion operation(Fig 1,2).

**Examples of Stack**

- Stack of plates.The last plate added is the first one to be removed.
- A stack of files(Fig 1) .
- A girl wearing bangles.The last bangle on the arm is the first one to be removed.
- Luggage placed in truck.

In **stack**,the **first element** is at** bottom**(0 position) and **last element** at **TOP** position.The topmost element of stack is represented by** TOP** variable.**MAX** variable is used to store the** maximum** no of** elements** that the** stack can** **hold**(Fig 3).

- if
**TOP =MAX-1**,then stack is**full**. - if
**TOP = NULL**,then stack is**empty**.

Each function in a stack runs in O(1) time.

- Push – Refer Push Operation in Stacks
- Pop – Refer Pop Operation in Stacks
- Peep – return topmost value

This operation first checks whether the** stack is empty** or has some elements(if TOP == NULL) and returns the **value** of **topmost element** **withou**t **deleting** it from the stack.In Fig 10, PEEP operation will **return 100** to user.

- Converting a decimal number into binary.
- Towers of Hanoi
- Evaluation of expression and syntax parsing.

**Counting** refers to **calculating number of node**s present in linked list.

**Program**

#include<stdio.h> struct node { int item; struct node *next }; int main() { struct node *start,*list,*temp; int i; start = (struct node *)malloc(sizeof(struct node)); list = start; start->next = NULL; for(i=1;i<5;i++) { list->item = i; list->next = (struct node *)malloc(sizeof(struct node)); list = list->next; } list->next = NULL; temp = start; int length =0; while(temp!=NULL)//Fig 2 { if (temp->next == NULL) { break; } else { length++; temp=temp->next; } } printf("Length of Linked List : %d \n",length); printf("Elements of linked lists are :\n"); while(start != NULL) { if (start->next == NULL) { break; } printf("%d\n",start->item); start = start->next; } return 0; }

**Illustration of Program**

- Refer Linked Lists for
**creation and display**of linked list. **temp = start**which means**temp =140**(Fig 1)