# Data Structures

## Hash functions

The mapping of keys to indices of a hash table is called as hash function.It is the usually the composition of two maps:

• Hash Code Map – Maps keys to integers when they are not integers.
• Compression Map – Maps wide range of integers to size of hash table.If S is size of table then integers need to brought to range 0 to [S -1]

### Hash Code Maps

Integer Cast – Reinterpret the bits of number as integer for numeric type with 32 bits or less.

Component sum – For numeric type with more than 32 bits,we can chunk into group of 32 bits and then interpret as integers.Component sum is not a good choice for  strings as it leads to more number of collisions.For e.g. if I want to convert ‘cusp‘ to integer value firstly ASCII value will be fetched  and then   added up  together to obtain the sum and integer value will be obtained.But there are huge no of possibilities that many words of english  dictionary have the same sum.If sum is same,multiple keys are mapped to same location resulting in high number of collisions.One of the  technique’s used to overcome this  problem is Polynomial accumulation

Polynomial Accumulation

For strings of english language,combine the ASCII values(ao, a1 ,a2,..an-1)  and view them as coefficients of a polynomial.

`ao + a1x + a2x2 + .an-1xn-1`

The integer value is computed with Horner’s rule for certain fixed value of x.The choice of x=33,37,39 or 41 gives at most 6 collisions on vocabulary of 50,000 words,

`ao + x(a1+ x(a2 + .x(an-2 +x an-1)))`

### Compression Maps

Use the remainder method – Use the modulo method for mapping high range of integers to size of table.If h is the hash function,k is the key and s is the size of table then use

`h(k) = k mod s`

## Double Hashing

It uses two hash functions h1 and h2.h1(k) is the position in the table to check for key k and h2 determines the offset from first position we use when searching again for k.Offset is how much distance has to covered while mapping the key to slot.The offset for linear probing is always 1,as we jumped one location ahead to store the element in the table.

Example

Consider a series of number 18,41,22,44,59,32,31,73 to be stored in hash table.

h1(k) = k mod 13(Fig 1)

h2(k) =  8 – (k mod 8)

Fig 1 : Animation of Double Hashing

## Hashing with non-integer keys

We can find some technique/way to covert keys to integers.For e.g if we have a mobile number like  98134-93789,we can  remove hyphen store it like 9813493789 or if we have a  string then we can  add ASCII values and then use hash functions on integer values.

## Features of Good Hash Function

• It is faster to compute.
• Distribution of keys is uniform throughout the table.
• It minimizes the probability of collisions.(No matter how good hash function we use,there is some probability of collisions in the slots).

Short URL: http://tinyit.cc/80f90e
May 17, 2014 |
Tags:

## Hashing

Hashing is a method and useful technique to implement dictionaries.This method is used to perform searching,insertion and deletion at a faster rate.A function called Hash Function is used to compute and return position of the record instead of searching with comparisons.The data is stored in array called as Hash table.The Hash Function maps keys into positions in a hash table.The mapping of keys to indices of a hash table is known as Hash Function.The major requirement of hash function is to map equal keys to equal indices.

Given a key, the algorithm computes an index that suggests where the entry can be found:

`index = f(key, array_size)`

The value of index is determined by 2 steps

1. hash = hash_func(key)
2. index = hash % array_size

A hash function is purely a user choice and a good hash function has following features :

• It is quick to compute.
• Keys are distributed uniformly throughout the table.
• The load factor is defined by  t= n/m where m is the slots/buckets and n is the elements in m of hash table t.

## Example

Say we want to perform certain operations on  salaries of employes from employee id’s  51000 to 52000.In order to access the data at a quicker rate we can use hashing(Fig 1 and Fig 2).

• Now keys(k) are are id’s from 51000 to 52000, i.e the array_size would be 1000.
• Say table[1000] is the array.
• We choose a Hash Function
```hash = k - 51000

index = hash % array_size```

Fig 1 : Computing hash function

Fig 2 : Hash Table of Fig 1

## Collision

When two or more values are mapped at same location/bucket by the hash function,then collision arises.

### Example 2

Say we have 100 students of subject CS501 of batch 2014 and  the roll no of first student is like 2014 CS501210,second is like 2014 CS501211 and the list goes on till 2104 CS501310.

• There are 100 students in class so we create a hash table of size 100.
• Say we chose a hash function of last two digits.
• 2014 CS501210 is stored at 10 location.
• Where will 2014 CS501310  go?Again at same location i.e at 10 ?
• A concept called collision arises when two or more entries are mapped at same location.

## Collision Resolution

### Chaining

In chaining,an array of tables/links is set up indexed by the keys to the list of items with the same key.Say we have a hash function which computed modulo 5 of any series and say  hash table has 6 locations in it.If two or more keys are mapped into same location we will keep on adding to the linked list.By example given below 1 key is mapped to location 0,2 keys are mapped to location 1,1 key is mapped to location 2,2 keys to location 3,3keys to location 4 and  no keys are mapped to location 5.The hash values ending at 0 are stored at 0 location,ending at 1 is stored at first location and the list goes on in the similar pattern..Thus ,the problem of collision is resolved by chaining.

Fig 3 : Chaining method

## Linear probing

It is a method to resolve collision.The item will be stored at next available slot in the table if there is clash,assuming that the table is already not full.It uses less memory than chaining as links are not required but is slower than chaining as one has to move across the table for long time.

Consider Fig 4.Say a list of numbers has to be stored in a hash table of size 8 and hash function is h(k) = k%8.The no. in series are 48,26,43,61,86,10,4 ,13. Mapping is done and numbers are stored in the table.10 and 26 have same remainder(2),so 10 goes to next empty slot.i.e. at location 4,Number 4 was to be stored at location 4 but since it is occupied it will go next empty slot.i.e. at 7 and lastly 13 and 61 have same remainder (5) but 61 is already occupied so 13 will go to location 1.

Fig 4 : Linear Probing

Short URL: http://tinyit.cc/087d20
May 16, 2014 |
Tags:

## Representation of Graphs

Graphs can be represented by

• Adjacency Matrix by Sequential Representation
• Linked Representation by using an adjacency list that stores neighbors of a node using linked list.

• It represents which nodes are adjacent to each other i.e. which nodes have edge connecting between them.
• Rows and Columns are labeled by graph vertices.
• a is the adjacency matrix,i represents row, j represents column,v i and vj represents vertices.
• aij is 1 if  v i and vj are adjacent to each other and 0 if they are not.
• Adjacency Matrix is also known as Bit matrix or Boolean Matrix since it contains only o’s and 1’s.

### Undirected Graph

The graph is shown in Fig 1 and matrix is shown in Fig 2.

Fig 1 : An Undirected Graph

Fig 2 : Adjacency Matrix of Fig 1

### Directed Graph

The graph is shown in Fig 3 and matrix is shown in Fig 4.

Fig 3 : An Undirected Graph

Fig 4 : Adjacency matrix of Fig 3

### Weighted Graph

The graph is shown in Fig 5 and matrix is shown in Fig 6.

Fig 5 : A Weighted Graph

Fig 6 : Adjacency matrix of Fig 6

It is an another way to represent graphs in computer memory.It is a linked representation in which every node is linked to its own list that contains he names of all other nodes which are adjacent to itself.The graph is shown in Fig 7 and list is shown in Fig 8.

Fig 7 : A Graph

Fig 8 : Adjacency list of Fig 7

• It clearly shows the adjacent nodes of a particular node.
• Adding new nodes in the list is easy as compared to adding nodes in the adjacency matrix.

Short URL: http://tinyit.cc/9edc88
May 14, 2014 |
Tags:

## Graphs

A graph is a collections of vertices/nodes and edges(arcs).It is like a tree structure with a difference that it does not have any parent-child relationship between nodes.Instead,any complex relationships are possible between the nodes.A Graph G(Fig 1) is defined as an ordered set (V,E),where V(G) is the set of vertices and E(G) represents the edges that connect the vertices together.

Fig 2 shows a graph with V(G) = {A,B,C,D,E,F}

E(G) = {(A,B),(A,C),(B,D),(C,D),(C,F),(D,E)}

There are 6 nodes and 6 edges in this Graph.

Fig 1 : A Graph

Fig 2 : A Graph

## Terms associated with Graph

• Adjacent Nodes –  For every edge, E = (U, V) that connects nodes U and V; the nodes U and V are the end-points and are said to be the adjacent nodes.In Fig 2,A and B are adjacent nodes.
• Degree – Degree of a node a is the total number of edges associated with the particular node.For e.g. ,degree of vertex A  in Fig 2 is 2.
• Path – The route/way followed by edges from one node to another node.
• Cycle – A closed simple path with length 3 or more is known as cycle.For e.g. P = {A,B,D,E,F,D} forms a cycle in Fig 3.
• Multiple edgesDistinct edges which connect the same end points/vertex are known as multiple edges.E = (A,C) and E’ = (D,C) are multiple edges in Fig 2.
• Loop – An edge that has identical end points is called as Loop.P = {A,B,D,C,A} forms a loop in Fig 2.
• Size of Graph – It is the total number of edges in the graph.Size of graph in Fig 2 is 6.

## Types of Graphs

### Directed Graph

Fig 3 : A Directed Graph

In a directed graph,edges have an orientation and  form an ordered pair (Fig 3). If there is an edge from A to B, then there is a path from A to B but not from B to A. The edge (A, B) is said to initiate from node A (also known as initial node) and terminate at node B (terminal node).Fig 3 is a directed graph with directed edges  {(A,B),(B,D),(D,E),(E,F),(F,D)}

### Undirected Graph

Fig 4 : Undirected Graph

A graph is called as undirected if the edges do not have any orientation.The edge can be traversed from one point to another and vice versa.Traversal can be done from A to B and B to A in Fig 4 and same is applicable for other nodes also.

### Regular Graph

Fig 5 : A Regular Graph

A regular graph is a graph where each vertex Â has the same number of neighbors, i.e., every vertex has the same degree . In Fig 5.degree of every vertex is 2.

### Connected Graph

Fig 6 : A Connected Graph

A graph is said to be connected if there is no isolated node and there exists a path between any of the two nodes i.e there will be at least one edge emerging from any of the vertices  in  graph(Fig 6).

### Complete Graph

Fig 7 : A Complete Graph

A graph G is said to be a complete, if there is always a  path from one node to another i.e if all the nodes are fully connected(Fig 7).

### Weighted Graph

Fig 8 ; A Weighted Graph

A graph is said to be weighted if every edge has some data/value  associated with it.It is denoted by w(e) and is a positive value referring to  the cost of traversing the edge.Weight of the graph as shown in Fig 8 , w(e) is 29.Cost of traversing edge A to B is 2,B to C is 5,C to D is 6,D to E is 9,E to A is 7.

### Multi-Graph

Fig 9 : Multi-Graph

A graph in which vertex  have  multiple edges  and /or  loops is called a multi-graph(Fig 9).

Short URL: http://tinyit.cc/448888
May 8, 2014 |
Tags:

## Traversal in Binary Trees

Traversal means to move across/visit each and every node in the tree.There are three techniques to traverse a Binary Tree :

• Preorder
• Inorder
• Postorder

### Preorder traversal

In preorder traversal,firstly  the root is traversed then left subtree and in the last right subtree(Fig 1).

Fig 1 : Technique of Pre-Order Traversal

### Example

Consider the following binary tree(Fig 2) :

Fig 2 : Pre-Order Traversal

Fig 3 : Traversal of nodes

Root is traversed in the beginning so 10 is visited at the first.After this left sub-tree is traversed. (Fig 3).  In the left subtree 22 is the root of 44 and 25,so 22 is visited after 10 followed by left child i.e.44.25 is the right child and  is the root node of 17 and 88.44 is followed by 25 then 17 and 88.After visiting left sub-tree traversal of right subtree is done.13 is visited and followed by left child i.e.46.

### Inorder Traversal

In In-order traversal,left subtree is traversed in the  beginning  followed by  root  which is further followed by  right sub-tree(Fig 4).

Fig 4 : Technique of In-order traversal

### Example

Consider the following binary tree(Fig 5) :

Fig 5 : In-Order Traversal

Fig 6 : Traversal of nodes

### Post Order Traversal

In post-order traversal left subtree is traversed first followed by right subtree and further followed by root.

Fig 7 : Technique of Post-Order traversal

Consider the following binary tree(Fig 5) :

Fig 8 : Post-order Traversal

Fig 9 : Post-Order Traversal

Short URL: http://tinyit.cc/4e9e08
May 7, 2014 |
Tags:

## Representation of Binary Trees in Memory

Binary trees in memory can be represented by

• Array

### Representation through Arrays

Binary Trees can be represented using 1-D array in memory(Fig 1).The rule to store binary tree in array are :

1. The values of binary tree is stored in an array called as tree.
2. The root of the tree is stored at first location i.e. at  tree[0](0 index).
3. The left child of tree is stored at  2i +1 and right child is stored at  2i+2 location.
4. The maximum size of the array tree is given as  2d+1-1, where d is the depth of the tree.
5. An empty tree or sub-tree is specified using NULL. If tree[0] = NULL, then it means that tree is empty.

Fig 1 : Array Representation of Binary Tree

In linked list representation,every node of the tree has three parts : the data/root element,the pointer to the left node and the pointer to the right node(Fig 2).In C,tree is built using structures by the following way :

```struct node
{
struct node* left;
int data;
struct node* right;
};```

Fig 2 : Linked List Representation of Binary Tree

Short URL: http://tinyit.cc/bfc8
May 6, 2014 |
Tags:

## Binary Trees

Binary Trees are the data structure defined by collection of elements called nodes.They are one of the efficient data structures for searching and insertion operations.The topmost node of binary tree is known as ROOT node  pointed by ROOT pointer.Root node has  left child/successor and  right child/successor pointed by left and right pointers(Fig 1).

Fig 1 : A Binary Tree

The left child further may or may not have successors and the right child may/may not have further successors.If nodes are split into further nodes then a tree is formed known as subtree of root.The left node of root(divided into further nodes) are collectively called as left subtree and right node(divided into further nodes) are called as right subtree(Fig 2).Every node in a binary tree has 0,1 or at most 2 successors.The node which does not have any successor is called as leaf node or terminal node.

Fig 2 : A Binary Tree

### Terms associated with Binary Tree

• Sibling -  If T is a tree and N is any node that has left child/successor S1 and right successor S2,then N is the parent of S1 and S2.S1 and S2 are called as siblings(Fig 3).Every node other than the root node has a parent.Thus all nodes  which are at same level and have same parent are known as siblings.In Fig 3, 2 and 3 are siblings and 1 is the parent of 2 and 3.

Fig 3 : Tree

• Level Number : It is the level of the node.The root node is at level 0 and its successors are at one level higher than previous i.e at level 1.The further successors of left and right child are at again one level higher i.e.at 2.Thus level number of all child nodes are +1 level of their parent level.

Fig 4 : Levels in binary tree

• Degree – It is equal to number of children that a node has.It can be 0,1 or 2 in case of binary tree.The degree of leaf nodes is 0.The degree of nodes in Fig 4 is as given below:

Fig 5 : Degree of nodes

•  In-Degree – It is the number of edges arriving at that node.The in-degree of root node is 0 as no edge arrives at root node.
• Out-degree – It is the number of edges that exits from that node.
• Leaf node – It is the terminal node having no children.
• Directed edge - Line drawn from node to any of the successor is known as directed edge.A binary tree of N nodes has exactly N-1 edges.
• Path –  A sequence of consecutive edges is known as Path.
• Ancestor nodes – All the nodes along the path from the root node to that node.
• Descendant nodes – All the nodes along the path from  that node to leaf node.
• Depth – It is the length of path from root node to node N.A binary tree of height has at least h nodes and at most 2h -1 nodes  .   For  e.g In Fig 4 height of tree is 4 and it can have  24 -1 =15 nodes at the max.

n = 2h -1
n + 1 =2h
log(n+1) = log(2h)
log2(n+1) = h

n is the number of nodes and h is the height of tree. The height of binary tree is at least n and at most log2(n+1).

Short URL: http://tinyit.cc/4ef084
May 2, 2014 |
Tags:

## Queues

Fig 3 : A Queue

Fig 2 : A Queue

Fig 1 : A Queue

Queues are the  data structure that stores data  in an ordered way.The  first element entered  is the  first one to be  removed.It is a  FIFO(First In First Out) data structure.The elements are added at one end called REAR end and removed from other end called as  FRONT end.

In Fig 2.FRONT =0 and REAR =4

Fig 2 : A Queue(FIFO)

## Array Representation of Queue

Queues can be represented by linear arrays.It has FRONT and REAR variables.FRONT points to the position from where deletion occurs and REAR points to the position where insertion occurs.

### Insertion in Queue

Before insertion,check whether the queue has the space to accommodate the new element .If the queue is full is and still insertion in tried then “OVERFLOW” condition arises and element does not get into queue.It is checked by condition if (REAR = MAX -1) where MAX is the maximum number of elements that a queue can hold.If yes,then OVERFLOW condition arises.

Program 1

```#include <stdio.h>

int main( )
{
int queue[10];
int max,num,i,item;
int front,rear;
//printf("Enter the size of queue");
scanf("%d",&max);
//printf("Enter the no of elements in queue");
scanf("%d",&num);
//printf("Enter the elements of queue");
printf("Queue before insertion : "); //Fig3
for(i=0;i<num;i++)
{
scanf("%d",&queue[i]);
printf(" %d\t",queue[i]);
}
printf("\n");
front = 0;
rear = num -1 ;
printf("Front end : %d\n",front);
printf("Rear end : %d\n",rear);
if (rear == max-1)
{
printf("Overflow");
}
else
{
//printf("Enter the item you want to insert \n");
scanf("%d",&item);
rear = rear + 1;
front = 0;
queue[rear] = item;
printf("Queue after insertion :");//Fig 4
for(i=0;i<=rear;i++)
{
printf(" %d\t",queue[i]);
}
printf("\n");
printf("Front end : %d\n",front);
printf("Rear end : %d\n",rear);
}
return 0;
}```

Say we want to insert 15 in Queue of  Fig 3.

Fig 3 : FRONT = 0
REAR = 4

REAR would get incremented by 1(as insertions are done at REAR end) and 15 is stored at value of  5th index(Fig 4).

Fig 4 : FRONT = 0
REAR =5

### Deletion in Queue

Before deleting an element check for “UNDERFLOW” condition.It occurs when we try to delete the element from an already empty queue.

If FRONT =-1 and REAR = -1 ,it means that there is no element in the queue and it is empty since index starts from 0.

If we want to delete element from the queue,then value of FRONT is incremented(Fig 6).Deletions are done at FRONT end only.

Program 2

```#include <stdio.h>
int main( )
{
int queue[10];
int max,num,i;
int front=0,rear;
//printf("Enter the size of queue");
scanf("%d",&max);
//printf("Enter the no of elements in queue");
scanf("%d",&num);
//printf("Enter the elements of queue");
printf("Queue before deletion : ");
for(i=0;i<num;i++)
{
scanf("%d",&queue[i]);
printf(" %d\t",queue[i]);
}
printf("\n");
front = 0;
rear = num -1 ;
printf("Front end : %d\n",front);
printf("Rear end : %d\n",rear);
if (front == -1 && rear == -1)
{
printf("Underflow");
}
else
{
front = front +1;
}
printf("Queue after deletion :");
for(i=front;i<=rear;i++)
{
printf(" %d\t",queue[i]);
}
printf("\n");
printf("Front end : %d\n",front);
printf("Rear end : %d\n",rear);
return 0;
}```

Say we want to delete 25 from the queue(Fig 5)

Fig 5 : FRONT = 0
REAR = 4

The value of FRONT in incremented t0 1 and 25 is deleted from the queue(Fig 6)

REAR =5

Short URL: http://tinyit.cc/044009
April 30, 2014 |
Tags:

## Stacks

Stacks are the linear data structure which stores data in an ordered way.The first element inserted is at the bottom and last element is at the top.It can be implemented using an array or a linked list.It is a LIFO(Last In First Out) data structure.This means that last element inserted will be first element to be removed incase of deletion operation(Fig 1,2).

Examples of Stack

• Stack of plates.The last plate added is the first one to be removed.
• A stack of files(Fig 1) .
• A girl wearing bangles.The last bangle on the arm is the first one to be removed.
• Luggage placed in truck.

Fig 1 : A Stack of Files

Fig 2 : A Stack

### Array Representation of Stack

In stack,the first element is at bottom(0 position) and last element at TOP position.The topmost element of stack is represented by TOP variable.MAX variable is used to store the maximum no of elements that the stack can hold(Fig 3).

• if TOP =MAX-1,then stack is full.
• if TOP = NULL,then stack is empty.

Fig 3 : A Stack

### Operations on Stack

Each function in a stack runs in O(1) time.

Fig 4 : Push & Pop Operation in Stack

### PEEP Operation

This operation  first checks whether the stack is empty or has some elements(if TOP == NULL) and returns the value of topmost element without deleting it from the stack.In Fig 10, PEEP operation will return 100 to user.

Fig 10 : A Stack

### Applications of Stack

• Converting a decimal number into binary.
• Towers of Hanoi
• Evaluation  of expression and syntax parsing.

Short URL: http://tinyit.cc/2c8820
April 10, 2014 |
Tags:

## Count number of nodes in Linked List

Counting refers to calculating number of nodes present in linked list.

Counting

Program

```#include<stdio.h>
struct node
{
int item;
struct node *next
};
int main()
{
struct node *start,*list,*temp;
int i;

start = (struct node *)malloc(sizeof(struct node));
list = start;
start->next = NULL;
for(i=1;i<5;i++)
{
list->item = i;
list->next = (struct node *)malloc(sizeof(struct node));
list = list->next;
}
list->next = NULL;
temp = start;
int length =0;
while(temp!=NULL)//Fig 2
{

if (temp->next == NULL)
{
break;
}
else
{
length++;
temp=temp->next;
}
}

printf("Length of Linked List : %d \n",length);
printf("Elements of linked lists are :\n");
while(start != NULL)
{
if (start->next == NULL)
{
break;
}
printf("%d\n",start->item);
start = start->next;
}
return 0;
}```

Illustration of Program