From 1ca0d3d62c1359e573144883169a4ceeb2920c53 Mon Sep 17 00:00:00 2001 From: lomna Date: Tue, 1 Aug 2023 21:23:02 +0530 Subject: [PATCH] Just complete TODO's now --- imgs/strongly-connected-component.svg | 14 + main.html | 611 +++++++++++++++++--------- main.org | 119 ++++- main.tsk | 4 +- 4 files changed, 538 insertions(+), 210 deletions(-) create mode 100644 imgs/strongly-connected-component.svg diff --git a/imgs/strongly-connected-component.svg b/imgs/strongly-connected-component.svg new file mode 100644 index 0000000..cfb7453 --- /dev/null +++ b/imgs/strongly-connected-component.svg @@ -0,0 +1,14 @@ + + + + + + + + + + + + + abcde + fgh \ No newline at end of file diff --git a/main.html b/main.html index b568c1e..38fc352 100644 --- a/main.html +++ b/main.html @@ -3,7 +3,7 @@ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - + Data Structures @@ -223,108 +223,114 @@

Table of Contents

-
-

1. Stack

+
+

1. Stack

A stack is a data structure which only allows insertion and deletion from one end of the array. The insertion is always on the extreme end of the array. The deletion can only be done on the element which was most recently added. @@ -347,18 +353,21 @@ To create a stack, we will keep track of the index which is the top of th

-
-

1.1. Operation on stack

+
+

1.1. Operation on stack

A stack has two operations

+
    +
  1. Push
  2. +
  3. Pop
  4. +
- -
-

2. Direct Address Table

+
+

2. Direct Address Table

Direct Address Tables are useful when we know that key is within a small range. Then, we can allocate an array such that each possible key gets an index and just add the values according to the keys. @@ -435,8 +444,8 @@ This also assumes that keys are integers

-
-

3. Hash Table

+
+

3. Hash Table

When the set of possible keys is large, it is impractical to allocate a table big enough for all keys. In order to fit all possible keys into a small table, rather than directly using keys as the index for our array, we wil first calculate a hash for it using a hash function. Since we are relying on hashes for this addressing in the table, we call it a hash table. @@ -452,8 +461,8 @@ So the main purpose of the hash function is to reduce the range of array indices

-
-

3.1. Collision

+
+

3.1. Collision

Because we are reducing the range of indices, the hash function may hash two keys to the same slot. This is called a collision. @@ -472,8 +481,8 @@ There are two ways we will look at to resolve collision.

-
-

3.1.1. Chaining

+
+

3.1.1. Chaining

In chaining, rather than storing values in table slots. We will have linked lists at each slot which will store (key, value) pairs. @@ -552,8 +561,8 @@ Insertion can be done in \(\theta (1)\) time if we assume that key being inserte

-
-

3.1.2. Performance of chaining hash table

+
+

3.1.2. Performance of chaining hash table

The load factor is defined as number of elements per slot and is calculated as @@ -569,8 +578,8 @@ If we also assume that hash funtion takes constant time, then in the average cas

-
-

3.1.3. Open Addressing

+
+

3.1.3. Open Addressing

In open addressing, all the key and value pair of entries are stored in the table itself. Because of this, the load factor \(\left( \alpha \right)\) can never exceed 1. @@ -585,7 +594,7 @@ It is necessary to keep probe sequence fixed for any given key, so that we can s

    -
  1. Linear probing
    +
  2. Linear probing

    For a given ordinary hash function \(h(k)\), the linear probing uses the hash function @@ -601,7 +610,7 @@ Linear probing is easy to implement, but it suffers from primary clusterin

  3. -
  4. Quadratic probing
    +
  5. Quadratic probing

    For given auxiliary hash function \(h(k)\), the quadratic probing uses @@ -622,7 +631,7 @@ If \(quadratic\_h(k_1, 0) = quadratic\_h(k_2,0)\), then that implies that all \(

  6. -
  7. Double Hashing
    +
  8. Double Hashing

    Double hashing is one of the best available method for open addressing.
    @@ -657,8 +666,8 @@ The number of probes on averge in a successful search is at most \(\frac{1}{\alp

-
-

3.2. Hash Functions

+
+

3.2. Hash Functions

A good hash funtion will approximately satisfy the simple uniform hashing, which means that any element is equally likely to be hashed to any slot. @@ -680,8 +689,8 @@ We will look at a few ways to make a hash function.

-
-

3.2.1. The division method

+
+

3.2.1. The division method

In division method, we map a key \(k\) into one of the \(m\) slots by taking the remainder of k divided by m. @@ -697,8 +706,8 @@ But there are some cases where \(m\) is chosen to be something else.

-
-

3.2.2. The multiplication method

+
+

3.2.2. The multiplication method

In multiplication method, we first multiply the key \(k\) with a constant \(A\) which is in range \(0 < A < 1\). Then we get the fractional part of \(kA\). Then we multiply the fractional part by \(m\) and floor it to get the hash. @@ -741,8 +750,8 @@ In C language,

-
-

3.2.3. Mid square method

+
+

3.2.3. Mid square method

In this method, we square the keys and then we choose some digits from the middle. @@ -755,8 +764,8 @@ With huge numbers, we need to take care of overflow conditions in this method.

-
-

3.2.4. Folding method

+
+

3.2.4. Folding method

While this method can be used on integers, this method is usually used where the key is segmented. For example in arrays or when key is a string. @@ -781,16 +790,17 @@ h(k) = 32

-
-

3.3. Universal Hashing

+
+

3.3. Universal Hashing

-TODO: Basics of universal hashing. +TODO : Basics of universal hashing.

-
-

3.4. Perfect Hashing

+ +
+

3.4. Perfect Hashing

NOTE: This doesn't seem to be in B.Tech syllabus, but it seems cool. @@ -800,8 +810,8 @@ TODO: Basics of universal hashing.

-
-

4. Representing rooted trees using nodes

+
+

4. Representing rooted trees using nodes

We can represent trees using nodes. A node only stores a single element of the tree. What is a node will depend on the language being used. @@ -837,8 +847,8 @@ In languages with oop, we create node class which will store refrences to other

-
-

4.1. Fixed number of children

+
+

4.1. Fixed number of children

When we know how many children any given node can have, i.e, the number of children is bounded. We can just use refrences or pointers to the nodes directly. @@ -857,8 +867,8 @@ For example, if we know we are making a binary tree, then we can just store refr

-
-

4.2. Unbounded number of children

+
+

4.2. Unbounded number of children

When we don't know how many children any given node will have. Thus any node can have any number of children, we can't just use refrences. We could create an array of refrences to nodes, but some nodes will only have one or two childs and some may have no childs. This will lead to a lot of wasted memory. @@ -906,8 +916,8 @@ can be represented using refrences and pointers as :

-
-

5. Binary Search Trees

+
+

5. Binary Search Trees

A tree where any node can have only two child nodes is called a binary tree. @@ -931,15 +941,15 @@ In C, we can make a binary tree as

-
-

5.1. Quering a BST

+
+

5.1. Quering a BST

Some common ways in which we usually query a BST are searching for a node, minimum & maximum node and successor & predecessor nodes. We will also look at how we can get the parent node for a given node, if we already store a parent pointer then that algorithm will be unnecessary.

-
-

5.1.1. Searching for node

+
+

5.1.1. Searching for node

We can search for a node very effectively with the help of binary search tree property. The search will return the node if it is found, else it will return NULL. @@ -981,8 +991,8 @@ We can also search iteratively rather than recursively.

-
-

5.1.2. Minimum and maximum

+
+

5.1.2. Minimum and maximum

Finding the minimum and maximum is simple in a Binary Search Tree. The minimum element will be the leftmost node and maximum will be the rightmost node. We can get the minimum and maximum nodes by using these algorithms. @@ -1014,8 +1024,8 @@ Finding the minimum and maximum is simple in a Binary Search Tree. The minimum e

-
-

5.1.3. Find Parent Node

+
+

5.1.3. Find Parent Node

This algorithm will return the parent node. It uses a trailing node to get the parent. If the root node is given, then it will return NULL. This algorithm makes the assumption that the node is in the tree. @@ -1041,8 +1051,8 @@ This algorithm will return the parent node. It uses a trailing node to get the p

-
-

5.1.4. Is ancestor

+
+

5.1.4. Is ancestor

This algorithm will take two nodes, ancestor and descendant. Then it will check if ancestor node is really the ancestor of descendant node. @@ -1068,8 +1078,8 @@ This algorithm will take two nodes, ancestor and descendant. Then it will check

-
-

5.1.5. Successor and predecessor

+
+

5.1.5. Successor and predecessor

We often need to find the successor or predecessor of an element in a Binary Search Tree. The search for predecessor and succesor is divided in to two cases. @@ -1077,7 +1087,7 @@ We often need to find the successor or predecessor of an element in a Binary Sea

    -
  1. For Successor
    +
  2. For Successor
    // get successor of x
    @@ -1104,7 +1114,7 @@ We often need to find the successor or predecessor of an element in a Binary Sea
     

  3. -
  4. For Predecessor
    +
  5. For Predecessor
    struct binary_tree *
    @@ -1133,15 +1143,15 @@ We often need to find the successor or predecessor of an element in a Binary Sea
     
-
-

5.2. Inserting and Deleting nodes

+
+

5.2. Inserting and Deleting nodes

When inserting and deleting nodes in BST, we need to make sure that the Binary Search Tree property continues to hold. Inserting node is easier in a binary search tree than deleting a node.

-
-

5.2.1. Insertion

+
+

5.2.1. Insertion

Insertion is simple in a binary search tree. We search for the node we want to insert in the tree and insert it where we find first NULL spot. @@ -1201,8 +1211,8 @@ The algorithm for iterative insertion is

-
-

5.2.2. Deletion

+
+

5.2.2. Deletion

Deletion in Binary Search Trees is tricky because we need to delete nodes in a way that the property of the Binary Search Tree holds after the deletion of the node. So we first have to remove the node from the tree before we can free it. @@ -1212,7 +1222,7 @@ TODO : Write four cases of node deletion here

    -
  1. Implementation in code
    +
  2. Implementation in code

    We also use a helper function called Replace Child for deletion of node. This function will simply take parent node, old child node and new child node and replace old child with new child. @@ -1300,8 +1310,8 @@ Now we can make a delete node function which will remove the node, reattach the

-
-

5.3. Performance of BST

+
+

5.3. Performance of BST

The performance of the search operation depends on the height of the tree. If the tree has \(n\) elements, the height of a binary tree can be between \(n\) and \(floor\left( 1+ log_2(n) \right)\). @@ -1320,16 +1330,16 @@ A balanced binary search tree in worst case for any operation will take \(\theta

-
-

5.4. Traversing a Binary Tree

+
+

5.4. Traversing a Binary Tree

There are three ways to traverse a binary tree, inorder tree walk, preorder tree walk and postorder tree walk. All three algorithm will take \(\theta (n)\) time to traverse the \(n\) nodes.

-
-

5.4.1. Inorder tree walk

+
+

5.4.1. Inorder tree walk

This algorithm is named so because it first traverses the left sub-tree recursively, then the node value and then traverses right sub-tree recursively. @@ -1356,8 +1366,8 @@ This algorithm is named so because it first traverses the left sub-tree recursiv

-
-

5.4.2. Preorder tree walk

+
+

5.4.2. Preorder tree walk

This algorithm is called preorder algorithm because it will first traverse the current node, then recursively traverses the left sub-tree and then recursively traverse the right sub-tree. @@ -1382,8 +1392,8 @@ This algorithm is called preorder algorithm because it will first traverse the c

-
-

5.4.3. Postorder tree walk

+
+

5.4.3. Postorder tree walk

In this algorithm, we first traverse the left sub-tree recursively, then the right-sub tree recursively and finally the node. @@ -1410,8 +1420,8 @@ In this algorithm, we first traverse the left sub-tree recursively, then the rig

-
-

6. Binary Heap

+
+

6. Binary Heap

Heap is a data structure represented as a complete tree which follows the heap property. All levels in a heap tree are completely filled except possible the last one, which is filled from left to right. @@ -1424,8 +1434,8 @@ The heap data structure is used to implement priority queues. In many cas

-
-

6.1. Heap Property

+
+

6.1. Heap Property

Heaps are of two types @@ -1444,8 +1454,8 @@ The heap property is different for min-heaps and max-heaps.

-
-

6.2. Shape of Heap

+
+

6.2. Shape of Heap

Also reffered to as shape property of heap. @@ -1454,8 +1464,8 @@ A heap is represented as a complete tree. A complete tree is one where all the l

-
-

6.3. Array implementation

+
+

6.3. Array implementation

We can implement binary heap using arrays. The root of tree is the first element of the array. The next two elements are elements of second level of tree and children of the root node. Similary, the next four elements are elements of third level of tree and so on. @@ -1482,15 +1492,15 @@ In C, we can create a heap struct for easier implementation of algorithms

-
-

6.4. Operations on heaps

+
+

6.4. Operations on heaps

Both insertion and deletion in heap must be done in a way which conform to the heap property as well as shape property of heap. Before we can look at insertion and deletion, we need a way to find parent and child for a given index. We will also first see up-heapify and down-heapfiy funtions.

-
-

6.4.1. Parent and child indices

+
+

6.4.1. Parent and child indices

In a binary heap, we can find parent and children for any given index using simple formulas. @@ -1510,8 +1520,8 @@ In a binary heap, we can find parent and children for any given index using simp

-
-

6.4.2. Down-heapify

+
+

6.4.2. Down-heapify

The down-heapify is a function which can re-heapify an array if no element of heap violates the heap property other than index and it's two children. @@ -1549,8 +1559,8 @@ Since we shift element downwards, this operation is often called down-heap

-
-

6.4.3. Up-heapify

+
+

6.4.3. Up-heapify

The up-heapify is a function which can re-heapify an array if no element of heap violates the heap property other than index and it's parent. @@ -1582,13 +1592,13 @@ This function runs in \(\theta (log_2n)\) time. The algorithm for this works as Since we shift element upwards, this operation is often called up-heap operation. It is also known as trickle-up, swim-up, heapify-up, or cascade-up

-TODO : Maybe up-heapfiy funtion should be made cleaner rather than trying to mirror down-heapify funtion. +TODO : Maybe up-heapfiy funtion should be made cleaner rather than trying to mirror down-heapify funtion.

-
-

6.4.4. Insertion

+
+

6.4.4. Insertion

Insertion takes \(\theta (log_2n)\) time in a binary heap. To insert and element in heap, we will add it to the end of the heap and then apply up-heapify operation of the elment @@ -1613,8 +1623,8 @@ The code shows example of insertion in a max-heap.

-
-

6.4.5. Deletion or Extraction

+
+

6.4.5. Deletion or Extraction

Like insertion, extraction also takes \(\theta (log_2n)\) time. Extraction from heap will extract the root element of the heap. We can use the down-heapify function in order to re-heapify after extracting the root node. @@ -1643,8 +1653,8 @@ The code shows example of extraction in max-heap.

-
-

6.4.6. Insert then extract

+
+

6.4.6. Insert then extract

Inserting an element and then extracting from the heap can be done more efficiently than simply calling these functions seperately as defined previously. If we call both funtions we define above, we have to do an up-heap operation followed by a down-heap. Instead, there is a way to do just a single down-heap. @@ -1681,16 +1691,16 @@ In python, this is implemented by the name of heap replace.

-
-

6.4.7. Searching

+
+

6.4.7. Searching

Searching for a arbitrary element takes linear time in a heap. We use linear search to search for element in array.

-
-

6.4.8. Deleting arbitray element

+
+

6.4.8. Deleting arbitray element

For a max-heap, deleting an arbitrary element is done as follows @@ -1702,8 +1712,8 @@ For a max-heap, deleting an arbitrary element is done as follows

-
-

6.4.9. Decrease and increase keys

+
+

6.4.9. Decrease and increase keys

TODO : I don't know if it is neccessary to do this operation. It looks simple to implement. @@ -1711,8 +1721,8 @@ TODO : I don't know if it is neccessary to do this operation. It looks simple to

-
-

6.5. Building a heap from array

+
+

6.5. Building a heap from array

We can convert a normal array into a heap using the down-heapify operation in linear time \(\left( \theta (n) \right)\) @@ -1736,15 +1746,15 @@ If we are using a one indexed language, then range of for loop is

-
-

7. Graphs

+
+

7. Graphs

A graph is a data structure which consists of nodes/vertices, and edges. We sometimes write it as \(G=(V,E)\), where \(V\) is the set of vertices and \(E\) is the set of edges. When we are working on runtime of algorithms related to graphs, we represent runtime in two input sizes. \(|V|\) which we simply write as \(V\) is the number of vertices and similarly \(E\) is the number of edges.

-
-

7.1. Representing graphs

+
+

7.1. Representing graphs

We need a way to represent graphs in computers and to search a graph. Searching a graph means to systematically follow edges of graphs in order to reach vertices. @@ -1752,10 +1762,13 @@ We need a way to represent graphs in computers and to search a graph. Searching
The two common ways of representing graphs are either using adjacency lists and adjacency matrix. Either can represent both directed and undirected graphs.

-
-
-

7.1.1. Adjacency List

+

+TODO : add images to show how it is represented +

+
+
+

7.1.1. Adjacency List

Every node in the graph is represented by a linked list. The list contains the nodes to which the list node is connected by an edge. @@ -1779,8 +1792,8 @@ The adjacency list representation is very robust and can represent various types

-
-

7.1.2. Adjacency Matrix

+
+

7.1.2. Adjacency Matrix

We use a single matrix to represent the graph. The size of the matrix is \(\left( |V| \times |V| \right)\). When we make the matrix, all it's elements are zero, i.e the matrix is zero initialized. @@ -1814,8 +1827,8 @@ We can store weighted graphs in adjacency matrix by storing the weights along wi

-
-

7.2. Vertex and edge attributes

+
+

7.2. Vertex and edge attributes

Many times we have to store attributes with either vertices or edges or sometimes both. How this is differs by language. In notation, we will write it using a dot (.) @@ -1827,8 +1840,8 @@ Similarly, the attribute x of edge (u , v) will be denoted as (u , v).x

-
-

7.3. Density of graph

+
+

7.3. Density of graph

Knowing the density of a graph can help us choose the way in which we represent our graph. @@ -1851,8 +1864,8 @@ Therefore, maximum density for a graph is 1. The minimum density for a graph is Knowing this, we can say graph with low density is a sparse graph and graph with high density is a dense graph.

-
-

7.3.1. Which representation to use

+
+

7.3.1. Which representation to use

For a quick approximation, when undirected graph and \(2|E|\) is close to \(|V|^2\), we say that graph is dense, else we say it is sparse. @@ -1868,8 +1881,8 @@ Another criteria is how algorithm will use the graph. If we want to traverse to

-
-

7.4. Searching Graphs

+
+

7.4. Searching Graphs

Graph search (or graph traversal) algorithms are used to explore a graph to find nodes and edges. Vertices not connected by edges are not explored by such algorithms. These algorithms start at a source vertex and traverse as much of the connected graph as possible. @@ -1878,8 +1891,8 @@ Graph search (or graph traversal) algorithms are used to explore a graph to find Searching graphs algorithm can also be used on trees, because trees are also graphs.

-
-

7.4.1. Breadth first search

+
+

7.4.1. Breadth first search

BFS is one of the simplest algorithms for searching a graph and is used as an archetype for many other graph algorithms. This algorithm works well with the adjacency list representation. @@ -1922,8 +1935,8 @@ For an input graph \(G=(V,E)\), every node is enqued only once and hence, dequeu

-
-

7.4.2. Breadth-first trees for shortest path

+
+

7.4.2. Breadth-first trees for shortest path

For a simple graph, we may want to get the shortest path between two nodes. This can be done by making a Breadth-first tree. @@ -1980,30 +1993,36 @@ This will print shortest path from end node to start node.

-
-

7.4.3. Depth first search

+
+

7.4.3. Depth first search

Unlike BFS, depth first search is more biased towards the farthest nodes of a graph. It follows a single path till it reaches the end of a path. After that, it back tracks to the last open path and follows that one. This process is repeated till all nodes are covered. -
-
+

+ +

Implementation of DFS is very similar to BFS with two differences. Rather than using a queue, we use a stack. In BFS, the explored nodes are added to the queue, but in DFS we will add unexplored nodes to the stack.

+

+Also in DFS, nodes are accessed two times, first when they are discovered and then when they are backtracked to and are considered finished. +

DFS(graph_type graph, node_type start){
   stack_type stack;
   stack.push(start);
   while(stack.len != 0){
     node_type v = stack.pop();
-    if(v.explored == false){
-      v.explored = true;
+    if(v.discovered == false){
+      v.discovered = true;
 
       node_list adjacency_list = graph.adj_list(start);
       while(adjacency_list != NULL){
         stack.push(adjacency_list.node);
         adjacency_list = adjacency_list.next;
       }
+
+      v.finished = true;
     }
   }
 }
@@ -2042,23 +2061,215 @@ For an input graph \(G=(V,E)\), the time complexity for Depth first search is \(
 
-
-

7.4.4. Properties of DFS

+
+

7.4.4. Properties of DFS

-DFS is very useful to understand the structure of a graph. To understand the +DFS is very useful to understand the structure of a graph. To study the structure of a graph using DFS, we will get two attributes of each node using DFS. We suppose that each step in traversal takes a unit of time. +

+
    +
  • Discovery time : The time when we first discovered the node. We will set this at the time we push node to stack. We will denote it as node.d
  • +
  • Finishing time : The time when we explored the node. We will set this when we pop the node and explore it. We will denote it as node.f
  • +
+

+So our funtion will become +

+
+
// call start node with time = NULL
+DFS(graph_type graph, node_type node, size_t *time){
+  node.discovered = true;
+  // if time is NULL, initialize it
+  if(time == NULL){
+    size_t initial_time = 0;
+    time = &initial_time;
+  }
+
+  (*time) = (*time) + 1;
+  node.d = (*time);
+
+  node_list adjacency_list = graph.adj_list(node);
+  while(adjacency_list != NULL){
+    node_type u = adjacency_list.node;
+    if(u.discovered == false)
+      DFS(graph, u, time);
+    adjacency_list = adjacency_list.next;
+  }
+
+  (*time) = (*time) + 1;
+  node.f = (*time);
+}
+
+
+ +

+This algorithm will give all nodes the (node.d) and (node.f) attribute. Similar to BFS, we can create a tree from DFS. Having knowledge of these attributes can tell us properites of this DFS tree. +

+
+ +
    +
  1. Parenthesis theorem
    +
    +

    +The paranthesis theorem is used to find relationship between two nodes in the Depth First Search Tree. +
    +For any two given nodes \(x\) and \(y\). +

    +
      +
    • If range \([x.d, x.f]\) is completely within \([y.d, y.f]\), then \(x\) is a descendant of \(y\).
    • +
    • If range \([x.d, x.f]\) and \([y.d, y.f]\) are completely disjoint, then neither is descendant or ancestor of another.
    • +
    +

    +So if node, \(y\) is a proper descendant of node \(x\) in the depth first tree, then +\[ \text{x is ancestor of y} : x.d < y.d < y.f < x.f \] +

    +
    +
  2. +
  3. White path theorem
    +
    +

    +If \(y\) is a descendant of \(x\) in graph G, then at time \(t = x.d\), the path from \(u\) to \(v\) was undiscovered. +

    + +

    +That is, all the nodes in path from \(x\) to \(y\) were undiscovered. Undiscovered nodes are shown by white vertices in visual representations of DFS, therfore this theorem was named white path theorem. +

    +
    +
  4. +
  5. Classification of edges
    +
    +

    +We can arrange the connected nodes of a graph into the form of a Depth-first tree. When the graph is arranged in this way, the edges can be classified into four types +

    +
      +
    1. Tree edge : The edges of graph which become the edges of the depth-first tree.
    2. +
    3. Back edge : The edges of graph which point from a descendant node to an ancestor node of depth-first tree. They are called back edge because they point backwards to the root of the tree oppsite to all tree edges.
    4. +
    5. Forward edge : The edges of graph which point from a point from an ancestor node to a descendant node.
    6. +
    7. Cross edge : An edge of graph which points to two different nodes
    8. +
    +

    +The back edge, forward edge and cross edge are not a part of the depth-first tree but a part of the original graph. +

    +
      +
    • In an undirected graph G, every edge is either a tree edge or a back edge.
    • +
    +
    +
  6. +
+
+
+

7.4.5. Depth-first and Breadth-first Forests

+
+

+In directed graphs, the depth-first and breadth-first algorithms can't traverse to nodes which are not connected by a directed edge. This can leave parts of graph not mapped by a single tree. +

+ +

+These tree's can help us better understand the graph and get properties of nodes, so we can't leave them when converting a graph to tree. +
+To solve this, we have collection of trees for the graph. This collection of trees will cover all the nodes of the graph and is called a forest. The forest of graph \(G\) is represented by \(G_{\pi}\). +

+ +

+Thus when using DFS or BFS on a graph, we store this collection of trees i.e, forests so that we can get properties of all the nodes. +

+ +
    +
  • NOTE : When making a depth-first forest, we don't reset the the time when going from one tree to another. So if finishing time of for root of a tree is \(t\), the discovery time of root node of next tree will be \((t+1)\).
  • +
+
+
+
+

7.4.6. Topological sort using DFS

+
+

+Topological sorting can only be done on directed acyclic graphs. A topological sort is a linear ordering of the nodes of a directed acyclic graph (dag). It is ordering the nodes such that all the the edges point right. +

+ +

+Topological sorting is used on precedence graphs to tell which node will have higher precedence. +

+ +

+To topologically sort, we first call DFS to calculate the the finishing time for all the nodes in graph and form a depth-first forest. Then, we can just sort the finishing times of the nodes in descending order. +

+ +

+TODO : Add image to show process of topological sorting +

+ +
    +
  • A directed graph \(G\) is acyclic if and only if the depth-first forest has no back edges.
  • +
+
+
+
+
+

7.5. Strongly connected components

+
+

+If we can traverse from a node \(x\) to node \(y\) in a directed graph, we show it as \(x \rightsquigarrow y\). +

+ +
    +
  • A pair of nodes \(x\) and \(y\) is called if \(x \rightsquigarrow y\) and \(y \rightsquigarrow x\)
  • +
  • A graph is said to be strongly connected if all pairs of nodes are strongly connected in the graph.
  • +
  • If a graph is not strongly connected, we can divide the graph into subgraphs made from neighbouring nodes which are strongly connected. These subgraphs are called strongly connected componnents.
  • +
+ +

+Example, the dotted regions are the strongly connected components (SCC) of the graph.

+ + +
+

strongly-connected-component.svg +

+
+ +
+

7.5.1. Finding strongly connected components

+
+

+We can find the strongly connected components of a graph \(G\) using DFS. The algorithm is called Kosaraju's algorithm. +

+ +

+For this algorithm, we also need the transpose of graph \(G\). The transpose of graph \(G\) is denoted by \(G^T\) and is the graph with the direction of all the edges flipped. So all edges from \(x\) to \(y\) in \(G\), will go from \(y\) to \(x\) in \(G^T\). +

+ +

+The algorithm uses the property that transpose of a graph will have the same SCC's as the original graph. +

+ +

+The algorithm works as follows +

+
    +
  • Step 1 : Perform DFS on the tree to compute the finishing time of all vertices. When a node finishes, push it to a stack.
  • +
  • Step 2 : Find the transpose of the input graph. The transpose of graph is graph with same vertices, but the edges are flipped.
  • +
  • Step 3 : Pop a node from stack and apply DFS on it. All nodes that will be traversed by the DFS will be a part of an SCC. After the first SCC is found, begin popping nodes from stack till we get an undiscovered node. Then apply DFS on the undiscovered node to get the next SCC. Repeat this process till the stack is empty.
  • +
+

+Example, consider the graph +

+
    +
  • Step 1 : we start DFS at node \(1\), push nodes to a stack when they are finished
  • +
  • Step 2 : Find transpose of the graph
  • +
  • Step 3 : pop node from stack till we find a node which is undiscovered, then apply DFS to it. In our example, first node is \(1\)
  • +
+ +

+TODO : Add images for this +

-
-

7.4.5. Topological sort using DFS

Author: Anmol Nawani

-

Created: 2023-07-30 Sun 18:20

+

Created: 2023-08-01 Tue 21:22

Validate

diff --git a/main.org b/main.org index 0403aa7..1c5ec66 100644 --- a/main.org +++ b/main.org @@ -12,7 +12,8 @@ To create a stack, we will keep track of the index which is the *top* of the arr ** Operation on stack A stack has two operations - +1. Push +2. Pop * Direct Address Table Direct Address Tables are useful when we know that key is within a small range. Then, we can allocate an array such that each possible key gets an index and just add the values according to the keys. \\ @@ -306,7 +307,8 @@ h(k) = 532 mod 100 h(k) = 32 ** Universal Hashing -TODO: Basics of universal hashing. +TODO : Basics of universal hashing. + ** Perfect Hashing *NOTE*: This doesn't seem to be in B.Tech syllabus, but it seems cool. \\ @@ -852,7 +854,7 @@ This function runs in $\theta (log_2n)$ time. The algorithm for this works as fo Since we shift element upwards, this operation is often called /up-heap/ operation. It is also known as /trickle-up, swim-up, heapify-up, or cascade-up/ \\ \\ -*TODO* : Maybe up-heapfiy funtion should be made cleaner rather than trying to mirror down-heapify funtion. +TODO : Maybe up-heapfiy funtion should be made cleaner rather than trying to mirror down-heapify funtion. *** Insertion Insertion takes $\theta (log_2n)$ time in a binary heap. To insert and element in heap, we will add it to the end of the heap and then apply up-heapify operation of the elment @@ -953,6 +955,7 @@ We need a way to represent graphs in computers and to search a graph. Searching \\ The two common ways of representing graphs are either using adjacency lists and adjacency matrix. Either can represent both directed and undirected graphs. +TODO : add images to show how it is represented *** Adjacency List Every node in the graph is represented by a linked list. The list contains the nodes to which the list node is connected by an edge. \\ @@ -1112,24 +1115,26 @@ Therefore, we can get the shortest path now as follows This will print shortest path from end node to start node. *** Depth first search Unlike BFS, depth first search is more biased towards the farthest nodes of a graph. It follows a single path till it reaches the end of a path. After that, it back tracks to the last open path and follows that one. This process is repeated till all nodes are covered. -\\ -\\ + Implementation of DFS is very similar to BFS with two differences. Rather than using a queue, we use a *stack*. In BFS, the explored nodes are added to the queue, but in DFS we will add unexplored nodes to the stack. +Also in DFS, nodes are accessed two times, first when they are discovered and then when they are backtracked to and are considered finished. #+BEGIN_SRC c DFS(graph_type graph, node_type start){ stack_type stack; stack.push(start); while(stack.len != 0){ node_type v = stack.pop(); - if(v.explored == false){ - v.explored = true; + if(v.discovered == false){ + v.discovered = true; node_list adjacency_list = graph.adj_list(start); while(adjacency_list != NULL){ stack.push(adjacency_list.node); adjacency_list = adjacency_list.next; } + + v.finished = true; } } } @@ -1157,5 +1162,103 @@ For an input graph $G=(V,E)$, the time complexity for Depth first search is $\th \[ \text{Time complexity of DFS : } \theta (V + E) \] *** Properties of DFS -DFS is very useful to understand the structure of a graph. To understand the +DFS is very useful to */understand the structure of a graph/*. To study the structure of a graph using DFS, we will get two attributes of each node using DFS. We suppose that each step in traversal takes a unit of time. ++ *Discovery time* : The time when we first discovered the node. We will set this at the time we push node to stack. We will denote it as node.d ++ *Finishing time* : The time when we explored the node. We will set this when we pop the node and explore it. We will denote it as node.f +So our funtion will become +#+BEGIN_SRC c + // call start node with time = NULL + DFS(graph_type graph, node_type node, size_t *time){ + node.discovered = true; + // if time is NULL, initialize it + if(time == NULL){ + size_t initial_time = 0; + time = &initial_time; + } + + (*time) = (*time) + 1; + node.d = (*time); + + node_list adjacency_list = graph.adj_list(node); + while(adjacency_list != NULL){ + node_type u = adjacency_list.node; + if(u.discovered == false) + DFS(graph, u, time); + adjacency_list = adjacency_list.next; + } + + (*time) = (*time) + 1; + node.f = (*time); + } +#+END_SRC + +This algorithm will give all nodes the (node.d) and (node.f) attribute. *Similar to BFS, we can create a tree from DFS.* Having knowledge of these attributes can tell us properites of this DFS tree. + +**** *Parenthesis theorem* +The paranthesis theorem is used to find relationship between two nodes in the *Depth First Search Tree*. +\\ +For any two given nodes $x$ and $y$. ++ If range $[x.d, x.f]$ is completely within $[y.d, y.f]$, then $x$ is a descendant of $y$. ++ If range $[x.d, x.f]$ and $[y.d, y.f]$ are completely disjoint, then neither is descendant or ancestor of another. +So if node, $y$ is a proper descendant of node $x$ in the depth first tree, then +\[ \text{x is ancestor of y} : x.d < y.d < y.f < x.f \] +**** *White path theorem* +If $y$ is a descendant of $x$ in graph G, then at time $t = x.d$, the path from $u$ to $v$ was undiscovered. + +That is, all the nodes in path from $x$ to $y$ were undiscovered. Undiscovered nodes are shown by white vertices in visual representations of DFS, therfore this theorem was named white path theorem. +**** *Classification of edges* +We can arrange the connected nodes of a graph into the form of a Depth-first tree. When the graph is arranged in this way, the edges can be classified into four types +1. Tree edge : The edges of graph which become the edges of the depth-first tree. +2. Back edge : The edges of graph which point from a descendant node to an ancestor node of depth-first tree. They are called back edge because they point backwards to the root of the tree oppsite to all tree edges. +3. Forward edge : The edges of graph which point from a point from an ancestor node to a descendant node. +4. Cross edge : An edge of graph which points to two different nodes +The back edge, forward edge and cross edge are not a part of the depth-first tree but a part of the original graph. ++ In an *undirected graph* G, every edge is either a *tree edge or a back edge*. +*** Depth-first and Breadth-first Forests +In directed graphs, the depth-first and breadth-first algorithms *can't traverse to nodes which are not connected by a directed edge*. This can leave parts of graph not mapped by a single tree. + +These tree's can help us better understand the graph and get properties of nodes, so we can't leave them when converting a graph to tree. +\\ +To solve this, we have /*collection of trees for the graph*/. This collection of trees will cover all the nodes of the graph and is called a *forest*. The forest of graph $G$ is represented by $G_{\pi}$. + +Thus when using DFS or BFS on a graph, we store this collection of trees i.e, forests so that we can get properties of all the nodes. + ++ *NOTE* : When making a depth-first forest, we *don't reset the the time* when going from one tree to another. So if finishing time of for root of a tree is $t$, the discovery time of root node of next tree will be $(t+1)$. *** Topological sort using DFS +Topological sorting can only be done on *directed acyclic graphs*. A topological sort is a linear ordering of the nodes of a directed acyclic graph (dag). It is ordering the nodes such that all the *the edges point right*. + +Topological sorting is used on *precedence graphs* to tell which node will have higher precedence. + +To topologically sort, we first call DFS to calculate the the finishing time for all the nodes in graph and form a depth-first forest. Then, we can just sort the finishing times of the nodes in descending order. + +TODO : Add image to show process of topological sorting + ++ A directed graph $G$ is *acyclic if and only if* the depth-first forest has *no back edges*. +** Strongly connected components +If we can traverse from a node $x$ to node $y$ in a directed graph, we show it as $x \rightsquigarrow y$. + ++ A pair of nodes $x$ and $y$ is called if $x \rightsquigarrow y$ and $y \rightsquigarrow x$ ++ A graph is said to be strongly connected if all pairs of nodes are strongly connected in the graph. ++ If a graph is not strongly connected, we can divide the graph into subgraphs made from neighbouring nodes which are strongly connected. These subgraphs are called *strongly connected componnents*. + +Example, the dotted regions are the strongly connected components (SCC) of the graph. + +[[./imgs/strongly-connected-component.svg]] + +*** Finding strongly connected components +We can find the strongly connected components of a graph $G$ using DFS. The algorithm is called Kosaraju's algorithm. + +For this algorithm, we also need the transpose of graph $G$. The transpose of graph $G$ is denoted by $G^T$ and is the graph with the direction of all the edges flipped. So all edges from $x$ to $y$ in $G$, will go from $y$ to $x$ in $G^T$. + +The algorithm uses the property that transpose of a graph will have the same SCC's as the original graph. + +The algorithm works as follows ++ *Step 1* : Perform DFS on the tree to compute the finishing time of all vertices. When a node finishes, push it to a stack. ++ *Step 2* : Find the transpose of the input graph. The transpose of graph is graph with same vertices, but the edges are flipped. ++ *Step 3* : Pop a node from stack and apply DFS on it. All nodes that will be traversed by the DFS will be a part of an SCC. After the first SCC is found, begin popping nodes from stack till we get an undiscovered node. Then apply DFS on the undiscovered node to get the next SCC. Repeat this process till the stack is empty. +Example, consider the graph ++ Step 1 : we start DFS at node $1$, push nodes to a stack when they are finished ++ Step 2 : Find transpose of the graph ++ Step 3 : pop node from stack till we find a node which is undiscovered, then apply DFS to it. In our example, first node is $1$ + +TODO : Add images for this diff --git a/main.tsk b/main.tsk index d48896f..3d6f3de 100644 --- a/main.tsk +++ b/main.tsk @@ -1,7 +1,7 @@ -*Export to HTML +* Export to HTML #do emacs --script src/export.el -*Remove intermediate +* Remove intermediate #do rm main.html~