Final commit for now

TODO : 1. Add images notes for topics mentioned in main.org 2. Notes on stack, queue and linked list (or maybe these structures are too elementary and don't need notes)
1 year ago · f6f51d0ff9
parent 1ca0d3d62c
commit f6f51d0ff9
4 changed files with 355 additions and 228 deletions
--- a/README.md
+++ b/README.md
@ -1,3 +0,0 @@
 # basic_data_structures
 Basic data structures
--- a/README.org
+++ b/README.org
@ -0,0 +1,4 @@
 * Basic data structures
 Notes for basic data structures taught in B.Tech.
 Currently does not include stacks, queues and linked lists.
--- a/main.html
+++ b/main.html
--- a/main.org
+++ b/main.org
@ -307,12 +307,48 @@ h(k) = 532 mod 100
 h(k) = 32
 ** Universal Hashing
-TODO : Basics of universal hashing.
+Suppose a malicious adversary who know's our hash function chooses the keys that are to be hashed. He can choose keys that all hash to same slot therefore degrading performance of our hash table to $\theta (n)$.
-** Perfect Hashing
+Fixed hash functions are vulnerable to such attacks. To prevent this from happening, we create a class of function from which a function will be choosen randomly in a way that is independent of the keys, i.e, any function can be choosen for any key. This is called *universal hashing*.
-*NOTE*: This doesn't seem to be in B.Tech syllabus, but it seems cool.
+
 The randomization of chosen hash function will almost guarentee that we won't get the worst case behaviour. The hash function is /*not changed every time we do an insert or delete operation.*/ Changing hash function after each operation will not allow us to lookup elements in optimal time. We only change to another hash function when we do rehashing.
 *** Rehashing
 When we need to increase the size of hash table or change the hash function, we have to do rehashing.
 Rehashing is the process of taking all the entries in a hash table and then reapplying the hash function (possibly changing the hash function) and adding the entries into a new hash table, whose size is usually greater than the previous hash table.
 Rehashing is usually done when load factor increases to the point that it affects performace.
 \\
 In universal hashing, we will change the hash function each time we rehash the hash table.
 *** Universal family
 For universal hashing, the set of hash functions which is used is called *universal family*.
 The set of hash functions is called universal family if, for every distinct pair of keys $(x,y)$, *the number of functions in set where $h(x) = h(y)$ is less than or equal to $(|H| \div m)$*.
 In other words, *the probability of collision between any two distinct keys $(x,y)$ is less than or equal to $(1/m)$* if hash function is randomly choosen from the universal family.
 Here, $m$ is the number of slots in hash table.
 \\
 Sometimes, universal family may be called a universal of hash functions.
 *** Performance of universal hashing
 For any hash function $h$ from the universal. We know that the probability of collision between two keys is $(1/m)$.
 \\
 Using this, we can show that when using chaining, the expected (or average) length of each list in the hash table will be $(1 + \alpha)$.
 \\
 Where, alpha is the load factor of hash table.
 *** Example for universal set of hash functions
 Suppose we have set of keys $\{ 0,1,2,...,k \}$, we will choose a prime number $p > k$.
 Then we can define a hash funtion
 \[ h_{ab}(k) = \left( (ak + b)\ mod\ p \right) \ mod\ m \]
 And, the universal is
 \[ H = \{ h_{ab} : a \in \{ 1,2,...,(p-1) \} \ and \ b \in \{ 0,1,...,(p-1) \} \} \]
 This class of hash functions will map from set $\{ 0,1,2,...,(p-1) \}$ to set $\{ 0,1,2,...,(m-1) \}$.
 \\
 Here, $m$ is the number of slots in hash table.
 ** Perfect Hashing
 TODO : Doing this or nah
 NOTE : This doesn't seem to be in B.Tech syllabus, but it seems cool.
 \\
 * Representing rooted trees using nodes
 We can represent trees using nodes.  A node only stores a single element of the tree. What is a node will depend on the language being used.
 \\
@ -604,8 +640,16 @@ The algorithm for iterative insertion is
 Deletion in Binary Search Trees is tricky because we need to delete nodes in a way that the property of the Binary Search Tree holds after the deletion of the node. So we first have to remove the node from the tree before we can free it.
 \\
 \\
-TODO : Write four cases of node deletion here
+There are *four different cases* which can occur when we try to delete a node. All four have a different method to handle them. These four cases relate to how many children the node which we want to delete has.
-**** *Implementation in code*
+\\
 Suppose the node is $X$.
 1. Node $X$ has no children i.e. it is a leaf node. In this case, we can simply delete the node and replace it with NULL.
 2. Node $X$ has one child. In this case, the child of node $X$ will take it's place and we can delete node $X$.
 3. Node $X$ has both left and right child, and the right child of $X$, is the successor of $X$. In this case, we will replace the left child of successor to left child of $X$, then replace $X$ with it's own right child.
 4. Node $X$ has both left and right child, and the right child if not the successor of $X$. In this case, we will replace the successor node with it's own right child. Then, we will replace both left and right child of succesor node with left and right childs of $X$ respectively. Finally, we can replace $X$ with the succesor node.
 TODO : add images here for four cases.
 + *Implementation in code*
 We also use a helper function called Replace Child for deletion of node. This function will simply take  parent node, old child node and new child node and replace old child with new child.
 #+BEGIN_SRC c
@ -852,10 +896,6 @@ This function runs in $\theta (log_2n)$ time. The algorithm for this works as fo
 #+END_SRC
 Since we shift element upwards, this operation is often called /up-heap/ operation. It is also known as /trickle-up, swim-up, heapify-up, or cascade-up/
 \\
 \\
 TODO : Maybe up-heapfiy funtion should be made cleaner rather than trying to mirror down-heapify funtion.
 *** Insertion
 Insertion takes $\theta (log_2n)$ time in a binary heap. To insert and element in heap, we will add it to the end of the heap and then apply up-heapify operation of the elment
 \\
		`@ -1,3 +0,0 @@`
			`# basic_data_structures`

			`Basic data structures`