The input size tells us the size of the input given to algorithm. Based on the size of input, the time/storage usage of the algorithm changes. <b>Example</b>, an array with larger input size (more elements) will taken more time to sort.
</p>
<ulclass="org-ul">
<li>Best Case : The lowest time/storage usage for the given input size.</li>
<li>Worst Case : The highest time/storage usage for the given input size.</li>
<li>Average Case : The average time/storage usage for the given input size.</li>
Since algorithms are finite, they have <b>bounded time</b> taken and <b>bounded space</b> taken. Bounded is short for boundries, so they have a minimum and maximum time/space taken. These bounds are upper bound and lower bound.
</p>
<ulclass="org-ul">
<li>Upper Bound : The maximum amount of space/time taken by the algorithm is the upper bound. It is shown as a function of worst cases of time/storage usage over all the possible input sizes.</li>
<li>Lower Bound : The minimum amount of space/time taken by the algorithm is the lower bound. It is shown as a function of best cases of time/storage usage over all the possible input sizes.</li>
<li>The Big Oh notation is used to define the upper bound of an algorithm.</li>
<li>Given a non negative funtion f(n) and other non negative funtion g(n), we say that \(f(n) = O(g(n)\) if there exists a positive number \(n_0\) and a positive constant \(c\), such that \[ f(n) \le c.g(n) \ \ \forall n \ge n_0 \]</li>
<li>So if growth rate of g(n) is greater than or equal to growth rate of f(n), then \(f(n) = O(g(n))\).</li>
<li>It is used to shown the lower bound of the algorithm.</li>
<li>For any positive integer \(n_0\) and a positive constant \(c\), we say that, \(f(n) = \Omega (g(n))\) if \[ f(n) \ge c.g(n) \ \ \forall n \ge n_0 \]</li>
<li>So growth rate of \(g(n)\) should be less than or equal to growth rate of \(f(n)\)</li>
</ul>
<p>
<b>Note</b> : If \(f(n) = O(g(n))\) then \(g(n) = \Omega (f(n))\)
<li>If is used to provide the asymptotic <b>equal bound</b>.</li>
<li>\(f(n) = \theta (g(n))\) if there exists a positive integer \(n_0\) and a positive constants \(c_1\) and \(c_2\) such that \[ c_1 . g(n) \le f(n) \le c_2 . g(n) \ \ \forall n \ge n_0 \]</li>
<li>So the growth rate of \(f(n)\) and \(g(n)\) should be equal.</li>
</ul>
<p>
<b>Note</b> : So if \(f(n) = O(g(n))\) and \(f(n) = \Omega (g(n))\), then \(f(n) = \theta (g(n))\)
<li>The little o notation defines the strict upper bound of an algorithm.</li>
<li>We say that \(f(n) = o(g(n))\) if there exists positive integer \(n_0\) and positive constant \(c\) such that, \[ f(n) <c.g(n)\\\foralln\gen_0\]</li>
<li>Notice how condition is <, rather than \(\le\) which is used in Big-Oh. So growth rate of \(g(n)\) is strictly greater than that of \(f(n)\).</li>
<li>The little omega notation defines the strict lower bound of an algorithm.</li>
<li>We say that \(f(n) = \omega (g(n))\) if there exists positive integer \(n_0\) and positive constant \(c\) such that, \[ f(n) > c.g(n) \ \ \forall n \ge n_0 \]</li>
<li>Notice how condition is >, rather than \(\ge\) which is used in Big-Omega. So growth rate of \(g(n)\) is strictly less than that of \(f(n)\).</li>
Using logarithm can be useful to compare exponential functions. When comaparing functions \(f(n)\) and \(g(n)\),
</p>
<ulclass="org-ul">
<li>If growth of \(\log(f(n))\) is greater than growth of \(\log(g(n))\), then growth of \(f(n)\) is greater than growth of \(g(n)\)</li>
<li>If growth of \(\log(f(n))\) is less than growth of \(\log(g(n))\), then growth of \(f(n)\) is less than growth of \(g(n)\)</li>
<li>When using log for comparing growth, comaparing constants after applying log is also required. For example, if functions are \(2^n\) and \(3^n\), then their logs are \(n.log(2)\) and \(n.log(3)\). Since \(log(2) <log(3)\),thegrowthrateof\(3^n\)willbehigher.</li>
<li>On equal growth after applying log, we can't decide which function grows faster.</li>
<li><b>Sum</b> : For a sum of two functions, the big-oh can be represented with only with funcion having higer growth rate. \[ O(f_1 + f_2 + ... + f_i) = O(max\ growth\ rate(f_1, f_2, .... , f_i )) \]</li>
<li><b>Constants</b> : For a constant \(c\) \[ O(c.g(n)) = O(g(n)) \], this is because the constants don't effect the growth rate.</li>
<li><b>Reflexive</b> : \(f(n) = O(f(n)\) and \(f(n) = \Omega (f(n))\) and \(f(n) = \theta (f(n))\)</li>
<li><b>Symmetric</b> : If \(f(n) = \theta (g(n))\) then \(g(n) = \theta (f(n))\)</li>
<li><b>Transitive</b> : If \(f(n) = O(g(n))\) and \(g(n) = O(h(n))\) then \(f(n) = O(h(n))\)</li>
<li><b>Transpose</b> : If \(f(n) = O(g(n))\) then we can also conclude that \(g(n) = \Omega (f(n))\) so we say Big-Oh is transpose of Big-Omega and vice-versa.</li>
<li><b>Antisymmetric</b> : If \(f(n) = O(g(n))\) and \(g(n) = O(f(n))\) then we conclude that \(f(n) = g(n)\)</li>
<li><b>Asymmetric</b> : If \(f(n) = \omega (g(n))\) then we can conclude that \(g(n) \ne \omega (f(n))\)</li>
A sequential set of instructions are instructions in a sequence without iterations and recursions. It is a simple block of instructions with no branches. A sequential set of instructions has <b>time complexity of O(1)</b>, i.e., it has <b>constant time complexity</b>.
A set of instructions in a loop. Iterative instructions can have different complexities based on how many iterations occurs depending on input size.
</p>
<ulclass="org-ul">
<li>For fixed number of iterations (number of iterations known at compile time i.e. independant of the input size), the time complexity is constant, O(1). Example for(int i = 0; i < 100; i++) { … } will always have 100 iterations, so constant time complexity.</li>
<li>For n number of iterations ( n is the input size ), the time complexity is O(n). Example, a loop for(int i = 0; i < n; i++){ … } will have n iterations where n is the input size, so complexity is O(n). Loop for(int i = 0; i < n/2; i++){…} also has time complexity O(n) because n/2 iterations are done by loop and 1/2 is constant thus not in big-oh notation.</li>
<li>For a loop like for(int i = 1; i <= n; i = i*2){…} the value of i is update as *=2, so the number of iterations will be \(log_2 (n)\). Therefore, the time complexity is \(O(log_2 (n))\).</li>
<li>For a loop like for(int i = n; i > 1; i = i/2){…} the value of i is update as *=2, so the number of iterations will be \(log_2 (n)\). Therefore, the time complexity is \(O(log_2 (n))\).</li>
</ul>
<p>
<b><spanclass="underline">Nested Loops</span></b>
<br/>
</p>
<ulclass="org-ul">
<li>If <b>inner loop iterator doesn't depend on outer loop</b>, the complexity of the inner loop is multiplied by the number of times outer loop runs to get the time complexity For example, suppose we have loop as</li>
Here, the outer loop will <b>n</b> times and the inner loop will run <b>log(n)</b> times. Therefore, the total number of time statements in the inner loop run is n.log(n) times.
Thus the time complexity is <b>O(n.log(n))</b>.
</p>
<ulclass="org-ul">
<li>If <b>inner loop and outer loop are related</b>, then complexities have to be computed using sums. Example, we have loop</li>
We first have to create a way to describe time complexity of recursive functions in form of an equation as,
\[ T(n) = ( \text{Recursive calls by the function} ) + ( \text{Time taken per call, i.e, the time taken except for recursive calls in the function} ) \]
</p>
<ulclass="org-ul">
<li>Example, suppose we have a recursive function</li>
<spanstyle="color: #a626a4;">if</span>(n == 0 || n == 1)
<spanstyle="color: #a626a4;">return</span> 1;
<spanstyle="color: #a626a4;">else</span>
<spanstyle="color: #a626a4;">return</span> n * fact(n-1);
}
</pre>
</div>
<p>
in this example, the recursive call is fact(n-1), therefore the time complexity of recursive call is T(n-1) and the time complexity of function except for recursive call is constant (let's assume <b>c</b>). So the time complexity is
\[ T(n) = T(n-1) + c \]
\[ T(1) = T(0) = C\ \text{where C is constant time} \]
Here, the recursive calls are func(n-1) and func(n-2), therefore time complexities of recursive calls is T(n-1) and T(n-2). The time complexity of function except the recursive calls is constant (let's assume <b>c</b>), so the time complexity is
\[ T(n) = T(n-1) + T(n-2) + c \]
\[ T(1) = T(0) = C\ \text{where C is constant time} \]
Here, the recursive calls are func(n-1) and func(n-2), therefore time complexities of recursive calls is T(n-1) and T(n-2). The time complexity of function except the recursive calls is <b>θ (n)</b> because of the for loop, so the time complexity is
</p>
<p>
\[ T(n) = T(n-1) + T(n-2) + n \]
\[ T(1) = T(0) = C\ \text{where C is constant time} \]
Tree method is used when there are multiple recursive calls in our recurrance relation. Example,
\[ T(n) = T(n/5) + T(4n/5) + f(n) \]
Here, one call is T(n/5) and another is T(4n/5). So we can't apply master's theorem. So we create a tree of recursive calls which is used to calculate time complexity.
The first node, i.e the root node is T(n) and the tree is formed by the child nodes being the calls made by the parent nodes. Example, let's consider the recurrance relation
\[ T(n) = T(n/5) + T(4n/5) + f(n) \]
</p>
<preclass="example">
+-----T(n/5)
T(n)--+
+-----T(4n/5)
</pre>
<p>
Since T(n) calls T(n/5) and T(4n/5), the graph for that is shown as drawn above. Now using recurrance relation, we can say that T(n/5) will call T(n/5<sup>2</sup>) and T(4n/5<sup>2</sup>). Also, T(4n/5) will call T(4n/5<sup>2</sup>) and T(4<sup>2</sup> n/ 5<sup>2</sup>).
</p>
<preclass="example">
+--T(n/5^2)
+-----T(n/5)--+
+ +--T(4n/5^2)
T(n)--+
+ +--T(4n/5^2)
+-----T(4n/5)-+
+--T(4^2 n/5^2)
</pre>
<p>
Suppose we draw this graph for an unknown number of levels.
</p>
<preclass="example">
+--T(n/5^2)- - - - - - - etc.
+-----T(n/5)--+
+ +--T(4n/5^2) - - - - - - - - - etc.
T(n)--+
+ +--T(4n/5^2) - - - - - - - - - etc.
+-----T(4n/5)-+
+--T(4^2 n/5^2)- - - - - - etc.
</pre>
<p>
We will now replace T()'s with the <b>cost of the call</b>. The cost of the call is <b>f(n)</b>, i.e, the time taken other than that caused by the recursive calls.
</p>
<preclass="example">
+--f(n/5^2)- - - - - - - etc.
+-----f(n/5)--+
+ +--f(4n/5^2) - - - - - - - - - etc.
f(n)--+
+ +--f(4n/5^2) - - - - - - - - - etc.
+-----f(4n/5)-+
+--f(4^2 n/5^2)- - - - - - etc.
</pre>
<p>
In our example, <b>let's assume f(n) = n</b>, therfore,
</p>
<preclass="example">
+-- n/5^2 - - - - - - - etc.
+----- n/5 --+
+ +-- 4n/5^2 - - - - - - - - - etc.
n --+
+ +-- 4n/5^2 - - - - - - - - -etc.
+----- 4n/5 -+
+-- 4^2 n/5^2 - - - - - - etc.
</pre>
<p>
Now we can get cost of each level.
</p>
<preclass="example">
+-- n/5^2 - - - - - - - etc.
+----- n/5 --+
+ +-- 4n/5^2 - - - - - - - - - etc.
n --+
+ +-- 4n/5^2 - - - - - - - - -etc.
+----- 4n/5 --+
+-- 4^2 n/5^2 - - - - - - etc.
Sum : n n/5 n/25
+4n/5 +4n/25
+4n/25
+16n/25
..... ..... ......
n n n
</pre>
<p>
Since sum on all levels is n, we can say that Total time taken is
\[ T(n) = \Sigma \ (cost\ of\ level_i) \]
</p>
<p>
Now we need to find the longest branch in the tree. If we follow the pattern of expanding tree in a sequence as shown, then the longest branch is <b>always on one of the extreme ends of the tree</b>. So for our example, if tree has <b>(k+1)</b> levels, then our branch is either (n/5<sup>k</sup>) of (4<sup>k</sup> n/5<sup>k</sup>). Consider the terminating condition is, \(T(a) = C\). Then we will calculate value of k by equating the longest branch as,
The tree method as mentioned is mainly used when we have multiple recursive calls with different factors. But when using the big-oh notation (O). We can avoid tree method in favour of the master's theorem by converting recursive call with smaller factor to larger. This works since big-oh calculates worst case. Let's take our previous example
\[ T(n) = T(n/5) + T(4n/5) + f(n) \]
Since T(n) is an increasing function. We can say that
\[ T(n/5) <T(4n/5)\]
So we can replace smaller one and approximate our equation to,
\[ T(n) = T(4n/5) + T(4n/5) + f(n) \]
\[ T(n) = 2.T(4n/5) + f(n) \]
</p>
<p>
Now, our recurrance relation is in a form where we can apply the mater's theorem.
The amount of memory used by the algorithm to execute and produce the result for a given input size is space complexity. Similar to time complexity, when comparing two algorithms space complexity is usually represented as the growth rate of memory used with respect to input size. The space complexity includes
</p>
<ulclass="org-ul">
<li><b>Input space</b> : The amount of memory used by the inputs to the algorithm.</li>
<li><b>Auxiliary space</b> : The amount of memory used during the execution of the algorithm, excluding the input space.</li>
</ul>
<p>
<b>NOTE</b> : <i>Space complexity by definition includes both input space and auxiliary space, but when comparing algorithms the input space is often ignored. This is because two algorithms that solve the same problem will have same input space based on input size (Example, when comparing two sorting algorithms, the input space will be same because both get a list as an input). So from this point on, refering to space complexity, we are actually talking about <b>Auxiliary Space Complexity</b>, which is space complexity but only considering the auxiliary space</i>.
The space complexity when we disregard the input space is the auxiliary space complexity, so we basically treat algorithm as if it's input space is zero. Auxiliary space complexity is more useful when comparing algorithms because the algorithms which are working towards same result will have the same input space, Example, the sorting algorithms will all have the input space of the list, so it is not a metric we can use to compare algorithms. So from here, when we calculate space complexity, we are trying to calculate auxiliary space complexity and sometimes just refer to it as space complexity.
There are two parameters that affect space complexity,
</p>
<ulclass="org-ul">
<li><b>Data space</b> : The memory taken by the variables in the algorithm. So allocating new memory during runtime of the algorithm is what forms the data space. The space which was allocated for the input space is not considered a part of the data space.</li>
<li><b>Code Execution Space</b> : The memory taken by the instructions themselves is called code execution space. Unless we have recursion, the code execution space remains constant since the instructions don't change during runtime of the algorithm. When using recursion, the instructions are loaded again and again in memory, thus increasing code execution space.</li>
<spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">Work on data</span>
}
}
</pre>
</div>
<p>
Here, we create an array of size <b>n</b>, so the increase in allocated space increases with the input size. So the space complexity is, <b>\(\theta (n)\)</b>.
<spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">Work on data</span>
}
}
}
</pre>
</div>
<p>
Here, we create a matrix of size <b>n*n</b>, so the increase in allocated space increases with the input size by \(n^2\). So the space complexity is, <b>\(\theta (n^2)\)</b>.
</p>
<ulclass="org-ul">
<li>If we use a node based data structure like linked list or trees, then we can show space complexity as the number of nodes used by algorithm based on input size, (if all nodes are of equal size).</li>
<li>Space complexity of the hash map is considered <b>O(n)</b> where <b>n</b> is the number of entries in the hash map.</li>
When we use recursion, the function calls are stored in the stack. This means that code execution space will increase. A single function call has fixed (constant) space it takes in the memory. So to get space complexity, <b>we need to know how many function calls occur in the longest branch of the function call tree</b>.
</p>
<ulclass="org-ul">
<li><b>NOTE</b> : Space complexity <b>only depends on the longest branch</b> of the function calls tree.</li>
<li><i><b>The tree is made the same way we make it in the tree method for calculating time complexity of recursive algorithms</b></i></li>
</ul>
<p>
This is because at any given time, the stack will store only a single branch.
<spanstyle="color: #a626a4;">if</span>(n == 1 || n == 0)
<spanstyle="color: #a626a4;">return</span> 1;
<spanstyle="color: #a626a4;">else</span>
<spanstyle="color: #a626a4;">return</span> n * func(n - 1);
}
</pre>
</div>
<p>
To calculate space complexity we can use the tree method. But rather than when calculating time complexity, we will count the number of function calls using the tree.
We will do this by drawing tree of what function calls will look like for given input size <b>n</b>.
<br/>
The tree for <b>k+1</b> levels is,
</p>
<preclass="example">
func(n)--func(n-1)--func(n-2)--.....--func(n-k)
</pre>
<p>
This tree only has a single branch. To get the number of levels for a branch, we put the terminating condition at the extreme branches of the tree. Here, the terminating condition is func(1), therefore, we will put \(func(1) = func(n-k)\), i.e,
\[ 1 = n - k \]
\[ k + 1 = n \]
</p>
<p>
So the number of levels is \(n\). Therefore, space complexity is <b>\(\theta (n)\)</b>
<li><i><b>As we know from the tree method, the two extreme branches of the tree will always be the longest ones.</b></i></li>
</ul>
<p>
Both the extreme branches have the same call which here is func(n/2<sup>k</sup>). To get the number of levels for a branch, we put the terminating condition at the extreme branches of the tree. Here, the terminating condition is func(2), therefore, we will put \(func(2) = func(n/2^k)\), i.e,
\[ 2 = \frac{n}{2^k} \]
\[ k + 1 = log_2n \]
Number of levels is \(log_2n\). Therefore, space complexity is <b>\(\theta (log_2n)\).</b>
Divide and conquer is a problem solving strategy. In divide and conquer algorithms, we solve problem recursively applying three steps :
</p>
<ulclass="org-ul">
<li><b>Divide</b> : Problem is divided into smaller problems that are instances of same problem.</li>
<li><b>Conquer</b> : If subproblems are large, divide and solve them recursivly. If subproblem is small enough then solve it in a straightforward method</li>
<li><b>Combine</b> : combine the solutions of subproblems into the solution for the original problem.</li>
<b>Recursive time complexity</b> : \(T(n) = T(n-1) + 1\)
</p>
<ulclass="org-ul">
<li><b>Best Case</b> : The element to search is the first element of the array. So we need to do a single comparision. Therefore, time complexity will be constant <b>O(1)</b>.</li>
</ul>
<p>
<br/>
</p>
<ulclass="org-ul">
<li><b>Worst Case</b> : The element to search is the last element of the array. So we need to do <b>n</b> comparisions for the array of size n. Therefore, time complexity is <b>O(n)</b>.</li>
</ul>
<p>
<br/>
</p>
<ulclass="org-ul">
<li><b>Average Case</b> : For calculating the average case, we need to consider the average number of comparisions done over all possible cases.</li>
The binary search algorithm works on an array which is sorted. In this algorithm we:
</p>
<olclass="org-ol">
<li>Check the middle element of the array, return the index if element found.</li>
<li>If element > array[mid], then our element is in the right part of the array, else it is in the left part of the array.</li>
<li>Get the mid element of the left/right sub-array</li>
<li>Repeat this process of division to subarray's and comparing the middle element till our required element is found.</li>
</ol>
<p>
The divide and conquer algorithm works as,
<br/>
Suppose binarySearch(array, left, right, key), left and right are indicies of left and right of subarray. key is the element we have to search.
</p>
<ulclass="org-ul">
<li><b>Divide part</b> : calculate mid index as mid = left + (right - left) /2 or (left + right) / 2. If array[mid] == key, return the value of mid.</li>
<li><b>Conquer part</b> : if array[mid] > key, then key must not be in right half. So we search for key in left half, so we will recursively call binarySearch(array, left, mid - 1, key). Similarly, if array[mid] < key, then key must not be in left half. So we search for key in right half, so recursively call binarySearch(array, mid + 1, right, key).</li>
<li><b>Combine part</b> : Since the binarySearch function will either return -1 or the index of the key, there is no need to combine the solutions of the subproblems.</li>
<spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">or we can use mid = left + (right - left) / 2, this will avoid int overflow when array has more elements.</span>
<li><b>Best case</b> : array is sorted in ascending order. Number of comparisions is \(n-1\). Time complexity is \(O(n)\).</li>
<li><b>Worst case</b> : array is sorted in descending order. Number of comparisions is \(2.(n-1)\). Time complexity is \(O(n)\).</li>
<li><b>Average case</b> : array can we arranged in n! ways, this makes calculating number of comparisions in the average case hard and it is somewhat unnecessary, so it is skiped. Time complexity is \(O(n)\)</li>
Suppose the function is MinMax(array, left, right) which will return a tuple (min, max). We will divide the array in the middle, mid = (left + right) / 2. The left array will be array[left:mid] and right aray will be array[mid+1:right]
</p>
<ulclass="org-ul">
<li><b>Divide part</b> : Divide the array into left array and right array. If array has only single element then both min and max are that single element, if array has two elements then compare the two and the bigger element is max and other is min.</li>
<li><b>Conquer part</b> : Recursively get the min and max of left and right array, leftMinMax = MinMax(array, left, mid) and rightMinMax = MinMax(array, mid + 1, right).</li>
<li><b>Combine part</b> : If leftMinMax[0] > rightMinmax[0], then min = righMinMax[0], else min = leftMinMax[0]. Similarly, if leftMinMax[1] > rightMinMax[1], then max = leftMinMax[1], else max = rightMinMax[1].</li>
<spanstyle="color: #a626a4;">if</span> left == right: <spanstyle="color: #a0a1a7; font-weight: bold;"># </span><spanstyle="color: #a0a1a7;">Single element in array</span>
<spanstyle="color: #a0a1a7; font-weight: bold;"># </span><spanstyle="color: #a0a1a7;">Combining result of the minimum from left and right subarray's</span>
<spanstyle="color: #a0a1a7; font-weight: bold;"># </span><spanstyle="color: #a0a1a7;">Combining result of the maximum from left and right subarray's</span>
We are dividing the problem into two parts of approximately, and it takes two comparisions on each part. Let's consider a comparision takes unit time. Then time complexity is
\[ T(n) = T(n/2) + T(n/2) + 2 \]
\[ T(n) = 2.T(n/2) + 2 \]
The recurrance terminated if single element in array with zero comparisions, i.e, \(T(1) = 0\), or when two elements with single comparision \(T(2) = 1\).
<br/>
<i>Now we can use the <b>master's theorem</b> or <b>tree method</b> to solve for time complexity.</i>
\[ T(n) = \theta (n) \]
</p>
<ulclass="org-ul">
<li>Space complexity</li>
</ul>
<p>
For space complexity, we need to find the longest branch of the recursion tree. Since both recursive calls are same sized, and the factor is (1/2), for <b>k+1</b> levels, function call will be func(n/2<sup>k</sup>), and terminating condition is func(2)
\[ func(2) = func(n/2^k) \]
\[ 2 = \frac{n}{2^k} \]
\[ k + 1 = log_2n \]
Since longest branch has \(log_2n\) nodes, the space complexity is \(O(log_2n)\).
</p>
<ulclass="org-ul">
<li>Number of comparisions</li>
</ul>
<p>
In every case i.e, average, best and worst cases, <b>the number of comparisions in this algorithm is same</b>.
\[ \text{Total number of comparisions} = \frac{3n}{2} - 2 \]
If n is not a power of 2, we will round the number of comparision up.
In this algorithm we will compare pairs of numbers from the array. It works on the idea that the larger number of the two in pair can be the maximum number and smaller one can be the minimum one. So after comparing the pair, we can simply test from maximum from the bigger of two an minimum from smaller of two. This brings number of comparisions to check two numbers in array from 4 (when we increment by 1) to 3 (when we increment by 2).
<spanstyle="color: #a0a1a7; font-weight: bold;"># </span><spanstyle="color: #a0a1a7;">check possibility that array[i] is maximum and array[i+1] is minimum</span>
<spanstyle="color: #a0a1a7; font-weight: bold;"># </span><spanstyle="color: #a0a1a7;">check possibility that array[i+1] is maximum and array[i] is minimum</span>
<preclass="src src-C"><spanstyle="color: #a0a1a7; font-weight: bold;">/* </span><spanstyle="color: #a0a1a7;">This will calculate A X B and store it in C.</span><spanstyle="color: #a0a1a7; font-weight: bold;"> */</span>
The addition of matricies of size (n X n) takes time \(\theta (n^2)\), therefore, for computation of C<sub>11</sub> will take time of \(\theta \left( \left( \frac{n}{2} \right)^2 \right)\), which is equals to \(\theta \left( \frac{n^2}{4} \right)\). Therefore, computation time of C<sub>11</sub>, C<sub>12</sub>, C<sub>21</sub> and C<sub>22</sub> combined will be \(\theta \left( 4 \frac{n^2}{4} \right)\), which is equals to \(\theta (n^2)\).
<br/>
There are 8 recursive calls in this function with MatrixMul(n/2), therefore, time complexity will be
Another, more efficient divide and conquer algorithm for matrix multiplication. This algorithm also only works on square matrices with n being a power of 2. This algorithm is based on the observation that, for A X B = C. We can calculate C<sub>11</sub>, C<sub>12</sub>, C<sub>21</sub> and C<sub>22</sub> as,
</p>
<p>
\[ \text{C_11 = P_5 + P_4 - P_2 + P_6} \]
\[ \text{C_12 = P_1 + P_2} \]
\[ \text{C_21 = P_3 + P_4} \]
\[ \text{C_22 = P_1 + P _5 - P_3 - P_7} \]
Where,
\[ \text{P_1 = A_11 X (B_12 - B_22)} \]
\[ \text{P_2 = (A_11 + A_12) X B_22} \]
\[ \text{P_3 = (A_21 + A_22) X B_11} \]
\[ \text{P_4 = A_22 X (B_21 - B_11)} \]
\[ \text{P_5 = (A_11 + A_22) X (B_11 + B_22)} \]
\[ \text{P_6 = (A_12 - A_22) X (B_21 + B_22)} \]
\[ \text{P_7 = (A_11 - A_21) X (B_11 + B_12)} \]
This reduces number of recursion calls from 8 to 7.
</p>
<preclass="example">
Strassen(A, B, n):
If n == 2 {
return A X B
}
Else{
Break A into four parts A_11, A_12, A_21, A_22, where A = [[ A_11, A_12],
[ A_21, A_22]]
Break B into four parts B_11, B_12, B_21, B_22, where B = [[ B_11, B_12],
[ B_21, B_22]]
P_1 = Strassen(A_11, B_12 - B_22, n/2)
P_2 = Strassen(A_11 + A_12, B_22, n/2)
P_3 = Strassen(A_21 + A_22, B_11, n/2)
P_4 = Strassen(A_22, B_21 - B_11, n/2)
P_5 = Strassen(A_11 + A_22, B_11 + B_22, n/2)
P_6 = Strassen(A_12 - A_22, B_21 + B_22, n/2)
P_7 = Strassen(A_11 - A_21, B_11 + B_12, n/2)
C_11 = P_5 + P_4 - P_2 + P_6
C_12 = P_1 + P_2
C_21 = P_3 + P_4
C_22 = P_1 + P_5 - P_3 - P_7
C = [[ C_11, C_12],
[ C_21, C_22]]
return C
}
</pre>
<p>
This algorithm uses 18 matrix addition operations. So our computation time for that is \(\theta \left(18\left( \frac{n}{2} \right)^2 \right)\) which is equal to \(\theta (4.5 n^2)\) which is equal to \(\theta (n^2)\).
<br/>
There are 7 recursive calls in this function which are Strassen(n/2), therefore, time complexity is
\[ T(n) = 7T(n/2) + \theta (n^2) \]
Using the master's theorem
\[ T(n) = \theta (n^{log_27}) \]
\[ T(n) = \theta (n^{2.807}) \]
</p>
<ulclass="org-ul">
<li><i><b>NOTE</b> : The divide and conquer approach and strassen's algorithm typically use n == 1 as their terminating condition since for multipliying 1 X 1 matrices, we only need to calculate product of the single element they contain, that product is thus the single element of our resultant 1 X 1 matrix.</i></li>
If the space complexity of a sorting algorithm is \(\theta (1)\), then the algorithm is called in place sorting, else the algorithm is called out place sorting.
Simplest sorting algorithm, easy to implement so it is useful when number of elements to sort is small. It is an in place sorting algorithm. We will compare pairs of elements from array and swap them to be in correct order. Suppose input has n elements.
</p>
<ulclass="org-ul">
<li>For first pass of the array, we will do <b>n-1</b> comparisions between pairs, so 1st and 2nd element; then 2nd and 3rd element; then 3rd and 4th element; till comparision between (n-1)th and nth element, swapping positions according to the size. <i>A single pass will put a single element at the end of the list at it's correct position.</i></li>
<li>For second pass of the array, we will do <b>n-2</b> comparisions because the last element is already in it's place after the first pass.</li>
<li>Similarly, we will continue till we only do a single comparision.</li>
<spanstyle="color: #a0a1a7; font-weight: bold;">/* </span><spanstyle="color: #a0a1a7;">i is the number of comparisions in the pass</span><spanstyle="color: #a0a1a7; font-weight: bold;"> */</span>
<spanstyle="color: #a0a1a7; font-weight: bold;">/* </span><spanstyle="color: #a0a1a7;">j is used to traverse the list</span><spanstyle="color: #a0a1a7; font-weight: bold;"> */</span>
<b><i>Minimum number of swaps can be calculated by checking how many swap operations are needed to get each element in it's correct position.</i></b> This can be done by checking the number of smaller elements towards the left. For descending, check the number of larger elements towards the left of the given element. Example for ascending sort,
<tdclass="org-left">Minimum number of swaps to get in correct position</td>
<tdclass="org-right">3</td>
<tdclass="org-right">1</td>
<tdclass="org-right">0</td>
<tdclass="org-right">0</td>
<tdclass="org-right">0</td>
</tr>
</tbody>
</table>
<p>
Therefore, minimum number of swaps is ( 3 + 1 + 0 + 0 + 0) , which is equal to 4 swaps.
</p>
<ulclass="org-ul">
<li><b><i>Reducing number of comparisions in implementation</i></b> : at the end of every pass, check the number of swaps. <b>If number of swaps in a pass is zero, then the array is sorted.</b> This implementation does not give minimum number of comparisions, but reduces number of comparisions from default implementation. It reduces the time complexity to \(\theta (n)\) for best case scenario, since we only need to pass through array once.</li>
</ul>
<p>
Recursive time complexity : \(T(n) = T(n-1) + n - 1\)
It is an inplace sorting technique. In this algorithm, we will get the minimum element from the array, then we swap it to the first position. Now we will get the minimum from array[1:] and place it in index 1. Similarly, we get minimum from array[2:] and then place it on index 2. We do till we get minimum from array[len(array) - 2:] and place minimum on index [len(array) - 2].
<spanstyle="color: #a0a1a7; font-weight: bold;">/* </span><spanstyle="color: #a0a1a7;">Get the minimum index from the sub-array [i:]</span><spanstyle="color: #a0a1a7; font-weight: bold;"> */</span>
<spanstyle="color: #a0a1a7; font-weight: bold;">/* </span><spanstyle="color: #a0a1a7;">Swap the min_index with it's position at start of sub-array</span><spanstyle="color: #a0a1a7; font-weight: bold;"> */</span>
<li>In this algorithm, we first divide array into two sections. Initially, the left section has a single element and right section has all the other elements. Therefore, the left part is sorted and right part is unsorted.</li>
<li>We call the leftmost element of the right section the key.</li>
<li>Now, we insert the key in it's correct position, in left section.</li>
<li>As commanly known, for insertion operation we need to shift elements. So we shift elements in the left section.</li>
<spanstyle="color: #a0a1a7; font-weight: bold;">/* </span><spanstyle="color: #a0a1a7;">Key is the first element of the right section of array</span><spanstyle="color: #a0a1a7; font-weight: bold;"> */</span>
<spanstyle="color: #c18401;">int</span><spanstyle="color: #8b4513;">j</span> = i - 1;
<spanstyle="color: #a0a1a7; font-weight: bold;">/* </span><spanstyle="color: #a0a1a7;">Shift till we find the correct position of the key in the left section</span><spanstyle="color: #a0a1a7; font-weight: bold;"> */</span>
<b>Best Case</b> : The best case is when input array is already sorted. In this case, we do <b>(n-1)</b> comparisions and no swaps. The time complexity will be \(\theta (n)\)
<br/>
<b>Worst Case</b> : The worst case is when input array is is descending order when we need to sort in ascending order and vice versa (basically reverse of sorted). The number of comparisions is
<br/>
\[ [1 + 2 + 3 + .. + (n-1)] = \frac{n(n-1)}{2} \]
<br/>
The number of element shift operations is
<br/>
\[ [1 + 2 + 3 + .. + (n-1)] = \frac{n(n-1)}{2} \]
<br/>
Total time complexity becomes \(\theta \left( 2 \frac{n(n-1)}{2} \right)\), which is simplified to \(\theta (n^2)\).
</p>
<ulclass="org-ul">
<li><b>NOTE</b> : Rather than using <b>linear search</b> to find the position of key in the left (sorted) section, we can use <b>binary search</b> to reduce number of comparisions.</li>
The inversion of array is the measure of how close array is from being sorted.
<br/>
For an ascending sort, it is the amount of element pairs such that array[i] > array[j] and i < j OR IN OTHER WORDS array[i] < array[j] and i > j.
</p>
<ulclass="org-ul">
<li>For <b>ascending sort</b>, we can simply look at the number of elements to left of the given element that are smaller.</li>
It is a divide and conquer technique. It uses a partition algorithm which will choose an element from array, then place all smaller elements to it's left and larger to it's right. Then we can take these two parts of the array and recursively place all elements in correct position. For ease, the element chosen by the partition algorithm is either leftmost or rightmost element.
<preclass="src src-C"><spanstyle="color: #a0a1a7; font-weight: bold;">/* </span><spanstyle="color: #a0a1a7;">Will return the index where the array is partitioned</span><spanstyle="color: #a0a1a7; font-weight: bold;"> */</span>
<spanstyle="color: #a0a1a7; font-weight: bold;">/* </span><spanstyle="color: #a0a1a7;">This will point to the element greater than pivot</span><spanstyle="color: #a0a1a7; font-weight: bold;"> */</span>
<spanstyle="color: #a626a4;">return</span> (i + 1);
}
</pre>
</div>
<ulclass="org-ul">
<li>Time complexity</li>
</ul>
<p>
For an array of size <b>n</b>, the number ofcomparisions done by this algorithm is always <b>n - 1</b>. Therefore, the time complexity of this partition algorithm is,
<li><b>Best Case</b> : The partition algorithm always divides the array to two equal parts. In this case, the recursive relation becomes
\[ T(n) = 2T(n/2) + \theta (n) \]
Where, \(\theta (n)\) is the time complexity for creating partition.
<br/>
Using the master's theorem.
\[ T(n) = \theta( n.log(n) ) \]</li>
<li><b>Worst Case</b> : The partition algorithm always creates the partition at one of the extreme positions of the array. This creates a single partition with <b>n-1</b> elements. Therefore, the quicksort algorithm has to be called on the remaining <b>n-1</b> elements of the array.
\[ T(n) = T(n-1) + \theta (n) \]
Again, \(\theta (n)\) is the time complexity for creating partition.
<li><b>Average Case</b> : The average case is closer to the best case in quick sort rather than to the worst case.</li>
</ul>
<p>
<br/>
To get the average case, we will <b>consider a recursive function for number of comparisions</b> \(C(n)\).
<br/>
For the function \(C(n)\), there are \(n-1\) comparisions for the partition algorithm.
<br/>
Now, suppose that the index of partition is <b>i</b>.
<br/>
This will create two recursive comparisions \(C(i)\) and \(C(n-i-1)\).
<br/>
<b>i</b> can be any number between <b>0</b> and <b>n-1</b>, with each case being equally probable. So the average number of comparisions for \(C(n)\) will be
<h3id="orgc1f8ddb"><spanclass="section-number-3">8.5.</span> Merging two sorted arrays (2-Way Merge)</h3>
<divclass="outline-text-3"id="text-8-5">
<p>
Suppose we have two arrays that are already sorted. The first array has <b>n</b> elements and the second array has <b>m</b> elements.
<br/>
The way to merge them is to compare the elements in a sequence between the two arrays. We first add a pointer to start of both arrays. The element pointed by the pointers are compared and the smaller one is added to our new array. Then we move pointer on that array forward. These comparisions are repeated until we reach the end of one of the array. At this point, we can simply append all the elements of the remaining array.
<h3id="org15245b4"><spanclass="section-number-3">8.6.</span> Merging k sorted arrays (k-way merge)</h3>
<divclass="outline-text-3"id="text-8-6">
<p>
k-way merge algorithms take k different sorted arrays and merge them into a single single array. The algorithm is same as that in two way merge except we need to get the smallest element from the pointer on k array's and then move it's corresponding pointer.
Merge sort is a pure divide and conquer algorithm. In this sorting algorithm, we merge the sorted sub-arrays till we get a final sorted array.<br/>
The algorithm will work as follows :
</p>
<olclass="org-ol">
<li>Divide the array of n elements into <b>n</b> subarrays, each having one element.</li>
<li>Repeatdly merge the subarrays to form merged subarrays of larger sizes until there is one list remaining.</li>
</ol>
<p>
For divide and conquer steps:
</p>
<ulclass="org-ul">
<li><b>Divide</b> : Divide the array from the middle into two equal sizes.</li>
<li><b>Conquer</b> : Call merge sort recursively on the two subarrays</li>
<li><b>Combine</b> : Merge the sorted array</li>
</ul>
<p>
The algorithm works as follows (this isn't real c code)
</p>
<divclass="org-src-container">
<preclass="src src-C"><spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">A function that will merge two sorted arrays</span>
This algorithm is often used in languages which have great support for linked lists, for example lisp and haskell. For more traditional c-like languages, often quicksort is easier to implement.
<br/>
An implementation in C language is as follows.
</p>
<divclass="org-src-container">
<preclass="src src-C"><spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">buffer is memory of size equal to or bigger than size of array</span>
<spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">buffer is used when merging the arrays</span>
<h4id="orgf4976e0"><spanclass="section-number-4">8.7.1.</span> Time complexity</h4>
<divclass="outline-text-4"id="text-8-7-1">
<p>
Unlike quick sort, <b>the recurrance relation is same for merge sort in all cases.</b>
<br/>
Since divide part divides array into two equal sizes, the input size is halfed (i.e, <b>T(n/2)</b> ).
<br/>
In conquer part, there are two calls so <b>2.T(n/2)</b> is added to time complexity.
<br/>
The cost for merging two arrays of size n/2 each is either <b>n-1</b> of <b>n/2</b>. That is to say that time complexity to merge two arrays of size n/2 each is always \(\theta (n)\). Thus, the final recurrance relation is
<h3id="org1ee53da"><spanclass="section-number-3">8.8.</span> Stable and unstable sorting algorithms</h3>
<divclass="outline-text-3"id="text-8-8">
<p>
We call sorting algorithms unstable or stable on the basis of whether they change order of equal values.
</p>
<ulclass="org-ul">
<li><b>Stable sorting algorithm</b> : a sorting algorithm that preserves the order of the elements with equal values.</li>
<li><b>Unstable sorting algorithm</b> : a sorting algorithm that does not preserve the order of the elements with equal values.
<br/></li>
</ul>
<p>
This is of importance when we store data in pairs of keys and values and then sort data using the keys. So we may want to preserve the order in which the entries where added.
Sorting algorithms which do not use comparisions to sort elements are called non-comparitive sorting algorithms. These tend to be faster than comparitive sorting algorithms.
<preclass="src src-c"><spanstyle="color: #a0a1a7; font-weight: bold;">//</span><spanstyle="color: #a0a1a7;">* The input array is sorted and result is stored in output array *//</span>
<spanstyle="color: #a0a1a7; font-weight: bold;">//</span><spanstyle="color: #a0a1a7;">* max is the largest element of the array *//</span>
<spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">count array should have a size greater than or equal to (max + 1)</span>
<spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">i from 0 to len(array) - 1</span>
<spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">this loop stores number of elements equal to i in count array</span>
<spanstyle="color: #a626a4;">for</span>(<spanstyle="color: #c18401;">int</span><spanstyle="color: #8b4513;">i</span> = 0; i < len(array); i++)
count[input[i]] = count[input[i]] + 1;
<spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">i from 1 to max</span>
<spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">this loop stores number of elements less that or equal to i in count array</span>
<spanstyle="color: #a626a4;">for</span>(<spanstyle="color: #c18401;">int</span><spanstyle="color: #8b4513;">i</span> = 1; i <= max; i++)
count[i] = count[i] + count[i - 1];
<spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">i from len(array) - 1 to 0</span>
<b>Time complexity</b> : Since there are only simple loops and arithmetic operations, we can get time complexity by considering the number of times loops are executed.
</p>
<p>
\[ \text{Number of times loops are executed} = n + (max - 1) + n \]
\[ \text{Where, } n = len(array) \text{ i.e, the input size} \]
</p>
<p>
Therefore,
\[ \text{Number of times loops are executed} = 2n + max - 1 \]
In radix sort, we sort using the digits, from least significant digit (lsd) to most significant digit (msd). In other words, we sort digits from right to left. The algorithm used to sort digits <b>should be a stable sorting algorithm</b>.
For the following example, we will use the bubble sort since it is the easiest to implement. But, for best performance, <b>radix sort is paired with counting sort</b>.
</p>
<divclass="org-src-container">
<preclass="src src-c"><spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">d = 0, will return digit at unit's place</span>
<spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">d = 1, will return digit at ten's place</span>
<spanstyle="color: #a0a1a7; font-weight: bold;">// </span><spanstyle="color: #a0a1a7;">and so on.</span>