Algorithm Analysis
Before beginning, let's analyse 2 different algorithms and see which takes longer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
So, which grows faster?
| T(k) = 100 k | T(k) = k^2 |
|---|---|
| T (0) = 0 | T (0) = 0 |
| T (1) = 1 | T (1) = 1 |
| T (100) = 10,000 | T (100) = 10,000 |
| T (1000) = 100,000 | T (1000) = 1,000,000 |
As you can see, when k is small, the second algorithm seems better but as k grows larger, the first algorithm is much faster.
In summary, always think big!
Big-O Notation
Upper Bound
We should be thinking -- how does an algorithm scale? For large inputs, what is the running time?
To set some context, we have the following notations:
- \(T(n)\): Runtime on inputs of size n
- \(O(f(n))\): Approximately what happens as n gets big
Note that \(O(n)\) has a precise and intuitive definition. Let's get the precise definition out of the way first to make our lives easier:
\(T(n) = O(f(n))\) if T grows no faster than f.
That is, \(T(n) = O(f(n))\) if there exists a constant \(c > 0\) , \(n_0 > 0\) such that for all \(n > n_0, T(n) < c f(n)\)
or to put it into words, when we say a function is Big-O of something else, there exists \(c\) and \(n_0\) such that when n gets big (i.e. > \(n_0\)), the runtime is bounded by \(c\) times the function. It won't be bigger than a constant factor figure than the function we care about.
Let's look at it in terms of graphs to make it easier to see:
For the graph in purple, (ignoring small values) as long as $n > \(n_0\), the blue function will be at most a constant \(c\) times the purple function, i.e. it is bounded by above by some constant \(c\) times the purple function.

Here, we have another example where beyond \(n_0\), the blue function will always be less than a constant \(c\) times the red function.

Thus, both functions provide a Big-O style upper bound. Note that we do not care about small values of n. If you noticed, the red function is not a good approximation for our function \(T(n)\) but is still a valid upper bound.
When talking about Big-O, we don't always mean a good approximation -- rather, we mean bounded above by some constant times a function.
Proof
\(T(n) = 4n^2 + 24n + 16\)
\(< 4n^2 + 24n^2 + 16n^2 (n > n_0 = 4)\)
\(= 29n^2\)
\(= O(n^2) (c = 29)\)
This is not a very tight analysis of T(n) but it is sufficient to show that the function is Big-O of \(n^2\).
Luckily, there are some fairly good heuristic ways to understand Big-O notation (so we don't always need to do these proofs)
| T(n) | big-O |
|---|---|
| 1000\(n\) | \(T(n) = O(n)\) |
| 1000\(n\) | \(T(n) = O(n^2)\) |
| \(n^2\) | \(T(n) != O(n)\) |
| 13\(n^2 + n\) | \(T(n) = O(n^2)\) |
Note that \(T(n) = O(n^2)\) is not a tight bound, and when asked for \(O(n)\), try to give the tightest bound available i.e. \(O(n)\) in this case. We can also ignore the constants as well as \(n\) when \(n^2\) is present.
Lower bound
So, what about the opposite, if the algorithm is inherently slow? In this case, our function in blue has something below it. Now, our blue function is big-Ω (big-Omega) of the red function below it as it is always going to be bigger than the red function beyond \(n_0\).

Let's look at the formal definition.
\(T(n) = Ω(f(n))\) if T grows no slower than f.
That is, \(T(n) = Ω(f(n))\) if there exists a constant \(c > 0\) , \(n_0 > 0\) such that for all \(n > n_0, T(n) >= c f(n)\)
or to put it into words, when we say a function is Big-Ω of something else, there exists \(c\) and \(n_0\) such that when n gets big (i.e. > \(n_0\)), our function \(T(n)\) will always be greater than or equal to some constant \(c\) times \(f(n)\).
| T(n) | Asymptotic |
|---|---|
| 1000\(n\) | \(T(n) = Ω(1)\) |
| 1000\(n\) | \(T(n) = Ω(n)\) |
| \(n^2\) | \(T(n) = Ω(n)\) |
| 13\(n^2 + n\) | \(T(n) = Ω(n^2)\) |
So far, both O(n) and Ω(n) don't have to be tight. But what if we want a more precise approximation?
Tighter bound
Now we have \(Θf(n)\). If something is \(Θf(n)\), it has to be tight, meaning it has to be close to the function in question.
\(T(n) = Θf(n)\) if T grows at the same rate than f.
That is, \(T(n) = Θf(n)\) if and only if \(T(n) = O(f(n))\) and \(T(n) = Ω(f(n))\)
or to put into words, beyond a certain point, \(T(n)\) will always be sandwiched between c1 times a function and c2 times another function, i.e. bounded above and below.

| T(n) | big-O |
|---|---|
| 1000\(n\) | \(T(n) = Θ(n)\) |
| \(n\) | \(T(n) != Θ(1)\) |
| 13\(n^2 + n\) | \(T(n) = Θ(n^2)\) |
| 13\(n^3\) | \(T(n) != Θ(n^2)\) |
Growth times
In general, we can follow this graph for growth times for different functions.

Model of Computation
In the real world, there are many different models of computation. Here, we will focus on Sequential computation, with one thing at a time and all operations taking the same amount of time (which of course, is not how it is in reality).
Example
1 2 3 4 5 6 7 | |
Total time: 1 + 1 + (k + 1) + k + 3k + 1 = 4k + 4 = O(k)
Misunderstanding
1 2 3 4 5 6 7 8 9 | |
Note here that string concatenation would depend on the length of the string! It isn't 1, constant or k! Moral of the story is, all costs are not 1.
Rules
There are some rules we can follow to make our lives easier
Big-O
- If \(T(n)\) is a polynomial of degeree \(k\) then \(T(n) = O(n^k)\)
- If \(T(n) = O(f(n))\) and \(S(n) = O(g(n))\) then \(T(n) + S(n) = O(f(n) + g(n))\)
- If \(T(n) = O(f(n))\) and \(S(n) = O(g(n))\) then \(T(n) * S(n) = O(f(n) * g(n))\)
Loops
- Loop: cost = (#iterations) * (max cost of 1 iteration)
- Nested loop: cost = inner * outer
- Sequential: cost = first loop + second
- If-else statements: cost = max(cost of first, cost of second) <= cost of first + cost of second
1 2 3 4 5 6 7 | |
1 2 3 4 5 6 7 8 9 | |
1 2 3 4 5 6 7 8 9 | |
1 2 3 4 5 6 7 8 | |
Recurrence
An example of if-else would be the famous fibonacci function. Let's take a look at the cost
1 2 3 4 5 6 7 8 | |
\(T(n) = 1 + T(n - 1) + T(n - 2)\)
\(= O(2^n)\)
Binary Search
Writing a binary search algorithm seems easy enough... or is it? Let's take a look at a initial algorithm that has bugs. Can you spot what is wrong?
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
Let's try some fixes. Firstly, we consider the specifications -- what problem are we actually trying to solve? What should we return?
In the case of binary search, maybe we want to return the element if it is in the array and null if it is not, or something
more useful could be returning the index of the element if it is in the array and -1 if not. Now, we not only learn that the element
is in the array, we also learn where it is.
Let's fix the algorithm a little. Are there still any bugs now?
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
begin + end > MAX_INT
Note that there will be no ArrayIndexOutOfBoundsException due to round down of (begin + end) / 2.
Now, we arrive at a working binary search algorithm.
| Binary Search(fixed) | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
Precondition & Postcondition
When it comes to writing algorithms and code, we want to try avoiding these bugs as much as possible. To help us with this, we can think about what is true when we start (i.e. precondition) and what is true when we finish (i.e. postcondition)
| Precondition | Postcondition |
|---|---|
| Fact that is true when function begins | Fact that is true when function ends |
| Something important to work correctly | Something useful to show computation done correctly |
In the context of binary search, we can consider the following:
Functionality: If element is in array, return index of element, else return -1
Preconditions: Array is of size n and sorted
Postcondition:
If element is in array, arr[begin] == key
Invariants
Another way to think about algorithms is via what is always true (i.e. invariants). More formally, an invariant is the relationship between variables that is always true or more intuitively, the fact you know is true no matter what throughout execution.
Loop Invariants
- Relationship between variables that is true at the beginning/end of each iteration of a loop.
Revisiting binary search, we can come up with some invariants.
Loop invariant: arr[begin] <= key <= arr[end]
Interpretation: key is in the range of the array
Of course, there is a possibility that the key is not part of the array, smaller than begin or larger than end. Thinking of invaraints allows us to consider other possibilities to refine the code we wrote.
When conducting binary search, it looks something like this

iter |
size |
|---|---|
| 1 | \(n\) |
| 2 | \(n/2\) |
| 3 | \(n/4\) |
| .. | .. |
| k | \(n/2^k\) |
Here, we found ourselves another invariant. At the end of \(log(n)\), we only have 1 element left and are done with the search. That is, we need at most \(log(n)\) iterations.
Derivation
\(n/2^k = 1\)
\(2^k = n\)
\(klog(2) = log(n)\)
\(k = log(n)\)
Here, we are assuming log base 2 is used.
Thus some key invariants we have for binary search are
arr[begin] <= key <= arr[end], showing correctness(end - begin) <= n/2^kin iteration k, giving us the bound on performance