CS3230 AY 2025/26 Semester 2

Assignment #2 Statistics and Comments (T04)

The average score in our class is 6.5 / 8. Below are the statistics for each question:

Question 1(a), (b):

  • It's better to specify an explicit value of $\varepsilon$ in your proof (and $c$ for the regularity condition).
    Here is a subtle error that may occur if you do not do so:
    • Consider this statement: $1 - \frac{1}{n} < c$ for some constant $c < 1$.
    • You may argue that $1 - \frac{1}{n} < 1$ and therefore the statement is true.
    • However, the statement is false, since $1 - \frac{1}{n}$ can be infinitely close to 1.
  • Note that you should not be using a calculator in your proof.
    Given this, a smart way to choose $\varepsilon$ is to choose $2 - \log_3 6$ in 1(a) and $\log_2 9 - 3$ in 1(b).
    This way, you can directly state that $f(n) \in \Omega(n^{\log_3 6 + \varepsilon})$ by reflexivity, likewise for 1(b).
  • Regularity condition is one of the required conditions to check before applyig Master Theorem.
    Therefore, you should prove the regularity condition before stating Case 3 is applicable.
  • When proving the regularity condition (or any other inequalities), you should state the explicit value of $c$, then derive the inequality. Under no circumstances should you start from the inequality and derive the value of $c$ (this assumes the inequality is true for some value of $c$, and your derivation after only bounds the value of $c$ given this assumption is true).
    Therefore, deriving the value of $c$ should be part of your working, not what you submit.
  • Some mistakes in applying the regularity condition:
    • The regularity condition has to hold for all values of $n$, not when $n \to \infty$. We have to follow the theorem statement strictly.
    • The regularity condition uses $f(n)$, which is $223n^2 - 16$ in this case (not just $n^2$).

Question 1(c): Most students use substitution method (and that's the intended way).

  • For the lower bound, it should be very obvious that $T(n) \geq 2n$. You can write this directly and conclude that $T(n) \in \Omega(n)$.
  • For the upper bound, notice that you are writing a proof by induction that $T(n) \leq cn$ for all $n$. This means:
    • The induction step must hold for all values of $n$ you are inducting from. Note that we need properties like $\sqrt{n} \leq \frac{n}{4}$ so that our induction step remains working (this implies $n > 16$), so we should start our induction step from a sufficiently large value of $n$, e.g. $n = 17$. Also, depending on how your induction step exactly works, $c$ has to be at least a certain value (e.g. $\geq 6$).
      Some of you did some asymptotic analysis or considered $n \to \infty$, which is incorrect (the induction chain will break earlier).
    • At the same time, the base cases must hold as well. If you induct from $n = 17$ onwards, then $n = 1, 2, \ldots, 17$ will all be base cases. This implies that you have to pick a value of $c$ such that these base cases all hold.
      We did not tell you exactly what $T(1), T(2), \ldots$ are, so you should just claim that you could pick a sufficiently large value of $c$ (which is also, for example, $\geq 6$) such that these base cases hold.
    • Note that either a broken base case or a broken induction step will fail your argument completely -- this is how induction works.

A note on recursion tree:

  • Recursion tree methods do not work in this particular question (at least I didn't come up with any valid argument). This is because the tree is "unbalanced" - if we trace through the $T(n/2)$ calls, the height of the tree is $\log_2 n$, but if we trace through the $T(\sqrt{n})$ calls, the height of the tree would be $\log \log n$ instead. It is also not straightforward to sum up the costs of each level/branch, given that there are many possibilities of how the calls can be made.
  • Here's an example (from the textbook) on applying the recursion tree method correctly on unbalanced trees:
    Problem: Solve $T(n) = T(n/3) + T(2n/3) + cn$.
    We draw the recursion tree and notice the cost at every level is $cn$.
    Therefore, the cost will be between $cn \cdot \log_{3/2} n$ and $cn \cdot \log_{3} n$, i.e. the total cost is $\Theta(n \log n)$.

Q2 is well-done -- hopefully you are familiar with induction by now.

Q3 is a challenging one. Please watch this video recording of the solution explanation (from last AY). Common mistakes:

  • By asking you to show a tightest upper bound, you should:
    • show that your function is an upper bound, i.e. for all values of $n$, Euclidean Algorithm runs in $O(\log n)$ time.
    • show that your function is tight (i.e. you can't reduce it further), i.e. for infinitely many values of $n$, Euclidean Algorithm runs in $\Omega(\log n)$ time. For this part, you have the liberty to choose selected values of $n$, $m$ which yield the worst-case performance of Euclidean algorithm.
    all except one of you missed one out of the two parts.
  • Many students show that $n \text{ mod } m < \frac{n}{2}$ and derive that the value of $m'$ in the next recursive call is at most half of $n$. But some of you stopped here and immediately concluded that at most $\log n$ recursive calls are involved. What is the problem here?
    The issue is that you are comparing two different variables. The value of $m'$ in the next recursive call is at most half of $n$, but the value of $n'$ in the next recursive call is NOT at most half of $n$. You cannot make any meaningful conclusions here.
    The key to show that it is indeed $\Theta(\log n)$ is to show that the value of $n''$ in the second next iteration (which is same as $n'$) is at most half of $n$. This shows that at most $2 \log n$ recursive calls are needed before $n$ reaches $\Theta(1)$.
  • Some of you used the fact that "If $k$ recursive calls are needed, then $m \geq Fib_{k-1}$ and $n \geq Fib_k$" by induction. This only proves the upper bound.
    To understand why it is so, take the largest $k$ such that $n \geq Fib_k$, i.e. $Fib_k \leq n < Fib_{k+1}$. Then, we can conclude that $GCD(m, n)$ takes at most $k$ recursive calls (since $n \geq Fib_{k+1}$ does not hold). Since $k = \Theta(\log n)$, we will know that $GCD(m, n)$ takes at most $c \log n$ time.
    However, this does not show that $GCD(m, n)$ takes at least $c \log n$ time.
  • Don't stop after writing that $GCD(Fib_{k-1}, Fib_k)$ takes $k-1$ recursive calls. Prove why by showing that $GCD(Fib_{k-1}, Fib_k)$ reduces to $GCD(Fib_{k-2}, Fib_{k-1})$.

P.S. I started to see some low-effort submissions for Q2 and Q3. Please try your best for those question or else I might start considering giving a 0 for those questions. (Don't worry if you already tried your best though.)


Last updated: 02 February 2026