Skip to content

Selection Statement

Learning Objectives

At the end of this sub-unit, students should

  • understand how selection statement works.
  • know how to analyze selection statement.
  • know how to write selection statement.
  • know how to apply selection statement in problem solving.

Making a Choice

Let's look back at our solution for quadratic formula.

Quadratic Formula

1
2
3
4
5
6
# Assume a, b, and c are initialized
discriminant = (b * b) - (4 * a * c)
denominator = 2 * a

x1 = (-b + (discriminant ** 0.5)) / denominator
x2 = (-b - (discriminant ** 0.5)) / denominator

Unfortunatelu, there is a slight problem with the formula above. The expression (discriminant ** 0.5) may produce no real number solution1. This happens when the discriminant is negative. We want to produce two different behavior depending on the sign of the discriminant.

  1. If the discriminant is negative, print "No Real Solution".
  2. Otherwise, print both \(x_1\) and \(x_2\) on separate lines.

Here we have a problem, we cannot do both. But based on what we have learnt so far, the execution will proceed from top to bottom, visiting every line. We need another construct to make choices such that we may execute some lines and not others.

We would highly encourage you to familiarize yourself with the basic terminology section of the appendix.

If-Statement

The construct to do that is called the if-statement. The general syntax can be described as follows.

  • It must start with the if keyword.
    • The if keyword is followed by a conditional (i.e., boolean expression).
    • This line is terminated by a colon (i.e., :).
    • The next line is an inner block called the then branch.
  • It can have 0 or more elif keyword2 in the same indentation level as the if keyword.
    • The elif keyword is followed by a conditional (i.e., boolean expression).
    • This line is terminated by a colon (i.e., :).
    • The next line is an inner block called the elif branch.
  • It can have 0 or 1 else keyword in the same indentation level as the if keyword.
    • The else keyword is NOT followed by a conditional (i.e., boolean expression).
    • This line is terminated by a colon (i.e., :).
    • The next line is an inner block called the else branch.

As a general rule, after every colon (i.e., :), the next line should be more indented. The reverse is also true. Consider an indented line, there must be a line with fewer indentation that ends with a colon (i.e., :). To remember this, there is a notation we can use.

Syntax

1
2
3
4
5
6
if  expression  :
   block 
 elif  expression  :
   block  ❳✱
 else :
   block  

A little explanation on the notation.

  • Named surrounded by ❬❭ are not actual code but a placeholder for something else.
  • Elements surrounded ❲❳ are optional (i.e., they may or may not present).
    • Additionally, if they are then followed by , it means there can be 0 or more of them.
    • Alternatively --while not used here-- if they are then followed by , it means there can be 1 or more of them.

This match the explanation above.

This syntax and the explanation gives us some basic construction rule for selection statements which include the following.

1
2
3
4
5
6
if cond:
  stmt
# no elif
#
# no else
#

Variant

If Elif Else
1 0 0
1
2
3
4
5
6
if cond:
  stmt
elif cond:
  stmt
# no else
#

Variant

If Elif Else
1 1 0
1
2
3
4
5
6
if cond:
  stmt
# no elif
#
else:
  stmt

Variant

If Elif Else
1 0 1
1
2
3
4
5
6
if cond:
  stmt
elif cond:
  stmt
else:
  stmt

Variant

If Elif Else
1 1 1

The other variants are just varying the number of elif. Note that there cannot be 0 number of if keyword to actually qualify as a selection statement.

Invalid Code

The following codes are invalid and will give syntax error.

if :        # should have condition
  stmt
1
2
3
4
if cond:
  stmt
elif :      # should have condition
  stmt
if cond:
stmt        # should be indented
1
2
3
4
if cond:
  stmt
else cond:  # should have no condition
  stmt
1
2
3
4
5
6
if cond:
  stmt
else:       # should be after elif
  stmt
elif cond:
  stmt
1
2
3
4
elif cond:
  stmt
else:
  stmt
1
2
3
4
5
if cond:
  stmt
stmt
elif cond:
  stmt

Let us consider a simple if-statement and see how it is evaluated. The behavior is also called the semantics. In describing a new construct, we need to know the syntax and the semantics. Unfortunately, the mathematical description of the semantics is rather complicated. So, we will use an informal explanation for the semantics.

Semantics

When describing the semantics, we should not use the abstract syntax involving ❬❭ and ❲❳. Instead, we will use name as a placeholder for expressions, statements, and blocks.

1
2
3
4
5
6
if cond1:
  block1
elif cond2:
  block2
else:
  block3

The execution of if-statement above is as follows:

  1. Evaluate the boolean expression cond1 into a value \(V_1\).
  2. If \(V_1\) is True3, evaluate the block block1 and go to Step 6.
  3. Otherwise (i.e., value is False), evaluate the boolean expression cond2 into a value \(V_2\).
  4. If \(V_2\) is True3, evaluate the block block2 and go to Step 6.
  5. Otherwise (i.e., value is False), evaluate the block cond3 and go to Step 6.
  6. Evaluate the next statement (if any).

This behavior can be extended into multiple elif by duplicating Step 3 and Step 4. Note that we do the evaluation of the next condition only if the previous condition evaluates to False. On the other hand, if there is no elif, then we simply skip Step 3 and Step 4. Similarly, if there is no else, then we simply skip Step 5.

Often, thinking about the semantics (i.e., the behavior) using text can be confusing. The more common explanation tool is the flowchart. In general, to trace a flowchart, you simply follow an arrow. However, there may be some elements with multiple arrow coming out. In such cases, you need to understand which path to take based on some condition. This is because the flowchart is a representation of the code and the code can only execute one line at a time. So it cannot have both arrows being executed at the same time.

Flowchart

1
2
3
4
5
6
if cond1:
  block1
elif cond2:
  block2
else:
  block3

The flowchart corresponding to the code above is shown below.

If 01

The arrow with the checkmark corresponds to the path taken when the corresponding condition (e.g., cond1 or cond2) evaluates to True. The arrow with the crossmark corresponds to the path taken when the corresponding condition (e.g., cond1 or cond2) evaluates to False.

Concrete Example

Consider the problem of finding an absolute value of a floating point number in a variable x and print the value. If the value of x is negative, we simply print -x. Otherwise, we print x. Converting this into code, we get the following code.

Absolute Value

1
2
3
4
5
# Assume x is initialized
if x < 0:     # If the value of x is negative,
  print(-x)   #   we simply print -x.
else:         # Otherwise,
  print(x)    #   we print x.

The flowchart is shown below.

Abs

It is also interesting to see the execution step by step. We will illustrate that with our memory model. The path taken are highlighted. The memory model and the evaluation is given as comments.

1
2
3
4
5
6
7
8
# { x ↦ 5 }
if x < 0:    # ⇝  5 < 0  ⇝  False
  print(-x)
# { x ↦ 5 }
else:
  # { x ↦ 5 }
  print(x)   # ⇝  print(5)
  # { x ↦ 5 }
5

The lines executed can be visualized below.

If02

1
2
3
4
5
6
7
8
# { x ↦ -7 }
if x < 0:    # ⇝  -7 < 0  ⇝  True
  # { x ↦ -7 }
  print(-x)  # ⇝  print(-(-7))  ⇝  print(7)
  # { x ↦ -7 }
else:
  print(x)
# { x ↦ -7 }
7

The lines executed can be visualized below.

If03

Back to Quadratic

Now that we have a better understanding of selection statement, we can look back at the quadratic formula problem and improve on it. First, let us recap the relevant information.

  1. If the discriminant is negative, print "No Real Solution".
  2. Otherwise, print both \(x_1\) and \(x_2\) on separate lines.

Translating this, we get the following code.

Quadratic Formula

# Assume a, b, and c are initialized
discriminant = (b * b) - (4 * a * c)
if discriminant < 0:         # If the discriminant is negative,
  print("No Real Solution")  #   print "No Real Solution"
else:                        # Otherwise,
  denominator = 2 * a
  x1 = (-b + (discriminant ** 0.5)) / denominator
  x2 = (-b - (discriminant ** 0.5)) / denominator
  print(x1)                  #   print x1 and
  print(x2)                  #   x2 on separate lines

Properties

It is worth noting some interesting properties of if-statement. We will illustrate this with the following generalized if-statement.

1
2
3
4
5
6
7
8
if cond1:
  block1
elif cond2:
  block2
elif cond3:
  block3
else:
  else_block
  1. At most one of the blocks will be executed.
    • In other words, if block3 is executed, no other blocks will be executed.
    • Note that if there is no else keyword, then it is possible that none of the blocks are executed.
  2. cond2 will only be evaluated if cond1 evaluates to False.
    • Similarly, cond3 will only be evaluated if cond1 and cond2 evaluates to False.
    • In fact, if we order subsequence condition as cond_i, then it will only be evaluated if all cond_j where j < i evaluates to False.
  3. else_block is only executed if all conditions evaluate to False.

The second property may lead to an interesting behavior when multiple conditions may evaluate to True. Since the evaluation is still roughly top down, the top-most condition that evaluates to True will have its corresponding block executed. We say that this value satisfies the condition.

Afterwards, the execution jumps to the end of the if-statement. Subsequent blocks will not be executed even if the corresponding condition will actually evaluate to True if it ever reach.

Letter Grades

To illustrate the second property, considering the problem of assigning letter grades to marks according to the table below. Assume that all marks are integers between 0 and 100 (inclusive).

Mark \(M\) Grade
\(90 \leq M \leq 100\) A
\(80 \leq M \leq 89\) B
\(70 \leq M \leq 79\) C
\(60 \leq M \leq 69\) D
\(0 \leq M \leq 59\) F

There are several possible solutions and we will show two interesting solutions below.

Letter Grade

In this solution, we have conditions that are overlapping. In other words, a given value may satisfy multiple conditions.

# Assume mark is initialized
if mark >= 90:
  print("A")
elif mark >= 80:
  print("B")
elif mark >= 70:
  print("C")
elif mark >= 60:
  print("D")
else:
  print("F")

In this solution, we have conditions that are disjoint. In other words, a given value will only satisfy one condition.

# Assume mark is initialized
if mark >= 90 and mark <= 100:
  print("A")
elif mark >= 80 and mark <= 89:
  print("B")
elif mark >= 70 and mark <= 79:
  print("C")
elif mark >= 60 and mark <= 69:
  print("D")
elif mark >= 0 and mark <= 59:
  print("F")
Incorrect Letter Grade

Note that the order of the condition matters! The following is incorrect as the latter conditions (e.g., mark >= 90) is subsumed by the earlier conditions (e.g., mark >= 70).

# Assume mark is initialized
if mark >= 60:
  print("D")
elif mark >= 70:
  print("C")
elif mark >= 80:
  print("B")
elif mark >= 90:
  print("A")
else:
  print("F")

Analysis

Since there can be multiple correct solution, how can we analyze a particular solution involving if-statement? For one, we can test it with all marks from 0 to 100 and see if the answer are as expected. In this case, that will likely be the best way to test. Unfortunately, it may not work in general where the input can be unbounded.

We will show one possible way of analyzing if-statement that works for some numeric inputs. This is called a range analysis because we are only interested in the range of values a particular variable can take. Note that it is only an illustration. You may need a more imaginative technique to analyze your own code, but you may use this technique as a starting point.

We will indicate the range of values as follows.

  • [a, b] means that the values are between a and b (inclusive of both).
  • [a, b) means that the values are between a and b (inclusive of a, exclusive of b).
  • (a, b] means that the values are between a and b (exclusive of a, inclusive of b).
  • (a, b) means that the values are between a and b (exclusive of a, exclusive of b).

Our analysis will use similar notation as our memory model. However, the value will be based on the range notation above. First, note that the correctness of the disjoint version is more obvious. In particular, it closely resembles the condition on the table. So, we only need to justify the overlapping version.

# { x ↦ [0, 100] }
if mark >= 90:
  # { x ↦ [90, 100] }
  #   [90, 100] because of the condition mark >= 90
  print("A")
elif mark >= 80:
  # { x ↦ [0, 90) and [80, 100] }
  #   [0, 90) because it is NOT mark >= 90
  #   [80, 100] because of the condition mark >= 80
  # { x ↦ [80, 90) }
  # { x ↦ [80, 89] }  because it is integer
  print("B")
elif mark >= 70:
  # { x ↦ [0, 90) and [0, 80) and [70, 100] }
  #   [0, 90) because it is NOT mark >= 90
  #   [0, 80) because it is NOT mark >= 80
  #   [70, 100] because of the condition mark >= 70
  # { x ↦ [70, 80) }
  # { x ↦ [70, 79] }  because it is integer
  print("C")
elif mark >= 60:
  # { x ↦ [0, 90) and [0, 80) and [0, 70) and [60, 100] }
  #   [0, 90) because it is NOT mark >= 90
  #   [0, 80) because it is NOT mark >= 80
  #   [0, 70) because it is NOT mark >= 70
  #   [60, 100] because of the condition mark >= 60
  # { x ↦ [60, 70) }
  # { x ↦ [60, 69] }  because it is integer
  print("D")
else:
  # { x ↦ [0, 90) and [0, 80) and [0, 70) and [0, 60) }
  #   [0, 90) because it is NOT mark >= 90
  #   [0, 80) because it is NOT mark >= 80
  #   [0, 70) because it is NOT mark >= 70
  #   [0, 60) because it is NOT mark >= 60
  # { x ↦ [0, 60) }
  # { x ↦ [0, 59] }  because it is integer
  print("F")
# { x ↦ [0, 100] }

Try to understand the justification above slowly. This illustrates the kind of analysis you should try to make to check if your code is correct. Such an analysis can also be done while designing your solution using flowchart as shown below. At the very least, try to perform a simple analysis. This will help in finding problems with your code later on.

Grade


  1. Luckily, Python can actually produce complex number, but we will not go into that. 

  2. The elif keyword is a portmanteau of else and if. else + ifelse + ifelif

  3. In Python, \(V_i\) need not evaluate exactly to True (i.e., boolean value). Instead, it only requires truthy/falsy values. However, it is a bad practice.