Skip to content

Synchronization

Race Condition

Concurrent execution can casue certain problems. When we have 2 or more processes that execute concurrently in an interleaving fashion and share a modifiable resource, it can cause synchronization problems (and the switch between the process is caused by a OS preemption, not controlled by process). The most common time this occurs is when a process runs out of its time quantum.

The execution of a single sequential process is deterministic and repeated execution gives the same result. However, concurrent processes may be non-deterministic. Execution outcome depends on the order in which shared resource is accessed/modified. The process has no way of predicting when it will be preempted. This is known as race conditions.

raceCond

Here we have process p1 and p2 modifying x. Recall that generally, most processes we deal with are load-store architectures. The processes do not modify the variables directly, and instead the program has to generate 3 separate instructions, since operations can only take place on registers except load/store.

  1. Load X -> Register1
  2. Add 1000 to Regiser1
  3. Store Register1 -> X

Illustration

rnd11

p1 loads x with 345 into reg1

rnd12

p1 increments reg1 by 1000 to 1345

rnd13

p1 stores value 1345 in reg1 to x

rnd14

p2 loads x into reg1'

rnd15

p2 increments reg1' by 1000 to 2345

rnd16

p2 stores value 2345 in reg1' to x

here, no race condition occurs and value written is correct.

rnd11

p1 loads x with 345 into reg1

rnd12

p1 increments reg1 by 1000 to 1345, then gets preempted

rnd23

p2 loads x with 345 to reg1'

rnd24

p2 increments reg1' by 1000 to 1345, gets preempted

rnd25

p1 stores value 1345 in reg1 to x, gets preempted

rnd26

p2 stores value 2345 in reg1' to x

Here, we end up with the wrong x value

The problem of race conditions is caused by unsynchronized access to a shared modifiable resource. To solve this, we can designate code segment with race conditions as critical section. A critical section is a section of code that modifies a shared modifiable resource. For example in circular buffer when you do head = (head + 1) % buffSize. The code that modifies this is the critical section.

Critical Section

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
void fn()
{
    int x = 0; // llocal var, no sync problems
    x++;
    ... // enter cs
    enq(x); // critical section
    ... // exit cs
}
//pseudocode
enq() {
    q[head] = x;
    head = (head + 1) % qSize;
}

When we have a critical section within our code, we do a call to enter and exit cs. This helps us synchronize access to shared resource. When 1 process is updating shared variable, the other process should not be allowed to enter the area it is modifying shared variable. When p1 earlier is modifying x, when p2 calls enter CS, it should block until p1 calls exit cs.

Some correct CS implementation properties include

  • Mutual Exclusion
    • If process p1is executing in critical section, all other proceses are prevented from entering the critical section
  • Progress
    • If no process is in critical section, 1 waiting process should be granted access
  • Bounded Wait
    • After process pi requests to enter critical section, there exists an upper bound of number of times other proceses can enter the section before pi
  • Independence
    • Processes not executing in critical section should never block other process (from entering the CS)

Incorrect synchronization can cause problems like

  • Deadlock
    • All processes get blocked so no progress
    • only occur if resource is shared
    • Can build mechanisms to detect when deadlock about to occur and switch to another resource (hopefully not shared)
  • Livelock
    • Usually caused by deadlock avoidance, when both process switch at same time, collide at same resource again
    • Processes keep changing state to avoid deadlock and make no other progress
    • Typically processes are not blocked
    • To OS, process are running but not doing anything
  • Starvation
    • Some processes blocked forever

Some implementations we can use are assembly level (mechanisms provided by CPU), high level language (utilizes normal programming constructs) and high level abstractions (provide abstracted mechanisms that give additional useful features,, commonly implemented with assembly-level mechanisms)

Assembly Level Implementation

Test and Set

This is an atmoic instruction and is a machine instruction to aid synchronization. We specify a register and memory location e.g. TestAndSet register, memLocation and the contents of the memory location are written to the register and a 1 is written to the memory location. This is fully atomic, which means it is impossible for this instruction to be interrupted before it writes 1 to memory. (is a single machine operation)

1
2
3
4
5
6
7
8
9
void EnterCS(int* Lock)
{
    while(TestAndSet(Lock) == 1); // if there is a 1, assume someone else entered CS
}

void ExitCS(int* Lock)
{
    *Lock = 0;
}

Although this works, it is not efficient. Since processes get stuck in the while loop until it runs out of time quantum, and this repeats till a process exits CS, the e.g. 10ms where nothing is done could have been used to execute 3 million instructions. This is also known as busy waiting. It is a wasteful use of processing power. Variants of this like Compare and Exchange, Atomic Swap, Load Link/Store Conditional exist

High Level Language Implementation

Initially, this may seem like a good idea

1
2
3
4
while (Lock != 0);
Lock = 1;
// critical section
Lock = 0;
however, since Lock is a modifiable shared resource, it suffers from race conditions. 2 processes can enter the critical section, failing mutual exclusion criteria. Maybe we can try to fix this by disabling interrupts?

1
2
3
4
5
6
// disable interrupt
while (Lock != 0);
Lock = 1;
// critical section
Lock = 0;
// enable interrupt

Since context switching occurs when the time quantum runs out (so there is timer interrupt), disabling this means context switch will not occur and the process can run finish. This seems good until the processes are on different CPUs. Disabling p0 on cpu 0 will not affect p1 on cpu 1. Of course, the process could also casue problems like segmentation fault etc. and interrupts do not get reenabled. Generally we will never disable interrupts.

Now let's explore a different try

1
2
3
4
5
6
7
8
9
// process 0
while (Turn != 0);
// critical section
Turn = 1;

// process 1
while (Turn != 1);
// critical section
Turn = 0;

This seems to work well, but it p0 does not enter CS, it will never change Turn to 1 and p1 can never enter cs. Thus it is possible for a process not in CS to block another process from entering. What if we give both process a separate variable?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
Want[0] = 0;
Want[1] = 0;

// process 0
Want[0] = 1; // line a
while (Want[1]);
// critical section
Want[0] = 0;

// process 1
Want[1] = 1;
while (Want[0]);
// critical section
Want[1] = 0;

Again, it works until context switch occurs after line a executes. Now, both processes have their want as 1 and neither can enter cs. They are both waiting for each other, resulting in a deadlock. Let's try 1 more attempt by combining turn and wait

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
Turn;
Want[0] = 0;
Want[1] = 0;

// process 0
Want[0] = 1; 
Turn = 1;
while (Want[1] && Turn == 1);
// critical section
Want[0] = 0;

// process 1
Want[1] = 1;
Turn = 0;
while (Want[0] && Turn == 0);
// critical section
Want[1] = 0;

This finally works, and is known as Peterson's Algorithm.

Peterson's Algorithm

Some disadvantages of this include busy waiting. The waiting process repeatedly tests the while-loop condition instead of going into a blocked state. It is also a low-level construct, where we manipulate individual variables to implement, which is error prone. A higher-level programming construct is more desirable. This is also not general and not easy to extend to more than 2 processes.

Synchronization Mechanism

EdwardW. Dijkstra proposed a mechanism called semaphore which has 2 main oprations - proberen (p) or Wait() and verhogen (v) Signal()

Proberen is a decrement operation and means "to test". It tries to decrement the semaphor and if unable to, will block. Verhogen is to increment the semaphor.

A semaphore is a generelized synchronization mechanism. Something interesting is that only the behaviours of the operations are defined, but implementations are not. So they can have different implementations. Semaphors gives us a way to block a number of processes (known as sleeping process) amd unblock/wake up 1 or more sleeping processes.

This is important as the main point about synchronization is when your process runs, it often has to wait for something to happen. It is important that such process gets blocked and not proceed while it waits (whether it has not received a value it needs or another process is in critical section).

If process p waits for q, once q finishes running, it should unblock p, else q cannot carry on. Similarly in critical section, once a process exits, it should be able to unblock other process so they can enter cs as well.

Semaphore

The 2 fundemental operations of semaphores are Wait() and Signal(). A semaphore S contains an integer value (it is a protected integer variable - any operations on this variable is always atomic, so Wait() and Signal() operations on s are always atomic). It can be initialized to any non-negative integer values initially.

Wait() takes the value inside the semaphore. If it is a positive vlaue, it will decrement the semaphore by 1. If it has 0, the process blocks. Signal() increments the semaphore and wakes 1 of any sleeping processes. This increment occurs only when there are no sleeping processes. Note that generally semaphores do not go negative. So Signal() will take a sleeping process to increment S.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
wait(int s) {
    if (s <= 0) {
        sleep();
    } else {
        s--;
    }
}

signal(int s) {
    if (!q.empty()) {
        deq();
    } else {
        s++;
    }
}

semaphore

As seen, the semaphore has a protected integer variable and a queue of processes blocked on the semaphore. When we perform wait, we take the initial value inside the semaphore. If it is 0 and a process calls wait, the process blocks and ends up in the list of blocked processes. Later, when some process q executes a signal, if p is in blocked queue it is woken up.

If the value initially is 1 and p calls wait (also known as P() or Down()), the value is decremented to 0 and p continues. Later, when p calls signal, if there are no processes blocking on the semaphore, the value is incremented to 1. The signal (also known as V() or Up()) never blocks.

semaphore1

Properties

Our initial semaphore value should always be >= 0, leading us to the following invariant:

\[S_{current} = S_{initial} + \#signal(S) - \#wait(S)\]

in other words, the current semaphore value is always equal to the initial value + number of signals (since it increments) - number of waits (since it decrements).

General and Binary

There are 2 types of semaphores - general and binary. A general semaphore can take any non-negative integer values (< MAX_INT) and is known as counting semaphores as one of their uses is to limit the access to something (limit x processes to do something by initializing value to x). There are also binary semaphores that is restructed to either 0 or 1.

General semaphores are provided for convenience but binary semaphores are sufficient (can mimick general). Binary semaphores are also very useful for critical sections where you would want to limit to 1 process entering (mutual exclusion).

1
2
3
4
5
int s = 1; // binary semaphore

wait(s); // process p calls wait, s decrements to 0
// critical section
signal(s);

when s is 0 and q calls wait, it gets blocked until p exits critical section and calls signal to unblock q. (When q eventually calls signal s is incremented to 1). Here, s can only be either 0 or 1, and is commonly known as mutex (mutual exclusion).

Mutex

We will try to show that the critical section is correctly enforced with a proof that \(N_{cs}\) (number of processes in the critical section) is at most 1.

informal proof

  1. \(N_{cs}\) = number of processes in critical section
  2. These process completed wait() but not signal()
  3. Thus, \(N_{cs} = \#wait(s) - \#signal(s)\)
  4. \(S_{initial} = 1\)
  5. \(S_{current} = 1 + \#signal(s) - \#wait(s)\) (from earlier equation)
  6. \(S_{current} = 1 - N_{cs}\) where (\(N_{cs} = -(\#signal(s) - \#wait(s))\))
  7. Since \(S_{current} >=0\), \(1 - N_{cs} >= 0\)
  8. \(1 >= N_{cs}\)

Another thing about semaphores is that they are deadlock-proof (with a caveat). A deadlock means that all processes are stuck at wait(s).

informal proof

  1. We have found earlier that \(S_{current} = 1 + N_{cs}\)
  2. Suppose a deadlock can occur, all processes are stuck at wait(s)
  3. This means \(S_{current} = 0\) and \(N_{cs} = 0\)
  4. But this contradicts the equation \(S_{current} = 1 + N_{cs}\)

Semaphores should also be starvation-free. If p1 is blocked at wait(s) and p2 is in cs, once it exits with signal(s), if there are no other processes sleeping,p1 wakes up. Else, assuming scheduling policies are fair, p1 will eventually wake up. Here, there are 2 assumptions being made. Firstly, processes that exit critical section always calls signal and secondly, there is a fair scheduling of processes that will eventually go through everything blocked.

Incorrect use

Although wait and signal operations are deadlock-free, semaphores can still deadlock with incorrect usage. This occurs when there are 2 non-shareable resources used in a cyclic fashion.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
int p = 1; // semaphore
int q = 1; // semaphore

// process 1
wait(p);
wait(q);
// critical section
wait(q);
wait(p);

// process 2
wait(q);
wait(p);
// critical section
wait(p);
wait(q);

Imagine the scenario where p1 acquires p and decrements it to 0. Control is then handed to p2 which acquires q and decrements it to 0. Now, p2 attempts to acquire p (now 0) and blocks, handing control back to p1. p1 attempts to acquire q (also 0) and blocks as well. Now both processes are blocked and neither can reach the code to unblock the other process.

Notice that this occurs in a very specific circumstance - when 2 processes try to acquire 2 or more resources and cannot acquire them at the same time, and they try to acquire them in a flipped manner. If they try to acquire in the same order, there won't be a deadlock. Since this is a very specific circumstance, it may not happen everytime.

Semaphores are bery powerful - there are no known unsolvable synchronizations (so far). Other high level abstraction provide extended features, such as a conditional variable. They are like semaphores but initialized to 0. We can then have multiple processes wait on that semaphore (so they block). It has the ability to broadcast, where 1 process comes by, signals conditional variable and instead of unblocking 1 process, it unblocks all processes.

POSIX

Semaphores are usually used in the context of threads in Linux/POSIX, and not often for processes (semaphore has to be shared). To use them, under header files you can #include <semaphore.h> and when compiling code, do gcc file.c -lrt which links in the real time library containing implementation for semaphores.

For basic usage, semaphore can be initialized to a value then perform wait and signal on it.

pthreads

pthreads also have mutex and conditional variables. pthread_mutex is a binary semaphore (equivalent to semaphore(1)) with methods pthread_mutex_lock() and pthread_mutex_unlock(). pthread_cond is a conditional variables with methods pthread_cond_wait(), pthread_cond_signal() (for 1 process) and pthread_cond_broadcast() (for multiple processes).

Conditional variables have to be used together with mutex and (calling pthread_cond_wait() requires a mutex argument)