Synchronization
Race Condition
Concurrent execution can casue certain problems. When we have 2 or more processes that execute concurrently in an interleaving fashion and share a modifiable resource, it can cause synchronization problems (and the switch between the process is caused by a OS preemption, not controlled by process). The most common time this occurs is when a process runs out of its time quantum.
The execution of a single sequential process is deterministic and repeated execution gives the same result. However, concurrent processes may be non-deterministic. Execution outcome depends on the order in which shared resource is accessed/modified. The process has no way of predicting when it will be preempted. This is known as race conditions.

Here we have process p1 and p2 modifying x. Recall that generally, most processes we deal with are load-store architectures. The processes do not modify the
variables directly, and instead the program has to generate 3 separate instructions, since operations can only take place on registers except load/store.
- Load X -> Register1
- Add 1000 to Regiser1
- Store Register1 -> X
Illustration

p1 loads x with 345 into reg1

p1 increments reg1 by 1000 to 1345

p1 stores value 1345 in reg1 to x

p2 loads x into reg1'

p2 increments reg1' by 1000 to 2345

p2 stores value 2345 in reg1' to x
here, no race condition occurs and value written is correct.

p1 loads x with 345 into reg1

p1 increments reg1 by 1000 to 1345, then gets preempted

p2 loads x with 345 to reg1'

p2 increments reg1' by 1000 to 1345, gets preempted

p1 stores value 1345 in reg1 to x, gets preempted

p2 stores value 2345 in reg1' to x
Here, we end up with the wrong x value
The problem of race conditions is caused by unsynchronized access to a shared modifiable resource. To solve this, we can designate code segment with race conditions as
critical section. A critical section is a section of code that modifies a shared modifiable resource. For example in circular buffer when you do head = (head + 1) % buffSize. The code that modifies this is the critical section.
Critical Section
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
When we have a critical section within our code, we do a call to enter and exit cs. This helps us synchronize access to shared resource. When 1 process is updating
shared variable, the other process should not be allowed to enter the area it is modifying shared variable. When p1 earlier is modifying x, when p2 calls enter CS,
it should block until p1 calls exit cs.
Some correct CS implementation properties include
- Mutual Exclusion
- If process
p1is executing in critical section, all other proceses are prevented from entering the critical section
- If process
- Progress
- If no process is in critical section, 1 waiting process should be granted access
- Bounded Wait
- After process
pirequests to enter critical section, there exists an upper bound of number of times other proceses can enter the section beforepi
- After process
- Independence
- Processes not executing in critical section should never block other process (from entering the CS)
Incorrect synchronization can cause problems like
- Deadlock
- All processes get blocked so no progress
- only occur if resource is shared
- Can build mechanisms to detect when deadlock about to occur and switch to another resource (hopefully not shared)
- Livelock
- Usually caused by deadlock avoidance, when both process switch at same time, collide at same resource again
- Processes keep changing state to avoid deadlock and make no other progress
- Typically processes are not blocked
- To OS, process are running but not doing anything
- Starvation
- Some processes blocked forever
Some implementations we can use are assembly level (mechanisms provided by CPU), high level language (utilizes normal programming constructs) and high level abstractions (provide abstracted mechanisms that give additional useful features,, commonly implemented with assembly-level mechanisms)
Assembly Level Implementation
Test and Set
This is an atmoic instruction and is a machine instruction to aid synchronization. We specify a register and memory location e.g. TestAndSet register, memLocation and
the contents of the memory location are written to the register and a 1 is written to the memory location. This is fully atomic, which means it is impossible for this
instruction to be interrupted before it writes 1 to memory. (is a single machine operation)
1 2 3 4 5 6 7 8 9 | |
Although this works, it is not efficient. Since processes get stuck in the while loop until it runs out of time quantum, and this repeats till a process exits CS, the e.g. 10ms where nothing is done could have been used to execute 3 million instructions. This is also known as busy waiting. It is a wasteful use of processing power. Variants of this like Compare and Exchange, Atomic Swap, Load Link/Store Conditional exist
High Level Language Implementation
Initially, this may seem like a good idea
1 2 3 4 | |
1 2 3 4 5 6 | |
Since context switching occurs when the time quantum runs out (so there is timer interrupt), disabling this means context switch will not occur and the process can run
finish. This seems good until the processes are on different CPUs. Disabling p0 on cpu 0 will not affect p1 on cpu 1. Of course, the process could also casue
problems like segmentation fault etc. and interrupts do not get reenabled. Generally we will never disable interrupts.
Now let's explore a different try
1 2 3 4 5 6 7 8 9 | |
This seems to work well, but it p0 does not enter CS, it will never change Turn to 1 and p1 can never enter cs. Thus it is possible for a process not in CS to
block another process from entering. What if we give both process a separate variable?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
Again, it works until context switch occurs after line a executes. Now, both processes have their want as 1 and neither can enter cs. They are both waiting for each
other, resulting in a deadlock. Let's try 1 more attempt by combining turn and wait
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
This finally works, and is known as Peterson's Algorithm.
Peterson's Algorithm
Some disadvantages of this include busy waiting. The waiting process repeatedly tests the while-loop condition instead of going into a blocked state. It is also a low-level construct, where we manipulate individual variables to implement, which is error prone. A higher-level programming construct is more desirable. This is also not general and not easy to extend to more than 2 processes.
Synchronization Mechanism
EdwardW. Dijkstra proposed a mechanism called semaphore which has 2 main oprations - proberen (p) or Wait() and verhogen (v) Signal()
Proberen is a decrement operation and means "to test". It tries to decrement the semaphor and if unable to, will block. Verhogen is to increment the semaphor.
A semaphore is a generelized synchronization mechanism. Something interesting is that only the behaviours of the operations are defined, but implementations are not. So they can have different implementations. Semaphors gives us a way to block a number of processes (known as sleeping process) amd unblock/wake up 1 or more sleeping processes.
This is important as the main point about synchronization is when your process runs, it often has to wait for something to happen. It is important that such process gets blocked and not proceed while it waits (whether it has not received a value it needs or another process is in critical section).
If process p waits for q, once q finishes running, it should unblock p, else q cannot carry on. Similarly in critical section, once a process exits, it should
be able to unblock other process so they can enter cs as well.
Semaphore
The 2 fundemental operations of semaphores are Wait() and Signal(). A semaphore S contains an integer value (it is a protected integer variable - any operations on
this variable is always atomic, so Wait() and Signal() operations on s are always atomic). It can be initialized to any non-negative integer values initially.
Wait() takes the value inside the semaphore. If it is a positive vlaue, it will decrement the semaphore by 1. If it has 0, the process blocks. Signal() increments
the semaphore and wakes 1 of any sleeping processes. This increment occurs only when there are no sleeping processes. Note that generally semaphores do not go negative.
So Signal() will take a sleeping process to increment S.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |

As seen, the semaphore has a protected integer variable and a queue of processes blocked on the semaphore. When we perform wait, we take the initial value inside the
semaphore. If it is 0 and a process calls wait, the process blocks and ends up in the list of blocked processes. Later, when some process q executes a signal, if p
is in blocked queue it is woken up.
If the value initially is 1 and p calls wait (also known as P() or Down()), the value is decremented to 0 and p continues. Later, when p calls signal, if
there are no processes blocking on the semaphore, the value is incremented to 1. The signal (also known as V() or Up()) never blocks.

Properties
Our initial semaphore value should always be >= 0, leading us to the following invariant:
in other words, the current semaphore value is always equal to the initial value + number of signals (since it increments) - number of waits (since it decrements).
General and Binary
There are 2 types of semaphores - general and binary. A general semaphore can take any non-negative integer values (< MAX_INT) and is known as counting semaphores as
one of their uses is to limit the access to something (limit x processes to do something by initializing value to x). There are also binary semaphores that is
restructed to either 0 or 1.
General semaphores are provided for convenience but binary semaphores are sufficient (can mimick general). Binary semaphores are also very useful for critical sections where you would want to limit to 1 process entering (mutual exclusion).
1 2 3 4 5 | |
when s is 0 and q calls wait, it gets blocked until p exits critical section and calls signal to unblock q. (When q eventually calls signal s is
incremented to 1). Here, s can only be either 0 or 1, and is commonly known as mutex (mutual exclusion).
Mutex
We will try to show that the critical section is correctly enforced with a proof that \(N_{cs}\) (number of processes in the critical section) is at most 1.
informal proof
- \(N_{cs}\) = number of processes in critical section
- These process completed
wait()but notsignal() - Thus, \(N_{cs} = \#wait(s) - \#signal(s)\)
- \(S_{initial} = 1\)
- \(S_{current} = 1 + \#signal(s) - \#wait(s)\) (from earlier equation)
- \(S_{current} = 1 - N_{cs}\) where (\(N_{cs} = -(\#signal(s) - \#wait(s))\))
- Since \(S_{current} >=0\), \(1 - N_{cs} >= 0\)
- \(1 >= N_{cs}\)
Another thing about semaphores is that they are deadlock-proof (with a caveat). A deadlock means that all processes are stuck at wait(s).
informal proof
- We have found earlier that \(S_{current} = 1 + N_{cs}\)
- Suppose a deadlock can occur, all processes are stuck at
wait(s) - This means \(S_{current} = 0\) and \(N_{cs} = 0\)
- But this contradicts the equation \(S_{current} = 1 + N_{cs}\)
Semaphores should also be starvation-free. If p1 is blocked at wait(s) and p2 is in cs, once it exits with signal(s), if there are no other processes
sleeping,p1 wakes up. Else, assuming scheduling policies are fair, p1 will eventually wake up. Here, there are 2 assumptions being made. Firstly, processes that
exit critical section always calls signal and secondly, there is a fair scheduling of processes that will eventually go through everything blocked.
Incorrect use
Although wait and signal operations are deadlock-free, semaphores can still deadlock with incorrect usage. This occurs when there are 2 non-shareable resources used
in a cyclic fashion.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
Imagine the scenario where p1 acquires p and decrements it to 0. Control is then handed to p2 which acquires q and decrements it to 0. Now, p2 attempts to
acquire p (now 0) and blocks, handing control back to p1. p1 attempts to acquire q (also 0) and blocks as well. Now both processes are blocked and neither
can reach the code to unblock the other process.
Notice that this occurs in a very specific circumstance - when 2 processes try to acquire 2 or more resources and cannot acquire them at the same time, and they try to acquire them in a flipped manner. If they try to acquire in the same order, there won't be a deadlock. Since this is a very specific circumstance, it may not happen everytime.
Semaphores are bery powerful - there are no known unsolvable synchronizations (so far). Other high level abstraction provide extended features, such as a conditional variable. They are like semaphores but initialized to 0. We can then have multiple processes wait on that semaphore (so they block). It has the ability to broadcast, where 1 process comes by, signals conditional variable and instead of unblocking 1 process, it unblocks all processes.
POSIX
Semaphores are usually used in the context of threads in Linux/POSIX, and not often for processes (semaphore has to be shared). To use them, under header files you
can #include <semaphore.h> and when compiling code, do gcc file.c -lrt which links in the real time library containing implementation for semaphores.
For basic usage, semaphore can be initialized to a value then perform wait and signal on it.
pthreads
pthreads also have mutex and conditional variables. pthread_mutex is a binary semaphore (equivalent to semaphore(1)) with methods pthread_mutex_lock() and
pthread_mutex_unlock(). pthread_cond is a conditional variables with methods pthread_cond_wait(), pthread_cond_signal() (for 1 process) and pthread_cond_broadcast() (for multiple processes).
Conditional variables have to be used together with mutex and (calling pthread_cond_wait() requires a mutex argument)