Process Management

Process ID & State

Process Identification

Now, we look at how OS manages processes.

Process are identified by number known as process ID (PID), unique to each process. However, there are some OS dependent issues -- when a process ends, are the PIDs reused? Is there a limit to number of process created? Are there any PIDs reserved for specific processes?

Unix

In Unix, PID1 is reserved for a special process called init

Process state

Apart from a PID, a process can also have process state. In multitasking scenario(where your computre runs >1 thing at a time). An obvious observation is a process can be running/not, but it could be ready to run but not actually executing on the cpu. Thus, the process state indicates its execution status.

In the simplest form, a process can be in ready state. Remember in our genereal assumption, there is only 1 cpu and 1 core so we can only do 1 thing at a time. When the OS is running, the CPU taking and executing its instructions, so user process not running. Another scenario is when there are multiple processes, when 1 runs, the other is not since the CPU can only do 1 thing at a time.

Not running

Just because a process is not running, does not mean it is not capable of being run. It can have all its inputs etc. to run, but no CPU to run. These preocess are in the ready state

Part of our OS is a scheduler, which picks a process from a pool of ready processes for running. When process invoked to run, we perform a context switch. The hardware context (register values a process is using) is saved.

Since there is only 1 set of registers in the CPU, it has to be shared between processes and since every process has a unique set of values, the values of registers will be different. This means that when we context switch, the current values have to be saved somewhere and the new process' values have to be loaded into the registers.

simpleModel

Once context switch is compete, the ready process is in running state. This means that the instructions are now being actively picked up by the CPU and executed.

At some point, the process stops running, its register contents is saved (register content of another process gets loaded and that process runs) and it switches back to ready state. These set of states and transitions are known as a process model.

State Diagram

processModel

breakdown

NewReadyRunningBlockedTerminated

new

Executable/binary files contain a header(gives us information about memory usage, global var, etc.) and compiled code for program (binary inrtructions you saw in cs2100). When loading program, the OS organise the memory for the program (header info is used to figure out how much memory needed for text, stack, data, heap etc.) and allocate it.

This happens when we create a new process (new state). Essentially, OS opens executable file, look at header, figure and allocate memory and a few other things.

ready

At some point, the OS has initialized its memory, loaded the compiled code, global variables etc. and now the process is ready to run. But just becuse it is ready, doesn't mean it should. If the currently running process is more important, it goes into the ready state.

run1

When the scheduler decides process should run, OS performs context switch. Register values of currently running processes is saved somewhere, and register value for new process is loaded.

Once context (of this process) is restored, OS hands control over to this process and process is switched to running state

run2

At some point os decides the process ran enough, so it saves context, restore context of another process and switch it back to ready state.

block1

While running, process may end up waiting for something (e.g. msword wait for user to enter keystroke). While waiting, it does not make sense to coninue running (the time can be worth a few million instructions!) so OS switches it into a blocked state.

block2

Once the event happens e.g. key is pressed, the process switches back to ready state (not running) where the cycle continues

terminate

Finally, when process finish e.g. ctrl + q hit, process moves to a terminated state. All memory and resources used by the process are released so they can be recycled

In summary, the 5 stages occur (and its transitions):

Create (nil -> New), process created
New: New process created, may still be under initialization, not ready yet
Admit (new -> ready), process ready to be scheduled for running
Ready: Process waiting to run
Switch (ready -> running), process selected to run
Switch (running -> ready), process gives up CPU coluntarily/preemptedby scheduler
Running: Process executed on CPU
Event wait (running -> blocked), process requests event/resource/service, currently unavailable/in progress
Event occurs (blocked -> running), process can continue
Blocked: Process waiting (sleep) for event, cannot execute until available
Terminated: Process finish execution, may need OS cleanup

Note that the state diagram is per process (each has its own state diagram). Given n processes, within 1 cpu theres at most 1 process running (can be 0 when OS is running). The os is a program not a process, and does not generally maintain process states of itself

The OS is the one that creates and manage process but is itself not a process. (This isn't 100% true there can be parts of OS that are processes like the shell but kernel itself is not a process.)

Conceptually if we have only 1 CPU, there is at most 1 transaction at a time. So with n CPUs, we have at most n processes in a running state, and these n processes can change state simultaneously (parallel state transitions).

Remember that the state diagram is per-process, so each process can be in a different state i.e. different part of its own state diagram

Queueing Model

So how does the OS implement the above state diagram? It makes use of queues!

queueModel

breakdown

AdmitSwitchReleaseWaitExit

admit

When process admitted, OS goes through steps of reading program file header, allocating memory required, load process code etc. When done, the process is admitted into the ready queue (queue of process ready to run) which can be FIFO/priority queue (based on various things)

switch

From the ready queue, OS scheduler picks 1 to be run and perform context switch so process enter running state.

release

At some point, the OS decides process should stop and saves context of process, puts it back into ready queue

wait

Alternatively, process may be waiting for something where it goes into the blocked queue. Once it receives something, it is removed from blocked queue and put into ready queue

exit

At the end when process quits, the process exits, all resources are removed and reused for other processes

Let's see what we know so far. When a program is under execution, we require more information -- memory context (text that stores instructions, data which hold global variables, stack hold call frame, local variables and heap which holds dynamic variables), hardware context (GPRs, stack pointer, PC etc.) and now the OS Context (process ID, process state etc.)

This is the idea of the process state model. Now let's look at how this information is managed.

Proces Table & Process Control Block

The fundemental data structure used to manage processes is the process control block (PCB or process table entry), containing the entire execution context of a process. The kernel maintains the PCB for all processes. Conceptually, all PCB are stored as 1 table (representing all processes)

However, we have a few interesting issues:

Scalability: how many process can potentially be run(how many concurrent processes)?
Efficiency: The OS, being a program all its data structures occupy memory. Since memory is limited, less spce for running processes. So it should provide efficient access with minimum space wastage

Process Table

In our process table, there is a PCB for each process currently loaded. Within each pcb, we have hardware context (in red), memory context(in green) which needs to be divided into multiple regions, and os context(in purple). Note that the PID is often not stored and is just an index into a process table.

processTable

In addition, the memory region not just tells where the text, stack, data and heap are, but since we have to share the memory across many processes, its also tells where the regions for different process start and end.