Inter-Process Communication

Why do we have inter-processs communication? A big problem with processes is that it is hard for cooperating processes to share information, since memory space is independent. Additionally, if we have a process with a global variable that gets incremented, but before that fork() is called, we now have 2 separate copies. This makes the global variables useless for communicating.

There are 2 common IPC mechanisms, shared-memory and message passing and 2 Unix-specific ones, Pipe and Signal.

Shared Memory

The general idea is that process p1 creates a shared memory region m and process p2 attaches m to its own memory space. This allows p1 and p2 to communicate through m. This shared memory region m behaves like a normal memory region - when we want to access to read/write to it, we can just do a dereferencing *m. No system calls are involved (other than creating and attaching to m). Any writes to the region can be seen by all other parties

The same model is applicable to multiple processes sharing the same memory region.

sharedMem

The only place with OS calls are "Create M" and "Attach M". The read/write to m are pure pointer operations. Thus the only time we incur overheads of switching to and from kernel mode is when we create and attach. At the end, there is also a detach and delete which would involve OS operations.

Advantages

This is efficient as the only time we make OS calls is to create and attach. The rest of the time doing message passing is only using pointer operations. This is also easy to use.

Disadvantages

Unfortunately, we cannot do synchronization. This means if we have process A and B communicating through m, we cannot have process B wait for A to write to m before reading. So B reads from m regardless of whether A has written to it. And although using is easy, implementation is slightly complicated, since it involves the virtual memory system. If there isn't a virtual memory system in the CPU, it is even harder!

POSIX

For POSIX shared memory in unix, the basic steps are as follows:

Create/locate a shared memory region M
Attach M to process memory space
Read from/Write to M
- Values written visible to all process that share M
Detach M from memory space after use
- process no longer has access to shared memory
- can reattach later on
- shared memory still exists, still consuming resources
Destroy M
- Only one process need to do this
- Can only destroy if M is not attached to any process
- Removes shared memory and deallocate resources

Message Passing

General idea here is process p1 prepares a message m and uses the OS call e.g. send(p2, m) to send it to process p2. p2 receives the message me.g. x = receive(p1). Message sending and receiving are usually provided as system calls.

How the system calls work is firstly, we have to initialize the message passing mechanism. When initializing, within the OS there will be a region of memory allocated for this communication. When p1 performs a send, the data is written into that part of memory (in the kernel). When p2 does a receive, data is read and passed to p2. Since everything is inside the kernel, send and receive are usually kernel calls, meaning they tend to be slow.

Some additional properties include naming (to identify the other party in the communication) and synchronization (the behaviour of sending/receiving operations)

messagePass

There are 2 naming schemes (2 different ways to specify who to talk to). Firstly, we can have the direct naming scheme. What we do is we specify who we want to send or receive to/from, as seen earlier. The other way is mailbox scheme, where we create a mailbox e.g. m = mailbox(). When sending, we send via the mailbox m.send(msg) and receive from the mailbox.

There are also 2 different synchronization mechanisms, synchronus and asynchronous. Under synchronus, when p1 does a send, this send blocks until p2 does a receive. For asynchronous, after p1 sends, it continues (doesn't block). When p2 receives, does not matter whether there is a message.

Advantages

Message passing is portable, usually can be put in a user library and used on any OS. Synchronization is easier too due to blocking.

Disadvantages

If everything is implemented as a kernel call, it is very inefficient. This can be harder to use as we need to use send/receive rather than just pointer operations.

Unix Pipes

Unix pipes is a communication and synchronization mechanism fairly unique to Unix (and the Linux variants). A pipe is a communication channel and processes in unix are given 3 standard files - stdin (default input, generally connected to keyboard), stdout (output sent to screen) and stderr (print error messages). stdout can actually be redirected (written) to a file (default is printed to screen). Note stderr cannot do so.

When doing pipes, what we do is connect stdout of 1 process to stdin of another process, called a unix pipe. This is not the default thing to do, and instead we usually make use of pipes as a communication channel.

pipe

In unix command line, there is a symbol | which symboolizes a pipe, linking the output/input channels of 1 process to another, known as piping.

ls

When you type the comman ls, it is usually printed to the screen. When connecting ls to less with a pipe i.e. ls -l | less, whatever printed to screen is read by less which displays 1 screenful at a time, and when you hit spacebar it shows another screenful.

Unix pipes are one of the earliest IPC mechanism and the general idea is that we create a communication channel with 2 ends (writing where we put data in and reading).

unixPipe

IPC Mechanism

unixIPC

Process P writes abcd to the shared pipe and process Q reads abcd from the pipe. By default, pipes are character-oriented (read/write char by char). This is a form of producer-consumer relationship. We will study the behaviour, semantics and variants of the pipe.

#define READ_END 0
#define WRITE_END 1

int main()
{
    int pipeFd[2]; // arr of 2 int
                   // pipeFd[0] read from pipe
                   // pipeFd[1] write to pipe
    int pid, len;
    char buf[100];
    char *str = "Hello there!";

    // creates file descriptors, a way to identify file 
    pipe(pipeFd); // call pipe w 2 elem arr

    if ((pid = fork()) > 0) {    /* parent (write) */
        close(pipeFd[READ_END]); // close reading end

    //library call(writing end, string, no. of bytes + \0)
        write(pipeFd[WRITE_END], str, strlen(str) + 1);
        close(pipeFd[WRITE_END]); // close write end
    } else {                      /* child (read) */
        // since child is exact copy of parent, 
        // will have a copy of the pipe
        close(pipeFd[WRITE_END]);
    //function_call( , buffer to write to, size of buffer)    
        len = read(pipeFd[READ_END], buf, sizeof(buf));

        printf("Proc %d read: %s\n", pid, buf);
        close(pipeFd[READ_END]);    
    }
}

writeRead

Notice that you always close the end you are not using. This enables the mechanism where parent and child receieve the SIG_PIPE and EOF signal respectively. In other words, the writer and reader would not get the correct signals.

So with pipes, we can attach/change the standard communication channels (stdin, stdout, stderr) to one of the pipes, making a program read from the first and write to the third. To do this, we make use of dup() or dup2() (used more often). dup2() makes a copy of the file descriptor and assigns it to another known file descriptor. For dup(), it is random.

Unix Signal

This is somewhat unique to unix. It is a form of IPC that is adynchronous. It is a notification sent to a process/thread, usually processed by the OS to inform the process that something has happened. The recipient has to handle the signal either by a default set of handlers or use a user supplied handler (only for some signals).

SIG_INT is a signal that is sent to process when someone presses ctl z. The default handler breaks the process and terminates it. With user supplied handler, you can catch the signal and let the process continue, preventing someone from interrupting the program.

Unix signals

Some signals in unix include kill, stop, continue, memory error, arithemetic error etc.

In the example below, we want to interrupt the segmentation fault hander. The default handler just terminates process and prints segmentation cault to the screen, which can be annoying.

Custom Signal Handler
#include <stdio.h>
#include <signal.h>
#include <unistd.h>

// intercept segmentation fault handler
void myOwnHandler( int signo ) // int signo used to pass signal number
{
    if (signo == SIGSEGV){ // match 
        printf("Memory access blows up!\n");
    exit(1); // can choose not to exit, let process continue
}

int main(){
    int *ip = NULL; // points to address 0: (void*) 0
    // signal fn(signal to capture, your own handler)
    if (signal(SIGSEGV, myOwnHandler) == SIG_ERR) // checks if assignment of handler is successful
        printf("Failed to register handler\n");

    *ip = 123; // attempts to write 123 to 0, generally does
               // not belong to process, segmentation fault

    return 0;
}