[Lecture Notes by Prof Yuen and possibly others 2000-2001]

Single Machine - Multiple Machines
----------------------------------

You are all familiar with seeing hundreds of users sitting at their
terminals working on the same machine, and also with working on
a PC with several open windows all actively doing something. So the
concept of a single machine pretending to be multiple machines is
not new to you, but how is this achieved? What combination of hardware
and software techniques is involved to realize the concept cheaply,
reliably and on a large scale?

The first idea is time slicing: a processor can execute one program
for a short period of time, then switch to another, then another.
Human users take seconds, sometimes minutes, to type in a line of
text or move/click a mouse, and would willingly wait for a similar
amount of time for a simple response and a longer period for more
complex work to be done. If a processor spends 10 milliseconds on
each user, it can interact with 100 users once per second, providing
each with some quite lengthy processing since 10 milliseconds of time
is usually enough to execute thousands of instructions on behalf of
a user. 

But some user's input requires a simple response that takes less than
10ms to produce, after which their programs have nothing to do until
they get some more input; others require much longer processing. So
the computer must provide for two kinds of program suspensions: a
voluntary one "I am waiting for input; please execute some other program
and wake me up when new input arrives", and a forced one: at the end
of 10ms or some other pre-set limit, a user program that does not stop
voluntarily will be forced to suspend, resuming only after other user
programs have had their chance to execute. The forced suspension requires
a hardware component called the interval timer, which signals the
processor when a pre-set time limit expires, causing the processor to
halt its current execution and jump to a system program that chooses
a new user program for execution according to some program swapping
rules.

Since we have multiple user programs in the same machine taking turns
to execute, it is necessary to ensure the clear separation of the code
and data belonging to different programs; this is achieved by establishing 
memory partitions, usually involving assigning distinct virtual addresses 
to different programs so that any access on a memory location not belonging
to you would be easily detected. In particular, input areas for keystrokes
coming in from different terminals, and output areas for text and graphics
going out to different screens, must be clearly demarked. While the
processor is running one user program, putting its output into its memory
area for the terminal screen, other users may be typing in input which 
must be put into their correct memory locations, so that when their 
programs get the chance to execute, they can pick up their input data
from where they expect it.

What we have just described is called multiprogramming. Multi-windowing,
though it is done on small machines for single users, is actually more
complex and came much later, because you also have to make a single
terminal pretend to be multiple terminals. Using mouse clicks and moves,
which are recorded in the input area, the user defines a set of windows
occupying particular parts of the screen, and this is recorded in the
output area of the terminal. By other mouse operations, the user moves
the cursor to a particular window (information in the input area modifying
information in the output area), which shows which program he wants to
interact with at this moment. The system will therefore choose a particular
program to execute, and send input from the terminal to that program's
input area, as well as sending the program's output to the terminal's
output area used for that particular window.

We have so far discussed multiple programs that are competitive: they
compete to execute on the processor, for allocations of memory, and
for the use of input/output facilities including even a single terminal.
We can also have cooperative multiple programs, which solve different
parts of the same problem concurrently and put their results together; 
this is called multi-tasking. When such a program is run on a machine
with multiple processors, we could get the final results earlier, which
is why certain time critical problems like weather forecasting and real
time control (sometimes known as embedded systems) must use parallel
machines. The Deep Blue system that defeated Gary Kasparov in chess
was one such parallel machine, since chess rules require the moves
to be decided within certain time limits. Machines like this, with 
hundreds, sometimes thousands, of processors, is called an MPP, Massively
Parallel Processor.

In other words, multiple machines can also pretend to be a single machine,
but this can take many forms. Some parallel machines have processors
that read/write in a common memory, while others have individuall memories
and send messages to each other in order to share information. Some
systems have all the processors execute in step, some run independently
but share a common program (though can run different parts of it), and
yet others can accomodate a different program per processor. It is even
possible for a single machine to pretend to be multiple machines by
timeslicing, and then run a multi-tasking program that splits a single
problem into concurrent parts. While you do not get results any faster
(more likely to be slower because of overheads), the program would have
been written in a more modular fashion in smaller and more manageable
pieces.

In the cluster arrangement, a set of workstations would together define a
large virtual memory space, instead of each machine defining a separate
space. A multi-tasking program is written with each component occupying
part of the space, and two components may have overlapping parts in order
to work together. When a component is sent to a particular machine to
execute, its use of particular addresses causes the wanted information to
be brought from the disk to that machine's memory, with multiple copies of
shared data on machines that are sharing the information. The need to
ensure that both sides see the same information raises the so called
"memory consistency" problem. Within a cluster, work can migrate from
machine to machine: the content of the memory moves to the machine that
actually processes it. If a program crashes on one machine, it can be
restarted on another machine with information saved from the previous
run; work started on one machine can be stopped and moved to another
machine that is less busy. In this way clustering aims to make multi
tasking easier to do as well as more reliable.