|
Parallelism
Due to the design of clock rate, traditional
processors can operate and execute only one instruction at
a time. Hence, the next instruction can only carry out when
the previous instruction is completed. This kind of processing
leads to an inherent inefficiency because the maximum instructions
able to perform are one instruction per cycle (scalar performance).
In most cases, the processor may be hung up on instructions
which take more than one clock cycle to complete execution.
So most of the time, the performance is subscalar (which means
less than one instruction is executed per CPU clock cycle).
In order to get scalar and reach better performance, a variety
of design methodologies including parallelism is introduced
to the processors. Parallel computing (or parallelism) is
the simultaneous (to make it less linear and more parallel)
execution of some combination of multiple instances of programmed
instructions and data on multiple processors in order to obtain
results faster. The idea is based on the fact that the process
of solving a problem usually can be divided into smaller tasks,
which may be carried out simultaneously with some coordination.
When we talk about parallelism in processors, instruction
level parallelism and thread level parallelism are the two
things we must mention. Instruction level parallelism (ILP)
commonly seeks to increase the number of the operations in
a computer program can be performed at the same time (i.e.
to increase the utilization of on-die execution resources),
and thread level parallelism (TLP) aims to increase the quantity
of threads that a processor can execute simultaneously.
|