Parallelism

Due to the design of clock rate, traditional processors can operate and execute only one instruction at a time. Hence, the next instruction can only carry out when the previous instruction is completed. This kind of processing leads to an inherent inefficiency because the maximum instructions able to perform are one instruction per cycle (scalar performance). In most cases, the processor may be hung up on instructions which take more than one clock cycle to complete execution. So most of the time, the performance is subscalar (which means less than one instruction is executed per CPU clock cycle).


In order to get scalar and reach better performance, a variety of design methodologies including parallelism is introduced to the processors. Parallel computing (or parallelism) is the simultaneous (to make it less linear and more parallel) execution of some combination of multiple instances of programmed instructions and data on multiple processors in order to obtain results faster. The idea is based on the fact that the process of solving a problem usually can be divided into smaller tasks, which may be carried out simultaneously with some coordination.


When we talk about parallelism in processors, instruction level parallelism and thread level parallelism are the two things we must mention. Instruction level parallelism (ILP) commonly seeks to increase the number of the operations in a computer program can be performed at the same time (i.e. to increase the utilization of on-die execution resources), and thread level parallelism (TLP) aims to increase the quantity of threads that a processor can execute simultaneously.


 

About Us | Credits | Glossary | References