Presentation is loading. Please wait.

Presentation is loading. Please wait.

CISC 879 : Advanced Parallel Programming Vaibhav Naidu Dept. of Computer & Information Sciences University of Delaware Importance of Single-core in Multicore.

Similar presentations


Presentation on theme: "CISC 879 : Advanced Parallel Programming Vaibhav Naidu Dept. of Computer & Information Sciences University of Delaware Importance of Single-core in Multicore."— Presentation transcript:

1 CISC 879 : Advanced Parallel Programming Vaibhav Naidu Dept. of Computer & Information Sciences University of Delaware Importance of Single-core in Multicore Era Toshinori Sato, Hideki Mori, Rikiya Yano, Takanori Hayashida - Fukuoka University, Japan Published: Thirty-fifth Australasian Computer Science Conference

2 CISC 879 : Advanced Parallel Programming Outline Introduction Motivation Searching for best Multicore Single-core Performance improvement Results Conclusion

3 CISC 879 : Advanced Parallel Programming Introduction Pollack’s rule Processor performance is proportional to the square root of the area of the processor. Amhadl’s law: The speedup using multiple processors in parallel computing is limited by time needed for the sequential fraction of the program.

4 CISC 879 : Advanced Parallel Programming Motivation Increasing number of transistors for increasing the number of cores on a chip might not be the best choice. What would be the best configuration of a multicore processor? How do we improve the performance of the single-core?

5 CISC 879 : Advanced Parallel Programming Searching for the best Multicore As number of transistors increase on a chip, the flexibility to determine a processor configuration also increases. With this flexibility, we don’t know which the best configuration is; how many cores should it have; etc.

6 CISC 879 : Advanced Parallel Programming Searching for the best Multicore Processor Topologies:

7 CISC 879 : Advanced Parallel Programming Searching for the best Multicore Processor Topologies: 1.Single-core: For a better performance, in the future, one option is to increase the size of the core. All transistors on the chip are utilized by a single core. 2. Many-core: The core microarchitecture is fixed and multiple copies of the core are integrated on the chip.

8 CISC 879 : Advanced Parallel Programming Searching for the best Multicore 3. Heterogeneous Multicore: Only one core becomes large and other cores remain small. 4.Scalable Multicore: A collection of small cores that can logically fuse together to compose a high-performance large core. 5.Dynamically Configurable: The processor cores can combine together to form a larger core.

9 CISC 879 : Advanced Parallel Programming Single-core vs Many-core Single-core: As the core becomes larger, area-performance ratio meets a diminishing return (Pollack’s rule) Many-core: If the amount of parallelizable code is less, the speedup might not be as much (Amhadl’s law)

10 CISC 879 : Advanced Parallel Programming Single-core vs Many-core X-axis: Times the area of a baseline processor Y-axis: Performance improvement rate

11 CISC 879 : Advanced Parallel Programming Single-core vs Heterogeneous Multicore Heterogeneous Multicore: They are widely studied for improving energy efficiency. Parallelized portions are executed by multiple small cores and hard-to-parallelize portions are executed by a big strong core. Interestingly, the performance is equivalent regardless of the big core’s size

12 CISC 879 : Advanced Parallel Programming Single-core vs Heterogeneous Multicore X-axis: Times the area of a baseline processor Y-axis: Performance improvement rate

13 CISC 879 : Advanced Parallel Programming Heterogeneous Multicore vs Scalable Homogeneous Scalable Homogeneous: They have smaller number of larger cores. Sometimes using 3 large cores is desirable when compared to using 6 small cores.

14 CISC 879 : Advanced Parallel Programming Heterogeneous Multicore vs Scalable Homogeneous X-axis: Times the area of a baseline processor Y-axis: Performance improvement rate

15 CISC 879 : Advanced Parallel Programming Heterogeneous vs Dynamically Configurable Dynamically Configurable: They dynamically configure each core and size of each core.

16 CISC 879 : Advanced Parallel Programming Heterogeneous vs Dynamically Configurable X-axis: Times the area of a baseline processor Y-axis: Performance improvement rate

17 CISC 879 : Advanced Parallel Programming Heterogeneous vs Dynamically Configurable Dynamic reconfiguration suffers approx. 25% penalty.(0.8 DC-n & 0.8 DC-8) As the number of cores increases, it becomes difficult to combine all cores due to the increasing complexity of interconnects. Red dashed line represents the current technology. The 0.8 DC 8 is the most practical Dynamically configurable processor and it’s performance is not as good as Heterogeneous.

18 CISC 879 : Advanced Parallel Programming Single-Core performance improvement Increasing clock frequency has been the easiest way to improve performance. But it increases the power supply voltage, resulting in serious power and temperature problems. A technique to increase the clock frequency without increasing supply voltage.

19 CISC 879 : Advanced Parallel Programming Cool Turbo Boost Intel’s Turbo Boost Technology increases the supply voltage and thus clock frequency. Cool Turbo Boost Technology, will not require the increase in supply voltage. When the hardware size and complexity become small, there is an opportunity to increase its clock frequency. (Intel ATOM)

20 CISC 879 : Advanced Parallel Programming Cool Turbo Boost Datapath: A collection of functional units, as arithmetic logic units or multipliers, that perform data processing operations, registers, and buses. When datapath becomes small, its computing performance is degraded. If the performance loss is not compensated by the clock frequency boost, then the processor performance is diminished.

21 CISC 879 : Advanced Parallel Programming Cool Turbo Boost Instruction level parallelism (ILP): Number of operations in a computer program that can be performed simultaneously. When ILP is small, small datapath is enough; otherwise, the datapath should not be reduced. Hence, the datapath is dynamically configured according to ILP in each program phase.

22 CISC 879 : Advanced Parallel Programming Cool Turbo Boost Multiple Clustered-Core Processor (MCCP): Configures its datapath according to ILP and thread level parallelism (TLP) in the program. The authors configure MCCP so that its clock frequency is increased when it configures its datapath small.

23 CISC 879 : Advanced Parallel Programming Results Six programs from SPECint2000 are used and executed for 2 billion instructions are executed Narrow Datapath ResultsCool Turbo Boosting Results X-axis: Boosting ratio Y-axis: Normalized Single-core performance

24 CISC 879 : Advanced Parallel Programming Results Average performance loss of Narrow datapath is 36.1% and of Cool turbo boost is only 4.2% When boosting rate reaches 1.4 and 1.6, the performance is improved by 5.0% and 8.7% on average respectively. For parser (which includes gzip, vpr and parser) the performance of cool turbo boost is not good. Whereas for Vortex (includes gcc and vortex) the performance is better regardless of boosting ratio.

25 CISC 879 : Advanced Parallel Programming Conclusion Paper investigates the best multicore configuration for the near future, winner is Heterogeneous Multicore. It unveiled that the single-core performance is the key for improving the performance of the heterogeneous multicore in the near future. The average performance improvement using the Cool Turbo Boost Technology is only 5%. Hence, future studies are to be made in this area.

26 CISC 879 : Advanced Parallel Programming Questions?

27 CISC 879 : Advanced Parallel Programming Thank you


Download ppt "CISC 879 : Advanced Parallel Programming Vaibhav Naidu Dept. of Computer & Information Sciences University of Delaware Importance of Single-core in Multicore."

Similar presentations


Ads by Google