Download presentation
Presentation is loading. Please wait.
Published byAnn Owens Modified over 9 years ago
1
CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah, M. Ramirez, M. Daneshtalab, P. Liljeberg, J. Plosila 1
2
Outline Introduction Mapping Problem and Evaluation Metrics Contiguous Neighborhood Allocation Mapping Experimental Setup Results and Analysis Conclusion 2
3
Outline Introduction Mapping Problem and Evaluation Metrics Contiguous Neighborhood Allocation Mapping Experimental Setup Results and Analysis Conclusion 3
4
Introduction An efficient algorithm for run-time application mapping problem Three novel contributions First node selection First task selection Map the rest of tasks onto nearest neighborhood 4
5
Outline Introduction Mapping Problem and Evaluation Metrics Contiguous Neighborhood Allocation Mapping Experimental Setup Results and Analysis Conclusion 5
6
Mapping Problem and Evaluation Metrics Applications A p =TG(T, E) t i T e i,j E Communication platform AG(Ñ, L) ñ i,j ={(r i,j, pe i,j )| ñ i,j Ñ, 0≤ i<M, 0≤ j<N} Manhattan Distance : MD(ñ i,j, ñ m,n ) = (|i - m| + |j - n|) Mapping function map: T → Ñ, s.t. map(t i ) = ñ m,n ; ∀ t i ∈ T, ∃ n m,n ∈ Ñ 6
7
Evaluation Metrics Packet latency Average Manhattan Distance Average Weighted Manhattan Distance 7
8
Evaluation Metrics (cont.) Mapped Region Dispersion Internal Congestion Ratio (ICR) The number of edges using the same channel with respect to its total number of edges 8
9
Outline Introduction Mapping Problem and Evaluation Metrics Contiguous Neighborhood Allocation Mapping Experimental Setup Results and Analysis Conclusion 9
10
Contiguous Neighborhood Allocation Mapping (CoNA) Three steps First node selection Choosing the first task of the application Contiguous neighborhood allocation 10
11
CoNA (cont.) 11
12
CoNA (cont.) First node selection The nearest node to the central manager among the nodes with the largest number of available neighbors 12
13
CoNA (cont.) Choosing the first task of the application Selects the task with the largest number of edges The most intensive communication 13
14
CoNA (cont.) Contiguous neighborhood allocation Task graph is traversed in the breadth-first order, paired with their predecessors is: {(t 1, t 4 ), (t 2, t 4 ), (t 5, t 4 ), (t 0, t 1 ), (t 3, t 2 )} Select the one which fits in the smallest square with the first node 14
15
CoNA (cont.) Contiguous neighborhood allocation Task graph is traversed in the breadth-first order, paired with their predecessors is: {(t 1, t 4 ), (t 2, t 4 ), (t 5, t 4 ), (t 0, t 1 ), (t 3, t 2 )} Select the one which fits in the smallest square with the first node 15
16
CoNA (cont.) Contiguous neighborhood allocation Task graph is traversed in the breadth-first order, paired with their predecessors is: {(t 1, t 4 ), (t 2, t 4 ), (t 5, t 4 ), (t 0, t 1 ), (t 3, t 2 )} Select the one which fits in the smallest square with the first node 16
17
CoNA (cont.) Contiguous neighborhood allocation Task graph is traversed in the breadth-first order, paired with their predecessors is: {(t 1, t 4 ), (t 2, t 4 ), (t 5, t 4 ), (t 0, t 1 ), (t 3, t 2 )} Select the one which fits in the smallest square with the first node 17
18
CoNA (cont.) Contiguous neighborhood allocation Task graph is traversed in the breadth-first order, paired with their predecessors is: {(t 1, t 4 ), (t 2, t 4 ), (t 5, t 4 ), (t 0, t 1 ), (t 3, t 2 )} Select the one which fits in the smallest square with the first node 18
19
CoNA (cont.) 19
20
Outline Introduction Mapping Problem and Evaluation Metrics Contiguous Neighborhood Allocation Mapping Experimental Setup Results and Analysis Conclusion 20
21
Experimental Setup NoC platform Plasma processor Local memory DMA controller Tra-NI interface Central manager (CM) The maximum number of applications that could be injected per second into the system is denoted as λ full 21
22
Experimental Setup (cont.) Simulation To extract packet latency FPGA To investigate CoNA time complexity Xilinx ML605 22
23
Experimental Setup (cont.) Application set Task graphs are randomly generated (set1) using the Task graph generator Number of nodes : 4 – 11 Weight of edges : 4 – 16 flits The weights of applications edges are equally multiplied by 16 (set16) 23
24
Outline Introduction Mapping Problem and Evaluation Metrics Contiguous Neighborhood Allocation Mapping Experimental Setup Results and Analysis Conclusion 24
25
Results and Analysis Packet latency evaluation Time complexity evaluation 25
26
Packet latency evaluation 26
27
Packet latency evaluation (cont.) 27
28
Packet latency evaluation (cont.) 28
29
Packet latency evaluation (cont.) 29
30
Time complexity evaluation 30
31
Time complexity evaluation (cont.) 31
32
Outline Introduction Mapping Problem and Evaluation Metrics Contiguous Neighborhood Allocation Mapping Experimental Setup Results and Analysis Conclusion 32
33
Conclusion An efficient run-time task allocation is proposed Reduce internal and external congestions Three novel contributions 33
34
Thank you ! 34
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.