Computation and data migration in an embedded many-core SoC January Matthieu BRIEDA Anca MOLNOS Julien MOTTIN 1
Background 2
Simulated Heat map of Sthorm Platform before (left) and after (right) activity migration Becher, M., Bensalem, S., & Pacull, F. (2014, February). Icy-Core Framework for Simulating Thermal Effects of Task Migration Algorithms on Multi-and Many-Core Architectures. In ICONS 2014, The Ninth International Conference on Systems (pp ). 3
Work context and objective Implement computation and data migration to enable thermal mitigation 4 Many-core accelerator PE Local Memory Cluster 3Cluster 2 Cluster 0 Cluster 1 Legend Processing Element Memory NoC router HOST Global Memory PE Local Memory
Problems 1.Task migration ( between iteration) Remote data access: performance loss 5 Many-core accelerator PE Cluster 3Cluster 2 Cluster 0 Cluster 1 PE 3 3 T T Local data access T T Migration Remote data access Legend Task Data T T 3 3
Problems 2.Data migration Pointer invalidation: application error Address space Cluster 1 Cluster 0 Cluster 2 Cluster 3 0xFFFFFFF 0x T code: int* pointer = malloc(4); *p = 3; … int a = *p; T code: int* pointer = malloc(4); *p = 3; … int a = *p; Address mapping Many-core accelerator PE Cluster 3Cluster 2 Cluster 0 Cluster 1 PE 0x Migration 6 T T
Solution overview Host Many-core accelerator 7 Application Framework Decision policy (e.g., temperature mitigation, …) Decision policy (e.g., temperature mitigation, …) allocators, communication API, HAL OS Contributions Goal Interface Blocks Application building interface Task and data mapping interface 3. Memory translation mecanism 2. Migration protocol 1. Application management
Application building interface Task ID Init() executed once Fire() executed iteratively End() executed once List of data ID Data ID Status shared/private Size Application 8 Application control task -Start/stop app Application control task -Start/stop app Application Control Task Inter-task shared data Inter-iteration shared data
Run-time app initialization 9 Application control task Framework Decision policy 1. Application Management Shared memory allocation Private memory allocation Task creation and start Unmapped task and data Mapped task and data Local tables Local tables Legend Data flow Control flow App description PE and memory attribution Initialization
Migration protocol fire() fire() Destination controller fire() source PE destination PE Source Controller Legend Framework function User function Trigger T T P P T T P P From Cluster 1To Cluster 3 Paused 10 Resume 2. Pause task 3. Data copy 1. Allocate new memory 4. Free old memory 5. resume task
Translation mecanism 11 data ID, Task ID address 2. task_get_addess Data ID, Task ID => address fire(){ int *pointer = task_get_addess(dataID); if(iteration==1) *pointer = 3; if(iteration==2) int a = * pointer; } fire(){ int *pointer = task_get_addess(dataID); if(iteration==1) *pointer = 3; if(iteration==2) int a = * pointer; } Local table framework: – Provide address virtualization in software – Update the translation at data migration user: – Accesses data based on IDs – Never allocates memory directly => Solve the pointer invalidation problem Square 0
Experimental Setup 12
Experimental Results step Duration (cycles) dependency # of Data 23166Constant 32996Data size 49968# of Data 51536# of Data Sum36327 Frozen task Total Migration fire() fire() fire() source PE destination PE Source Controller Paused Resume Destination controller 2. Pause task 3. Data copy 1. Allocate new memory 4. Free old memory 5. resume task Legend Framework function User function Trigger Total Migration duration Frozen task Duration
Conclusion Demonstration of a proof-of-concept task and data migration on a many-core SoC at enabling thermal mitigation at a reasonable cost. & Questions 14