Many-SC Programming Model Jaejin Lee Center for Manycore Programming Seoul National University
Schedule for the first year Schedule(13/11/11~14/11/10) Developing a low-level communication library 2 Developing a threading library 3 Developing a software SVM 4 Verification and performance analysis of the low-level communication library and the software SVM on Chundoong Interim report Final report /27
Cluster The Structure of the Many-SC Core Cluster Core Cluster Core Cluster Core
The Structure of Typical Cluster Systems Node Core Node Core Node Core Node Core Interconnection network
Developing a Low-level Communication Library Many-SC Cache coherence protocol Works between cores in a cluster Does not work between cores in different clusters It is very similar to the typical cluster systems Plan Developing a low-level communication library for typical cluster systems Apply this to the Many-SC
Low-level Communication Library Provide high-level APIs similar to MPI The API functions are optimized for SnuCL Use RDMA internally Some API functions
SnuCL An OpenCL framework for heterogeneous clusters Platform layer + runtime + kernel compiler Freely available, open-source software Supports OpenCL 1.2 Passed most of OpenCL conformance tests Supports x86 CPUs, ARM CPUs, AMD GPUs, NVIDIA GPUs, Intel Xeon Phi coprocessors (from July, 2013) With SnuCL, an OpenCL application written for a single operating system instance runs on a heterogeneous cluster without any modification
The Structure of SnuCL
Replacing Communication Library Hardware MPI SnuCL OpenCL Applications Hardware Low-level communication library SnuCL OpenCL Applications
Performance
Developing Thread Library Provide a subset of POSIX thread API functions NameDescription int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine)(void *), void *arg); Starts a new thread in the calling process void pthread_exit(void *status);Terminate calling thread int pthread_join(pthread_t thread, void **status);Join with terminated thread pthread_t pthread_self(void);Obtain ID of the calling thread int pthread_mutex_init( pthread_mutex_t *mutex, const pthread_mutexattr_t *attr); Initialize a mutex int pthread_mutex_destroy( pthread_mutex_t *mutex); Destroy a mutex int pthread_mutex_lock(pthread_mutex_t *mutex); Lock a mutex int pthread_mutex_unlock(pthread_mutex_t *mutex); Unlock a mutex int pthread_barrier_init(pthread_barrier_t *restrict barrier, const pthread_barrierattr_t *restrict attr, unsigned count); Initialize a barrier int pthread_barrier_destroy(pthread_barrier_t *barrier); Destroy a barrier int pthread_barrier_wait(pthread_barrier_t *barrier); Synchronize at a barrier
Current Status and Future Plan Developing a low-level communication library Done for cluster systems Need to be applied to the Many-SC Developing a threading library Defining API functions is done Developing a software SVM for cluster systems Need to implement Verification and performance analysis on Chundoong
The End