Presentation is loading. Please wait.

Presentation is loading. Please wait.

CRL (C Region Library) Chao Huang, James Brodman, Hassan Jafri CS498LVK.

Similar presentations


Presentation on theme: "CRL (C Region Library) Chao Huang, James Brodman, Hassan Jafri CS498LVK."— Presentation transcript:

1 CRL (C Region Library) Chao Huang, James Brodman, Hassan Jafri CS498LVK

2 Introduction CRL is an all-software distributed shared memory (DSM) system –Provides shared address space –Built on PVM “Region”: an arbitrarily sized, continuous area of memory –Consistent cached copy at local nodes

3 Functions Environment –crl_init –crl_num_nodes, crl_self_addr Basic region operations –rid_t rgn_create(unsigned size) –void rgn_destroy(rid_t rgn_id) –rid_t rgn_rid(void *rgn) –unsigned rgn_size(void *rgn) –void rgn_flush(void* rgn)

4 Functions Region mapping –void* rgn_map(rid_t rgn_id) –void rgn_unmap(void* rgn) Region read and write –void rgn_start_read(void *rgn) –void rgn_end_read(void *rgn) –void rgn_start_write(void *rgn) –void rgn_end_write(void *rgn)

5 Functions Global synchronization –void rgn_barrier(void) –void rgn_bcast_send(int len, void *buf) –void rgn_bcast_recv(int len, void *buf) –double rgn_reduce_dadd(double arg) –double rgn_reduce_dmin(double arg) –double rgn_reduce_dmax(double arg)

6 Example /* Compute the dot product of * two n-element vectors, each * of which is represented by * appropriately-sized region * x: region identifier for 1st vector * y: address at which 2nd vector is already mapped */ double dotprod(rid_t x, double *y, int n) { int i; double *z; double rslt; /* map 1st vector and initiate read operation */ z = (double *) rgn_map(x); rgn_start_read(z); /* initiate read operation on 2nd vector */ rgn_start_read(y); /* compute dot product */ rslt = 0; for (i=0; i<n; i++) rslt += z[i] * y[i]; /* terminate read operations and unmap 1st vector */ rgn_end_read(y); rgn_end_read(z); rgn_unmap(z); return rslt; }

7 Discussions All-software: latency of communication operations may be higher than hardware based system Region size can be chosen to correspond to user data structures (programmer’s responsibility) Fixed-home, directory-based invalidate protocol Ordered message delivery: 32-bit version number tags each region Unmapped region cache : unique mapping can be cached after unmapped

8 URC Enables Lazy Release Consistency for CRL rgn_start_op can be satisfied locally if region is not invalidated before next time it is mapped Even if data/region is invalidated, later accesses can be satisfied more quickly

9 Software Prototype implementation available Platforms –CM-5 Thinking Machines (message passing multicomputer) –Alewife (Distributed memory multiprocessor). Provides Native shared memory support –TCP/Unix Implementation for SunOS Expect a Linux port soon

10 Machine Characteristics CM-5Alewife Throughput34us14us Latency8MB/sec18MB/sec

11 Basic Ops Latencies CM-5 (us)Alewife (us) Alewife native(us) Start read hit 2.52.3 End read hit 32.5 Start read miss 0 inv 55.1291.9 Start write miss 1 inv 108.148.93.3 Start write miss 6 inv 129.996.735.4

12 Applications 32-way completion time of apps with CRL on Alewife comparable to that of Alewife native shared memory –How? Upto 5 remote headers supported by LimitLESS (Alewife’s software-based cache-coherence subsystem)


Download ppt "CRL (C Region Library) Chao Huang, James Brodman, Hassan Jafri CS498LVK."

Similar presentations


Ads by Google