Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Berkeley UPC Kathy Yelick Christian Bell, Dan Bonachea, Wei Chen, Jason Duell, Paul Hargrove, Parry Husbands, Costin Iancu, Rajesh Nishtala, Mike Welcome.

Similar presentations


Presentation on theme: "1 Berkeley UPC Kathy Yelick Christian Bell, Dan Bonachea, Wei Chen, Jason Duell, Paul Hargrove, Parry Husbands, Costin Iancu, Rajesh Nishtala, Mike Welcome."— Presentation transcript:

1 1 Berkeley UPC Kathy Yelick Christian Bell, Dan Bonachea, Wei Chen, Jason Duell, Paul Hargrove, Parry Husbands, Costin Iancu, Rajesh Nishtala, Mike Welcome LBNL and U.C. Berkeley http://upc.lbl.gov

2 Kathy YelickBerkeley UPChttp://upc.lbl.gov Berkeley UPC Compiler Status Recent Berkeley UPC release (v2.2) Support 1.2 language spec Supports collectives (tuning ongoing); memory model compliance Supports UPC I/O (naïve reference implementation) Compiler work Optimization phase and improved performance in v2.2 Work on automated communication overlap, upc_forall,… Large effort in quality assurance and robustness Test suite: 600+ tests run nightly on 20+ platform configs >30,000 UPC compilations and >20,000 UPC test runs per night Test suite infrastructure extended to support any UPC compiler now running nightly with GCC/UPC + UPCR also has been used on HP-UPC, Cray UPC

3 Kathy YelickBerkeley UPChttp://upc.lbl.gov Berkeley UPC Collaborations GCC UPC on Berkeley UPC Runtime Use for cluster (GASNet) implementations Now works with pthread runtime Source-level debugging with Totalview 7.x Joint project with Etnus General framework for source-to-source translators Future work: Cray XT3 and other Rainier/Adams port Possible BlueGene/L port XT3 and BG/L both run on MPI conduit

4 Kathy YelickBerkeley UPChttp://upc.lbl.gov Berkeley Applications & Benchmarks Some new applications FT:.45 TFlops on 512 proc Itanium/Quadrics (Elan4) CG: 30 GFlops on 512 HP Alpha/Quadrics (Elan3) LU: >2 TFlops on 512 proc Itanium/Quadrics (Elan4) Barnes-Hut: fine-grained (based on Splash) CFG: uses to Chombo More on LU Towards a Sparse direct solver (SuperLU) Currently a full (top500-compliant) HPL implementation All UPC except for call to the BLAS

5 5 End of Berkeley Status

6 6 Data Movement and Synchronization

7 Kathy YelickBerkeley UPChttp://upc.lbl.gov Motivation for Data Movement Synchronization Some are (at best) hard/slow in UPC Benchmarks highlight these FT: communication-limited, all-to-all: want to overlap MG: fill in ghost regions Remote writes are often faster than remote reads But need to synchronize: let the other proc know data is available –See Tarek and John Mellor-Crummey’s PPoPP05 paper –Signaling store in Split-C –Implementation issue: reordering LU: remotely enqueue a task GUPS and Histogram: remotely increment/XOR a value With or without atomicity

8 Kathy YelickBerkeley UPChttp://upc.lbl.gov Who Would Like to Talk? Non-Blocking Memget/put (Dan) Semaphores (Dan) Semaphore example (Tarek) Remote Atomics (Phil) Floating functions (Jason)


Download ppt "1 Berkeley UPC Kathy Yelick Christian Bell, Dan Bonachea, Wei Chen, Jason Duell, Paul Hargrove, Parry Husbands, Costin Iancu, Rajesh Nishtala, Mike Welcome."

Similar presentations


Ads by Google