Download presentation
Presentation is loading. Please wait.
1
March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford University March 11, 2003
2
SS-SQ03-W: 2 Year 2 Overview Where we are today –First year goal was met: demonstrated feasibility on single node –Feedback from site visit team was very positive –Potential for a big impact on scientific computing –But still much to do! Key FY03 goals –Get long-term software infrastructure in place Select approach, implement baseline Brook to SSS compiler –Multi-node versions that scale Language, compiler, simulator –Tackle hard problems: 3-D, Irregular neighborhoods/sparse matrix solve Language support, numerics support, evaluate on simulator –Refine architecture Cluster organization, aspect ratio, register organization, memory organization –Industrial Partner Start serious discussions, outreach to build support, close partner in 04
3
March 11, 2003 SS-SQ03-W: 3 Some concerns We’re doing a great job – but… Losing a bit of focus and momentum –Tooling on the detail –Need to take a step back and reexamine the big picture Need to raise our outside profile –Publish Overview paper Brook paper –Generate some more convincing evidence of advantages Need a control for bandwidth measures –Update the web page –Visit the labs
4
March 11, 2003 SS-SQ03-W: 4 Lets review our overall goal Exploit capabilities of VLSI to realize cost- effective scientific computing.
5
March 11, 2003 SS-SQ03-W: 5 Review – What is the SSS Project About? Exploit streams to give 100x improvement in performance/cost for scientific applications vs. ‘cluster’ supercomputers –From 100 GFLOPS PCs to TFLOPS single-board computers to PFLOPS supercomputers Use layered programming system to simplify development and tuning of applications –Stream languages –Streaming virtual machine Demonstrated feasibility of streaming scientific computing in year 1 Refine architecture and programming system in year 2 –Demonstrate realistic applications (3D, irregular) –Build usable compiler –Resolve architecture questions – aspect ratio, conditional execution, sparse clusters, reg organization, memory system, etc… Build a prototype and demonstrate CITS applications in years 3-6 –With industrial and government partners –Broaden our base of support
6
March 11, 2003 SS-SQ03-W: 6 Industrial Partner Update Candidates –Cray, IBM, Sun, HP, SGI, Intel Initial discussion –Present SSS project and results to date –Discuss collaboration models –Identify next steps Met with Cray, Sun, and SGI –They listened politely, but little traction –Need more convincing evidence –Need to address programming issue Have to provide a path for legacy codes
7
March 11, 2003 SS-SQ03-W: 7 Outreach National Labs –Los Alamos –Livermore –Sandia Other Government –NASA –DARPA –DoD (Charlie Holland) –AFOSR User communities
8
March 11, 2003 SS-SQ03-W: 8 Software Win 02 Goals Brook –Define carefully the semantics of the operators No progress –Work on “views of memory” abstraction Proposed API – will write up for next SW meeting –Support for partitioning, shared memory, naming, fitting into stream abstraction Adopting UPC – will write up for next SW meeting –Support for irregular neighborhoods Failed to find an application –Multithreaded version (Christos) Have simple model for multi-node – written up –(NEW) Preliminary Brooktran spec –Concrete Winter goals [Ian/Frank] Review of the language [Pat] Partitioning (UPC) Multi-node/Multi-threaded version Irregular support – w/ application PPoPP paper MD on BRT
9
March 11, 2003 SS-SQ03-W: 9 Brook Spring 03 Goals Refine semantics of operators –New version of spec Implement views of memory API (UPC) Find application for irregular structures –Dijkstra, incomplete LU Dynamic structure Start switching to new compiler Brooktran spec/implementation –Implemented in Open64 Concern – have lost metacompiler support
10
March 11, 2003 SS-SQ03-W: 10 Software Win 02 Goals SVM –Spec has evolved Concensus between MIT, Texas, Stanford, USC –Implement multinode version No progress –SVM to simulator path No progress –Multi-thread
11
March 11, 2003 SS-SQ03-W: 11 SVM – Spring 03 Goals Spec is complete – and supports SSS Revise single-node simulator Multi-node simulator (prelim)
12
March 11, 2003 SS-SQ03-W: 12 Software Win 02 Goals (3 of 3) Start regular meetings [Done] Compiler –Decide on flow from Brook->SVM->SSS [Mattan] Done –Select base compiler [Jayanth] ORC, Gnu, SUIF, Tendra, others… Done –“Spike” a simple program from Brook->SSS [Mattan/Jayanth ++] Started – modified front end – operating on WHIRL –Brook to Nvidia –Optimizations [Spring] Run time –Write a white paper
13
March 11, 2003 SS-SQ03-W: 13 Compiler Spring 03 Goals Complete feasibility study Brook to C path –Parse Brook –Generate C Optimizations –See Mattan’s document Need to generate SVM code by mid summer Parse Brooktran [Alan, Fatica, Jayanth] Kernel scheduler MULADD [Das] SVM to SSS [Francois – long term – need plan]
14
March 11, 2003 SS-SQ03-W: 14 Application Win 02 Goals StreamFLO[Fatica] –Base version is complete –Not running on simulator –Early start on 3D version – partitioning waiting on API def StreamFEM [Barth] –Waiting on spec for partitioning –3D arithmetic kernels done –Tridiagonal in Brook StreamMD [Eric/student] –Ported GROMACS to the NV30 – benchmarks Performance dependent on number of registers Doesn’t work with CG compiler Model applications [Ron/Frank] –Started Look at Sierra, purple benchmarks: ppm, sweep3D [delay]
15
March 11, 2003 SS-SQ03-W: 15 Application Spring 03 Goals StreamFLO[Fatica] –Parse Brooktran – F to WHIRL [Alan, Fatica] –Partitioned version – multi-node UPC –3D version StreamFEM [Barth] –Simulate 3D –Sparse LUD –Partitioned version StreamMD [Eric/student] –Hand-tune NV30 assembly code –GROMACS in Brook Model applications [Ron/Frank] –C implementations of adaptive structures Look at Sierra, purple benchmarks: ppm, sweep3D [delay]
16
March 11, 2003 SS-SQ03-W: 16 Architecture Win 02 Goals Single-Node Simulator [Jung-Ho, Knight] –64-bit support, MULADD, Scalar Processor –Not yet Multi-Node Simulator [Jung-Ho, Abhishek] –Network model –Multi-node mechanisms –Not yet Point Studies –Aspect ratio SSE vs VLIW Planning –Conditional execution [Mattan/Ujval] Started –Sparse clusters –SRF organization [Nuwan] Complete –Cache alternatives [Jung Ho] –Add and store study [Jung Ho] Started –I/O –Iterative operations [Francois] Planned
17
March 11, 2003 SS-SQ03-W: 17 Architecture Spring 03 Goals Multi-node simulator Point Studies –Aspect ratio [TIM] –Conditional execution [Mattan/Ujval] –Sparse clusters [Delay] –SRF organization [Nuwan] Refine Cache alternatives [Jung Ho] –Add and store study [Jung Ho] –I/O [?] –Iterative operations [Francois] 64-bit [delay] Scalar Processor [delay]
18
March 11, 2003 SS-SQ03-W: 18 Special Win 02 Goals Fix website [Pat] –Public and private websites Name that computer –Mississippi –Axios –Submit names to Mattan –Bill, Pat, Bill to choose Project Party [Mattan – Pat’s house]
19
March 11, 2003 SS-SQ03-W: 19 Name Resolution From now on, the SSS is called Merrimac
20
March 11, 2003 SS-SQ03-W: 20 Winter Quarter Meeting Schedule 4/1FedkiwParty 4/8Alan, FaticaBrooktran 4/15KapasiConditionals 4/22FaticaStreamFLO update 4/29Review Prep 5/6Review Prep 5/13Tim, TimStreamFEM 3D 5/20Ian, PatBrook Specification 5/27MattanBandwidth Comparison 6/3JayanthCompiler 6/10BillWrapup
21
March 11, 2003 SS-SQ03-W: 21 Papers Arch –Indexable SRFs (Nuwan) –Streaming Supercomputer Overview (Tim K.) –Streaming on conventional CPUs (Mattan) –Conditionals (Ujval) –Remote Ops (Jung Ho) –Aspect Ratio (?) –Data parallel (SSE) vs. ILP (VLIW) Software –Design of Brook (Ian) –Data parallel programming on graphics HW (Pat) –Brook to CG Compiler Apps –Gromacs –StreamFEM (Tim 2 ) Overview (Bill and Pat)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.