March 4, 2003SOS-71 FAST-OS Arthur B. (Barney) Maccabe Computer Science Department The University of New Mexico SOS 7 Durango, Colorado March 4, 2003
SOS-71 FAST-OS Most applications show poor scaling on very large machines Scaling must be addressed at all levels FAST-OS: Consider OS and runtime Emphasis on near-term systems Application Runtime Operating System Hardware FAST-OS Forum to Address Scalable Technology for runtime and Operating Systems
March 4, 2003SOS-71 FAST-OS People Oversight Committee Fred Johnson, DOE Paul Messina, Cal Tech Jose Munoz, DOE Thomas Sterling, Cal Tech Rick Stevens, ANL Steering committee Barney Maccabe, UNM Ron Brightwell, SNL Al Geist, ORNL Terry Jones, LLNL Rusty Lusk, ANL Ron Minnich, LANL
March 4, 2003SOS-71 Forum Activities Web page February 2002 WIMPS Bodega Bay ½ day FAST-OS workshop Primary focus on organization July ½ days in Chicago 30 people from labs, industry, and academics Report focuses on issues October 2002 ½ day workshop in Santa Fe Commodity versus Specialized approaches
March 4, 2003SOS-71 Issues (1 of 4) Fault tolerance is critical System size and application run times imply that apps will encounter faults All levels (including apps) must deal with faults Programming models Need to support a variety of models Models for 100,000 processors Breaking the legacy straight jacket What comes after MPI?
March 4, 2003SOS-71 Issues (2 of 4) OS Structure Protection boundaries Global/Local OS split How far can we take lightweight approaches Split between OS and runtime system? APIs App/Runtime; Runtime/OS Tools: compilers, debuggers Specific functions Process management and scheduling Security, QoS, Invariants, etc.
March 4, 2003SOS-71 Issues (3 of 4) What is Scalability? Which runtime and OS services should scale Performance nearly independent of size? Reliability nearly independent of size? Future hardware Do we need different OS/runtime for different systems? Hardware support for OS/runtime protection reliable networks collective operations
March 4, 2003SOS-71 Issues (4 of 4) Application requirements Identify critical apps Measure needs Metrics How do we measure success? Programmatic Roles for academics, vendors, and labs Support testbeds (e.g., Chiba City) Don't choose a winner too early in the process
March 4, 2003SOS-71 Next Steps Next general meeting will be May or June Goal is to define the research agenda