System Software Environments Breakout Report June 27, 2002.

Slides:



Advertisements
Similar presentations
Threads, SMP, and Microkernels
Advertisements

1 Module 1 The Windows NT 4.0 Environment. 2  Overview The Microsoft Operating System Family Windows NT Architecture Overview Workgroups and Domains.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
Computer Systems/Operating Systems - Class 8
A 100,000 Ways to Fa Al Geist Computer Science and Mathematics Division Oak Ridge National Laboratory July 9, 2002 Fast-OS Workshop Advanced Scientific.
CS533 Concepts of Operating Systems Class 20 Summary.
1 BGL Photo (system) BlueGene/L IBM Journal of Research and Development, Vol. 49, No. 2-3.
CS533 Concepts of Operating Systems Class 12 Micro-kernels (or Extensibility via Hardware-Based Protection)
35 Advanced Operating Systems Kernels and kernel services.
Disco Running Commodity Operating Systems on Scalable Multiprocessors.
1 Mathematical, Information and Computational Sciences FAST-OS Workshop July 9-10, 2002 Fred Johnson.
CS533 Concepts of Operating Systems Class 12 Micro-kernels (or Extensibility via Hardware-Based Protection)
Chiba City: A Testbed for Scalablity and Development FAST-OS Workshop July 10, 2002 Rémy Evard Mathematics.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Slide 3-1 Copyright © 2004 Pearson Education, Inc. Operating Systems: A Modern Perspective, Chapter 3 Operating System Organization.
John David Eriksen Jamie Unger-Fink High-Performance, Dependable Multiprocessor.
Computer System Architectures Computer System Software
Yavor Todorov. Introduction How it works OS level checkpointing Application level checkpointing CPR for parallel programing CPR functionality References.
Priority Research Direction Key challenges Fault oblivious, Error tolerant software Hybrid and hierarchical based algorithms (eg linear algebra split across.
UNIX System Administration OS Kernal Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept Kernel or MicroKernel Concept: An OS architecture-design.
Priority Research Direction (use one slide for each) Key challenges -Fault understanding (RAS), modeling, prediction -Fault isolation/confinement + local.
IMPROUVEMENT OF COMPUTER NETWORKS SECURITY BY USING FAULT TOLERANT CLUSTERS Prof. S ERB AUREL Ph. D. Prof. PATRICIU VICTOR-VALERIU Ph. D. Military Technical.
Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
Software Engineering General architecture. Architectural components:  Program organisation overview Major building blocks in a system Definition of each.
1 Intern Project Presentation Connor Richardson Big Data August 4, 2015.
Computer Science Section National Center for Atmospheric Research Department of Computer Science University of Colorado at Boulder Blue Gene Experience.
Programming Models & Runtime Systems Breakout Report MICS PI Meeting, June 27, 2002.
Mesos A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony Joseph, Randy.
Presented by Reliability, Availability, and Serviceability (RAS) for High-Performance Computing Stephen L. Scott and Christian Engelmann Computer Science.
Heterogeneous Multikernel OS Yauhen Klimiankou BSUIR
Crystal Ball Panel ORNL Heterogeneous Distributed Computing Research Al Geist ORNL March 6, 2003 SOS 7.
Introduction. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
Headline in Arial Bold 30pt HPC User Forum, April 2008 John Hesterberg HPC OS Directions and Requirements.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Chapter 5 McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. Enterprise Architectures.
CS551 - Lecture 5 1 CS551 Lecture 5: Quality Attributes Yugi Lee FH #555 (816)
1 Threads, SMP, and Microkernels Chapter Multithreading Operating system supports multiple threads of execution within a single process MS-DOS.
Diskless Checkpointing on Super-scale Architectures Applied to the Fast Fourier Transform Christian Engelmann, Al Geist Oak Ridge National Laboratory Februrary,
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
 Load balancing is the process of distributing a workload evenly throughout a group or cluster of computers to maximize throughput.  This means that.
The Role of Virtualization in Exascale Production Systems Jack Lange Assistant Professor University of Pittsburgh.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
Breakout Group: Debugging David E. Skinner and Wolfgang E. Nagel IESP Workshop 3, October, Tsukuba, Japan.
System-Directed Resilience for Exascale Platforms LDRD Proposal Ron Oldfield (PI)1423 Ron Brightwell1423 Jim Laros1422 Kevin Pedretti1423 Rolf.
Copyright © Clifford Neuman - UNIVERSITY OF SOUTHERN CALIFORNIA - INFORMATION SCIENCES INSTITUTE Advanced Operating Systems Lecture notes Dr.
By Chi-Chang Chen.  Cluster computing is a technique of linking two or more computers into a network (usually through a local area network) in order.
SensorWare: Distributed Services for Sensor Networks Rockwell Science Center and UCLA.
1 Distributed Processing Chapter 1 : Introduction.
Lawrence Livermore National Laboratory 1 Science & Technology Principal Directorate - Computation Directorate Scalable Fault Tolerance for Petascale Systems.
Tackling I/O Issues 1 David Race 16 March 2010.
March 4, 2003SOS-71 FAST-OS Arthur B. (Barney) Maccabe Computer Science Department The University of New Mexico SOS 7 Durango, Colorado March 4, 2003.
Primitive Concepts of Distributed Systems Chapter 1.
E Virtual Machines Lecture 1 What is Virtualization? Scott Devine VMware, Inc.
Threads, SMP, and Microkernels Chapter 4. Processes and Threads Operating systems use processes for two purposes - Resource allocation and resource ownership.
BLUE GENE Sunitha M. Jenarius. What is Blue Gene A massively parallel supercomputer using tens of thousands of embedded PowerPC processors supporting.
© 2010 VMware Inc. All rights reserved Why Virtualize? Beng-Hong Lim, VMware, Inc.
1.3 Operating system services An operating system provide services to programs and to the users of the program. It provides an environment for the execution.
Computer System Structures
Introduction to Distributed Platforms
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
CS533 Concepts of Operating Systems
Scalability and Validation the “test bed”
Threads, SMP, and Microkernels
Lecture 4- Threads, SMP, and Microkernels
Process Migration Troy Cogburn and Gilbert Podell-Blume
Presentation transcript:

System Software Environments Breakout Report June 27, 2002

What is the scope of existing efforts? 1.Scalable Systems Software Center 2.FAST-OS “lightweight kernel” effort 3.Science Appliance effort 4.Systems Management research 5.Blue Gene

Anemic Areas of Existing Research What are the under funded critical research issues? 1.Security – protecting systems from compromise 2.Fault tolerance – being able to run through failure. Three steps: detection, notification, system recovery Linkage between projects and groups for FT chain. 3.Validation of result – how know that app got right ans. Given system runs through failure.

Potential Gaps in Research What are the gaps in MICS research portfolio related to peta-scale computing? 1.What happens after Linux? 2.OS that supports the programming model – if it is going to change from dist. mem. Message passing then… 3.Alternate to microkernel approach “Concrete” holds everything together but is really heavy. 4.Local address space vs more expansive address support 5.Experimental architecture research testbed Arose during PIM discussion Platforms for software development Not just the commodity clusters we work with today ES40 and PIM not addressed

Getting Help From Others What is needed most from the other four groups? 1.Runtime & Programming models – feedback to OS so it can reconfigure to optimize needs, PIM is entangled across Programming and OS. 2.Data & vizualization - Scalable I/O – leverage SDM ISIC and Program models Checkpoint/restart 100K nodes 3.Portability & Interoperability - ??? Discussion but no clear things we need to leverage 4.Performance – feedback to OS so it can correct bottlenecks while apps run.