Alexey Pakhunov /XCG, Microsoft Research/ March 30 th, 2011.

Slides:



Advertisements
Similar presentations
Operating Systems Components of OS
Advertisements

MHK200 Module 1: Introduction to Windows CE. MHK200 Overivew Windows CE Design Goals Windows CE Architecture Supported Technologies, Libraries, and Tools.
What is an operating system? Is it software?
AMD OPTERON ARCHITECTURE Omar Aragon Abdel Salam Sayyad This presentation is missing the references used.
CS 345 Computer System Overview
OS Case Study: The Xbox 360  Instructor: Rob Nash  Readings: See citations in the slides.
04/14/2008CSCI 315 Operating Systems Design1 I/O Systems Notice: The slides for this lecture have been largely based on those accompanying the textbook.
Architectural Support for OS March 29, 2000 Instructor: Gary Kimura Slides courtesy of Hank Levy.
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
Read Chapter 3 (David E. Simon, An Embedded Software Primer)
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Operating Systems CS208. What is Operating System? It is a program. It is the first piece of software to run after the system boots. It coordinates the.
Introduction to Computer Terminology
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Computer Organization
C.S. Choy95 COMPUTER ORGANIZATION Logic Design Skill to design digital components JAVA Language Skill to program a computer Computer Organization Skill.
How Hardware and Software Work Together
Basic Input Output System
Computer System Architectures Computer System Software
Xiaocheng Zhou Intel Labs China “Single-chip Cloud Computer” An experimental many-core processor from Intel Labs.
Computers Central Processor Unit. Basic Computer System MAIN MEMORY ALUCNTL..... BUS CONTROLLER Processor I/O moduleInterconnections BUS Memory.
The Computer Systems By : Prabir Nandi Computer Instructor KV Lumding.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Jonathan Walpole (based on a slide set from Vidhya Sivasankaran)
CS533 Concepts of Operating Systems Jonathan Walpole.
 Design model for a computer  Named after John von Neuman  Instructions that tell the computer what to do are stored in memory  Stored program Memory.
Extracted directly from:
I/O Systems I/O Hardware Application I/O Interface
GBT Interface Card for a Linux Computer Carson Teale 1.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Enabling Multi-threaded Applications on Hybrid Shared Memory Manycore Architectures Tushar Rawat and Aviral Shrivastava Arizona State University, USA CML.
Caltech CS184b Winter DeHon 1 CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] Day14:
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard et al. Madhura S Rama.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.
January 10, Kits Workshop 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS A Smart Port Card Tutorial --- Software John DeHart Washington University.
GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.
1)Leverage raw computational power of GPU  Magnitude performance gains possible.
Advanced x86: BIOS and System Management Mode Internals Boot Process Xeno Kovah && Corey Kallenberg LegbaCore, LLC.
Chapter 13 – I/O Systems (Pgs ). Devices  Two conflicting properties A. Growing uniformity in interfaces (both h/w and s/w): e.g., USB, TWAIN.
Silberschatz, Galvin and Gagne  Operating System Concepts Six Step Process to Perform DMA Transfer.
Processes and Virtual Memory
CSC414 “Introduction to UNIX/ Linux” Lecture 2. Schedule 1. Introduction to Unix/ Linux 2. Kernel Structure and Device Drivers. 3. System and Storage.
Full and Para Virtualization
June 16, 2002 SPC Tutorial 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS A Smart Port Card (SPC and SPC-II) Tutorial --- Hardware John DeHart Washington.
1 Lecture 1: Computer System Structures We go over the aspects of computer architecture relevant to OS design  overview  input and output (I/O) organization.
MIDORI The Windows Killer!! by- Sagar R. Yeole Under the guidance of- Prof. T. A. Chavan.
EMBEDDED SYSTEM SOFTWARE AND HARDWARE BASICS HOME TASK E MBEDDED S YSTEMS S OFTWARE T RAINING C ENTER 1.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Performance Analysis of HPC with Lmbench Didem Unat Supervisor: Nahil Sobh July 22 nd 2005 netfiles.uiuc.edu/dunat2/www.
Silberschatz, Galvin, and Gagne  Applied Operating System Concepts Module 12: I/O Systems I/O hardwared Application I/O Interface Kernel I/O.
January 8, 2001 SPC Tutorial 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS Agenda 9:00 SPC Hardware -- John DeHart 9:45 SPC Software -- John DeHart 10:30.
Introduction to Operating Systems Concepts
Computer System Structures
Chapter 13: I/O Systems.
Module 3: Operating-System Structures
Module 12: I/O Systems I/O hardware Application I/O Interface
Chapter 13: I/O Systems Modified by Dr. Neerja Mhaskar for CS 3SH3.
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
Types of RAM (Random Access Memory)
Chapter 3: Operating-System Structures
I/O Systems I/O Hardware Application I/O Interface
Operating System Concepts
13: I/O Systems I/O hardwared Application I/O Interface
CS703 - Advanced Operating Systems
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Chapter 13: I/O Systems.
COT 5611 Operating Systems Design Principles Spring 2014
Module 12: I/O Systems I/O hardwared Application I/O Interface
Interrupt Message Store
Presentation transcript:

Alexey Pakhunov /XCG, Microsoft Research/ March 30 th, 2011

 Black Cloud OS:  A fork of Singularity OS  Our playground for experimenting with message passing in non-cache coherent environment  This presentation covers only our development experiences on the SCC  Submission of the paper is on its way 2

 A quote from Singularity home page: “A research operating system prototype, extending programming languages, and developing new techniques and tools for specifying and verifying program behavior”  Written in managed code  Some Assembler and C++ in the boot loader and kernel  IPC and inter-component communications are based on passing messages 3

Tile R R R R R R R R R R R R R R R R R R R R R R R R DDR3 MC VRC System Interface PCI-E Management Console (Linux) sccTcpServer/mceGui Management Console (Linux) sccTcpServer/mceGui Desktop PC (Windows) RcLoader.Net, KdProxy, WinDbg, etc. Desktop PC (Windows) RcLoader.Net, KdProxy, WinDbg, etc. TCP/IP 4

 Configuration  Generates the system memory map  Configures the SCC registers  Uploads the boot loader and OS images  Supports manual editing of the SCC configuration  Debugging  Allows inspecting the memory and configuration registers 5

6

 No serial port or console  Memory at 0xb8000 is the console buffer  I/O redirection doesn’t work as expected  Execution of IN or OUT instruction effectively halts the core and sccTcpServer  Serial KD transport is emulated  A couple of ring buffers on the SCC side  KdProxy.exe exposes a named pipe interface for the debugger 7

 No BIOS  The system memory map is patched directly in the boot loader  No standard devices  Local APIC is used instead of i8254 timer and PIC  No RTC clock  No modern instruction supported  Context handling code was updated due to lack of MMX ▪ 32bit flavor of Singularity uses only x87 for floating point calculations  Bartok compiler was patched due to lack of CMOV instructions 8

 Turning on MPB bypass bit causes a race causing memory corruptions  Minus three days of debugging :-)  We couldn’t take advantage of fast MPB access  Large pages cannot be used together with MPB  Singularity uses large pages to create the identity mapping spanning 4GB 9

 A telnet connection to each core  The same serial transport emulation via KdProxy.exe was used 10

 A read-only OS image is shared among all cores  Message passing code uses MPB-mapped buffers and CL1FLUSH-aware memcpy()  Large shared memory storage is accessible via dynamically remapped LUTs  R/W access is possible with proper cache flushing and/or caching settings in PTEs 11

 Core’s memory interface bandwidth is limited  One outstanding memory operation 12 Frequency (MHz)A cache line writing latency (cycles) Maximum bandwidth (MB/s) (cached) (uncached)124.2

 Memory controller bandwidth is limited 13

 The SCC is an experimental platform tailored for message passing  Lack of cache coherency makes us think hard how about message passing  The chip has enough cores to play with scalability  Compare apples to apples  The cache and memory subsystems are significantly different  The SCC is super parallel, not super fast 14

15