Sun Starfire: Extending the SMP Envelope Presented by Jen Miller 2/9/2004.

Slides:



Advertisements
Similar presentations
Memory Interleaving.
Advertisements

L.N. Bhuyan Adapted from Patterson’s slides
IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
6-April 06 by Nathan Chien. PCI System Block Diagram.
Chapter Three: Interconnection Structure
Digital Computer Fundamentals
Princess Sumaya Univ. Computer Engineering Dept. Chapter 6:
ARCHITECTURE OF APPLE’S G4 PROCESSOR BY RON WEINWURZEL MICROPROCESSORS PROFESSOR DEWAR SPRING 2002.
4. Shared Memory Parallel Architectures 4.4. Multicore Architectures
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
CA 714CA Midterm Review. C5 Cache Optimization Reduce miss penalty –Hardware and software Reduce miss rate –Hardware and software Reduce hit time –Hardware.
The AMD Athlon ™ Processor: Future Directions Fred Weber Vice President, Engineering Computation Products Group.
Multiple Processor Systems
CS 258 Parallel Computer Architecture Lecture 15.1 DASH: Directory Architecture for Shared memory Implementation, cost, performance Daniel Lenoski, et.
CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.
Main Mem.. CSE 471 Autumn 011 Main Memory The last level in the cache – main memory hierarchy is the main memory made of DRAM chips DRAM parameters (memory.
10.2 Characteristics of Computer Memory RAM provides random access Most RAM is volatile.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
CS 284a, 7 October 97Copyright (c) , John Thornley1 CS 284a Lecture Tuesday, 7 October 1997.
Lecture 12: DRAM Basics Today: DRAM terminology and basics, energy innovations.
Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.
1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.
Multiprocessors Andreas Klappenecker CPSC321 Computer Architecture.
CS252/Patterson Lec /23/01 CS213 Parallel Processing Architecture Lecture 7: Multiprocessor Cache Coherency Problem.
1 Lecture 23: Multiprocessors Today’s topics:  RAID  Multiprocessor taxonomy  Snooping-based cache coherence protocol.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 18, 2002 Topic: Main Memory (DRAM) Organization – contd.
CPU Chips The logical pinout of a generic CPU. The arrows indicate input signals and output signals. The short diagonal lines indicate that multiple pins.
1 Lecture 20: Protocols and Synchronization Topics: distributed shared-memory multiprocessors, synchronization (Sections )
Router Architectures An overview of router architectures.
Router Architectures An overview of router architectures.
Module I Overview of Computer Architecture and Organization.
An Introduction to Infrastructure Ch 11. Issues Performance drain on the operating environment Technical skills of the data warehouse implementers Operational.
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
PCI Team 3: Adam Meyer, Christopher Koch,
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
A 50-Gb/s IP Router 참고논문 : Craig Partridge et al. [ IEEE/ACM ToN, June 1998 ]
STARFIRE: Extending the SMP Envelope Alan Charlesworth Presented By Bob Koutsoyannis.
Top Level View of Computer Function and Interconnection.
Dynamic Verification of Cache Coherence Protocols Jason F. Cantin Mikko H. Lipasti James E. Smith.
Switches and indirect networks Computer Architecture AMANO, Hideharu Textbook pp. 92~13 0.
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
Computer Architecture System Interface Units Iolanthe II approaches Coromandel Harbour.
EEE440 Computer Architecture
ECE 526 – Network Processing Systems Design Computer Architecture: traditional network processing systems implementation Chapter 4: D. E. Comer.
2 Systems Architecture, Fifth Edition Chapter Goals Describe the system bus and bus protocol Describe how the CPU and bus interact with peripheral devices.
Lecture 12: Reconfigurable Systems II October 20, 2004 ECE 697F Reconfigurable Computing Lecture 12 Reconfigurable Systems II: Exploring Programmable Systems.
The Memory Hierarchy Lecture # 30 15/05/2009Lecture 30_CA&O_Engr Umbreen Sabir.
Computer Architecture System Interface Units Iolanthe II in the Bay of Islands.
1 Lecture 19: Scalable Protocols & Synch Topics: coherence protocols for distributed shared-memory multiprocessors and synchronization (Sections )
Jeffrey Ellak CS 147. Topics What is memory hierarchy? What are the different types of memory? What is in charge of accessing memory?
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
CDA-5155 Computer Architecture Principles Fall 2000 Multiprocessor Architectures.
Niagara: A 32-Way Multithreaded Sparc Processor Kongetira, Aingaran, Olukotun Presentation by: Mohamed Abuobaida Mohamed For COE502 : Parallel Processing.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Presented by: Nick Kirchem Feb 13, 2004
5.2 Eleven Advanced Optimizations of Cache Performance
Cache Memory Presentation I
Superscalar Pipelines Part 2
STARFIRE Extending the SMP Envelope
Lecture 17: Case Studies Topics: case studies for virtual memory and cache hierarchies (Sections )
CSC3050 – Computer Architecture
Co-designed Virtual Machines for Reliable Computer Systems
Presentation transcript:

Sun Starfire: Extending the SMP Envelope Presented by Jen Miller 2/9/2004

2 Starfire Overview 24 – 64 processors –Maximum of 4 processors per board Based on UMA, SMP snooping architecture Design focus on interconnect

3 System 250 MHz processors with 4 MB external caches 16 x 16 data crossbar Active centerplane Point to point routing 4 GB memory separated into 4 banks –4 way interleaved address bus

4 Interconnect Point to point routing: centerplane transfers addresses and data between boards - Higher latency than traditional bus - Better bandwidth, reliability, availability 2 cycle address transactions - Bus determined by 2 low-order cache bits Data transactions - Waiting packets are buffered and sent in 8 cycles buffer-to-buffer

5 Dynamic System Domains Can be dynamically subdivided into multiple computers Each domain is a separate shared-memory SMP system –Errors confined to domain Great for testing and development Starfire can replace multiple smaller systems Domains can be created for special functions Implemented in centerplane and system boards via registers

6 Reliability ECC for data transfers and address packets Optional hardware redundancy –Auto-reboot crash recovery

7 Performance Bandwidth increased dramatically over prior generations Unix server flexibility with Dynamic System Domains Reliable, available, and serviceable “Can match or exceed performance of other parallel architectures for a lower system cost”