STARFIRE Extending the SMP Envelope Alan Charlesworth, Sun Microsystems Presented by Mahmut Yilmaz 01/30/2006
Introduction Cache-coherent symmetric multi-processor architecture (24-64 UltraSparc II microprocessors running @ 250MHz, sharing max 64GB memory) Goals: Increase system memory bandwidth Reduce memory latency as much as possible Provide Unix server flexibility (Dynamic System Domains) Improve system reliability, availability, and serviceability How? Multiple outstanding cache misses, split-transaction buses, separate address and data paths, wider data paths, higher system clocks… Starfire cabinet1 1 http://www.filibeto.org/~aduritz/truetrue/e10000/starfire-interconnect.html
Improvements in Bus Capacity Bus snooping rate increased from 2.5M/s (1991) to 167M/s (1997). How? : Bus clock rate: 40 MHz 100 MHz Circuit switched protocol Packet switched protocol (separates requests from replies) Separate address and data bus Interleaved multiple snoop buses (4 address buses) Cache block size: 32 bytes 64 bytes Data bus: 8 bytes 16 bytes
System
STARFIRE Interconnect Ultra Port Architecture interconnect (writeback MOESI – X-Modified, S-Modified, X-Clean, S-Clean, Invalid) Packet-switched data transaction with ECC codes (16bytes + 2bytes) 2 level interconnect On-board: Processor, SBUS cards, memory Address & data ports Centerplane: Transfers address & data between boards 2 clock cycle address transaction 2 low-order cache-block address bits determine the address bus to use 8 clock cycle board to board data transfer Buses vs Point-to-point routers? Buses: Faster Point-to-point routers: Bandwidth, partitioning, reliability, availability, and serviceability
Interconnect Reliability ECC for both address and data bus (ASIC help) Failed components: System attempts to recover without any service interruption Redundant components: Optional Crash recovery: Automatic System Recovery (ASR) – requires redundant components
Dynamic System Domains Starfire can be easily subdivided into domains for boards using System Service Processor: Each domain is a totally isolated SMP Good for many applications: LAN consolidation; development, production and test; software migration; special I/O or network functions … Picture: http://www.filibeto.org/~aduritz/truetrue/e10000/starfire-interconnect.html
Price/Performance Throughput is higher than other HPC computers with similar cost Easy administration, reliable and serviceable Starfire Cost2 2 Alan Charlesworth, Nicholas Aneshansley, Mark Haakmeester, Dan Drogichen, Gary Gilbert , Ricki Williams, Andrew Phelps, “The Starfire SMP Interconnect”
Questions Is the price/performance comparison to other HPC computers fair? What if we have a cluster of Starfire servers? Uniform latencies? What is the performance cost adding redundancy? Centerplane: Single Point of Failure? Can we replace centerplane?
Fair Comparison? Graph: Alan Charlesworth, Nicholas Aneshansley, Mark Haakmeester, Dan Drogichen, Gary Gilbert , Ricki Williams, Andrew Phelps, “The Starfire SMP Interconnect”