PSC
BigBen Features Compute Nodes 2068 nodes running Catamount (QK) microkernel Seastar interconnect in a 3-D torus configuration No external connectivity (no TCP) All Inter-node communication is over Portals Applications use MPI which is based on Portals Service & I/O Nodes (SIO) Nodes 22 nodes running Suse Linux Also on the Seastar interconnect SIO nodes can have PCI-X hardware installed, defining unique roles for each 2 SIO nodes are externally connected to ETF with 10GigE cards (currently)
Portals Direct I/O (PDIO) Details Portals-to-TCP routing –PDIO daemons aggregate hundreds of portals data streams into a configurable number of outgoing TCP streams –Heterogenous portals (both QK + Linux nodes) Explicit Parallelism –Configurable # of Portals receivers (on SIO nodes) Distributed across multiple 10GigE-connected Service & I/O (SIO) nodes –Corresponding # of TCP streams (to the WAN) one per PDIO daemon –A Parallel TCP receiver in the Goodhue booth Supports a variable/dynamic number of connections
Portals Direct I/O (PDIO) Details Utilizing the ETF network –10GigE end-to-end –Benchmarked >1Gbps in testing Inherent flow-control feedback to application –Aggregation protocol allows TCP transmission or even remote file system performance to throttle the data streams coming out of the application (!) Variable message sizes and file metadata supported Multi-threaded ring buffer in the PDIO daemon –Allows the Portals receiver, TCP sender, and computation to proceed asynchronously
Portals Direct I/O (PDIO) Config User-configurable/tunable parameters: –Network targets Can be different for each job –Number of streams Can be tuned for optimal host/network utilization –TCP network buffer size Can be tuned for maximum throughput over the WAN –Ring buffer size/length Controls total memory utilization of PDIO daemons –Number of portals writers Can be any subset of the running application’s processes –Remote filename(s) File metadata are propagated through the full chain, per write
ETF network Compute Nodes I/O Nodes Steering iGRIDPSC HPC resource and renderer waiting…
pdiod recv ETF network Compute Nodes I/O Nodes Steering iGRIDPSC Launch PPM job, PDIO daemons, and iGRID recv’ers
pdiod recv ETF network Compute Nodes I/O Nodes Steering iGRIDPSC Aggregate data via Portals
pdiod recv ETF network Compute Nodes I/O Nodes Steering iGRIDPSC Route traffic to ETF net
pdiod recv ETF network Compute Nodes I/O Nodes Steering iGRIDPSC Recv iGRID
pdiod recv ETF network render Compute Nodes I/O Nodes Steering iGRIDPSC Render real-time data
pdiod recv ETF network render Compute Nodes I/O Nodes Steering iGRIDPSC Send steering data back to active job input
pdiod recv ETF network render Compute Nodes I/O Nodes Steering iGRIDPSC Dynamically update rendering input