Peta-Cache, Mar30, 2006 V1 1 Peta-Cache: Electronics Discussion II Presentation Ryan Herbst, Mike Huffer, Leonid Saphoznikov Gunther Haller (650)
Peta-Cache, Mar30, 2006 V1 2 Flash Storage Option Skim Builder as in option discussed earlier Event server (cache box) as in other option Shown as two boxes for simplicity, could be in one box (there are pro’s and con’s) Issue is again interconnect speed Up to 16 1-Terra-Byte Flash boxes for each event server –Each lane PCI-E 256 MByte/sec –16-lanes gives total of 4-Gbytes/sec bandwidth Each Flash box has only fraction of total event store Flash has limited write-cycles so can’t frequently rewritten (need to enforce with some policy which is most important) But don’t really want to “burn” results of skim in flash, since goal is to make own lists (and flash can’t be reburned at will anyways) Flexibility: –One event server can have a sub-set of the list and events go to client –Or, better, have total of event server as “one” cache” and event store is managed so that parts of the list which are in other pizza boxes are kept in that cache as opposed to discarded Question is again how to populate the Flash most effectively Decompression in event server Flash bad-block management in event server Reed-Solomon EDAC in event server Can consider without cache box: 4000 clients going after the same block, the last one to get data is ~ 300 msec later. Flash Storage Event Server Ethernet/PCI-E/etc Client (1, 2, or 4 core) Skim builder (s) Client (1, 2, or 4 core) Up to 1,500 cores in ~ 800 units? Disk Storage Event Server Tape Client (1, 2, or 4 core) Optionally direct IO Disk Storage PCI-E 1 16 Flash Storage 1 16
Peta-Cache, Mar30, 2006 V1 3 Flash-Box, Event Box Flash Memory Box 8-, 16-, or 32-Gbit NAND devices For 1 Terra-Byte need 250 each 32-Gbit devices –All on board, or –32 G-Byte memory cards (DIMM) »Need > 30 DIMM’s Preliminary placement on 19-inch rack PCB shows that we can fit 1 Terra-byte on single board –PCI-E to PCI-X bridge (to get 64-bit addressing space ) –No smarts in here Event (Pizza) Box –8 F40 Xilinx (each has MHz PPC’s) –16 GBytes of RLDRAM2 –8 PLX8508 PCI-E switch 5-ports –2 PLX port switch (32 lanes)
Peta-Cache, Mar30, 2006 V1 4 Flash & Event Server Boxes 4Gbit chips: $30, 8Gbit = $60 4 Gbyte device quote: $110 min qty 1000 (is 4-1GB die stack) 1 Peta-Byte: 1,000 boxes total $27 Mill Bridge Chips (total of 16)$500 Misc (Box, board, loading, regulators, etc) $400 1 TByte Flash (250 x (4-GByte ~ $110))$27,000 Xilinx’s (8 each)$500 Local RLDRAM2 (16Gbytes)$3,200 Misc (Box, board, loading, regulators, etc) $400 Misc Switches$500 Flash Box Event Server
Peta-Cache, Mar30, 2006 V1 5 Pizza box block diagram (needs some modification) (Out) PCI Express x4 PPC 405 XILNIX XC4VFX40 RLDRAM II IGbyte PLX8508 PLX8532 x16 (In) PCI Express x16 x4 PLX8532 PLX8508
Peta-Cache, Mar30, 2006 V1 6 Event Processing Center switch file system fabric sea of cores fabric switch(s) HPSS pizza box out protocol conversion Event processing node disks pizza box as skim builder in protocol conversion