gStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
ITEE-Palaver gStore - GSI Mass Storage2 Overview 1.adsmcli: ended after 10 years of operation 2.tsmcli: modern concept 3.gStore (gstore): unified user interface 4.rearrangement of storage 5.gStore projects 6.final remarks
ITEE-Palaver gStore - GSI Mass Storage3 adsmcli: initial system Software: 1.ADSM: Adstar Storage manager commercial handles ATL and tapes 2.GSI Software: –Interface to users –API to ADSM
ITEE-Palaver gStore - GSI Mass Storage4 adsmcli: initial system Hardware 1996: –AIX server –ATL: IBM tape drives IBM 3590: 14 MB/s, 10 GB/volume max 23 TByte – few GB disk (write) cache ADSM – 80 GB read cache (1998)
ITEE-Palaver gStore - GSI Mass Storage5 adsmcli: overview
ITEE-Palaver gStore - GSI Mass Storage6 adsmcli: early usage
ITEE-Palaver gStore - GSI Mass Storage7 adsmcli: the best year
ITEE-Palaver gStore - GSI Mass Storage8 adsmcli limitations Restrictions: –bottleneck server –no scalability data capacity (cache) I/O bandwidth –missing write cache frozen since 2001 –only read cache upgrade 2003: 1.2 TB
ITEE-Palaver gStore - GSI Mass Storage9 tsmcli: concepts Concepts: separation of control and data flow: –data flow: Data Mover –control flow: TSM Server, Entry Server many DMs => many parallel data streams SAN: Storage Area Network Cache Manager: read and write cache direct DAQ connection to gStore
ITEE-Palaver gStore - GSI Mass Storage10 tsmcli concept
ITEE-Palaver gStore - GSI Mass Storage11 tsmcli: storage view
ITEE-Palaver gStore - GSI Mass Storage12 tsmcli: usage tsmcli in production since January 2003 –in parallel to adsmcli –initially only for 'large' experiments write cache: since February 2005 –for 'normal' clients: command tsmcli RFIO API –for DAQ clients (RFIO, write only)
ITEE-Palaver gStore - GSI Mass Storage13 tsmcli hardware 2007 server: Windows 2000 cluster ATL: Sun StorageTek L700 –9 tape drives LTO2: –35 MByte/s, 200 GByte/vol –max 140 TByte data mover: –10 Windows (gsidm0-9), 4 TB disk cache – 5 Linux (slxdm01-5), 13 TB disk cache
ITEE-Palaver gStore - GSI Mass Storage14 tsmcli usage 2006
ITEE-Palaver gStore - GSI Mass Storage15 tsmcli usage 2007
ITEE-Palaver gStore - GSI Mass Storage16 gStore top load top data transfer in 2006: Dec 31 overall: 9.6 TB in 24 h –111 MB/s on average slxdm01: 2.9 TB in 24 h –33.6 MB/s on average
ITEE-Palaver gStore - GSI Mass Storage17 common mass storage interface coexistence of 2 mass storage systems: intermediary solution (ca 4 years) => common new interface gstore: replacing adsmcli and tsmcli successfully in operation since May 23 (considerable) enhancement of tsmcli SW: access to 2 independent TSM servers and attached DMs/disk caches –further scalability aspect!
ITEE-Palaver gStore - GSI Mass Storage18 storage status mid ATLs: 1.IBM 3494 (3590 tapes): –50 TB experiment data (adsmcli) –15 TB backup data –nearly filled 2.Sun StorageTek L700 (LTO2 tapes): –120 TB experiment data (tsmcli) –max 140 TB => nearly filled 3.Sun StorageTek L700 (LTO1 tapes): –38 TB backup data –max 70 TB
ITEE-Palaver gStore - GSI Mass Storage19 requirements 1.substantially more data capacity –4 new tape drives IBM 3592 for 3494 ATL 2.separate experiment and backup data: experiment data -> IBM 3494 backup data -> LTO2 ATL 3.safe long term storage upgrade LTO1 ATL -> LTO3 deploy in 'remote RZ'
ITEE-Palaver gStore - GSI Mass Storage20 gstore hardware server: AIX –ATL: IBM tape drives IBM 3592: 100 MByte/s, 700 GByte/vol max 1.6 PB –data mover: 5 Linux (slxdm01-5), 13 TB disk cache 3 Linux (slxdm06-8), 17 TB disk cache 2.server: Windows
ITEE-Palaver gStore - GSI Mass Storage21 actions: move all existing experiment data to IBM 3592 tapes in 3494 ATL –50 TB from 3590 media: finished (adsmcli data) => old 3590 hardware/media replaced –130 TB from LTO2 media: 40 TB done (tsmcli data) write all new experiment data to 3494 ATL: –since May 23
ITEE-Palaver gStore - GSI Mass Storage22 actions: redirect all new backup data to LTO2 media –new pair of Linux TSM servers –in work move actual backup data to LTO2 media –mainly user archives –from LTO1 and 3590 media –still to be done
ITEE-Palaver gStore - GSI Mass Storage23 open projects xrootd: –in test environments gStore access for xrootd clients available –still open: stability xrdcp, functionality Posix ls
ITEE-Palaver gStore - GSI Mass Storage24 open projects Grid SRM (Storage Resource Manager): –several types of SRMs installed worldwide –common: no general mass storage interface –currently under investigation for connection with gStore: Berkeley SRM ('BeStMan')
ITEE-Palaver gStore - GSI Mass Storage25 open projects 2nd level DM: –no SAN connection –filled via LAN from 1st level DM –inexpensive extension gStore read cache: for data needed online for longer time scales (weeks/months) –no NFS: use gstore query/retrieve e.g. xrootd: enable full file information for new /d file servers !?
ITEE-Palaver gStore - GSI Mass Storage26 user requests gStore enhancements: –staging large sets of files: equal distribution on all DMs (1st or 2nd level) stage –distr stage –distr –L2 –recursive access query/stage/retrieve –r path –rename path/file –files > 2 GB –...
ITEE-Palaver gStore - GSI Mass Storage27 Final Remarks I currently ca. 180 TB of exp. data on tape (+50 TB raw data backup) 1.6 – 2 PB max tape capacity I/O bandwidth –> 1 GB/s cache clients – tape Hades DAQ end 2008: 200 MB/s => more tape drives needed 35 TB disk cache (1st level)
ITEE-Palaver gStore - GSI Mass Storage28 Final Remarks II gStore fully scalable in data capacity and I/O bandwidth supports several TSM servers gStore fully flexible in hardware (TSM) in the past years: –managed growth of > order of magnitude –handled various hardwares and platforms gStore prepared for further growth (FAIR) gStore adaptable for cooperation with external software packages