Improved Access to RDA from the MSS OSD Executive Meeting April 28, 2009
Current situation RDA, modes of access MSS to NCAR connected machines SAN to the networked world (35 TB only) Personalized requests to DSS staff, to the world Ad hoc, one-off efforts, no automation, less than optimal metrics, requires strong initiative for users Quick drive through Illustrate foundations for the new service
Registration To get RDA data a user MUST register Metadata and searching are freely open DB controlled Levels of registration control access Non-restricted data (first level) Restricted access requires additional registration, e.g. agreement to special usage condition for JRA-25
Access offerings
MSS File Selection/Information Example: ECMWF, May 2008
Script to Read MSS files Enable this process for users that don’t have NCAR MSS access – outside users
RDA Dataset Request System (DSRQST) Written project description and requirements Highlight of requirements in three categories Users DSS Control Systems
User Requirements Prioritized Service Levels UCAR, U.S. Univ, U.S Gov. etc Only qualified users see DSRQST interface All data restrictions strictly enforced Download options matched RDA online Point and click ‘wget’ script Automatic notification when data is ready Password protected download directories Unreasonable requests – pop-up window including fault description and options Most likely too much data
DSS Control Requirements System workflow settings 1.Staff inspection of request, approval and start 2.Fully automatic – staff still receives all workflow information Implementable at the dataset level Controlled initiation Service initiated by responsible staff Maximum download amount controlled at dataset level (~ default GB) ‘msrcp’ MSS files out to network attached systems Set priority process levels, 1-10
System Requirements Daemon controls number of processes Throttle impact on MSS Monitor temporary storage area Automatic scrubbing of staged data Default 5 days, more by request File transfers stops if temporary storage is near full All metrics logged as “one-off” requests Cross-check file availability from SAN Don’t transfer from MSS is already available online Corollary: share files between requests Restart capability – mid-stream recovery End-to-end control from DB settings
Next steps Coding is underway, datasets for prototype are identified When – 1-2 months Implement data processing on files – data product preparation Possible – split request and use multiple processes Utilize CISL high-end computing Implement product write service onto the MSS for NCAR users Short retention period
End Questions Complex system DB – MySQL MSS read and write interactions html php java script perl