By: Roman Olschanowsky An Introduction to the
Outline SDSC and History of SRB –Example Project Introduction to SRB –Discussion on SRB basics –SRB Clients Overview of a Data Grid –Infrastructure –Topology Teragrid Demo –How to use your TG SRB account –How to access Digital Data Collections
Archival Systems 18 PB 15.6 TF DataStar IBM Power4 4.4 TF TeraGrid Linux Cluster (IA64) 1.4 PB Storage Area Network Disk Sun F15K Disk Server Networking Visualization Storage and Compute Resources Human infrastructure: Experienced multi- disciplinary staff support a broad spectrum of national science, engineering and technology projects Blue Gene/L (Due 12/04) 2.8/5.7 TF
Sites Using the SRB
SDSC SRB Projects (60 million,.5 PB ) Digital Libraries –UCB, Umich, UCSB, Stanford,CDL –NSF NSDL - UCAR / DLESE NASA Information Power Grid Astronomy –National Virtual Observatory –2MASS Project (2 Micron All Sky Survey) Particle Physics –Particle Physics Data Grid (DOE) –GriPhyN –SLAC Synchrotron Data Repository Medicine –Digital Embryo (NLM) Earth Systems Sciences –ESIPS –LTER Persistent Archives –NARA –LOC Neuro Science & Molecular Science –TeleScience/NCMIR, BIRN –SLAC, AfCS, …
Storage Resource Broker (SRB) A distributed file system (Data Grid) –Client-Server, Server-Server architecture. –Abstracts physical SRB provides the ability to transparently share data across remote sites. –Heterogeneous Resources –Single sign on –Single logical file hierarchy
What we are familiar with
What we are not familiar with, yet
How do the file systems differ? Logical Abstraction –Folders are NOT physical –Files do NOT inherit physical location –Everything is potentially distributed Access Control –Permissions are NOT rwxrwxrwx –Permissions ARE on a object by object basis –Groups and permissions ARE more similar to NTFS Domains –Geographical / logical grouping of users –Namespace scalability: –Also doubles as groups
Interfaces to the Storage Resource Broker inQ– Windows Client Scommands– UNIX, DOS Command line Client Jargon– Java API and GUI components mySRB– Web Client Matrix– WSDL, Data Grid Workflows C, C++– C and C++ API Python– Python API Perl– Perl API
Common Scommands (75 total) Sinit Senv Spwd Sls Scd Sget Sput Ssh Scp Smv (logical) Sphymove (physical) Srm Smkdir Srmdir Serror Schmod Sexit
mySRB
BIRN Portal (perl based)
NEEScentral Portal (php based)
Biomedical Informatics Research Network (BIRN) Major collaboration with SDSC, several of the projects’ Co- Investigators and Co-PIs are at SDSC. BIRN’s purpose is to provide it’s consortium of neuroscience laboratories the ability to share, compute, and collaborate. The Storage Resource Broker provides the ability to transparently share data across remote sites.
The BIRN SRB Data Grid
Doing this “Manually”
The BIRN Data Grid
The grid is in the details
File Replication Sls /home/Demo/SRB-Tutorial/files-2: Doc.txt Sls -l /home/Demo/SRB-Tutorial/files-2: romanoly 0 z-ucsd-ncmir-nas Doc.txt romanoly 1 z-jhu-cis-nas Doc.txt romanoly 2 z-stanford-lucas-nas Doc.txt romanoly 3 z-umn-cmrr-nas Doc.txt romanoly 4 z-uci-bic-nas Doc.txt
Teragrid SRB –All Teragrid accounts are given a SDSC SRB Teragrid account The ‘username’ is the same as your SDSC UNIX account name Your SRB ‘domain’ is ‘teragrid’ You must register your DN string with SDSC’s grid-mapfile or request a SRB password to activate your SRB account. Instructions to do so are here: –Your ~/.srb/.MdasEnv file OR env variables mdasCollectionHome '/home/.teragrid' mdasDomainName 'teragrid' srbUser ' ' #AUTH_SCHEME 'ENCRYPT1' AUTH_SCHEME 'GSI_AUTH' srbHost 'srb.sdsc.edu' srbPort '7321' defaultResource 'sfs-tape-tgd' SERVER_DN '/C=US/O=NPACI/OU=SDSC/CN=Storage Resource Broker/USERID=srb'
Scommand Features Command line interface -> SCRIPTING Available for all of the most popular UNIX flavors and DOS S-commands are the most flexible and powerful of the clients They are the fastest, and most reliable They are multithreaded for big gains in data flow They are great for scripts, perl wrappers, batch jobs, etc… Installed man pages via “man [Scommand]” –man Sput
Scommand Notes Shelp –Gives list of commands with brief summary –“[Scommand] ” gives usage info (usually) or try –h flag Sinit – establishes a session Senv – displays connection information Spwd – display current working directory Sexit – ends session
Some Public SRB Collections Southern California Earthquake Center /home/public.teragrid/SCEC Two Micron All Sky Survey /home/public.teragrid/2MASS The Palomar Digital Sky Survey /home/public.teragrid/DPOSS
Watch me do a SRB demo
Thanks! SRB handles large data and provides the ability to share and collaborate on distributed heterogeneous resources. Questions? Teragrid SRB userguide: SRB website: SRB