Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Data Management Requirements at SNS Shelly Ren & Steve Miller Scientific Computing Group, SNS-ORNL December 11, 2006.

Similar presentations


Presentation on theme: "The Data Management Requirements at SNS Shelly Ren & Steve Miller Scientific Computing Group, SNS-ORNL December 11, 2006."— Presentation transcript:

1 The Data Management Requirements at SNS Shelly Ren & Steve Miller Scientific Computing Group, SNS-ORNL December 11, 2006

2 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 2 SDM 12/11/2006 SNS Neutron Scattering User Facility

3 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 3 SDM 12/11/2006 Neutron Scattering Science Areas  Chemistry – microstructures Chemistry  Complex Fluids – fluid properties Complex Fluids  Crystalline Materials – molecular structure Crystalline Materials  Disordered Materials – structure characterization Disordered Materials  Engineering – study material stress/strain Engineering  Magnetism & Superconductivity – material properties Magnetism & Superconductivity  Polymers – studying “giant” molecules Polymers  Structural Biology - proteins Structural Biology

4 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 4 SDM 12/11/2006 SNS Instrument Commissioning Schedule

5 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 5 SDM 12/11/2006 0 200 400 600 800 1,000 1,200 1,400 200620072008200920102011 GB/day reduced raw+reduced old raw raw YEAR SNS Potential Data Volume 0 200 400 600 800 1,000 1,200 200620072008200920102011 YEAR TB raw+reduced old raw raw Total Stored Data Production Data Rate Just Instrument Data Here

6 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 6 SDM 12/11/2006 Integrating Computation with Experimentation Acquisition Raw Diagnostics Treatment Analysis IntermediateScientific Instrument Electronic notebook Decision Support & Intelligent Control Sample & Environment Vis Controls Instrument Simulation Materials simulation Proposal Automation Database Instrument simulation Materials simulation Vis Publications interactive feedback acquisition analysis simulation data Database Documentation visualization Raw Intermediate Scientific Notebook Sample & environment Simulation Access and authorization control Control portalData portalAnalysis portal Web Browser Repository Data Software Hardware Key Metadata Portal HPC Support High Performance Computing

7 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 7 SDM 12/11/2006 Creating, Processing and Storing Data Event Histogramming Detector to Pixel mapping Instrument Geometry Metadata extraction Create NeXus file  Catalog and Store  Reduce Data All subsystems functional to some degree

8 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 8 SDM 12/11/2006 Current SNS Data Hierarchy SNS data are stored on NFS mounted file system  Direct Attached Storage (DAS) - incrementally growing the storage resources based upon need  A data server for DAS - Terabytes internal hard drive storage SNS metadata are stored in Oracle database ICAT metadata -- Oracle DB ICAT Appl Server -- JBoss /facility/instrument/proposalID/experimentId/runNumber /Nexus/NeXus files /preNeXus/metadata files /analysis live-catalog icat-search Data Hierarchy e.g. /SNS/BSS/2006_1_2_SCI/1/100/NeXus/BSS_100.nxs /SNS/BSS/2006_1_2_SCI/1/100/preNeXus/cvinfo.xml /SNS/BSS/2006_1_2_SCI/1/100/preNeXus/cvbeam.xml user-workspace data browser sns-checkin

9 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 9 SDM 12/11/2006 SNS Data Access – Through Unix Shell  Symbolic links are created in the user’s home directory to link to the proposal directories he/she is a member of  Symbolic links are created for the user in the users’ home directory to link to the public directory where public data reside  Disk quota may be allocated for users to perform analysis, simulation /facility/users/neutron_boy/workspace (write) /proposalID (read only) /public (read only) /facility/users/public/proposalID /proposalID User Workspace Gray names are symbolic links to data hierarchy

10 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 10 SDM 12/11/2006 SNS Data Access – Through Portal ISAW Plot metadata NeXu Files NeXus tags First SNS Data

11 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 11 SDM 12/11/2006 Search Your Data via the Web Enter search text Select search fields Select files of interest to browse or to download Select Optional Search Fields Enter Text Search String Returns Files Found

12 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 12 SDM 12/11/2006 Monte Carlo Simulation via the Portal

13 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 13 SDM 12/11/2006 SNS Data Management Requirements  Archive, catalog and maintain data produced by SNS instruments so users can access them from anywhere at anytime and not worry about data storage issues  Grant authorized access to SNS data and metadata for both shell and portal users (ensure data is private to the experiment team)  Provide services for efficient search, browse, download SNS data and metadata  Allow users to share datasets with their collaborators or access datasets that have been made public, in a scalable fashion  Provide data management service to HFIR, LUJAN, IPNS and other interested neutron facilities.  Extend dataset storage to spin disc, HPSS and other archival systems  Manage distributed dataset storage and perform data transport for the end users  Federate data storage with partner neutron facilities like ISIS so that the users would see all their experiment data by logging into one facility.

14 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 14 SDM 12/11/2006 SNS Long Term Data Management Needs  Create a single file hierarchy for accessing data distributed across multiple storage systems and multiple facilities even extending beyond neutron scattering facilities  Support the management, collaboration, controlled sharing, replication, transfer, and preservation of distributed data  Capture metadata for user produced data  Automate data transfer  Improve data processing -- parallel and scalable  Search large volumes of data for patterns to find certain structures within their data -- data mining  Establish a unified user authentication service across neutron facilities  Provide users with ease of use portal service to search, browse, download and upload data; to search, annotate, and update metadata;  Integrate experiment with simulation, launch simulation jobs that need programmatic access to the distributed data resources.

15 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 15 SDM 12/11/2006 Summary  As more instruments are going through instrument commissioning phase and diving into new science discovery era, we are facing the emerging challenge of managing the scientific data that can grow to petabytes scale in a few years  As a user facility, SNS will have a steady stream of users to run experiment, generate raw and analysis data files – we will need not only disc cache but also long term storage system like HPSS  Promise to search and retrieve SNS data and metadata for end users anywhere anytime in a timely fashion  Grow our data management resources and collaborate with the community  Looking for opportunities to work with and leverage resources beyond our facility  Eager to reach out, learn and collaborate with data management experts working on the data management discipline in all domain areas  Wish to understand and utilize new software applications to manage distributed data storage; to transport, search and retrieve data more effectively and efficiently


Download ppt "The Data Management Requirements at SNS Shelly Ren & Steve Miller Scientific Computing Group, SNS-ORNL December 11, 2006."

Similar presentations


Ads by Google