Download presentation
Presentation is loading. Please wait.
1
UMIACS PAWN, LPE, and GRASP data grids Mike Smorul
2
Overall Principles (PAWN) Distributed, secure ingestion Use of web/grid technologies – platform independent Minimal client-side requirements Ease of integration with data grid systems. Designed to satisfy data integrity requirements of scientific collections and digital preservation
3
Distributed Ingestion (PAWN)
4
Each Producer registers and arranges files locally prior to transport. Multiple distributed archival receiving stations. X.509 based authentication between sites. Independent Certificate Authorities at each Producer. Data Grid is geographically distributed with multiple entrance points
5
Ingestion Workflow (PAWN) 1. Negotiate Submission Agreement. 2. Workflow Initialization and Submission Information Packet (SIP) creation. 3. Transfer of SIPs to Data Grid site. 4. Validation of SIP transfer 5. Organization of data into collections and transfer into Data Grid.
6
Component Overview (PAWN)
7
Communication & Security (PAWN) Web service communication between all components using SOAP. All remote calls are authenticated and encrypted. Components are scalable by using standard web server load balancing technology. Secure communication with Data Grid using GSI.
8
The Lightweight Preservation Environment is an archival system based on a modular design, primarily consisting of existing components. The current implementation relies mostly on Globus technologies. Primarily, we’ve focused on wrapping logic around those components. Lightweight Preservation Environment (LPE)
9
Globus Components We Use (LPE) Grid Security Infrastructure (GSI) for authentication Open Grid Services Infrastructure (OGSI) for middleware Metadata Catalog Service (MCS) for descriptive metadata Replica Location Service (RLS) for locating files GridFTP for file access
10
Data Manager (DM): Organizes data and queries between the user and the other components Policy Manager (PM): Ensures that a minimum number of copies exist for any given file Transformation Manager (TM): Executes specific transformations on a named file on a given storage node and returns the results Developed Components (LPE)
11
We have developed several interfaces to the LPE, for demonstration purposes Command Line Perl CGI Servlet Interfaces (LPE)
12
All components use GSI LPE Organization
13
SRB Grid Testbed (original) Modified the SRB to hold spatial data Contributed Informix port of the SRB (v3.2) Linked three ESIP sites Tested replication between GMU and UMD. Remote registration at UNH
14
Grid Retrieval and Search Platform (GRASP) Based on concepts developed in the Earth Science Data Interface (ESDI) developed at the UMIACS GLCF. Provides a graphical interface into data grid holdings. Access to entire GLCF holdings through the Storage Resource Broker(SRB)
15
GRASP Architecture
16
GRASP uses a data grid as an abstract storage repository. Metadata in the grid is mined from the grid itself or from external sources and published into a browsable form. Data grids may allow for platform independent metadata, but may not be optimal for access
17
GRASP Screenshot
18
Grid Holdings Registered GLCF holdings Over 338,000 registered files 4.4Tb total size Granular permissions on registered holdings No need to move all data into grid, registered pre-existing holdings in place.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.