MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

OGF-23 iRODS Metadata Grid File System Reagan Moore San Diego Supercomputer Center.
Chapter 10: Designing Databases
Data Grid: Storage Resource Broker Mike Smorul. SRB Overview Developed at San Diego Supercomputing Center. Provides the abstraction mechanisms needed.
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SAN DIEGO SUPERCOMPUTER CENTER Particle Physics Data Grid PPDG Data Handling System Reagan.
San Diego Supercomputer CenterNational Partnership for Advanced Computational Infrastructure1 Grid Based Solutions for Distributed Data Management Reagan.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Chronopolis: Preserving Our Digital Heritage David Minor UC San Diego San Diego Supercomputer Center.
ETEC 100 Information Technology
ABCSG - Distributed Database 1 Data Management Distributed Database Data Replication.
Introduction to Active Directory
Applying Data Grids to Support Distributed Data Management Storage Resource Broker Reagan W. Moore Ian Fisk Bing Zhu University of California, San Diego.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 1: Introduction to Windows Server 2003.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 1: Introduction to Windows Server 2003.
GRID COMPUTING: REPLICATION CONCEPTS Presented By: Payal Patel.
ADVANCED MICROSOFT ACTIVE DIRECTORY CONCEPTS
11 REVIEWING MICROSOFT ACTIVE DIRECTORY CONCEPTS Chapter 1.
January, 23, 2006 Ilkay Altintas
L/O/G/O Metadata Business Intelligence Erwin Moeyaert.
Information storage: Introduction of database 10/7/2004 Xiangming Mu.
Database Concepts & Introduction to MS Access 1. Outline Database Overview  Database Management System Concepts  Database Structures Database, tables,
Web-Enabled Decision Support Systems
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
Jan Storage Resource Broker Managing Distributed Data in a Grid A discussion of a paper published by a group of researchers at the San Diego Supercomputer.
Database System Concepts and Architecture
1 Introduction to Database Systems. 2 Database and Database System / A database is a shared collection of logically related data designed to meet the.
Entity Framework Overview. Entity Framework A set of technologies in ADO.NET that support the development of data-oriented software applications A component.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
File Systems (1). Readings r Reading: Disks, disk scheduling (3.7 of textbook; “How Stuff Works”) r Reading: File System Implementation ( of textbook)
Active Directory Maryam Izadi. Topics Covered NT Vs 2000/2003 Active Directory LDAP MMC.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
2Object-Oriented Analysis and Design with the Unified Process Objectives  Describe the differences and similarities between relational and object-oriented.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
Rule-Based Programming for VORBs Bertram Ludaescher Arcot Rajasekar Data and Knowledge Systems San Diego Supercomputer Center U.C. San Diego.
Web: Minimal Metadata for Data Services Through DIALOGUE Neil Chue Hong AHM2007.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
CE Operating Systems Lecture 21 Operating Systems Protection with examples from Linux & Windows.
DATABASE SYSTEMS. DATABASE u A filing system for holding data u Contains a set of similar files –Each file contains similar records Each record contains.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
PREMIS Implementation Fair, San Francisco, CA October 7, Stanford Digital Repository PREMIS & Geospatial Resources Nancy J. Hoebelheinrich Knowledge.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
Create Content Capture Content Review Content Edit Content Version Content Version Content Translate Content Translate Content Format Content Transform.
© 2008 Open Grid Forum File Catalog Development in Japan e-Science Project GFS-WG, OGF24 Singapore Hideo Matsuda Osaka University.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Replica Management Kelly Clynes. Agenda Grid Computing Globus Toolkit What is Replica Management Replica Management in Globus Replica Management Catalog.
Introduction to The Storage Resource.
DataBase System Concepts and Architecture
OVERVIEW OF ACTIVE DIRECTORY
Distributed Database Management Systems. Reading Textbook: Ch. 1, Ch. 3 Textbook: Ch. 1, Ch. 3 For next class: Ch. 4 For next class: Ch. 4 FarkasCSCE.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
The Storage Resource Broker and.
1 CEG 2400 Fall 2012 eDirectory – Directory Service.
Database Environment Chapter 2. The Three-Level ANSI-SPARC Architecture External Level Conceptual Level Internal Level Physical Data.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Active Directory Domain Services (AD DS). Identity and Access (IDA) – An IDA infrastructure should: Store information about users, groups, computers and.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
CS422 Principles of Database Systems Course Overview
AMGA Web Interface Vincenzo Milazzo
Database Systems Instructor Name: Lecture-3.
Introduction of Week 14 Return assignment 12-1
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets A.Chervenak, I.Foster, C.Kesselman, C.Salisbury,
Presentation transcript:

MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

Overview u What is metadata u MCAT architecture u List of (many!) MCAT attributes u MAPS

Elements of Data Intensive computing environments u Resources –Hardware: computing platforms, networks, storage systems –Software: DBs, file systems, operating systems, schedulers, applications u Methods –Access methods, APIs, data access and conversion u Data objects –Data sets and collections of data sets u Users and groups –Who is allowed to create/update/access resources, methods and data sets

Elements of MCAT u MAPS initialization (Metadata Attribute Presentation Structure) u Schema initialization u MAPS to schema converter to dynamic query generator u DB2 and Oracle Query systems u Answer extractor u Convert back to MAPS format

Metadata Stored in MCAT u Metadata = information about data objects –Describes properties and attributes of objects Examples 1. Identifier (internal, not seen by user) 2. Name 3. Types and formats 4. Size

MCAT attributes (cont.) 5. Comments 6. Liveness: (I.e., current state) deleted or exists or locked or under construction 7. Replica-number uSRB supports cloning of data uAn object may have many clones uSRB controls replica selection 8. Creation time-stamp 9. Creation-owner

MCAT attributes (cont.) 10. Collection name uEvery data object must be associated with a collection uA collection contains data objects and other collections (I.e., sub-collections) uObjects may only belong to one collection 11. Physical resource where object is located 12. Location inside the resource (e.g., a directory on a file system

MCAT Attributes (cont) 13. Access control list (ACL) uEntry is: uEach user is given one permission per data object uEach permissionID has an associated list of actions that are permitted uRead uWrite uControl ugrantTicket

MCAT attributes (cont.) 14. Audit record u uEach action on a data object can be audited uAction success or failure noted in audit trail

MCAT attributes (cont.) 15. Ticket uProvides holder with an action permit on the data object uCurrently only read uTicket-giver can impose restrictions: who can use it, when, how many times it can be used u

Attributes not yet supported u Partitioning of data objects u Versioning u Lineage (of data objects and methods) u Derivatives u Locking u Public and private keys on data objects or collections u Summaries or aggregations u Measurements

Resource-related metadata 1. Name 2. Type 3. Access address 4. Default location template (URL??) 5. Replica-numberA: copies of the same resource, any of copies are equivalent 6. Comments

Replicated resource concept u Logical resource u Formed as set of physical, possibly heterogeneous resources u Create a data object on a replicated resource: –object automatically replicated in each one of the component resources u Provides fault tolerance u Other logical resources: striped resources (round-robin), write-once resources, read- only resources

User-related attributes 1. Name 2. Type (privileged, normal, projects) 3. Address Phone 6. Pass phrase 7. Domain: e.g., ucsd, sdsc, caltech 8. User-groups: provides group ID and access control

Data Models and Data Exchange u Data models: standards for structuring information (e.g., Dublin core) u Data exchange formats: standard means to communicate metadata (e.g., XML) u MCAT uses its own data model and exchange format: MAPS –MAPS = metadata attribute presentation structure u Working on mappings to other formats

MAPS u MAPS query format derived from SQL –Large metadata catalogs require database systems –Metadata are normally given as attribute- value pairs whose search can easily be translated into SQL-like queries