All Hands Meeting 2004 BIRN Coordinating Center Status Report Mark Ellisman Philip Papadopoulos.

Slides:



Advertisements
Similar presentations
CCIRN Meeting Douglas Gatchell US NSF/CISE/SCI July 3, 2004.
Advertisements

National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
SACNAS, Sept 29-Oct 1, 2005, Denver, CO What is Cyberinfrastructure? The Computer Science Perspective Dr. Chaitan Baru Project Director, The Geosciences.
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
High Performance Computing Course Notes Grid Computing.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Building on the BIRN Workshop BIRN Systems Architecture Overview Philip Papadopoulos – BIRN CC, Systems Architect.
Spark Web 2.0 Tools for Communication and Collaboration David Grogan Manager, Curricular Technology Group UIT Academic Technology Tufts University What.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Simo Niskala Teemu Pasanen
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
BIRN Update Carl Kesselman Professor of Industrial and Systems Engineering Information Sciences Institute Fellow Viterbi School of Engineering University.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
A DΙgital Library Infrastructure on Grid EΝabled Technology ETICS Usage in DILIGENT Pedro Andrade
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
Data Management BIRN supports data intensive activities including: – Imaging, Microscopy, Genomics, Time Series, Analytics and more… BIRN utilities scale:
Mark Ellisman, Ph.D. Professor of Neurosciences and Bioengineering Director, BIRN Coordinating Center Center for Research on Biological Systems University.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
GCRC Meeting 2004 Introduction to the Grid and Security Philip Papadopoulos.
National Center for Supercomputing Applications Barbara S. Minsker, Ph.D. Associate Professor National Center for Supercomputing Applications and Department.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
All Hands Meeting 2005 BIRN Portal Architecture: Security Jana Nguyen
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Middleware Camp NMI (NSF Middleware Initiative) Program Director Alan Blatecky Advanced Networking Infrastructure and Research.
Vicky Rowley Solution Architect BIRN Coordinating Center - University of California San Diego E-x-t-e-n-d-i-n-g Rocks: The Creation and Management of Grid.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
GCRC Meeting 2004 BIRN Coordinating Center Software Development Vicky Rowley.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
Jorge Jovicich, Ph.D. Massachusetts General Hospital - Harvard Medical School Biomedical Informatics Research Network Overview Testbeds Morphometry BIRN.
Biomedical Informatics Research Network BIRN Workflow Portal.
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
December 10, 2003Slide 1 International Networking and Cyberinfrastructure Douglas Gatchell Program Director International Networking National Science Foundation,
2005 GRIDS Community Workshop1 Learning From Cyberinfrastructure Initiatives Grid Research Integration Development & Support
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Biomedical Informatics Research Network The Storage Resource Broker & Integration with NMI Middleware Arcot Rajasekar, BIRN-CC SDSC October 9th 2002 BIRN.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May
NSF Middleware Initiative Purpose To design, develop, deploy and support a set of reusable, expandable set of middleware functions and services that benefit.
NA-MIC National Alliance for Medical Image Computing UCSD / BIRN Coordinating Center NAMIC Group Site PI: Mark H. Ellisman Site Project.
APAN Meeting Douglas Gatchell US NSF/CISE/SCI July 4, 2004.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Biomedical Informatics Research Network BIRN as a Shared Infrastructure: An Overview of Policies & Procedures Mark James, Phil Papadopoulos October 9,
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Introduction to Data Management in EGI
UCSD / BIRN Coordinating Center NAMIC Group
Google Sky.
Brokering as a Core Element of EarthCube’s Cyberinfrastructure
Presentation transcript:

All Hands Meeting 2004 BIRN Coordinating Center Status Report Mark Ellisman Philip Papadopoulos

What is BIRN? 150,000LONIharvardncrr.nih

Biomedical Informatics & Research  Biocomplexity  Discovery and Systems research approaches complement Hypothesis-based research  Integrative, multidisciplinary team approach adapted for complex queries versus focused approach for hypothesis-driven research  Team approach more dependent on advanced technologies and instrumentation which generate large data sets  Information management at core of biomedical research for 21 st century and beyond  Biocomplexity  Discovery and Systems research approaches complement Hypothesis-based research  Integrative, multidisciplinary team approach adapted for complex queries versus focused approach for hypothesis-driven research  Team approach more dependent on advanced technologies and instrumentation which generate large data sets  Information management at core of biomedical research for 21 st century and beyond

Overview of the BIRN-CC Roadmap  Deliver and maintain a robust and scalable PRODUCTION Grid for the collaborative sharing, analysis and interrogation of biomedical data  Provide system integration to bring user applications into BIRN  Provide a consistent and scalable software delivery mechanism  Facilitate the use of advancing information technologies by biomedical scientists - “Cyberinfrastructure” and the “Grid”  Be the biomedical applications driver framing requirements for the rapidly evolving GRID infrastructure “Enforce the AEIOU’s – Accessibility, Extensibility, Interoperability, Openness, Usability, Scalability”

Hardware Integrated Cyberinfrastructure System meeting the needs of multiple communities Source: Dr. Deborah Crawford, Chair, NSF CyberInfrastructure Working Group Grid Services & Middleware Development Tools & Libraries Applications Environmental Science High Energy Physics Biomedical Informatics Geoscience Domain- specific Cybertools (software) Shared Cybertools (software) Distributed Resources (computation, communication storage, etc.) Education and Training Discovery & Innovation

BIRN Core Software Infrastructure Distributed Resources BIRN builds on evolving community standards for middleware Adds new capabilities required by projects Does System Integration of domain-specific tools building a distributed infrastructure Utilizes commodity hardware and stable networks for baseline connectivity

BIRN Core Software Infrastructure Distributed Resources BIRN builds on evolving community standards for middleware Adds new capabilities required by projects Does System Integration of domain-specific tools building a distributed infrastructure Utilizes commodity hardware and stable networks for baseline connectivity Grid Services & Middleware Development Tools & Libraries Shared Tools for Multiple Science Domains Distributed Computing, Instruments and Data Resources Your Specific Tools & User Apps. Friendly Work Facilitating Portals Authentication - Authorization - Auditing - Workflows - Visualization - Analysis

BIRN Core Software Infrastructure Distributed Resources BIRN builds on evolving community standards for middleware Adds new capabilities required by projects Does System Integration of domain-specific tools building a distributed infrastructure Utilizes commodity hardware and stable networks for baseline connectivity Grid Services & Middleware Development Tools & Libraries Shared Tools Science Domains Distributed Computing, Instruments and Data Resources Your Specific Tools & User Apps. Friendly Work Facilitating Portals Authentication - Authorization - Auditing - Workflows - Visualization - Analysis Biomedical Informatics “BIRN” High Enegy Pysics GriPhyN Geosciences “GEON” Bays and Rivers (Moore Found.) Earthquake “NEES” Ocean Observing “Looking”

BIRN is Pioneering: We are Making Unique and Fundamental Contributions to Establish Working GRIDs  BIRN is setting an example for other Grid project deployments [i.e. use of Rocks and automated distribution mechanisms] GEON, etc…GEON, etc…  BIRN is a driver application for other major GRID initiatives Common security APIs being used within BIRN, Telescience, GEONCommon security APIs being used within BIRN, Telescience, GEON OptIPuter - research into next generation networking - BIRN is the Bioscience DriverOptIPuter - research into next generation networking - BIRN is the Bioscience Driver Drives requirements to the Global Grid Forum and Internet 2 development effortsDrives requirements to the Global Grid Forum and Internet 2 development efforts

Grid Infrastructure in Action  The Grid is already having an impact… Many projects in many subjects:Many projects in many subjects:  Life sciences  Medicine  Environment  Engineering  Materials  Chemistry  Physics BIRN embodies the most innovative use of data, metadata & portalsBIRN embodies the most innovative use of data, metadata & portals BIRN cited as successful model of grid computing.

The Grid is becoming the backbone for collaborative science and data sharing

BIRN Infrastructure Provides…  high performance connectivity between distributed resources (computation and data storage) JHU utilizing TeraGrid resources pulling data from SRBJHU utilizing TeraGrid resources pulling data from SRB  secure access to large volumes of distributed data  distributed high performance computing resources BIRN just received an NSF Large Resource Allocation Committee award (450,000 service units, i.e. processor hours)BIRN just received an NSF Large Resource Allocation Committee award (450,000 service units, i.e. processor hours)  frameworks (standards, APIs, services) for the integration and interoperation of tools, users, data and computing resources Improved high level “wrapper” toolsImproved high level “wrapper” tools common authentication protocolcommon authentication protocol

Intuitive user interfaces to access grid based computational analyses Transparent access to distributed data found within the BIRN Data Grid Access to Grid Resources Case Study: JHU - LDDMM grid computing launched from BIRN Portal Semi Automatic Shape Analysis study utilizing compute intensive analyses (i.e. Large Deformation Diffeomorphic Metric Mapping)

BIRN has the Advantage of having Developed an “End-to-End” Infrastructure BIRN has the Advantage of having Developed an “End-to-End” Infrastructure in the context of distributed biomedical research projects.  Consists of all the components required to effectively share and collaboratively explore data The BIRN Rack (BIRN site infrastructure)The BIRN Rack (BIRN site infrastructure) The BIRN Virtual Data GridThe BIRN Virtual Data Grid The BIRN Mediation InfrastructureThe BIRN Mediation Infrastructure The BIRN PortalThe BIRN Portal  The system integration, development, deployment and management of this infrastructure is the main focus of activities within the BIRN Coordinating Center

 Continually improve the BIRN software infrastructure (i.e. performance, robustness, end- to-end integration, and interoperability)  Standardize the software delivery process by providing twice yearly scheduled software releases – April & October Develop internal processes for alpha, beta and production releasesDevelop internal processes for alpha, beta and production releases Instantiate robust development, staging and production environmentsInstantiate robust development, staging and production environments Improved documentation and tutorials for all componentsImproved documentation and tutorials for all components  Provide automated deployment mechanisms Improving the BIRN Environment

BIRN Portal  Updated BIRN Portal with new and improved features currently in production  Worked with test beds to improve the usability and performance of the BIRN Portal Improved Performance Updated Portal API for more robust operation Implemented guest pages and accounts Enhanced security and integration with the BIRN Authentication infrastructure Updated look and feel for improved usability Providing online documentation & tutorials

… is that you rely on the integrity of the gatekeeper The problem with portals …

Benefits of a Data Grid  Uniform interface for connecting to heterogeneous distributed data resources Allows for any “grid enabled” tool to interact with data no matter where it is located or what it is located on  Allows for the seamless creation and management of distributed data sets Distributed data appear as a single managed collection both to users and tools  Access is Managed using GRID Authentication through BIRN Portal

Security: Access and Audit Intuitive interfaces to core infrastructure (e.g. the BIRN Virtual Data Grid) and services (e.g. full auditing on BIRN data or image viewing)

Google is not a portal……… Google is not a portal……… Carrot juice cures piles A result? From Ken Peach, Rutherford Labs UK

If you dig deep enough you may get what you want (but perhaps not exactly what you need) Carrot juice cures piles 1,680 Drink a juice of turnip leaves, spinach, water cress and carrots (equal quantity)

Example of Data Mediation within BIRN Find all joint projects between UCSD and Duke w/ relevance to Lewy Body Disease

Benefits of Data Mediation  Provide means to locate, access and interrogate data contained in distributed databases  Can add new resources without modifying existing data resources  Promote flexible views on top of the data  Semantically and spatially integrate multi-scale and multi-modal data

BIRN Data Mediation  Version 2.0 of the BIRN mediator is currently in alpha testing (i.e. as a core component of the BIRN 2.0 release)  Improvements to the new release Enhanced query performance Updated registration, query and view building tools Support for PostgreSQL databases Integrated with BIRN authentication infrastructure  BIRN-CC is exploring additional data mediation approaches with collaborators Yale - Query Integrator System (QIS) GEON - IBM Information Integrator

From Vision to Reality  “It’s all in the software” “It’s not a bug, it’s a feature” “That will be in the next version” “When is the next version?”  “I just want to open a file”  “I need to monitor and control who accesses my data”  “How do I locate data of interest to me?”  “I need a boatload of computing, how do I find it?”  “Why the heck isn’t this easier?”

New sites and collaborative projects are being added BIRN Grid Testbed Sites

We Began with Standard Hardware  This Jumpstarted BIRN for functionality  Software footprint is managed from the BIRN Coordinating Center  Integration of domain tools, middleware, OS, updates, and more  BIRN expansion/upgrade of existing sites must have a more generic (and less expensive) hardware footprint

BIRN CC Software Concerns & Operations  Deploy/Manage/Update Common Services Portal/Website Security Infrastructure Metadata Catalog – SRB MCAT Mediator Registry Source code repository Java Application Servers  Deploy/Manage/Update Site Racks Enterprise Linux Databases and Data Grid Clients Mediated Data Resources BIRN applications (e.g. LONI, 3D-Slicer, FreeSurfer, …)

“It’s all in the Software”  Critical Issues What is the BIRN Software Stack? When is it updated? What Services are supported?  Integrated releases of all BIRN software Defining components: Input/SW from all of BIRN  Candidate software – 3 months prior to release  Alpha phase (functionality freeze) – 2 monts prior to release Defined Schedule  April/October releases - 1 month beta cycle  Pre-alpha is defined now – Part of this meeting should be to prioritize components for April ’05.

More on Software Releases  Defined release cycles is intended to Provide software stability for users and developers Allow everyone to plan on when system changes will occur  As a whole, BIRN will need to prioritize what goes into a release There are limited people and testing resources  Transferring software is not a trivial task Packaging uncovers system assumptions  We use Rocks to define “appliances” 100% automated configuration of endpoints and services BIRN tools need to be transferable to other NIH projects

I Just Want to Open a File …  BIRN has been built upon data collections Data was copied in/out of data grid Meta data allows transparent location/querying Requires scripts/changes to code  Distributed File System Layer Experimenting with AFS Feasibility/performance of developing SRBFS not clear BIRN Application Workflow Mediated Data Data Collection DB Distributed File System Local/NFS File System Mediator – under development (v 2.0) Oracle/Postgres - SRB Standard OS

I Need to Monitor Access to My Files  Authentication (Identification) GSI Certificates Managed transparently by the Portal – Username/Passwd Have developed a Java Class to encapsulate GSI functionality to ease the development of GSI-aware SW  Access control already built in to Data Collection Management (authorization) As we introduce other data modalities, we need to develop a vocabulary that is useful  Translate to specific software systems Eg. SRB, Oracle/Postgres Table Security, AFS, GridFTP, … There is a dearth of community tools to build upon here  BIRN can help drive the community

I need to locate data of interest to me  Two ways now: Meta data attached to collection-managed data  “Retrieve all DAT-KO MRI images” Data Mediator  Gives the illusion of a single database  New relationships among separate database.  What about distributed file systems? You get pathnames, only You can VI (or emacs) a file – that is read/write/open/close works as expected It is reasonable to look at a DFS as a step stone Very useful as community working directories where metadata is less important, but access control is critical

I need a boatload of computing …  JHU has been experimenting with using Teragrid and loading data into the BIRN Data Grid Their storage resources are at 90+%  Condor is deployed on Racks, but We need to look at Use cases and utility. Automated data management (Move my data to the computing) is still clumsy at best  Pathfinder applications help to more crisply define the software stack

Why the heck isn’t this easier?  It really hasn’t been done before A significant number of dimensions  Application usage  Security requirements  Scale of data and of distributed systems Software is evolving to be more robust Cyberinfrastructure architecture has converged to services-based  Implementation Grid Services -> Web Services (within the year)  We’ve needed a Software rallying point Regular release schedule should help provide the pacing that we need

BIRN Core Software Infrastructure Distributed Resources BIRN builds on evolving community standards for middleware Adds new capabilities required by projects Does System Integration of domain-specific tools building a distributed infrastructure Utilizes commodity hardware and stable networks for baseline connectivity

All Hands Meeting 2004

~2000 years old and still readable without technology The Forest of Stones, Xi’an

Evolution of the Computational Infrastructure Source: Dr. Deborah Crawford Chair, NSF CyberInfrastructure Working Group (CIWG) Supercomputer Centers PACI Terascale | | | | | | NPACI and Alliance SDSC, NCSA, PSC, CTC TCS, DTF, ETF Cyberinfrastructure Prior Computing Investments NSF Networking Mosaic - Web Browser GRID Term Coined ~ Metacomputing A timeline from the Computational Infrastructure Division of the US National Science Foundation Telescience: Access to Remote Resources