ARCHER Overview October 2008. 2 e-Research Challenges Acquiring data from instruments Storing and managing large quantities of data Processing large quantities.

Slides:



Advertisements
Similar presentations
Improving Learning Object Description Mechanisms to Support an Integrated Framework for Ubiquitous Learning Scenarios María Felisa Verdejo Carlos Celorrio.
Advertisements

Crystallographic Metadata Simon Coles CrystalGrid Collaboratory Foundation Meeting September 2004.
EBankII Workshop 1 Making Scientific Data Openly Available Simon Coles School of Chemistry, University of Southampton.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Implementing Shibboleth-based Virtual Organisations and VO Federations using IAMSuite (including AAF update) James Dalziel & Alan Lin Professor of Learning.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
BCAD Architecture 2009 British Cartoon Archive. Projects A project to digitise and catalogue the Carl Giles Archive to current international standards.
Content Management System (CMS) - An overview. Project Organisation.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
NCS Grid Service Ken Meacham, IT Innovation Crystal Grid Workshop, Sept 2004.
Dspace – Digital Repository Dawn Petherick, University Web Services Team Manager Information Services, University of Birmingham MIDESS Dissemination.
The Changing Face of Research Anthony Beitz DART Integration Manager.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Tools for e-Research Mat Wyatt. 2 e-Research Sensor nets data compute… Models/ software/ workflows colleagues instruments.
OpenMDR: Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
Content Management Interoperability Services (CMIS)
AAF Middleware update February Presented by Terry Smith Technical Manager and Heath Marks Manager.
EXtensible Neuroimaging Archive Toolkit (XNAT) Washington University Neuroinformatics Group.
BISQUE: Enabling Cloud and Grid Powered Image Analysis Ramona Walls iPlant Collaborative
R utgers C ommunity R epository RU CORE 1 Research Data and Context  Presentation Goals  The challenge of context  Metadata design to support context.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
ChemStation Integration with ECM November 7, 2006 Integration of ChemStation with OpenLAB ECM Life Sciences Solutions Unit Susanne Kramer, Application.
Enabling E Research ANU Data Commons. What is it ? Building a repository for data sets o data can be deposited o updated o published to Research Data.
ISpheres Project. Project Overview iSpheresCore iSpheresImage Demonstration References.
Enabling Cloud and Grid Powered Image Phenotyping Nirav Merchant iPlant Collaborative
The TARDIS Framework A Federated Repository Solution For Raw Diffraction Datasets Steve Androulakis, Monash University, Melbourne Australia I2S2 Workshop.
Introduction to MDA (Model Driven Architecture) CYT.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, SCAPE Scalable Preservation Environments.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
The TARDIS Framework A Federated Repository Solution For Raw Diffraction Datasets Steve Androulakis, Monash University, Melbourne Australia International.
A summary of the outputs of the ARCHER Project David Groenewegen, Nick Nicholas and Anthony Beitz ARCHER Project.
Crystal-25 April The Rising Power of the Web Browser: Douglas du Boulay, Clinton Chee, Romain Quilici, Peter Turner, Mathew Wyatt. Part of a.
ICTP, April 2007 CIMA in Australia Ian Atkinson HPRC Manager, ITR School of Maths, Physics and IT James Cook University.
KMS Products By Justin Saunders. Overview This presentation will discuss the following: –A list of KMS products selected for review –The typical components.
Neil Witheridge APAN29 Sydney February 2010 ARCS Authorisation Services Neil Witheridge Manager, ARCS Authorisation Services APAN29, Sydney, February 2010.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
EPrints 10 Years of Digital Preservation. What is EPrints For?  EPrints offers a safe, open and useful place to store, share and manage material in the.
Crystal25 Hunter Valley, Australia, 11 April 2007 Crystal25 Hunter Valley, Australia, 11 April 2007 JAINIS (JCU and Indiana Instrument Services): A Grid.
ANDS and its Services Phenomics Data & Informatics Workshop 2010, Friday, 23rd April 2010.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
CaGrid Overview and Core Services caGrid Knowledge Center February 2011.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
WHIP - Workflow Hosted in Portals Kurt Mueller and Andrew Harrison School of Computer Science, Cardiff And Ian Taylor School of Computer Science, Cardiff.
Technical Update 2008 Sandy Payette, Executive Director Eddie Shin, Senior Developer April 3, 2008 Open Repositories 2008, Fedora User Group.
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
ARROW Institutional Repositories for Managing e-Theses Presentation to ETD September 2005 Geoff Payne, ARROW Project Manager.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
TopCAT Use Cases Priorities User Interface 1 ICAT developer workshop, August 2009 Laurent Lerusse – STFC
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
| nectar.org.au NECTAR TRAINING Module 2 Virtual Laboratories and eResearch Tools.
Simplified Experiment Submit Proposal Results Excited Users Do Expt Data Analysis Feedback.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
CombeDay Making Data Openly Available Simon Coles.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Storing digital assets on Grid/EGI FedCloud with gLibrary Giuseppe La Rocca, INFN DARIAH ERIC.
NVS New Zealand National Vegetation Survey. What is NVS? NVS (National Vegetation Survey) – New Zealand’s largest archive facility for plot-based vegetation.
DART Developing Toolkits for e-Research Dr Jeff McDonell, DART Project Director July 2006.
International Planetary Data Alliance Registry Project Update September 16, 2011.
ARCHER Building data and information management tools for the complete research life-cycle July 2006.
Overview SPIRE project: Looking at the feasibility of P2P in UK higher education Focused on Penn States open source P2P system ‘LionShare’ which is a heavily.
Accessing the VI-SEEM infrastructure
Contract Lifecycle Management In the Disruptive Age
UNC Digital Library Project
VI-SEEM Data Repository
An ecosystem of contributions
Malte Dreyer – Matthias Razum
Presentation transcript:

ARCHER Overview October 2008

2 e-Research Challenges Acquiring data from instruments Storing and managing large quantities of data Processing large quantities of data Sharing research resources and work spaces between institutions Publishing large datasets and related research artifacts Searching and discovering

3 Research process Researcher grows crystal Crystal exposed to X-rays & diffraction pattern detected Detector generates raw data Data stored to SRB Monitor telemetry during file generation Analysis begins during data generation Analysis performed on the grid Workflow automates analysis Analysis accessed by collaboratorsIterative analysis saved to SRB Results published in PDB & other repositories Metadata associated to raw data

4 ARCHER - Australian ResearCh Enabling enviRonment Building generic research data management infrastructure:  ARCHER Research Repository  Distributed Integrated Multi-Sensor & Instrument Middleware – concurrent data capture and an  Scientific Dataset Manager (Web)  Scientific Dataset Manager (Desktop Client)  Metadata Editing Tool  Analysis Workflow Automation Tool  Collaborative and Adaptable Research Portal Development Tool Work on Shibboleth enhancements and security requirements with the AAF Completed some customised deployments Developed by Monash University, James Cook University, and University of Queensland Funded by DIISR/DEST, through the SII (Systemic Infrastructure Initiative) ARCHER will be completed by September 2008

5 Acquire Publish Publication Repositories Instruments Manual Research Repositories Computational Grids ARCHER Building generic tools for a secure, seamless, and collaborative e-Research space Dataset Acquisition Dataset Management (Web) Dataset Management (Desktop) Collaborative Workspaces Workflow Automation Metadata Management

6 ARCHER: Data-centric Model Federation IdP Research Repository (SRB & iCat) Repository Web Access (xdms, plone) Collaboration Environment (plone) Automated Instrument Data Deposition Service Provider Repository Desktop Access (Hermes) IdP Shib Protected PKI Workflow/Analysis Automation

7 An Example ARCHER Deployment

8 ARCHER Research Repository A place for Researchers to store their research data Easily Accessible  Federated access - aligns with the AAF  Research data can be accessed by web, desktop, or standard file access protocols (e.g. GridFTP and SRB) Capable of managing large datasets  Built on SRB Rich metadata  Core metadata based on CCLRC’s ٭ Scientific Metadata Model  Flexible metadata available for samples, datasets, and datafiles Secure ٭ Now the Science and Technology Facilities Council

9 Simplified CCLRC Scientific Metadata Model

10 Distributed Integrated Multi-Sensor & Instrument Middleware (DIMSIM) Concurrent data capture & analysis Allows multiple sensors to be easily integrated Enables instruments to be more easily accessible over a network Automatically deposits instrument datasets into a designated research repository Easily accessible telemetry Enables concurrent analysis

11 DIMSIM for Crystallographers Diffractometer OSC Images Sensors (lab environment etc.) Images SRB MCAT Disk/Tape Storage (multiple locations) Useful stuff!

12 DIMSIM – Example Telemetry

13 XDMS: Scientific Dataset Manager (Web) A web tool for Researchers to manage and curate their research data Formalised research data management  Directory structure follows CCLRC’s Scientific Metadata Model  Suitable for dataset collection/analysis/publication  Create/Read/Update/Delete support Powerful search capabilities Automatic metadata extraction from research datafiles Rich metadata editing capabilities (via MDE) Secure and accessible  Federated access  Aligns with the AAF (Australian Access Federation)  Protected by Shibboleth Utilises Handles (persistent identifiers) for external links Dataset export to Fedora

14 XDMS

15 Metadata Editing Tool (MDE) Schema driven metadata editing for e-Research The key innovation of MDE is that it is a schema-driven editor. MDE uses the schema to build a Web 2.0 form layout for the metadata. The layout includes the following:  Form elements for displaying the existing metadata elements, with type-specific input controls for entering the values. These include such things as number and date validation, and pull-downs for controlled lists.  Element descriptions available as hover-text.  Controls for creating and deleting elements based on what the schema allows and requires. When the user decides to save the metadata record, it undergoes complete validation against the schema. The validation process checks that:  the elements in the record are all defined in the schema and present in the correct number,  the values of the elements satisfy any type restrictions defined by the schemas; e.g. elements defined as integers should consist of digits with an optional leading sign,  schema-specific constraints on the record and individual elements are all satisfied.

16 Metadata Editing Tool

17 Hermes: Scientific Dataset Manager (Desktop Client) A desktop tool for Researchers to transfer/manage their research data Doesn’t have timeout issues for large data transfers that web apps experience Platform-independent (written in Java) Federated access  Aligns with the AAF (Australian Access Federation)  Protected by Shibboleth and PKI technologies Dock-able file browser Supports many different types of file systems (gftp, srb,cifs etc.) Freedom to access the storage system of choice Supports plugins, which interface to the institutions metadata repository. Addition of customised views of metadata repositories

18 Hermes

19 Hydrant: Analysis Workflow Automation Tool Streamlining Analysis Web based portal which sits on top of the core Kepler engine Easy for researchers to reproduce or modify an analysis  Analysis is described by a workflow  Workflow is in XML form and can be presented on the web visually  Workflow can be executed on a workflow engine from the web  Researchers can easily modify aspects of workflow from the web  Researchers can share their workflows Secure and accessible

20 Hydrant

21 ARCHER Enhanced Plone: Collaborative and Adaptable Research Portal Development Tool Bringing Researchers together Simplifies research portal development  Easy to author and manage own web content Enables sharing, management, and discussions of documents Built on Plone  Open source Content Management System (CMS) Powerful search capabilities Secure and accessible  Federated access - aligns with the AAF (Australian Access Federation)  Protected by Shibboleth Access to the ARCHER Research Repository

22 Plone – an Example Portal

23 ARCHER Expected Tool Usage High-end Users Low-end Users Level of user of e-Research Infrastructure ARCHER Enhanced Plone (Collaborative/Adaptable Research Portal Dev Tool) Own Tools ARCHER Research Repository Hermes (desktop client research data manager and file transfer agent) XDMS (web based research data manager and curator) DIMSIM (Distributed Integrated Multi-sensor and Instrument Middleware) Hydrant (Workflow Automation Tool)

24 e-Research Repository Space

25

26 Discipline Specific Federated Search

27 Create Project Package Data Upload data

28 ARCHER Deployment: National Breast Cancer Foundation

29 ARCHER Deployment: ARCS Data Fabric

30 Coming ARCHER Deployment: Protein Crystallography

31 What researchers can expect from ARCHER  A place to collect, store and manage experimental data  Software tools focused on management of data & information  Standardised and secure method of storing, accessing, and analysing research results  Easier collaboration and sharing of research datasets & information

32 Future of ARCHER  Currently testing the tools for release by late September  Expecting that the partners will continue to develop the tools they created  New enhanced versions already being worked on  Looking at how these tools might be used within ANDS (Australian National Data Service) & ARCS (Australian Research Collaborative Service)

33 For more information… Contact: Anthony Beitz ARCHER Portal & Dataset Product Manager Ph: See: ARCHER Website: Demos: