DART Developing Toolkits for e-Research Dr Jeff McDonell, DART Project Director July 2006.

Slides:



Advertisements
Similar presentations
Common Instrument Middleware Architecture and Federation of Instrument Resources for X-ray Crystallography Rick McMullen Indiana University.
Advertisements

Illinois Justice Network Portal Implementation Board Meeting February 11, 2004.
Research Councils ICT Conference Welcome Malcolm Atkinson Director 17 th May 2004.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.
Repositioning for repositories: making the move to science data management Gerry Ryder CSIRO Information Management & Technology 21 January 2009.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
Supporting education and research E-learning tools, standards and systems Sarah Porter Head of Development, JISC.
An Overview of eResearch Activities in Australia Paul Davis, GrangeNet Jane Hunter, Uni of Qld.
ARCHER Overview October e-Research Challenges Acquiring data from instruments Storing and managing large quantities of data Processing large quantities.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
02/12/00 E-Business Architecture
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
The Changing Face of Research Anthony Beitz DART Integration Manager.
THE JOINED UP WORLD OF E-RESEARCH Professor Neil McLean National Technical Standards Adviser to the Department of Education Science and Training (DEST)
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Enhanced Collaboration and other benefits of Sharepoint Technologies Kern Sutton Business Productivity Group Microsoft Corporation.
Australian Partnership for Sustainable Repositories AUSTRALIAN PARTNERSHIP FOR SUSTAINABLE REPOSITORIES Caul Meeting 2005/2 Brisbane 15.
Tools for e-Research Mat Wyatt. 2 e-Research Sensor nets data compute… Models/ software/ workflows colleagues instruments.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
Adopting Hydra Making the case and getting going Chris Awre Hydra Europe Symposium London School of Economics, 23 rd April 2015.
A National Resource Working in the Public Interest © 2006 The MITRE Corporation. All rights reserved. KM at MITRE Jean Tatalias KM TEM, December 2007.
User requirements for and concerns about a European e-Infrastructure Steven Newhouse, Director.
ISpheres Project. Project Overview iSpheresCore iSpheresImage Demonstration References.
Managing Research Data – The Organisational Challenge at Oxford James A J Wilson Friday 6 th December,
AUKEGGS 3 rd Workshop - 29 November 2006 Australian Marine data, AODCJF and BlueNet Kate Roberts, Director BlueNet Project.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Australian Access Federation and other Middleware Initiatives Presented at TF-EMC2, Prague 4 Sep 2007 Patty McMillan, The University of Queensland.
The TARDIS Framework A Federated Repository Solution For Raw Diffraction Datasets Steve Androulakis, Monash University, Melbourne Australia International.
Group-based Repositories in Oz Diane Costello Council of Australian University Librarians ICOLC Montreal 2007.
A summary of the outputs of the ARCHER Project David Groenewegen, Nick Nicholas and Anthony Beitz ARCHER Project.
Crystal-25 April The Rising Power of the Web Browser: Douglas du Boulay, Clinton Chee, Romain Quilici, Peter Turner, Mathew Wyatt. Part of a.
High-Throughput Crystallography at Monash Noel Faux Dept of Biochemistry and Molecular Biology Monash University.
ICTP, April 2007 CIMA in Australia Ian Atkinson HPRC Manager, ITR School of Maths, Physics and IT James Cook University.
Neil Witheridge APAN29 Sydney February 2010 ARCS Authorisation Services Neil Witheridge Manager, ARCS Authorisation Services APAN29, Sydney, February 2010.
Crystal25 Hunter Valley, Australia, 11 April 2007 Crystal25 Hunter Valley, Australia, 11 April 2007 JAINIS (JCU and Indiana Instrument Services): A Grid.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
Chapter 6 Supporting Knowledge Management through Technology
E.Soundararajan R.Baskaran & M.Sai Baba Indira Gandhi Centre for Atomic Research, Kalpakkam.
EPA Geospatial Segment United States Environmental Protection Agency Office of Environmental Information Enterprise Architecture Program Segment Architecture.
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
Catawba County Board of Commissioners Retreat June 11, 2007 It is a great time to be an innovator 2007 Technology Strategic Plan *
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
Information Infrastructure Evolution ARIIC is working towards – a distributed electronic research environment that allows researchers to share, annotate,
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
ARROW Institutional Repositories for Managing e-Theses Presentation to ETD September 2005 Geoff Payne, ARROW Project Manager.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
Technology for Social Justice Enhancing community sector service delivery Stefanie Kechayas – Senior Consultant 17 November 2015 SharePoint Connect and.
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
Report of the Architecture and Data Committee (ADC) R.Shibasaki (ADC, Japan)
| nectar.org.au NECTAR TRAINING Module 2 Virtual Laboratories and eResearch Tools.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
2005 GRIDS Community Workshop1 Learning From Cyberinfrastructure Initiatives Grid Research Integration Development & Support
E-Science Security Roadmap Grid Security Task Force From original presentation by Howard Chivers, University of York Brief content:  Seek feedback on.
MAPS Middleware Action Plan & Strategy Project Middleware Action Plan & Strategy Project (MAPS) Patricia McMillan, Project Manager.
Built on the Powerful Microsoft Azure Platform, Forensic Advantage Helps Public Safety and National Security Agencies Collect, Analyze, Report, and Distribute.
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
JCU Australian Marine Science Data Network.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
ARCHER Building data and information management tools for the complete research life-cycle July 2006.
DART SI-8: Pilot long-distance high speed and secure data transfer between the Repositories DART Workshop on Infrastructure Chief Investigator: Dr. Asad.
ARCHER and DART Key projects developing e-Research software tools Jeff McDonell, DART Project Director November 2006.
Update from the Faster Payments Task Force
DART: Drivers, Design, Dimensions, Demonstrators and Deliverables
Summit 2017 Breakout Group 2: Data Management (DM)
Grid Portal Services IeSE (the Integrated e-Science Environment)
Presentation transcript:

DART Developing Toolkits for e-Research Dr Jeff McDonell, DART Project Director July 2006

2 Acknowledgements DART is a proof-of-concept project funded by the Department of Education, Science and Training (DEST) to support collaborative research in Australia through the program Managed Environment for Research Repository Infrastructure (MERRI) What is DART? D ataset A cquisition / Accessibility / Annotation e- R esearch T echnology

3 The Australian government is funding several e-Research projects, like DART, to…  enable publicly funded research to be publicly available  support research collaboration -- this means sharing -- to share, then data and information needs to be stored, retained, defined & secured -- then many collaborators can access it  establish common standardised software / middleware applications that are adaptable to many research capabilities  help develop world class Australian research Why DART?

4 What is DART trying to achieve?  To develop software tools to handle the data and information management requirements of the complete research lifecycle  To collect and manage large datasets, associated with instruments, such as sensor networks, synchrotrons, telescopes, etc.  To support collaborative research and annotation needs  To deal with intellectual property, privacy and security issues  To create customised portals for research demonstrators  To handle research publication, discovery and access or to put it another way……

5 DART - version 1

6 DART - version 2

7 DART - version 3

8 What should researchers expect from DART?  A reliable place to store data – not just using the lab server or home PC or putting it onto a DVD somewhere!  Useful tools for researchers to use in their everyday research  Software tools just focused on management of data & information  Potentially a customised portal applicable to their research field  A standard and secure method of storing, accessing, analysing and annotating research results  Easier to collaborate, share information and publish results

9 DART design criteria  Identify best-of-breed solutions in each area  Use Open Standards and choose Open Source software if possible  Leverage existing work and expertise - don’t reinvent the wheel!  Use real world research environments to test proof-of-concept  Identify common frameworks for: - Security (with help from MAMS & e-Security Framework projects) - network transport - integration across all 27 DART work packages

10 DART demonstrators The three research areas chosen as DART demonstrators are:  X-Ray Crystallography  Climatology  Digital History The use of demonstrators is designed to show the value of an end-to-end lifecycle approach and to test proof-of-concept outcomes from DART Monash, Queensland & James Cook University researchers are involved in all three demonstrators

11 The DART demonstrator tasks are to:  Engage with suitable researchers at each partner university for each of the three selected research areas  Define the research activities applicable to each area  Embed Information Management specialists into research teams  Construct a custom designed prototype DART portal, incorporating software applications specific to each research discipline  Progressively refine the model as the DART project adds new features and services How to build demonstrators?

12 DART logistics DEST funding of A$3.235 million involving:  3 partners:- Monash University (host) in Melbourne - MU - University of Queensland in Brisbane - UQ - James Cook University in Townsville - JCU  5 technical areas of focus within the DART work packages (WPs)  7 Chief Investigators  18 month project, expected to be completed by mid next year  27 Separate DART work packages  40+ project team members!!

13 DART chief investigators Andrew Treloar Asad Khan David Abramson Ann Monotti (Project Architect) Jane Hunter Xiaofang Zhou Ian Atkinson

14

15 DART work packages The 27 work packages (WPs) cover five technical areas:  Data Collection and Monitoring  Storage and Interoperability  Content and Rights  Annotation and Assessment  Discovery and Access

16 Data Collection and Monitoring Developing front-end research processes 1.Connect instruments and sensors effectively to the network - JCU 2.Connect instruments to repositories with Storage Resource Broker (SRB) via Common Instrument Middleware Architecture/CIMA - JCU 3.Ensure data is of sufficient quality to warrant curation - JCU 4.Online remote access to working instruments and sensors - UQ 5.Improve intelligence of the storage framework - JCU

17 Click here ▬ ►

18 Storage and Interoperability (part 1) Developing middleware tools 1.Facilitate distributed data management with Fedora - UQ 2.Improve interoperability between SRB and Fedora - MU 3.Support richer metadata to enhance discovery - UQ 5.Support data replication systems, such as SRB, Globus & GFarm - MU 6.Allow simulation data to be retrieved or dynamically regenerated - MU

19 Storage and Interoperability (part 2) Developing secure data transfer and storage 4.Secure service for transferring data from instruments and sensors to repositories via the Grid - MU 7.Develop pre-processing system for secondary storage - MU 8.Pilot long distance high speed and secure data transfers between repositories - MU 9.Scope and pilot storage infrastructure requirements - MU

Manage Data User requests data acquisition 2. Acquiring CD or DVD static data 3. Acquiring dynamic instrument or sensor data 4. Acquiring static or dynamic SAN data 5. Storing raw data in Primary Storage 6. Pre-processing of the raw data 7. Storing pre- processed data in Secondary Storage

21 Potential use of GridSphere as the front-end DART portal

22 Turning Data into Information Protein crystallography raw data 3D atomic structure of protein after processing

23 Data collected at JCU, MU and UQ is stored in Primary Datastores Processed data is securely transferred to Secondary Datastores Secure Data Replication via the Network to DART Partners Monash University James Cook University University of Queensland QCIF / QPSF Grid Primary Secondary Sensor Network AARNet / GrangeNet

24 Content and Rights Collecting data sources into institutional repositories 1.Move data from personal repositories into trusted alternatives - MU 2.Reduce barriers to content acquisition by rights assignment for non- Science researchers (Creative Commons) - UQ 3.As above – for Science researchers (Science Commons) - UQ 4.Improve management practices in research communities - MU 5.Assist researchers to deposit datasets and other digital objects into institutional repositories - MU 6.Clarify legal issues around intellectual property (IP), information security and privacy – MU

25 Annotation and Assessment Including collaboration tools for research 1.Allow researchers to annotate each other’s work - UQ 2.Improve annotation & deposit rates by allowing end user control - UQ 3.Help annotation services contribute to the life and productivity of research communities - UQ 4.Foster wiki-based collaborative work practices in research teams - JCU

26 Click here ▬ ►

27 Discovery and Access Searching, browsing and discovering resources 1.Improve repository deposit rates, sharing & reuse by user access - MU 2.Improve repository deposit rates, sharing & reuse by improving discoverability - MU 3.Reduce effort for creating metadata schemas & improve interoperability - UQ

28 Discovery examples

29 DART Demonstrators X-Ray Crystallography  Focussing on diffractometers and protein crystallography  Using CIMA instrument interfaces  James Whisstock (MU), Jenny Martin (UQ) and Ian Atkinson (JCU) are the major researchers involved Climatology  Focussing on ocean and atmospheric data – e.g. merging of data around Heron Island (in the Great Barrier Reef) to predict weather  Amanda Lynch (MU) and Stuart Kininmonth (AIMS) early adopters

30 DART 3 rd Demonstrator Digital History  Three key projects with a Humanities and Social Sciences focus:  Gugu Badhun Digital History project - JCU  Women on Farms project - MU  Western Cape Community Agreement project - UQ  Dealing with video storage & management, annotations, survey data, authorisation and security, community involvement, etc

31 DART Deliverables

32 What will DART deliver? By mid 2007, DART aims to provide:  Practical and workable software tools for researchers to use for their daily data and information management requirements  Working proof-of-concept demonstrations, including customised research portals  Strong feedback from researchers in all the demonstrators  Reports recommending best practice in several areas  Assessment of the value of the DART integrated lifecycle approach  A clear understanding of how to turn proof-of-concept into robust production-ready systems Note DART WILL NOT be delivering production services!

33 How far has DART progressed? Fast startup:  Started Dec 2005, DART now has 40+ staff and researchers on board Collaborative Project:  27 WPs in 3 partner universities: Monash, Queensland, James Cook Effectively managed:  7 Chief Investigators, strong Project Office & Board of Management Grounded in research practice:  Building three demonstrators with research teams from the 3 partners Common standards used to develop generic software tools:  Fedora, GridSphere, SRB, Kepler, XACML, Shibboleth, Annotea, CIMA, plone

34 Key DART Achievements  Strong progress in data capture and instrument integration  Investigating storage and replication of very large datasets across diverse networks (up to Petabytes)  Have placed Information Management staff into key research teams, addressing their data and information management requirements  Developing annotation software for 3-D models, video and audio  IP and privacy are being reviewed by a Law Faculty  Investigating Creative Commons & Science Commons licensing  Working to utilise Shibboleth, PKI and Grid security standards  Developing search tools, metadata schema registry, wiki tools, etc

35 DARTs in use by the ARCHER? ARCHER is a new DEST funded project for 2007 that will take the proof-of-concept outcomes of DART and turn them into production- ready ARCHER software tools These tools will be developed into modular middleware web services, customised through dedicated task forces to suit the needs of the:  nine NCRIS priority research capabilities, plus  two specialised task forces for the Humanities & Social Sciences

36 Useful DART tools for ARCHER (part 1) Compute/storage:  Interface to instruments & sensors, CIMA video and Kepler workflows  Interface to distributed computing (HPC / Grid)  Interface between the hardware and DART software tools  Secure access to large scale data storage & repositories (SRB, Fedora) Data Quality:  Pre-analysis of data to automatically detect faulty or degraded data  Seamless replication of data for backup and disaster recovery  Support for multiple data replication systems (SRB / GFarm / Globus)  Transfer of large datasets between systems efficiently & fault tolerantly

37 Useful DART tools for ARCHER (part 2) Software tools:  Manage metadata, including defining, storing, searching, etc.  Manage authentication and authorisation, plus data security  Deal with Science/Creative Commons licensing, IP and privacy issues  Provide secure annotations for documents, datasets, video, audio, etc. Usability :  GridSphere portal to tie all the software tools together  Migrate data from personal to institutional storage  Support for legacy applications  Collaborate using research-centric wiki / weblog communication tools

38 Acknowledgements Without the hard work of all these people, DART would just not happen! UQ JCU MU

39 DART Contacts  Web:dart.edu.audart.edu.au   Project  Phone  Project  Phone Questions?