ARCHER and DART Key projects developing e-Research software tools Jeff McDonell, DART Project Director November 2006.

Slides:



Advertisements
Similar presentations
Common Instrument Middleware Architecture and Federation of Instrument Resources for X-ray Crystallography Rick McMullen Indiana University.
Advertisements

Research Councils ICT Conference Welcome Malcolm Atkinson Director 17 th May 2004.
Data Management Planning Kerry Miller Digital Curation Centre University of Edinburgh DIY Research Data Management Training Kit for.
Repositioning for repositories: making the move to science data management Gerry Ryder CSIRO Information Management & Technology 21 January 2009.
An Overview of eResearch Activities in Australia Paul Davis, GrangeNet Jane Hunter, Uni of Qld.
ARROW Progress Report to CAUL September 2004 Geoff Payne, ARROW Project Manager.
ARCHER Overview October e-Research Challenges Acquiring data from instruments Storing and managing large quantities of data Processing large quantities.
1 The Australian Partnership for Sustainable Repositories Margaret Henty Digital Futures Industry Briefing November 8, 2006.
The Changing Face of Research Anthony Beitz DART Integration Manager.
THE JOINED UP WORLD OF E-RESEARCH Professor Neil McLean National Technical Standards Adviser to the Department of Education Science and Training (DEST)
TPAC Digital Library Talk Overview Presenter:Glenn Hyland Tasmanian Partnership for Advanced Computing & Australian Antarctic Division Outline: TPAC Overview.
Tools for e-Research Mat Wyatt. 2 e-Research Sensor nets data compute… Models/ software/ workflows colleagues instruments.
Page 1 LAITS Laboratory for Advanced Information Technology and Standards 9/6/04 Briefing on Open Geospatial Consortium (OGC)’s Web Services (OWS) Initiative.
Software Engineering Muhammad Fahad Khan
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
Good practice in Research Data Management Module 6: Tools, training and support.
AAF Middleware update February Presented by Terry Smith Technical Manager and Heath Marks Manager.
Geoff Payne ARROW Project Manager 1 April Genesis Monash University information management perspective Desire to integrate initiatives such as electronic.
Managing the Record of Research At the Smithsonian Using SIdora SAA Research Forum August 12, 2014.
Australian Partnership for Sustainable Repositories University of Sydney practices and test-bed projects, sustainability in a distributed.
SCSC 311 Information Systems: hardware and software.
Australian Access Federation and other Middleware Initiatives Presented at TF-EMC2, Prague 4 Sep 2007 Patty McMillan, The University of Queensland.
Digital Preservation: Lessons learned through national action Digital Preservation Interoperability Framework Workshop April 2010.
The TARDIS Framework A Federated Repository Solution For Raw Diffraction Datasets Steve Androulakis, Monash University, Melbourne Australia International.
Group-based Repositories in Oz Diane Costello Council of Australian University Librarians ICOLC Montreal 2007.
A summary of the outputs of the ARCHER Project David Groenewegen, Nick Nicholas and Anthony Beitz ARCHER Project.
Crystal-25 April The Rising Power of the Web Browser: Douglas du Boulay, Clinton Chee, Romain Quilici, Peter Turner, Mathew Wyatt. Part of a.
ICTP, April 2007 CIMA in Australia Ian Atkinson HPRC Manager, ITR School of Maths, Physics and IT James Cook University.
1 Web: Steve Brewer: Web: EGI Science Gateways Initiative.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
EPrints 10 Years of Digital Preservation. What is EPrints For?  EPrints offers a safe, open and useful place to store, share and manage material in the.
Crystal25 Hunter Valley, Australia, 11 April 2007 Crystal25 Hunter Valley, Australia, 11 April 2007 JAINIS (JCU and Indiana Instrument Services): A Grid.
Information Infrastructure Evolution ARIIC is working towards – a distributed electronic research environment that allows researchers to share, annotate,
Flexiblelearning.net.au get into flexible learning Sharing E-Learning Resources in VET Vivienne Blanksby Program Leader Resources for Teaching, Learning.
ARROW Institutional Repositories for Managing e-Theses Presentation to ETD September 2005 Geoff Payne, ARROW Project Manager.
Millman—Nov 04—1 An Update on Digital Libraries David Millman Director of Research & Development Academic Information Systems Columbia University
MAPS Middleware Action Plan & Strategy Project Middleware Action Plan & Strategy Project (MAPS) Patricia McMillan, Project Manager.
FROM PRINCIPLE TO PRACTICE: Implementing the Principles for Digital Development Perspectives and Recommendations from the Practitioner Community.
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
DART Developing Toolkits for e-Research Dr Jeff McDonell, DART Project Director July 2006.
JCU Australian Marine Science Data Network.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
ARCHER Building data and information management tools for the complete research life-cycle July 2006.
Helmholtz Open Science Webinars on Research Data Webinar 34 – 6 / 11 April 2016 Dr. Birgit Schmidt Niedersächsische Staats- und Universitätsbibliothek.
Software Reuse. Objectives l To explain the benefits of software reuse and some reuse problems l To discuss several different ways to implement software.
Publish your Data on the Tropical Data Hub Seeding the Commons Project Australian National Data Service e-Research Centre James Cook University This work.
DART Project Work Packages CR4 and CR5 Tom Denison, Nicholas McPhee, Monash University.
Chapter 1 Computer Technology: Your Need to Know
Partner Cloud Voice Offer Guidance
Ian Bruno, Suzanna Ward The Cambridge Crystallographic Data Centre
Joslynn Lee – Data Science Educator
DataNet Collaboration
DART: Drivers, Design, Dimensions, Demonstrators and Deliverables
Summit 2017 Breakout Group 2: Data Management (DM)
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Hydra: a case study Chris Awre
Structures for Implementation
OGCE Portal Applications for Grid Computing
Research Data Management
Repository Platforms for Research Data Interest Group: Requirements, Gaps, Capabilities, and Progress Robert R. Downs1, 1 NASA.
Malte Dreyer – Matthias Razum
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Jisc Research Data Shared Service (RDSS)
ROLE OF «electronic virtual enhanced research-engaged student teams» WEB PORTAL IN SOLUTION OF PROBLEM OF COLLABORATION INTERNATIONAL TEAMS INSIDE ONE.
Chapter 16 – Software Reuse
The New Internet2 Network: Expected Uses and Application Communities
Grid Systems: What do we need from web service standards?
Introduction to SOA Part II: SOA in the enterprise
Australian and New Zealand Metadata Working Group
Presentation transcript:

ARCHER and DART Key projects developing e-Research software tools Jeff McDonell, DART Project Director November 2006

2 Acknowledgements DART is a DEST funded project (Department of Education, Science and Training) to support collaborative research in Australia through the MERRI program (Managed Environment for Research Repository Infrastructure) What is DART? D ataset A cquisition / Accessibility / Annotation e- R esearch T echnology

3 The Australian government is funding several e-Research projects, such as DART, to…  enable publicly funded research to be publicly available  enhance collaboration in research - sharing requires data and information to be stored, retained, defined and secured, so collaborators can access it  establish common standardised software / middleware applications that are adaptable to many research capabilities Why DART?

4 What is DART trying to achieve?  DART is a ‘proof-of-concept’ project to develop software for data and information management requirements for the research lifecycle  To collect and manage large datasets, associated with instruments, such as sensor networks, synchrotrons, telescopes, etc  To support collaborative research and annotation needs  To deal with intellectual property, privacy and security issues  To create customised portals for research demonstrators  To handle research publication, discovery and access or to put it another way……

5 To create useful information

6 DART design  Identify best-of-breed solutions in each area  Use Open Standards and Open Source software wherever possible  Leverage off existing work and expertise - not reinventing the wheel!  Use real world research environments to test proof-of-concept  Identify a common security architecture - with help from the MERRI funded MAMS and e-Security Framework projects  Integrate all 27 DART work packages utilising three demonstrators

7 DART logistics DEST funding of A$3.235 million for:  3 partners:- Monash University (host) in Melbourne - MU - University of Queensland in Brisbane - UQ - James Cook University in Townsville - JCU  7 Chief Investigators  18 month project, expected to be completed by mid 2007  27 Separate DART work packages  40+ project team members!!  Uses common open source standards: SRB, Fedora, Kepler, GridSphere, Shibboleth, XACML, Annotea, CIMA, plone, etc

8 DART work packages The 27 work packages cover five technical areas:  Data Collection and Monitoring  Storage and Interoperability  Content and Rights  Annotation and Assessment  Discovery and Access The following slides illustrate a few of these work packages

9 JAINIS – JCU and Indiana Instrument Services

10 Click here ▬ ►

11 Turning Data into Information Protein crystallography raw data 3D atomic structure of protein after processing

12 Click here ▬ ►

13 Demonstrators These are designed to show the value of an end-to-end lifecycle approach and to test proof-of-concept DART outcomes How we build demonstrators  Engage with researchers at each of the three partner universities in three selected research areas  Embed Information Management specialists into the research teams  Design a customised DART portal, incorporating research applications commonly used in each research discipline  Refine the DART portal as new features are developed and adapted

14 Three DART Demonstrators X-Ray Crystallography  Using JAINIS / CIMA instrument interfaces with diffractometers  James Whisstock (MU), Jenny Martin (UQ) and Ian Atkinson (JCU) Climate Research  Merging ocean and atmospheric data around Heron Island in the Great Barrier Reef, for backcasting and forecasting weather Digital History (for Humanities and Social Sciences)  Video storage/management, annotations, authorisation/security  Gugu Badhun Digital History (JCU)  Women on Farms (MU)  Western Cape Community Agreement (UQ)

15 Designing user-focused portals

16 Proposed format of DART menu tabs Rough mockup of a DART portal

17 Key DART Achievements so far…  Strong progress in data capture and instrument integration  Storage/replication of very large datasets across diverse networks  Information Management specialists joined key research teams, to address data and information management requirements  Developing annotation software for 3-D models, video and audio  IP and privacy being reviewed by a Law Faculty  Investigating Creative Commons & Science Commons licensing  Working to utilise Shibboleth, PKI and Grid security standards  Developing search tools, metadata schema registry, wiki tools, etc

18 ARCHER– Australian ResearCH EnviRonment ARCHER is a new 2007 DEST funded project currently being planned, to convert DART proof-of-concept outcomes into production-ready ARCHER software tools  ARCHER will create several development teams, utilising DART project expertise, to build research-specific tools to handle data and information management requirements  ARCHER will apply them to specific research capabilities, tailor- making portal-based solutions adaptable to each research capability  The research areas will include NCRIS 5.1 (Bioinformatics), 5.2 (Atlas of Living Australia) and 5.8 (Biosecurity), plus Humanities & Social Sciences

19 What should researchers expect from ARCHER?  A reliable method of storing data  Useful ‘middleware’ or software tools for every day research, focused on the management of data and information only  A customised portal adapted to their research needs  A secure and standardised way of storing, accessing, analysing and annotating their research  An easier way to collaborate, share information and publish results

20 The proposed ARCHER project structure Project Management (PM) Team Tightly defined PM team to oversee all aspects of ARCHER Development Areas Nine (9) technical teams (see the following slides) Application Teams We expect 5-6 teams focused on specific research capability needs

21 ARCHER development strategy  R&D only to the minimum extent required  Build on the best components from DART that fit the ARCHER mission  Source appropriate components from elsewhere  Focus on integration, ruggedisation and packaging

22 ARCHER Development Area #1 Integration Framework Integration will need to occur at the front-end for the user and at the backend for software components: IF1:JSR-168 based portal environment, utilising GridSphere (ARCHER will also monitor the OGCE activity in the US) IF2:Workflow support – using one workflow engine, such as BPEL or Kepler, but ARCHER may need to support two workflow engines (e.g. possibly Taverna)

23 ARCHER Development Area #2 Security Architecture Aiming for single sign-on and sign-off for all tools: SA1:Support Shibboleth, PKI and Grid security, as required SA2:As appropriate, Shibbolize ARCHER components and incorporate XACML for encoding access constraints SA3:Participate in the development of the AUeduPerson schema (an extension to the MACE eduPerson schema)

24 ARCHER Development Area #3 Data Collection Covering small (sensors), medium (instruments) and large (major national facilities) data collection requirements: DC1:ARCHER will support all three, for specific NCRIS capability areas DC2:Integrate with workflows, to analyse data at the source and also generate and capture as much metadata as possible DC3:ARCHER will develop or co-opt capability-specific ontologies to describe experimental components DC4:Monitor instruments remotely as required, as well as observe samples and data being acquired in real-time DC5:ARCHER will support self-deposition or assisted deposition of existing datasets

25 ARCHER Development Area #4 Metadata Capture as much metadata as possible for the purposes of data discovery, data re-use and repeatability of results, in a way that impacts as little as possible on the researcher: MD1:Automatically capture as much metadata as possible and as close to the source as possible MD2:Provide the user with the ability to correct and augment automatically generated metadata MD3:Validate captured metadata against a schema (in a registry?) MD4:Support different schemas in use for different capability areas MD5:Build on the DART schema registry and extend it for ontologies MD6:Capture information needed for object preservation

26 ARCHER Development Area #5 Object Storage This is a critical aspect of ARCHER, however a number of NCRIS research capabilities already have preferred models for object storage: OS1:Provide and support SRB for NCRIS areas that do not have a preference for object storage technologies OS2:Implement and optimise a metadata service based on SRB OS3:Develop YourSRB / PGL to support offline storage for fieldwork as well as Peer-to-Peer (P2P) data management

27 ARCHER Development Area #6 Data Analysis Data analysis will be very research capability specific. There is a need to access existing visualisation and formatting codes or APIs. Portals provide an access framework and where access to source code is available, a generic approach to ‘portalising’ this code and embedding it into workflows may be possible, to link the code to data repositories: DA1:Provide support for integration of third party data analysis packages

28 ARCHER Development Area #7 Collaboration Tools There are a large number of existing candidate tools to use or extend, such as , calendar, collaboration workspaces, Instant Messaging (IM), etc. ARCHER may need to provide some of these tools because of interoperability issues between some tools (such as IM), due to a lack of open standards or vendor desire to lock-in users: CT1:Focus on e-Research tools that are not likely to be offered by institutions housing NCRIS personnel CT2:Continue work on collaborative workspaces (wikis / plone) CT3:Continue work on collaborative annotation tools, possibly including further development or adoption of existing toolkits

29 ARCHER Development Area #8 Data Publishing The ARCHER project will extend up to making data publicly available, possibly through a journal/society based repository or an institutional repository. It could also be research capability or community specific: DP1:Provide a single interface for users that allows them to select the data repository to publish to, including the ability to add or edit the metadata that will accompany the digital object. Allow users to select a license (either from a repository list or a user- defined list), as well as specify access control restrictions DP2:Provide services to assist people exporting their data in a standardised format and be able to submit to journal web sites

30 ARCHER Development Area #9 Search Functionality This will need to be research capability specific, although some functions will be common: SF1:Provide search across a single repository instance based on metadata SF2:Provide P2P search facility

31 ARCHER Application Teams These teams will adapt the ARCHER software tools developed by the nine specialised teams described above, to meet the data and information management requirements of a number of specific research capability areas. After input from DEST and NCRIS, it is expected that the following teams will be established: NCRIS 5.1 (Bioinformatics) NCRIS 5.2 (Atlas of Living Australia) NCRIS 5.8 (Biosecurity) Humanities & Social Sciences Other NCRIS team(s), as required

32 ARCHER Future Service and Support Stage 1DARTProof-of-concept2006 Stage 2ARCHERProduction-ready2007 Stage 3??Service delivery2008 Funding for 2008 and beyond is required for the third stage of the DART / ARCHER projects, in order to deliver long-term sustainable software services

33 Contacts Web:archer.edu.au and dart.edu.auarcher.edu.audart.edu.au Project  Phone Chair, Board of  Phone

34 DART chief investigators Andrew Treloar Asad Khan David Abramson Ann Monotti (Project Architect) Jane Hunter Xiaofang Zhou Ian Atkinson

35 Acknowledgements Some of the people in the DART project… UQ JCU MU