Presentation is loading. Please wait.

Presentation is loading. Please wait.

ARCHER and DART Key projects developing e-Research software tools Jeff McDonell, DART Project Director November 2006.

Similar presentations


Presentation on theme: "ARCHER and DART Key projects developing e-Research software tools Jeff McDonell, DART Project Director November 2006."— Presentation transcript:

1 ARCHER and DART Key projects developing e-Research software tools Jeff McDonell, DART Project Director November 2006

2 2 Acknowledgements DART is a DEST funded project (Department of Education, Science and Training) to support collaborative research in Australia through the MERRI program (Managed Environment for Research Repository Infrastructure) What is DART? D ataset A cquisition / Accessibility / Annotation e- R esearch T echnology

3 3 The Australian government is funding several e-Research projects, such as DART, to…  enable publicly funded research to be publicly available  enhance collaboration in research - sharing requires data and information to be stored, retained, defined and secured, so collaborators can access it  establish common standardised software / middleware applications that are adaptable to many research capabilities Why DART?

4 4 What is DART trying to achieve?  DART is a ‘proof-of-concept’ project to develop software for data and information management requirements for the research lifecycle  To collect and manage large datasets, associated with instruments, such as sensor networks, synchrotrons, telescopes, etc  To support collaborative research and annotation needs  To deal with intellectual property, privacy and security issues  To create customised portals for research demonstrators  To handle research publication, discovery and access or to put it another way……

5 5 To create useful information

6 6 DART design  Identify best-of-breed solutions in each area  Use Open Standards and Open Source software wherever possible  Leverage off existing work and expertise - not reinventing the wheel!  Use real world research environments to test proof-of-concept  Identify a common security architecture - with help from the MERRI funded MAMS and e-Security Framework projects  Integrate all 27 DART work packages utilising three demonstrators

7 7 DART logistics DEST funding of A$3.235 million for:  3 partners:- Monash University (host) in Melbourne - MU - University of Queensland in Brisbane - UQ - James Cook University in Townsville - JCU  7 Chief Investigators  18 month project, expected to be completed by mid 2007  27 Separate DART work packages  40+ project team members!!  Uses common open source standards: SRB, Fedora, Kepler, GridSphere, Shibboleth, XACML, Annotea, CIMA, plone, etc

8 8 DART work packages The 27 work packages cover five technical areas:  Data Collection and Monitoring  Storage and Interoperability  Content and Rights  Annotation and Assessment  Discovery and Access The following slides illustrate a few of these work packages

9 9 JAINIS – JCU and Indiana Instrument Services

10 10 Click here ▬ ►

11 11 Turning Data into Information Protein crystallography raw data 3D atomic structure of protein after processing

12 12 Click here ▬ ►

13 13 Demonstrators These are designed to show the value of an end-to-end lifecycle approach and to test proof-of-concept DART outcomes How we build demonstrators  Engage with researchers at each of the three partner universities in three selected research areas  Embed Information Management specialists into the research teams  Design a customised DART portal, incorporating research applications commonly used in each research discipline  Refine the DART portal as new features are developed and adapted

14 14 Three DART Demonstrators X-Ray Crystallography  Using JAINIS / CIMA instrument interfaces with diffractometers  James Whisstock (MU), Jenny Martin (UQ) and Ian Atkinson (JCU) Climate Research  Merging ocean and atmospheric data around Heron Island in the Great Barrier Reef, for backcasting and forecasting weather Digital History (for Humanities and Social Sciences)  Video storage/management, annotations, authorisation/security  Gugu Badhun Digital History (JCU)  Women on Farms (MU)  Western Cape Community Agreement (UQ)

15 15 Designing user-focused portals

16 16 Proposed format of DART menu tabs Rough mockup of a DART portal

17 17 Key DART Achievements so far…  Strong progress in data capture and instrument integration  Storage/replication of very large datasets across diverse networks  Information Management specialists joined key research teams, to address data and information management requirements  Developing annotation software for 3-D models, video and audio  IP and privacy being reviewed by a Law Faculty  Investigating Creative Commons & Science Commons licensing  Working to utilise Shibboleth, PKI and Grid security standards  Developing search tools, metadata schema registry, wiki tools, etc

18 18 ARCHER– Australian ResearCH EnviRonment ARCHER is a new 2007 DEST funded project currently being planned, to convert DART proof-of-concept outcomes into production-ready ARCHER software tools  ARCHER will create several development teams, utilising DART project expertise, to build research-specific tools to handle data and information management requirements  ARCHER will apply them to specific research capabilities, tailor- making portal-based solutions adaptable to each research capability  The research areas will include NCRIS 5.1 (Bioinformatics), 5.2 (Atlas of Living Australia) and 5.8 (Biosecurity), plus Humanities & Social Sciences

19 19 What should researchers expect from ARCHER?  A reliable method of storing data  Useful ‘middleware’ or software tools for every day research, focused on the management of data and information only  A customised portal adapted to their research needs  A secure and standardised way of storing, accessing, analysing and annotating their research  An easier way to collaborate, share information and publish results

20 20 The proposed ARCHER project structure Project Management (PM) Team Tightly defined PM team to oversee all aspects of ARCHER Development Areas Nine (9) technical teams (see the following slides) Application Teams We expect 5-6 teams focused on specific research capability needs

21 21 ARCHER development strategy  R&D only to the minimum extent required  Build on the best components from DART that fit the ARCHER mission  Source appropriate components from elsewhere  Focus on integration, ruggedisation and packaging

22 22 ARCHER Development Area #1 Integration Framework Integration will need to occur at the front-end for the user and at the backend for software components: IF1:JSR-168 based portal environment, utilising GridSphere (ARCHER will also monitor the OGCE activity in the US) IF2:Workflow support – using one workflow engine, such as BPEL or Kepler, but ARCHER may need to support two workflow engines (e.g. possibly Taverna)

23 23 ARCHER Development Area #2 Security Architecture Aiming for single sign-on and sign-off for all tools: SA1:Support Shibboleth, PKI and Grid security, as required SA2:As appropriate, Shibbolize ARCHER components and incorporate XACML for encoding access constraints SA3:Participate in the development of the AUeduPerson schema (an extension to the MACE eduPerson schema)

24 24 ARCHER Development Area #3 Data Collection Covering small (sensors), medium (instruments) and large (major national facilities) data collection requirements: DC1:ARCHER will support all three, for specific NCRIS capability areas DC2:Integrate with workflows, to analyse data at the source and also generate and capture as much metadata as possible DC3:ARCHER will develop or co-opt capability-specific ontologies to describe experimental components DC4:Monitor instruments remotely as required, as well as observe samples and data being acquired in real-time DC5:ARCHER will support self-deposition or assisted deposition of existing datasets

25 25 ARCHER Development Area #4 Metadata Capture as much metadata as possible for the purposes of data discovery, data re-use and repeatability of results, in a way that impacts as little as possible on the researcher: MD1:Automatically capture as much metadata as possible and as close to the source as possible MD2:Provide the user with the ability to correct and augment automatically generated metadata MD3:Validate captured metadata against a schema (in a registry?) MD4:Support different schemas in use for different capability areas MD5:Build on the DART schema registry and extend it for ontologies MD6:Capture information needed for object preservation

26 26 ARCHER Development Area #5 Object Storage This is a critical aspect of ARCHER, however a number of NCRIS research capabilities already have preferred models for object storage: OS1:Provide and support SRB for NCRIS areas that do not have a preference for object storage technologies OS2:Implement and optimise a metadata service based on SRB OS3:Develop YourSRB / PGL to support offline storage for fieldwork as well as Peer-to-Peer (P2P) data management

27 27 ARCHER Development Area #6 Data Analysis Data analysis will be very research capability specific. There is a need to access existing visualisation and formatting codes or APIs. Portals provide an access framework and where access to source code is available, a generic approach to ‘portalising’ this code and embedding it into workflows may be possible, to link the code to data repositories: DA1:Provide support for integration of third party data analysis packages

28 28 ARCHER Development Area #7 Collaboration Tools There are a large number of existing candidate tools to use or extend, such as email, calendar, collaboration workspaces, Instant Messaging (IM), etc. ARCHER may need to provide some of these tools because of interoperability issues between some tools (such as IM), due to a lack of open standards or vendor desire to lock-in users: CT1:Focus on e-Research tools that are not likely to be offered by institutions housing NCRIS personnel CT2:Continue work on collaborative workspaces (wikis / plone) CT3:Continue work on collaborative annotation tools, possibly including further development or adoption of existing toolkits

29 29 ARCHER Development Area #8 Data Publishing The ARCHER project will extend up to making data publicly available, possibly through a journal/society based repository or an institutional repository. It could also be research capability or community specific: DP1:Provide a single interface for users that allows them to select the data repository to publish to, including the ability to add or edit the metadata that will accompany the digital object. Allow users to select a license (either from a repository list or a user- defined list), as well as specify access control restrictions DP2:Provide services to assist people exporting their data in a standardised format and be able to submit to journal web sites

30 30 ARCHER Development Area #9 Search Functionality This will need to be research capability specific, although some functions will be common: SF1:Provide search across a single repository instance based on metadata SF2:Provide P2P search facility

31 31 ARCHER Application Teams These teams will adapt the ARCHER software tools developed by the nine specialised teams described above, to meet the data and information management requirements of a number of specific research capability areas. After input from DEST and NCRIS, it is expected that the following teams will be established: NCRIS 5.1 (Bioinformatics) NCRIS 5.2 (Atlas of Living Australia) NCRIS 5.8 (Biosecurity) Humanities & Social Sciences Other NCRIS team(s), as required

32 32 ARCHER Future Service and Support Stage 1DARTProof-of-concept2006 Stage 2ARCHERProduction-ready2007 Stage 3??Service delivery2008 Funding for 2008 and beyond is required for the third stage of the DART / ARCHER projects, in order to deliver long-term sustainable software services

33 33 Contacts Web:archer.edu.au and dart.edu.auarcher.edu.audart.edu.au Project Architect:Andrew.Treloar@its.monash.edu.auAndrew.Treloar@its.monash.edu.au  Phone+613 9902 0572 Chair, Board of ManagementAhChung.Tsoi@adm.monash.edu.auAhChung.Tsoi@adm.monash.edu.au  Phone+613 9905 9918

34 34 DART chief investigators Andrew Treloar Asad Khan David Abramson Ann Monotti (Project Architect) Jane Hunter Xiaofang Zhou Ian Atkinson

35 35 Acknowledgements Some of the people in the DART project… UQ JCU MU


Download ppt "ARCHER and DART Key projects developing e-Research software tools Jeff McDonell, DART Project Director November 2006."

Similar presentations


Ads by Google