Download presentation
Presentation is loading. Please wait.
Published byJemimah Thornton Modified over 9 years ago
1
Metadata in the iPlant Collaborative Cyberinfrastructure Birds of a Feather meeting at PAG XXII, Jan. 14, 2014
2
From the iPlant Data Strategy: “The vision for iPlant CI data capabilities is to provide flexible, adaptive and scalable data infrastructure that enables users and communities to implement best practices for data management.”
3
How to enable best practices for data management in iPlant: 1.A way to add and edit metadata 2.Metadata templates for common file types 3.Search and browse iDS based on metadata and file content 4.Support for unstructured and structured (relational) data within the iDS 5.Interoperability with key external data sources 6.Benefits/features that are aligned with the use of popular file types 7.An iPlant Data Commons for public data
4
KEY ELEMENTS OF THE iPLANT DATA STRATEGY
5
1. CI to enable users to add and edit metadata using simple and flexible interfaces, including customizable metadata components. – a web-based user interface accessible via the DE – upload metadata as csv file – access to all metadata entities via iPlant APIs
6
Current DE metadata interface
7
Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute Value attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Templates Templates OK Cancel Browse Templates Browse Templates
8
2. Project data management templates and best practices for organizing, handling and managing data for diverse use cases, including: – groups or consortia working on large-scale genome and transcriptome sequencing projects or species range maps – single PI laboratories focused on specific analysis such as RNA-Seq experiments, phenotype data sets
9
Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute Value attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Templates Templates OK Cancel Browse Templates Browse Templates
10
Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute Value attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Browse Templates Browse Templates OK Cancel Browse Templates Cancel Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Genome Sequence in iDS Genome Sequence in iDS Item 1 Select a template Insert Attributes Preview
11
Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute Value attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Browse Templates Browse Templates OK Cancel Browse Templates Cancel Insert Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Item 3Item 5 Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Item 3Item 5 Genome Sequence in iDS Genome Sequence in iDS Item 1 Attributes Preview project specimen identifier collection date geographic location nam… geographic location longi… geographic location latit… genus species infraspecific name project specimen identifier collection date geographic location nam… geographic location longi… geographic location latit… genus species infraspecific name
12
Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute Value attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Browse Templates Browse Templates OK Cancel Browse Templates Cancel Insert Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Item 3Item 5 Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Item 3Item 5 Genome Sequence in iDS Genome Sequence in iDS Item 1 Attributes Preview project specimen identifier collection date geographic location nam… geographic location longi… geographic location latit… genus species infraspecific name project specimen identifier collection date geographic location nam… geographic location longi… geographic location latit… genus species infraspecific name
13
Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates OK Cancel Accordion Item Attribute Value project*jackson specimen identifier54769 collection date*2008-01-23T19:23 sequencing method* Template: Metagenemoic Sequence Metadata
14
Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates OK Cancel Accordion Item Attribute Value project*jackson specimen identifier54769 collection date*2008-01-23T19:23 sequencing method* Template: Metagenemoic Sequence Metadata All of these are ISO8601 compliant time stamps: 2008- 0123T19:23:10+00:00…
15
Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates Cancel Accordion Item Attribute Value project*jackson specimen identifier54769 collection date*2008-01-23T19:23 sequencing method* Template: Metagenemoic Sequence Metadata OK
16
Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates OK Cancel Accordion Item Attribute Value project*jackson specimen identifier54769 collection date*2008-01-23T19:23 sequencing method* Template: Metagenemoic Sequence Metadata This field is required.
17
Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates Cancel Accordion Item Attribute Value project*jackson specimen identifier54769 collection date*2008-01-23T19:23 sequencing method*DOI# Template: Metagenemoic Sequence Metadata OK
18
3. CI to support searching and browsing based on metadata attributes and suitable file content. – provenance/system metadata and scientific metadata – across both private data and public data – ontology enhanced searches
19
Search capabilities Search API: users will be able to search by – file or folder name – any metadata attribute or value – date created – date last modified – creator – file size – file type – tool that created the file – analysis that created a file or folder – constraints (and, or, xor)
20
Search capabilities Users will be able to make "smart folders", that is, folders for all the files that match a set of search criteria.
21
4. Support for unstructured, semi-structured, and structured (relational) data within the iDS. – Document-based and NoSQL approaches to support unstructured and semi-structured data – Support for large matrix based data sets (e.g., in GBS, GWAS, etc.) – A way for users to search and access data in iPlant-hosted projects that include MySQL and PostgreSQL databases
22
5. Interoperability with key external data sources, including, but not limited to: – Ability to use external data in analyses run through iPlant, e.g., import from BioMart – Access to databases like CoGe, PO, MaizeGDB – Ability to push/publish/link data housed in iDS to canonical public repositories like NCBI, Data Dryad – Ability to engage semantic services and semantic pipelines based on metadata and ontological reasoning systems.
23
6. Benefits/features that are aligned with the use of popular file types. – provide the suitable utilities, tools, integration, and documentation on best data management practices for projects utilizing these formats
24
Demo:http://mirrors.iplantcollaborative.org/b rowse/iplant/home/shared/iplant_public_testhttp://mirrors.iplantcollaborative.org/b rowse/iplant/home/shared/iplant_public_test
25
7. An iPlant Data Commons that provides stable access to objects in the iDS that includes: – The option to make data public and permanent (un-editable). – Issuing multiple permanent identifiers (unique IDs) as needed (i.e. DOI, NOID, ARK) while packaging the content in standard compliant formats.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.