Download presentation
Presentation is loading. Please wait.
1
OIA Key Attributes + DRAFT
May 18, 2011 Comment: focus on medical molecular imaging; pathology in a later effort
2
Contributors Michael Ackerman, NLM Rick Avila, Kitware
Andy Buckler, Buckler Biomedical Terry Yoo, NLM David Clunie, Core Lab Partners James Luo, NIBIB Tony Reeves, Cornell University Daniel Rubin, Stanford University Brandon Whitcher, Mango Solutions Alden Dima
3
Key Attributes Contribution Support User Support General
Quality of the data curation process Speed to post datasets Support for imaging data types and metadata User Support Robust querying and ease of performing a download Advanced computing services General Long-term integrity and support
4
Contribution Support: Quality of the data curation process
de-identification support Validate/verify that de-identification was successful Example: BIRN DUP application that de-identifies Support for de-identification standards DICOM suppl. 142 Metadata preparation tools – clear definition needed if used in final document; split (experiment description; clinical data non-image data) Tools for efficient capture and organization of metadata Utilization of common nomenclature Example: OSA ISP metadata tool, Ontologies: BRIDG, imaging biomarker ontology, UMLS? … NLM: Numerous ontologies being developed – this must be considered carefully. For recommendation document: Appropriate domain specific (clin) metadata; those would come from relevant sources specific to the domain; always provide a specific example illustrating the recommendation Distinguish between bulk data and individual data upload (e.g. eCRF) Revision control (depending on use case e.g. data sharing) Apply revision control concepts to data elements Examples: Commercial institutions do this routinely. EHRs. NBIA may have this capability (Eliot Siegel). Capturing provenance Capturing important information on the acquisition process is needed Examples: Perhaps “data papers” will help There are also goal specific requirements
5
Contribution Support: Speed to post datasets
Avoid Limits on data upload size and speed Protocols to load the data FTP, DICOM, SOAP based interfaces, webDAV Re-de-identify using automated methods (needs example and explanation); usage based de-identification Retain certain fields for potential future purposes Automated methods to check that the data complies with expectations Example: PET SUV, need to know patient weight and height Goal is to try to obtain high quality data, but we would not throw away data if not conforming Some expectations we may know in advance, others not Organization David: Try to be agnostic on data organization (? Rick); data model flexibility and context (?); HealthVault example
6
Data Upload Attributes Continued
DICOM conformance checks Automated methods are preferred (UID replacement, integrity check etc.); limit manual interaction steps (NBIA) ADNI is performing automated qa (specific use cases) Metadata Expectations Utilize a standard information model Example: Use AVT to Store results of computation/analysis Manual annotations Computational algorithms Summary statistics Utilize emerging data models (e.g. AIM) Definition : Ontologies vs information models Ontology – standard terminology (e.g. RADLex, SNOMed, …) Information Model – the syntax for making statements (DICOM structured reporting, NBIA has proprietary XML format)
7
User Support Query Capabilities More generic than web page queries
More sophisticated query methods will drive database design Outside applications can access and perform queries and get a response using a service model Flexibility to support a range of use cases Support both plain text search and a structured query One day support content based retrieval If we were to support data papers, there would be additional content and terms we could use Revision control and review on datasets Query on computation/analysis results Have multiple indices for the same dataset
8
User Support Continued
Download Support Shopping carts are good for certain situations Use of standard protocols (e.g. rsync, FTP) Web services Support for portable hard drive Annotated manifest (images, text descriptions summarizing the metadata) Computational Support (nice to have)
9
User Support Contributor Agreement User License Least restrictive
Standard license
10
General Long-term integrity Support Backup /mirror solutions
Handle.net type solution Local copies of format specification for non-standard data Not just images, but meta-data as well Crypto-graphic hashes as a unique identifier, check sums Hash is helpful for data analysis and retrieval Support Check de-identification (service?) Need to encourage and we prefer automated systems
11
Encouragement & Credit
Making data available as a requirement of: Funding Publication Data papers Very early yet Infrastructure should support this if adopted by the community
12
Recognition Major Problem: medical imaging field started off not sharing Open Data Papers Open Data Awards Requirement for Tenure? Allow/encourage users/readers to contribute to a body of knowledge, similar to papers Other Ideas?
13
Next Step for OIA Present recommendations to the RSNA radiology informatics committee Prepare manuscript on OIA Committee recommendations NIH Movement to make funded data available (e.g. datamedcentral) Publishers OSA BiomedCentral Elsevier: Media
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.