The Tissue Microarray Data Exchange Specification Presented for: Cambridge Healthtech Institute Microarrays in Medicine Boston, MA April 26, 2004 Jules.

Slides:



Advertisements
Similar presentations
Tissue Microarray Data Standards
Advertisements

Introduction The cancerGrid metadata registry (cgMDR) has proved effective as a lightweight, desktop solution, interoperable with caDSR, targeted at the.
How to Author Teaching Files Draft Medical Imaging Resource Center.
Repository models and policies for preservation Steve Hitchcock Preserv Project Intelligence Agents Multimedia Group, School of Electronics and Computer.
The Seven Pillars of Open Language Archiving: A Vision Statement Gary Simons and Steven Bird Workshop on Web-based Language Documentation and Description.
Data Documentation Initiative (DDI) Workshop Carol Perry Ernie Boyko April 2005 Kingston Ontario.
Forest Markup / Metadata Language FML
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
The BADC-CSV Format Meeting user and metadata requirements Graham A Parton*, Sam J Pepler British Atmospheric Data Centre, Rutherford Appleton Laboratory,
XML/EDI Overview West Chester Electronic Commerce Resource Center (ECRC)
The TMAJ Software Project and Database: Angelo M. De Marzo MD PhD James Morgan BS November 12, 2007.
CHOICE Pathology Informatics 2010 Boston, Massachusetts DataReady ® : A Deployable Data Management and Integration System for Large-scale Cancer Repositories.
Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program.
By Mary Anne Poatsy, Keith Mulbery, Eric Cameron, Jason Davidson, Rebecca Lawson, Linda Lau, Jerri Williams Chapter 8 Get Connected 1 Copyright © 2014.
WMES3103 : INFORMATION RETRIEVAL
XML A brief introduction ---by Yongzhu Li. XML --- a brief introduction 2 CSI668 Topics in System Architecture SUNY Albany Computer Science Department.
The Future of the Document Paper is OUT Trees are IN UVic Humanities Computing and Media Centre.
Tutorial 8 Sharing, Integrating and Analyzing Data
Copyright © 2003 Pearson Education, Inc. Slide 1-1 Created by Cheryl M. Hughes, Harvard University Extension School — Cambridge, MA The Web Wizard’s Guide.
©2015 IPDAE. All rights reserved. All content in this presentation is the proprietary property of The Institute for the Professional Development of Adult.
W3C XML Query Language Working Group Mark Needleman Data Research Associates ZIG Current Awareness Session July 13, 2000.
Software and Multimedia
Network publishing and mark-up languages. Alpe Adria Master Course :: Medical Informatics :: Dr. J. Dimec: Web publishing and mark-up languages.2 p- versus.
AS Computing Software definitions.
1 © Netskills Quality Internet Training, University of Newcastle Metadata Explained © Netskills, Quality Internet Training.
1Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall. Exploring Microsoft Office Access 2010 by Robert Grauer, Keith Mast, and Mary Anne.
Data standards in pathology informatics and experimental pathology Experimental Biology 2004 April 17, 2004 Association for Pathology Informatics Data.
Ontology-based Annotation & Query of TMA data Nigam Shah Stanford Medical Informatics
Department of Biomedical Informatics Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics.
CISC 3140 (CIS 20.2) Design & Implementation of Software Application II Instructor : M. Meyer Address: Course Page:
CDS/ISIS Clearing House Workshop 2003 – Patrick Huby, Davide Storti Recent developments.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
Page 1 Informatics Pilot Project EDRN Knowledge System Working Group San Antonio, Texas January 21, 2001 Steve Hughes Thuy Tran Dan Crichton Jet Propulsion.
Content and Computer Platforms Week 3. Today’s goals Obtaining, describing, indexing content –XML –Metadata Preparing for the installation of Dspace –Computers.
Open Source Solutions for Tissue Banking Informatics Jules J. Berman, Ph.D., M.D. INFORMATICS FOR REPOSITORIES Wednesday, May 21, :30 pm – 4:05 pm.
1 A National Virtual Specimen Database for Early Cancer Detection June 26, 2003 Daniel Crichton NASA Jet Propulsion Laboratory Sean Kelly NASA Jet Propulsion.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XML.
Resource Description Framework (RDF) Course: Electronic Document Team member: Ding Feng Ding Wei Wang Ling Date:
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
API Working Group Session – Open Discussion of Pathology Digital Imaging Standards Organizers: Jules J. Berman, Ph.D., M.D. Program Director for Pathology.
An Introduction to XML Sandeep Bhattaram
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
Geography Markup Language (GML). What is GML? – Scope  The Geography Markup Language is  a modeling language for geographic information  an encoding.
Pathology data sharing United States Military Cancer Institute Walter Reed Army Medical Center November 16, 2004 Jules J. Berman, Ph.D., M.D. Program Director,
Implementing an RDF Schema for Pathology Images, From the Association for Pathology Informatics Jules J. Berman, Ph.D., M.D. APIII, Pittsburgh, PA Monday,
October 9 th, 2015 University of Pennsylvania TIES Cancer Research Network Y3 Face to Face Meeting U24 CA Session 7 Year 3 Development Plan.
Information Design Trends Unit 4: Sources and Standards Lecture 3: A Brief Introduction to XML.
Document Computing Technologies for Managing Electronic Document Collections Ross Wilkinson... [et al.] Circulation Counter [RES3H] ZA4080.D
INTRODUCTION The Internet has given the opportunity to share findings and distribute knowledge from experts regardless of time and place. Communication.
Geography Markup Language (GML). GML What is GML? – Scope  The Geography Markup Language is  a modeling language for geographic information  an encoding.
Computer Applications Chapter 16. Management Information Systems Management Information Systems (MIS)- an organized system of processing and reporting.
XML - eXtensible Markup Language Who Am I? Name: Jared Rypka-Hauer Owner: Continuum Media Group, LLC Adobe Community Expert – ColdFusion 9 years ColdFusion.
COMMON COMMUNICATION FORMAT (CCF). Dr.S. Surdarshan Rao Professor Dept. of Library & Information Science Osmania University Hyderbad
7. Data Import Export Lingma Acheson Department of Computer and Information Science IUPUI CSCI N207 Data Analysis Using Spreadsheets 1.
The Development of Imaging Standards for Pathology Lab Infotech Summit March 2-4, 2005 Las Vegas, Nevada Jules J. Berman, Ph.D., M.D. Program Director,
From XML to DAML – giving meaning to the World Wide Web Katia Sycara The Robotics Institute
Presenting Semantic Data Through “Instance Hubs” Using Authoritative URI Design Schemes Alexei Bulazel 1 ( ), Dominic Difranzo 1 (
2005 All Hands Meeting Data & Data Integration Working Group Summary.
Connecting to External Data. Financial data can be obtained from a number of different data sources.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
NAACCR Clinical Data Work Group: Design and Implementation Issues Jim Martin Director, Virginia Cancer Registry 2007 NAACCR Annual Meeting Detroit, Michigan.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
University of Colorado at Denver and Health Sciences Center Department of Preventive Medicine and Biometrics Contact:
CaBig February 6, 2007 Jules Berman, Ph.D., M.D.
The Evils of Complexity
Software and Multimedia
Software and Multimedia
Exploring Microsoft® Access® 2016 Series Editor Mary Anne Poatsy
APE EAD3 introduction - DARIAH - Brussels
Presentation transcript:

The Tissue Microarray Data Exchange Specification Presented for: Cambridge Healthtech Institute Microarrays in Medicine Boston, MA April 26, 2004 Jules J. Berman, Ph.D., M.D. Program Director for Pathology Informatics Cancer Diagnosis Program National Cancer Institute National Institutes of Health Rockville, MD This presentation is a U.S. government-sponsored work in the public domain

In brief : The TMA Specification is an open access document that can be used without any restriction. Its development was sponsored by the NCI and by the Association for Pathology Informatics All the documents and software that you might need to obtain, understand and implement the specification are available in two recently published open access manuscripts.

Basics of the specification: Jules J Berman, Mary Edgerton and Bruce Friedman.The tissue microarray data exchange specification: a community- based, open source tool for sharing tissue microarray data. BMC Med Inform Decis Mak May 23;3:5 Real-world implementation example: Jules J Berman, Milton Datta, Andre Kajdacsy-Balla, Jonathan Melamed, Jan Orenstein, Kevin Dobbin, Ashok Patel, Rajiv Dhir, Michael J Becich. The tissue microarray data exchange specification: implementation by the Cooperative Prostate Cancer Tissue Resource. BMC Bioinformatics 2004 Feb 27, 5:19

Why is it important to have a data exchange specification for TMAs? The greatest value of TMAs is the ability to link TMA data with data from other TMAs and from other databases that inform on the data contained in the TMA database. That value is essentially untapped because there has been no way to publish, exchange, merge and link TMA datasets in a manner that everyone can use and understand. The data exchange specification provides a common intermediate structure for TMA data that can be used to exchange data between different TMA databases.

Analagous situation: Wordperfect (different versions) Word (different versions) Abiword Postscript Pdf One vendor’s software often cannot open files prepared in another vendor’s software. But any good word processor should be able to export a file as an RTF file (simple ascii with markup for formatting), and should be able to import the RTF file and convert it to their preferred proprietary format.

We wanted to make a flexible specification for TMAs that would permit researchers with proprietary systems to port their TMA data into a file that could be easily disassembled and re-assembled into other formats. The basic properties of the file: 1.Self-describing 2.Made from commonly understood data structures 3.Extremely simple (most of our stakeholders are not sophisticated bioinformaticians, computer scientists, or metadata experts) 4.Infinitely scalable (can be endlessly combined with other data sources)

The first draft of the specification was developed through open workshops held at meetings sponsored by the Association for Pathology Informatics and the National Cancer Institute

May 30, Ann Arbor, Michigan. Chair of speaker session: Mark A Rubin. Speakers: David Rimm, Steve Bova, Matt Van de Rijn, Jules Berman Oct. 6, Pittsburgh, PA and co-sponsored by The National Cancer Institute. Chair, Mary Edgerton. Speakers: Olli Kallioniemi, Chris Chute, Richard Lieberman, Paul Spellman. Chair of Data Exchange Workshop: Mary Edgerton. May 22, Ann Arbor, Michigan and co-sponsored by the National Cancer Institute. Chair of Speaker session: Mark A. Rubin. Speakers: James Bacus, Angelo de Marzo, Peggy Porter, David Rimm and Guido Sauter. Chair of Data Exchange Workshop: Dr. Mary Edgerton. October 4, Held in conjunction with Advancing Pathology Informatics, Imaging and the Internet, Pittsburgh, PA. Chair of speaker session: Mary Edgerton. Speakers: Steve Hewitt, Ulysses Balis. Chair of Data Exchange Workshop: Mary Edgerton.

Specification is XML XML allows heterogeneous systems to communicate and exchange their data It achieves this through metadata (data about data). Can produce an ideal document that completely describes itself, including all data and all metadata.

Four required sections: 1) Header, containing the specification Dublin Core identifiers, 2) Block, describing the paraffin-embedded array of tissues, 3)Slide, describing the glass slides produced from the Block, and 4) Core, containing all data related to the individual tissue samples contained in the array.

Eighty Common Data Elements (CDEs), conforming to the ISO specification for data elements constitute XML tags used in the TMA data exchange specification. Only a hand-ful of these are required in TMA files. A set of six simple semantic rules describe the complete data exchange specification. Anyone using the data exchange specification can validate their TMA files using a software implementation written in Perl and distributed as a supplemental file with this publication.

<histo xmlns=" xmlns:cpctr=" xmlns:dc=" Cooperative Prostate Cancer Tissue Resource (CPCTR) Prostate Cancer Microarray 1-2 CPCTR Prostate tissue microarray CPCTR TMA XML datafile for Microarray 1- 2 CPCTR Prostate Cancer Tissue Microarray

G61 Caucasian Yes adenocarcinoma NOS aka acinar Bladder pT3b pN0 pMX Alive Unknown row 9, column 18|row 10, column 4

Implementing the specification We provide: 1.The specification (XML data structure and 80 common data elements) 2.A perl-script validator 3.A paper that describes a real-world implementation (porting TMA data from an excel spreadsheet) You provide: 1.Whatever database you like for storing your TMA data 2.A script (java, perl, python, whatever) that can port your data into the TMA specification. 3.A script that can port TMA files in the data exchange specification into whatever database you prefer.

Future?