Informatics Laboratory Digital Imaging Project APIII 2006 Vancouver, British Columbia Session F2, Friday, August 18, 2006, 10:30 A.M. to noon Jules J. Berman, Ph.D., M.D. Co-chair, Laboratory Digital Imaging Project President, Association for Pathology Informatics
Purpose of LDIP image specification 1. To permit image users to annotate a pathology image with relevant technical, pathologic, and clinical information and to convey this information with the image file. 2. To provide a file that is self-describing, containing well-defined metadata for all data values, and that uses a standard, generic syntax that is easy to understand and implement. 3. To produce image files that can be integrated with other image standards and other types of data expressed in the same syntax.
Specific goals of LDIP 1. Develop an RDF schema for LDIP that employs well-defined metadata from existing standards (HL7, DICOM, OME, CytometryML, MISFISHIE, GO). 2. Keep it simple (should not require more than 5 minutes to learn if you know anything about RDF; 8 minutes if you don't) 3. Publish easily emulated examples of the LDIP schema being used with HL7, DICOM, jpeg, OME, surgical pathology reports, etc. 4. Follow our progress at:
Problems with existing standards (including DICOM) 1. Too complex, hard to understand and implement. 2. Not generic (won't merge with datasets using other standards) 3. Made with non-standard (often obsolete) methodologies (not XML/RDF, even ascii) 4. Lack the metadata (data descriptors) needed by pathologists.
Arguments for HL7 and DICOM are coercive 1. The U.S. Government is fully backing HL7 and DICOM 2. The major vendors are backing HL7 and DICOM Shouldn't arguments be based on the scientific or technical merits of the standards and come from non-conflicted experts with no loyalty to the standards?
Why governments avoid creating biomedical standards Private entities that use a standard may be in the best position to create the best possible standard. Private entities are more likely to adopt a new standard if they had a part in developing the standard. Governments know that many standards are never adopted by the public and do not want to waste their resources on a standard that will be ignored. Governments may be reluctant to face criticism for standards that may adversely effect certain segments of its population.
Most importantly, U.S. Government is prohibited by law from intruding into the Standards development process. Specified by law: The National Technology Transfer and Advancement Act of 1995 (NTTAA), Public Law This Act directs Federal agencies to use standards developed by private standards development organizations, instead of government agencies, whenever feasible.
Industry is not permitted to create coercive standards: When industry creates a standard, it should be remembered that every design element can potentially benefit some entities and harm others. This is why the standards process can be contentious. The U.S. RICO laws are invoked as a potential concern for standards developers. RICO is the Racketeer Influenced and Corrupt Organizations Act, U.S. Code Title 18, Part 1, Chapter 96.
From RICO: (a) Whoever in any way or degree obstructs, delays, or affects commerce or the movement of any article or commodity in commerce, by robbery or extortion or attempts or conspires so to do, or commits or threatens physical violence to any person or property in furtherance of a plan or purpose to do anything in violation of this section shall be fined under this title or imprisoned not more than twenty years, or both. (b) As used in this section- (1) The term "robbery" means the unlawful taking or obtaining of personal property from the person or in the presence of another, against his will, by means of actual or threatened force, or violence, or fear of injury, immediate or future, to his person or property, or property in his custody or possession, or the person or property of a relative or member of his family or of anyone in his company at the time of the taking or obtaining. (2) The term "extortion" means the obtaining of property from another, with his consent, induced by wrongful use of actual or threatened force, violence, or fear, or under color of official right."
Government and Standards Organizations often bet on the the wrong horse. From Wikipedia: ISO/OSI “The model was defined by the International Organization for Standardization in the ISO standard Of course, by that time, TCP/IP had been in use for years. TCP/IP was fundamental to ARPANET and the other networks that evolved into the Internet..... Only a subset of the whole OSI model is used today. It is widely believed that much of the specification is too complicated and that its full functionality has taken too long to implement, although there are many people who strongly support the OSI model. On the other hand, many feel that the best thing about the whole ISO networking effort is that it failed before it could do too much damage.”
Government and Standards Organizations often bet on the the wrong horse. I3C – Dozens of industry and government leaders united to develop health care interoperability Sun Microsystems' Informatics Advisory Council IBM Apple Oracle Federal Government: National Cancer Institute National Human Genome Research Institute Now defunct: Impression is that the group conceded effort to the W3C which has a generic approach embodied under the semantic web.
LDIP is a way of specifying an image and is not a standard. LDIP simply uses generic W3C standards to create a simple way of expressing image information. You can basic RDF in a few minutes. Methods used here can (and should) be extended to other biomedical domains.
LDIP uses RDF, a existing generic simple syntax recommended by the W3C RDF files are collections of statements expressed as data triples “Jules Berman” “blood glucose level” “85” “Mary Smith” “eye color” “brown” “Samuel Rice” “eye color” “blue” “Jules Berman” “eye color” “brown” When you bind a key/value pair to a specified object, you're moving from the realm of data structure into the realm of data meaning.
Medical file: “Jules Berman” “blood glucose level” “85” “Mary Smith” “eye color” “brown” “Samuel Rice” “eye color” “blue” “Jules Berman” “eye color” “brown” Merged Jules Berman database: “Jules Berman” “blood glucose level” “85” “Jules Berman” “eye color” “brown” “Jules Berman” “hat size” “9” Hat file: “Sally Frann” “hat size” “8” “Jules Berman” “hat size” “9” “Fred Garfield” “hat size” “9” “Fred Garfield” “hat_type” “bowler” RDF permits data to be merged between different files
"The image is a squamous cell carcinoma of the floor of the mouth. It was taken by Jules Berman, on February 2, The microscope was an Olympus model The lens objective was 40x The camera was a Sony model 342. The image dimensions are 524 by 429 pixels. The microscope and camera were not calibrated. The specimen Baltimore Hospital Center S , specimen 2, block 3. The specimen was logged in 8/15/01 and processed using the standard protocol for H&E that was in place for that day. The patient is Sam Someone, medical identifier 4357 The tissue was received in formalin. The specimen shows a moderately differentiated, invasive squamous carcinoma. The patient has a 30 year history of oral tobacco use. The image is kept in jpeg (Joint Photographic Experts Group) file format and named y49w3p2.jpg and kept in the pathology subdirectory of the hospital's server. It's URL is The image file has an md_5 hash value of gjsj The image has no watermark Copyright is held by Baltimore Hospital Center, and all rights are reserved."
:] [.] rdf:.] [:Baltimore_Hospital_Center rdf:type "Hospital".] [:Baltimore_Hospital_Center_4357 rdf:type"Unique_medical_identifier".] [:Baltimore_Hospital_Center_4357 :patient_name "Sam_Someone".] [:Baltimore_Hospital_Center_4357 :surgical_pathology_specimen "S3456_2001".] [:S_3456_2001 rdf:type "Surgical_pathology_specimen".] [:S_3456_2001 :image.] [:S_3456_2001:log_in_date " ".] [:S_3456_2001 :clinical_history "30_years_oral_tobacco_use".] [ rdf:type "Medical_image".] [ :surgical_pathology_accession_number "S ".] [ :specimen "2".] [ :block "3".] [ :format "jpeg".] [ :width "524_pixels".] [ :height "429_pixels".] [ :hash_value " gjsj350489".] [ :hash_type "md_5".] [ :watermark "none".] [ :camera "Sony".] [ :camera_model "342".] [ :capture_date " ".] [ :diagnosis "squamous_cell_carcinoma".] [ :topography "floor_of_mouth".] [ :has "Intellectual_property_restriction".] [ :copyright "all_rights_reserved".] [ :copyright_holder "Baltimore_Hospital_Center".] [ :microscope "Olympus".] [ :microscope_model "3453".] [ :microscope_objective_power "40X".] [ :photographer_name "Jules_Berman".]
Proper triples “Jules Berman” “blood glucose level” “85” A specified object well-defined metada datatyping 1. Unique identifiers for unique objects: URIs, LSIDs, other identification systems 2. Class identifiers for class objects: examples.... image class, person class, report class, event class 3. Formal Common Data elements protected in namespaces example..... chem:blood_glucose ldip:imaging_device 4. Datatyping using xsd for data types examples.... integer, string literal, one of an enumeration list
CDEs in RDF are either classes or properties. The LDIP model for CDEs is designed to support automatic transformation into an RDF schema: The format for classes is: Class Label (in standard XML tag format, uppercase 1st letter): Registration Authority: Association for Pathology Informatics Cardinality: (default is "/[0-9]+/"): Comment (must include detailed definition): subClassOf: Contributor (your consistent first-name last-name): Date of your contribution (/[\d]{2}\-[\d{2}]\-[\d]{4}]/): The format for properties is: Property Label (in standard XML tag format, lowercase 1st letter): Registration Authority: Association for Pathology Informatics Cardinality (default is "/[0-9]+/"): Datatype (can be "literal", a list or a regex; default is "literal"): Comment (must include detailed definition): Domain (comma-delimited if multiple): Range (default is "literal"): Contributor (your consistent first-name last-name): Date of your contribution (/[\d]{2}\-[\d{2}]\-[\d]{4}/):
Example: Common Data Element for “Instrument” Class Label:Instrument versionInfo (required): 0.1 Registration Authority (required): Association for Pathology Informatics Language:en Cardinality (required):/[0-9]+/ Datatype: Literal comment: All the instruments used in preparing, viewing, and imaging a specimen. Includes: microscope, camera. subClassOf:Class Contributor:Bill Moore Date_of_contribution: The plain-text list of CDEs can be automatically converted into RDF schema and xsd user-defined datatypes.
Summary 1. Not to make a new standard for pathology images or to compete with existing standards. 2. To develop a way for people to specify information about their images using standard, simple and generic annotation methods. 3. To provide a standard syntax for conveying this information with their image binaries. The resultant files can be used for telepathology, consultation, posted to the web, submitted as supplemental material with publications, etc. 4. To integrate terminology from existing healthcare standards (Hl7, DICOM, OME, others) 5. To provide an infrastructure that can be used by developers as a data exchange specification that will support interoperability between applications and standards.