Considerations on barriers to data sharing Elaine Collier, MD National Center for Research Resources National Institutes of Health
Considerations on Barriers Existence of Data Data or Information Characteristics Technical Policy and Cultural Impacts Competing Interests Privacy and Confidentiality Cost
Existence of Data Does the data or information exist? Human or machine discoverable? Requirements for discovery? Characteristics of data discoverable? Usability of data? Access requirements for data? Existence should be easily discoverable but….
Data Characteristics Semantics/Meaning of Data Format or syntactic interoperability of data Completeness – raw, aggregate, derived, selected History of Data – provenance, curation Links to other data? Public or Private Access controls
Semantics/Data quality Meaning Content Context Temporal aspects Granularity Provenance or history Language Durability Annotation Properties Framework Community needs or requirements Shared Standards
Technical Semantic and syntactic interoperability Data formats and communication protocols Availability – persistent? Versioning? Requirements for re-use, or re-purposing ? Cleaning, derivation, documentation? Repository or publisher of data Linked to other data? Preservation Impact of technology changes
Policy and Cultural Fear of misuse of data Confidentiality or privacy concern Legal considerations Intellectual Property Resources for sharing data Resources for accessing data – Approval process? – Costs – technical, policy agreement
Competing interests Researcher or collector Institution or company National competition or cooperation Public versus private Fame or Money? Public Good versus Private Good? Compete on use not data itself
Privacy and Security Impact of sharing – Individual – Institution – Population – Country – World Cultural Raw, aggregate, identifiable, links, Clinical care and clinical research data Audience
Cost Whose cost? Cost versus value? Open or closed market? Cultural and policy issues related to cost Technical costs Impact on usability and availability data