Data Sharing Spoke Northeast Big Data Innovation Hub

Slides:



Advertisements
Similar presentations
Davis Wright Tremaine LLP HIT Legal Issues: HIPAA Implications to a Regional Health Information Organization Becky Williams, R.N., J.D. Partner, Co-Chair,
Advertisements

29e CONFÉRENCE INTERNATIONALE DES COMMISSAIRES À LA PROTECTION DES DONNÉES ET DE LA VIE PRIVÉE 29 th INTERNATIONAL DATA PROTECTION AND PRIVACY COMMISSIONERS.
INADEQUATE SECURITY POLICIES Each covered entity and business associate must have written polices that cover all the Required and Addressable HIPAA standards.
University Data Classification Table* Level 5Level 4 Information that would cause severe harm to individuals or the University if disclosed. Level 5 information.
Red Flag Rules: What they are? & What you need to do
Helping you protect your customers against fraud Division of Finance and Corporate Securities.
Powered by SIS Technology. Debt collection challenges Increase your collections Decrease your costs Optimize your time Secure your data Organize your.
Security Controls – What Works
1 SAP Security and Controls Use of Security Compliance Tools to Detect and Prevent Security and Controls Violations.
Ethics and Responsibility
Chapter 9 Information Systems Controls for System Reliability— Part 2: Confidentiality and Privacy Copyright © 2012 Pearson Education, Inc. publishing.
Data Protection in Higher Education: Recent Experiences in Privacy and Security Institute for Computer Law and Policy Cornell University June 29, 2005.
Working across sectors Building collaborative eco-systems Lars Sundstrom SARTRE.
Electronic Records Management: What Management Needs to Know May 2009.
DATA GOVERNANCE Presentation to CSG September 27, 2007 Mary Weisse Manager, MIT Data & Reporting Services
What Keeps You Awake at Night Compliance Corporate Governance Critical Infrastructure Are there regulatory risks? Do employees respect and adhere to internal.
18 th Annual Canadian IT Law Association Conference Insider View from the EU Expert Group on Cloud Computing Dr Sam De Silva Partner, Head of IT & Outsourcing.
Risk Assessment. InfoSec and Legal Aspects Risk assessment Laws governing InfoSec Privacy.
The analyses upon which this publication is based were performed under Contract Number HHSM C sponsored by the Center for Medicare and Medicaid.
© Copyright 2009 EMC Corporation. All rights reserved. Controlling Content Helps Achieve Compliance EMC & Informative Graphics Corp. EMC World 2010.
HathiTrust’s Past, Present and Future. Short- and Long-term Functional Objectives Short-term Page turner mechanism (and Mobile!) Branding (overall initiative;
Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer Extended RBAC-design and implementation.
1 © 2014 Cloudera, Inc. All rights reserved. Preventing a Big Data Security Breach.
Patient Confidentiality and Electronic Medical Records Ann J. Olsen, MBA, MA Information Security Officer and Director, Information Management Planning.
1 Analysis of Consumer Issues and Paths for Concrete Approaches Dr. Carsten Orwat Forschungszentrum Karlsruhe in the Helmholtz Association, Institute for.
Federal Trade Commission U.S. Rules on Privacy and Data Security Organization for International Investment General Counsel Conference October 16, 2009.
Academic Computing Daniella Meeker, PhD Director, Clinical Research Informatics SC-CTSI Assistant Professor of Preventive Medicine and Pediatrics.
Data Governance 101. Agenda  Purpose  Presentation (Elijah J. Bell) Data Governance Data Policy Security Privacy Contracts  FERPA—The Law  Q & A.
Compliance Promotion Formalizing an Approach to Support Stakeholder Compliance.
DriveSense’14 NSF Workshop on Large-Scale Traffic and Driving Activity Data DriveSense’14, Oct 30-31, Norfolk, VA.
| 1 Open Access Advancing Text and Data Mining Libraries & Publishers working together to support Researchers What is Text Mining?
1 Melanie Alexander. Agenda Define Big Data Trends Business Value Challenges What to consider Supplier Negotiation Contract Negotiation Summary 2.
Compliance August 18, Agenda Outline Status Draft of Answers.
Fred Carter Senior Policy & Technology Advisor Information and Privacy Commissioner Ontario, Canada MISA Ontario Cloud Computing Transformation Workshop.
Chapter 3 Pre-Incident Preparation Spring Incident Response & Computer Forensics.
Safeguarding Sensitive Information. Agenda Overview Why are we here? Roles and responsibilities Information Security Guidelines Our Obligation Has This.
Large-Scale Record Linkage Support for Cloud Computing Platforms Yuan Xue, Bradley Malin, Elizabeth Durham EECS Department, Biomedical Informatics Department,
HHS Security and Improvement Recommendations Insert Name CSIA 412 Final Project Final Project.
Connecting College to Career with State Government Data
Audit Trail LIS 4776 Advanced Health Informatics Week 14
Accountability & Structured Privacy Management
3.a.iii Medication Adherence Program (MAP)
Strategies in the Game of
Sherry Michael Weller Bring Your Own Cloud: Data Management Challenges in a Click-Through World Sherry Michael Weller
Research Fairness Initiative (RFI)
Eric Peirano BRIDGE Support Team, Technofi
Information Security, Theory and Practice.
Encrypted from CDS Office Technologies
Data Sharing, Storage, & Consent
Athina Antoniou and Lilian Mitrou
Use Cases for Federation
LEGAL & ETHICAL ISSUES InsurTech & Health Insurance Providers
“Enabling Seamless Data Sharing in Industry and Academia”
Accelerate your compliance journey Key customer concerns Product value
Data Sharing, Storage, & Consent
Varonis Overview.
Today’s Business Pain Points
CASE STUDY Intelligent Subrogation
Ethical questions on the use of big data in official statistics
Security Awareness Training: Data Owners
Compliance….GlobalSearch……WHAT?!?!
Health Care: Privacy in a Digital Age
Storing and Accessing G-OnRamp’s Assembly Hubs outside of Galaxy
18734: Foundations of Privacy
The future of financial infrastructure An ambitious look at how blockchain can reshape financial services An Industry Project of the Financial Services.
Colorado “Protections For Consumer Data Privacy” Law
PERSONALLY IDENTIFIABLE INFORMATION: AUDIT CONSIDERATIONS
Security in SharePoint and Teams with DLP, IRM, and AIP
GDPR is here – are you ready?
Presentation transcript:

Data Sharing Spoke Northeast Big Data Innovation Hub Carsten BinniG (Brown), Jane Greenberg (Drexel), Tim kraska (Brown), Sam Madden (Mit)

Why Is Data Sharing Important! Different Reasons Combining Data Sharing with Experts … Promise: Better Insights into “Big Data”

Combining Data: Collaborative Cancer Cloud Indeed, Dana-Farber is one of three partners in the Intel CCC pilot; the other two are Oregon Health & Science University (OHSU) and the Ontario Institute for Cancer Research (OICR). All three of these partners are working together on a variety of projects (including, notably, research to identify previously unknown cancer-causing mutations) and coding new tools to aid their efforts, using the CCC. The partners seem well-matched for each other collaboratively. "We are working with Intel, OICR, and OHSU to develop a one-year pilot project to demonstrate secure genomic data sharing across our three institutions," explained Cerami. "Each of our institutions currently perform some type of genomic sequencing on patients, and the goal of the pilot project is to pool genomic data across all three centers and make it available for joint computation. Secure genomic data sharing across our three institutions

Sharing Data with Researchers: Financial Data Challenge: Consumer credit risk analysis and forecasting Approach: Machine learning FICO Score Machine-Learning Score 1% sample,10Tb This graph:2008Q4 Current 30-days late 60-days late 90+ days late The best-known and most widely used credit score model in the United States, the FICO score is calculated statistically, with information from a consumer's credit files Risk Measure vs. CScore Andrew W. Lo et al. (MIT Sloan School) Machine-learning detects potential defaults more accurately than FICO scores!

BUT Data Sharing is HARD!

Why Not Open Data?

Barriers To Data Sharing Must go beyond “creative commons” Incentives – why would someone go to all the effort to share their valuable data? Concerns over sensitive information (e.g., PII) Regulations governing use of data in different domains Not just “throwing it over the wall”! Do not want to loose control over data Can I get my data back? Has to be updated, requires training, redacted etc.

Sharing Data Today No data sharing without a legal agreement Involve lawyers to create individual agreement -> often prevents sharing!

Data Sharing Spoke: Goals Data-sharing Licensing Framework / Generator Data-Sharing Platform (Enforce Licenses) Metadata (Search Licenses & Data) Principle: Solve the 80% case!

Goal: Licensing Framework Standard terms that researchers, lawyers, and compliance teams conform with Controlled access Tracking of access Usage rights (e.g., publication, copying) Duration of use Warrantees of correctness/completeness/availability Other requirements and regulations

Licenses: First Results Data-Sharing Workshop 2016 (Metadata Research Center @ Drexel): Approx. 60 participants form industry + academia Hear from the trenches What works? What doesn’t? What are the biggest barriers? (What are the non-barriers?) Brainstorm solutions: would standardized licenses, use-cases/best practices help? Would better technologies help? Forge a path forward, together Agenda and Report: http://cci.drexel.edu/mrc/news/ 2016-11-bigdatahubworkshop/

Licenses: First Results Collected sharing agreements from academic institutions Compile list of standard terms for General (Time period, Use of data, ...) Privacy & Protection (PII, Security, Training) Access (Who?, How?) Responsibility (Indemnity clause, Ownership, Rights) Compliance (Background checks, Right to audit, ...) Data Handling (Allowed Methods of Data Transfer, ...)

Goal: hosted data-Sharing platform data user Suitably aggregated, de-identified, and fingerprinted data data Traninig Access log ShareDB data owner

Is this possible: Technology ⨝ Sharing Agreements Access control & rights management Expiration Logging & auditing Provenance/Finger printing De-identification “Noising” Aggregation Agreement Clauses Controlled access (who & where) Tracking of access Usage rights (e.g., publication, copying) Duration of use Warrantees of correctness/completeness/a vailability Other requirements and regulations

Is this possible: Technology ⨝ Sharing Agreements Access control & rights management Expiration Logging & auditing Provenance/Finger printing De-identification “Noising” Aggregation Agreement Clauses Controlled access (who & where) Tracking of access Usage rights (e.g., publication, copying) Duration of use Warrantees of correctness/completeness/av ailability Other requirements and regulations

Is this possible: Technology ⨝ Sharing Agreements Access control & rights management Expiration Logging & auditing Provenance/Finger printing De-identification “Noising” Aggregation Agreement Clauses Controlled access (who & where) Tracking of access Usage rights (e.g., publication, copying) Duration of use Warrantees of correctness/completeness/av ailability Other requirements and regulations

Is this possible: Technology ⨝ Sharing Agreements Access control & rights management Expiration Logging & auditing Provenance/Finger printing De-identification “Noising” Aggregation Agreement Clauses Controlled access (who & where) Tracking of access Usage rights (e.g., publication, copying) Duration of use Warrantees of correctness/completeness/av ailability Other requirements and regulations

Is this possible: Technology ⨝ Sharing Agreements Access control & rights management Expiration Logging & auditing Provenance/Finger printing De-identification “Noising” Aggregation Agreement Clauses Controlled access (who & where) Tracking of access Usage rights (e.g., publication, copying) Duration of use Warrantees of correctness/completeness/avail ability Other requirements and regulations

Platform: First Results De-identification is a major obstacle for data sharing (e.g., HIPAA, FERPA, …) Goal: Automatic De-identification Detect sensitive columns (rule catalog, user-defined, machine learning, …) Automatically de-identify Health Insurance Portability and Accountability Act ( HIPAA) Family Educational Rights and Privacy Act (FERPA)

HIPAA: Interactive DE-identification data data owner data user ShareDB

HIPAA: Interactive DE-identification data data owner data user ShareDB

HIPAA: Interactive DE-identification data data owner data user ShareDB De-identified data

NEXT Steps Next Data Sharing Spoke Workshop (Fall 2017) Collect more agreements and create license framework 0.1 Extend tooling support (watermarking, etc.) Metadata support

Questions?