Download presentation
Presentation is loading. Please wait.
Published byPrudence Wood Modified over 9 years ago
1
The View from Computation and Algorithms Andrew Olney University of Memphis
2
This Session Una-May O’Reilly – MOOCs: Research collaboration, data privacy, and the role of technology Shuangbao Wang – The illusion of privacy in an age of cyberinsecurity Solon Barocas – Big data and unexpected threats to privacy
3
My Background Research – Language, Education, AI Data – Video, Speech, Motion, Posture, Text, EEG, Eyetracking, Learning, Decisions/Judgments Admin
4
MOOCdb (Una-May) Open-ended standard data description Enable cross-course analysis
5
Video (Shuangbao) Automated video content analysis (inVideo) – Audio: keywords/language patterns – Video: reference pictures/knowledge inVideo could be applied to provide rich data on videos, turn them into more effective learning tools, and improve MOOCs
6
Privacy Threats (Solon) Benefits – Scientific knowledge – Decision making – Self knowledge Privacy protections must be sufficient to enable benefits Problems – Anonymity is an oxymoron An identifier is an identifier De-anonymization Inference – Informed consent cannot be guaranteed – Tyranny of the minority – the Target case Risk assessment
7
Focus Questions Threats/harms – De-anonymization – Public perception/discouragement Potential value – Scientific knowledge – Decision making – Self knowledge What IRB should do
8
Deanonymization Encryption? Self-identification – AOL’s 4417749 Cross-comparison – Netflix (external) – Target (internal)
9
Identifiability How much “encryption” is enough? – Time vs. set size Is it possible to guarantee? – Relative to data type – Relative to cross-comparison
10
Identifiable data types Important characteristics – Stationary – Distinctive Face Vocal tracts Movement Word choice
12
Cross-comparison
13
Threats Deanonymization very real – Low dimensionality data set with “vanilla” indicators “Real World” data makes it worse – More chance of cross-comparison – But this is where the interesting questions are
14
What should IRB do? Risk analysis – centered – Worst case scenario considered for privacy/confidentiality breach How will data be shared – Is public ‘anonymized’ warranted? – Restricted-use
16
Questions? http://andrewmolney.name
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.