Presentation is loading. Please wait.

Presentation is loading. Please wait.

Julia Lane, and many many coauthors. BIG DATA DEFINITION “Big Data” is an imprecise description of a rich and complicated set of characteristics, practices,

Similar presentations


Presentation on theme: "Julia Lane, and many many coauthors. BIG DATA DEFINITION “Big Data” is an imprecise description of a rich and complicated set of characteristics, practices,"— Presentation transcript:

1 Julia Lane, and many many coauthors

2 BIG DATA DEFINITION “Big Data” is an imprecise description of a rich and complicated set of characteristics, practices, techniques, ethics, and outcomes all associated with data. (AAPOR) No canonical definition By characteristics: Volume Velocity Variety (and Variability and Veracity) By source: found vs. made By use: professionals vs. citizen science By reach: datafication By paradigm: Fourth paradigm Source: Julia Lane

3 IMPLICATIONS FOR MEASUREMENT New business model Federal agencies no longer major players New analytical model Outliers Finegrained analysis New units of analysis New sets of skills Computer scientists Citizen scientists => Different cost structure

4 Source: Ian Foster, University of Chicago EXAMPLE

5 Source: Jason Owen Smith and UMETRICS data

6 ACCESS FOR RESEARCH

7 VALUE IN OTHER FIELDS

8

9 DATA HAVE VALUE

10 SO WE NEED TO GET THINGS RIGHT

11 VALUE IN OTHER FIELDS

12 What is the legal framework? What is the practical framework? What is the statistical framework? CORE QUESTIONS

13 LEGAL FRAMEWORK Current legal structure inadequate “The recording, aggregation,and organization of information into a form that can be used for data mining, here dubbed ‘datafication’, has distinct privacy implications that often go unrecognized by current law (Strandburg) Assessment of harm from privacy inadequate Privacy and big data are incompatible Anonymity not possible Informed consent not possible Source: Julia Lane

14 BAROCAS AND NISSENBAUM

15 INFORMED CONSENT (NISSENBAUM)

16 STATISTICAL FRAMEWORK Importance of valid inference The role of statisticians/access Inadequate current statistical disclosure limitation Diminished role of federal statistical agencies Limitations of survey New analytical framework : Mathematically rigorous theory of privacy Measurement of privacy loss Differential privacy

17 PRACTICAL FRAMEWORK

18 SOME SUGGESTIONS

19 AND A REMINDER OF WHY

20


Download ppt "Julia Lane, and many many coauthors. BIG DATA DEFINITION “Big Data” is an imprecise description of a rich and complicated set of characteristics, practices,"

Similar presentations


Ads by Google