Download presentation
Presentation is loading. Please wait.
Published byAllen Cannon Modified over 6 years ago
1
Metadata Capital: Simulating the Predictive Value of Self-Generated Health Information (SGHI)
IEEE Big Data 2014 Jane Greenberg, Adrian Ogletree CCI/Drexel University, Metadata Research Center Angela P. Murillo, Thomas P. Caruso, Herbie Huang, University of North Carolina at Chapel Hill
2
Metadata Capital Metadata duplication is inefficient, tedious
Metadata as an asset, a product Reuse of good quality metadata increase value of initial investment Goals: Discover and advance application of methods for quantifying the cost and value of metadata over time; raise dialog … M.C – incremental Successive growth rates Metadata duplication is inefficient, tedious An economic concept (Weber, 1905; Smith’s, 1776) Business and operations (net gains or losses) Finances, goods and services, and public needs Intellectual capital (Marr, 2005) Social capital a tangible result, value can increase, or… Modified Capital-sigma notation n R + ∑ ai = R + a1 + a2 +a3 + …an i=1
4
Modified Capital-sigma notation
R + ∑ ai = R + a1 + a2 +a3 + …an i=1 R = value of the metadata record i= number of usages a = incremental increase in value n = maximum number of reuse Cost / value Reuse
5
Modified Capital-sigma notation
DFC April 2013 NSF Review Modified Capital-sigma notation Cost / value Robust metadata reuse a1to a24 Reuse of metadata
6
Cycles… Successive growth rates N ∑ ic = Θ (nc +1) i=1
What about successive growth rate tied to a concept? A concept can be in ~ vernacular to canonical fall by the wayside, less popular out (deprecated)
7
SGHI
9
The Metadata Capital Initiative ~ MetaDataCAPT’L ~
Explore methods for quantifying metadata cost and value over time. Metadata capital targets metadata as an asset containing contextual knowledge about data content Environments Ontology development in collaboration with the National Institute for Environmental Health Sciences (NIEHS). Self-generated health information (SGHI) monitoring daily activity in collaboration with the Research Triangle Institute (RTI).
10
Advance nascent work on “metadata capital” for data science
Discover and advance the application of methods for quantifying the cost and value of metadata over time; raise dialog Advance nascent work on “metadata capital” for data science Actively engage with the NCDS community The Team 3. Connect NCDS metadata efforts w/the Research Data Alliance
11
Self Generated Health Information (SGHI)
“SGHI or Self-Generated Health Information is information generated by mobile health (mHealth apps), wearables and smartwatches. Quantified Self efforts tend to provide examples of the use of this information.” (T. Caruso)
12
facets Data Vendors/brands BodyMedia DailyMil FatSecret Fitbit Fitbug
Validic API Category “simple, standardized connection between healthcare companies and mobile health...” Vendors/brands BodyMedia DailyMil FatSecret Fitbit Fitbug Fleetly Garmin Glooko iHealth Jawbone Up ManageBGL MapMyFitness Moveable MovesApp MyGlucoHealth Nike+ Omron RunKeeper Strava VitaDock Withings Fitness Routine Nutrition Sleep Weight Diabetes Biometrics Tobacco Cessation
13
metadata Most popular timestamp type start_time distance duration
calories utc_offset metadata Total Fields Referenced (FitBit), toward SGHIx X Availale: 39 P (Pending): 3 NA (not available): 42 (Caruso & Ogletree)
14
Conclusion…other Valuation Approaches
Market cap of Facebook per user: $40 – $300 Revenues per record per user: $4 – $7 per year Facebook Experian Market prices of personal data: $0.50 for street address $2.00 for date of birth $8 for social security number $3 for driver’s license number $35 for military record SOURCE: OECD. Exploring the Economics of Personal Data: A Survey of Methodologies for Measuring Monetary Value. OECD Digital Economy Papers. Office for Economic Cooperation and Development Publishing, 2013.
16
h In the Fitbit data scenario, if a patient’s exercise data and environmental quality data can be combined with asthma condition data, we will get a better prediction of the way in which asthma evolves. Prediction error in the model when including the Fitbit data
17
Behavioral changes vs. asthma condition evolution.
18
Limitations Modified capital-sigma is only one dimensional; all metadata properties/concept are not equal Also, we know cost/value relationship is not 1:1. Metadata is only as good as your data not always true What about successive growth rate may be the way to go
19
Concluding remarks Interest….traction
Limitations: bad data, cost/value, more metadata We should care about cost Metadata capital can contextualize the discussion, provide a foundation Generic formula for further research Proof
20
The Team / acknowledgments
Tom Caruso, Health Information Liaison Research Associate, UNC-SILS/RTI Self-generated Health Information (SGHI) Rebecca Boyles, Data Scientist, NIEHS Common Core Vocabulary Jane Greenberg, SILS/UNC, MRC Herbie Huang, Ph.D. student, Economics Dep. UNC Austin Mathews, BSIS student, SILS/UNC, MRC Angela Murillo, Ph.D. student, ,SILS/UNC, MRC Adrian Ogletree, MSIS student, SILS/UNC, MRC Erik Scott, Senior Software Dev./RENCI (Renaissance Computing Institute)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.