A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government.

Slides:



Advertisements
Similar presentations
Federal Transparency.gov As Data For the Digital Government Strategy Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Advertisements

Data Science for Natural Medicines: Dead Doctors Don't Lie Radio
Data Science for Tackling the Challenges of Big Data
Who Tweets the most about Gov20? Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 5,
Data Science for MyFamilySearch.org Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community My Personal Family History.
Global Burden of Disease
Peterson-Kaiser Health System Tracker What do we know about the burden of disease in the U.S.?
EPA Big Data Analytics: EnviroAtlas Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
OMB Data Visualization Tool Requirements Analysis: IBM Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
My FamilySearch.org Tutorial Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community My Personal Family History Dashboard.
NLM-Semantic Medline Data Science Data Publication Commons Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Big Data and Social Media & Web Analytics Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
NIST Scientific Data for Data Science United Nations Open Data / Open Government Conference, April 26-28, Abu Dhabi
Data Science for RDA Climate Change Data Challenge and Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
EPA Big Data Analytics: Data Science for EPA Fracturing Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Linked Data Visualizations for Eurostat Linked Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Back-End Structures and Front End Visualizations DAMA Minnesota Matthew Israelson 19 November, 2014.
Big Data Conference: Analytics and Applications for Federal Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Data Science for USGS Minerals Big Data Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data.
Global Burden of Disease 2010 Council on Foreign Relations Feb. 6, 2013, Washington, D.C. Christopher JL Murray Institute Director Findings and implications.
Imagine Everything is Before You: Past, Present, and Future Paper and Demonstration for the 2014 Family History Technology BYU Dr. Brand Niemann.
Information Sharing Begins With Me Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
GIS Data Science for Collaboration Across Communities: GIScience 2.0 and Beyond Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Data Science Publication for NSF Polar Cyberinfrastructure Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Using Data Science as Evidence in Public Policy With Big Data and Elections Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
EPA Indicators of Our Health and Environment Updated and Improved Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Big Data Symposium: Analytics and Applications for Federal Big Data – Bureau of Justice Statistics Dr. Brand Niemann Director and Senior Enterprise Architect.
Big Data Symposium: Analytics and Applications for Federal Big Data - FEMA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Farm Data Dashboards: USDA and Microsoft Innovation Challenge Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Data Science for Agency Initiatives 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for RDA Climate Change Data Challenge and Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
MedlinePlus Trusted Health Information for You A service of the U.S. National Library of Medicine National Institutes of Health What’s new with MedlinePlus,
RootsTech 2012: My Experiences Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Data Science for DataBay DataBay "Reclaim the Bay" Innovation Challenge: August 1-3, 2014, Smithsonian Environmental Research Center, 647 Contees Wharf.
Data Science ESIP Publication Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for USGS Minerals Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
Data Science for DTIC Data Ecosystem Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
The 2012 EuroStat Regional Yearbook for Semantic Interoperability Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Why Doesn't EPA Have a Self- Contained Statistical Unit?: A Tribute to Doug Engelbart Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Data Science for USDA Big Data
Data Science for EPA Big Data Analytics: Oregon Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Data Science for Migration Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Health Datapalooza IV: Child and Adolescent Health Data App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Build the NY Times Subject Headings and Topics in the Cloud Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 4,
SmartGrid and Spotfire Cloud Computing - Similarities in Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Health Datapalooza Would Benefit From Real Innovation Investment Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community.
Research on US Federal Government Handling of Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
An Internet of Things: People, Processes, and Products in the Spotfire Cloud Library Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
Data Science for the NOAA Chief Data Officer Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Harnessing Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL.
Data Science for HealthCare.gov Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Department of Commerce App Challenge: Big Data Dashboards Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community.
Data Science for DoI BSEE Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for DoI BSEE.
Data Science for Joint Doctrine Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Joint.
Data Science for FDA RFI Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for Conservation International's Big Ecosystem Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
NIEM 3.0 Data Analytics App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger.
Harnessing Health.Data.gov Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Text Analytics A Tool for Taxonomy Development Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture.
Data Science for Global Ebola Response Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
HealthIT.gov Dashboard: Spotfire not Flash Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Metrics for Health, Development and the Environment Christopher JL Murray Institute Director.
Global Burden of Disease PHE contribution to GBD project Project Lead:Adrian Davis, Head of Population Health Science, PHE Senior Data Lead:Jürgen Schmidt,
Data Science for the National Big Data R&D Initiative Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for RDA Climate Change Data Challenge and Meetup
3 June, 2014 Matthew Israelson Data Development Manager
Spotfire 5 Users Guide Dashboard
Generating reliable evidence on the determinants of NCDs
Presentation transcript:

A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger April 22,

Background I did a story about TEDMED 2012 for the 2012 Health Datapalooza III and was invited to go to TEDMED 2013 as a Journalist! Session 2: “How Can Big Data Become Real Wisdom?” and Session 6: “Going Farther while Staying Closer” were the most interesting and motivating to me. See next slide. I heard about Big and Little Data and saw an opportunity to help TEDMED with a taxonomy that is a semantic Index to a knowledge base for improved search and to help TEDMED with examples of big and little data science. And the best data source for my work was Professor Christopher Murray’s (IHME/GBD) presentation and demonstration on “What does a $100 million public health data revolution look like?” funded by the Bill and Melinda Gates Foundation to prioritize global health research and help. It made me think of the Monica Rogati’s Strata 2012: More data beats clever algorithms but better data beats more data. Yes, and working on IHME/GDB (Global Burden of Disease) Visualizations like: But I want to volunteer to help TEDMED 2013 and 2014 as a data scientist/data journalist and saw on their Web site: If you are a talented designer and/or illustrator with experience in bringing presentations to life, you could help with our speaker presentation materials. I attended the First Great Challenges Day, participated in the Inventing Wellness Programs Breakout Session, and learned the importance of scientists storifying with “and, but, and therefore”. Therefore my story is a TEDMED Data Reveal: Big (IHME/GBD) and Little (TEDMED Web Site) with “and, but, and therefore.” 2

My TEDMED 2013 Highlights SESSION 2: How Can Big Data Become Real Wisdom? – Jay Walker: Introduction. Need a macro-scope to gather, network, store, and access data and to go from data to wisdom by finding patterns in the data. – Larry Smarr: Can you coordinate the dance of your body's 100 trillion microorganisms? How to quantify self movement with medical detail in real time by an astrophysicist turned computer scientist. SESSION 6: Going Farther while Staying Closer – Christopher Murray: What does a $100 million public health data revolution look like? Talk and live demo of Global Burden of Disease Treemap, Map, Time Plot, Age Plot, and Stacked Bar Chart by Age and Sex. 3 See:

TEDMED My Note: I decided to make this a Searchable Knowledge Base.

TEDMED Knowledge Base 5 Google Chrome: Find

TEDMED Speakers 6 My Note: I decided to make this a little data set for faceted search.

TEDMED Speakers Spreadsheet 7 My Note: The facets are Year, Keywords, and Tags.

Institutions hosting TEDMEDLive My Note: I decided to make this a little data set for mapping, but it was difficult to get the geo-referenced data set.

TEDMEDLive 2013 Institutions Spreadsheet 9 My Note: Simple Geo-referencing of Institutions.

TEDMED 2013: Spotfire 10

TEDMED : Spotfire 11

Institute for Health Metrics and Evaluation (IHME) 12 My Note: I heard this talk and decide to work with this big data. My Note: There are three Web site

Press Release The Global Burden of Disease (GBD) is a first-of-its-kind study of health around the world. The GBD findings present a new way to look at health, allowing countries to track progress against diseases ranging from malaria to cancer to diabetes, identify risks including smoking and poor diet, see how people in 187 countries are faring in terms of health and gauge emerging health challenges. The GBD is a collaboration of nearly 500 researchers in 50 countries, and is led by IHME, part of the University of Washington. Some of the countries included in the GBD, such as the UK and Indonesia, already have started to produce their own policy recommendations as a result of the study. Australia and China are also planning to produce studies that use GBD to drill down and develop local-level health data. IHME is working with three localities in the US to produce GBD-type data at the community level as well. Efforts are underway to provide continuous updates to the GBD and expand the range of health issues included in the study. The GBD measures health issues around the world through more than 1 billion pieces of data that can also be explored through interactive visualization tools online. 13

GHDx Catalog of Demographic and Health Data by IHME 14 My Note: Download Data

Global Burden of Disease Study 2010 Data Downloads 15 My Note: I downloaded 17 files totaling 1.13 GB. Two Codebook files were damaged and I repaired them.

GBD Compare 16 My Note: Treemap and Map.

GBD Cause Patterns 17 My Note: Stacked Bar Chart.

GBD Cause Patterns: Reports 18

IHME-GBD Causes of Death: Spotfire 19

IHME-GBD Life Expectancy: Spotfire 20

IHME-GBD Mortality: Spotfire 21

IHME-GBD Risk Factors: Spotfire 22

IHME-GBD Breast and Cervical Cancer: Spotfire 23 Navigation and Metadata Data Set World Map Bar Chart My Note: Data Visualizations are Linked.

Data Ecosystem: Spotfire 24

IHME-GBD Life Expectancy by Country: Spotfire 25 Navigation and Metadata Code Book Filters Details-on-Demand My Note: The Visualizations Are Linked to One Another. Data Set My Note: 19 files totaling 1.13 GB of data in a Spotfire file of only 0.5 GB! Life Expectancy by Region Life Expectancy (LE) Versus Health Adjusted Life Expectancy (HALE)

Conclusions and Recommendations My story is a TEDMED Data Reveal: Big (IHME/GBD) and Little (TEDMED) with “and, but, and therefore.” – I have done as Jay Walker suggested: We need a macro-scope to gather, network, store, and access data and to go from data to wisdom by finding patterns in the data. But to do that, TEDMED needs a taxonomy that is a semantic index to a knowledge base for improved search and help with examples of big and little data science. – I found the best big data source for my work was Professor Christopher Murray’s IHME/GBD funded by the Bill and Melinda Gates Foundation to prioritize global health research and help. But I found I could improved the access and simplify the visualizations of the IHME/GBD data. – Therefore, I did both of the above and volunteered to help TEDMED 2013 and 2014 as a data scientist/data journalist. 26

Data Visualizations 27

GBD Data Visualizations Spreadsheet 28 My Note: See All 13 Tabs.

GBD Data Visualizations Inventory 29 My Note: Download 36 flies totaling 19 MB and selected a few for visualizations.

Diabetes Prevalence by County (US) Maps 30 My Note: I used this in my 2013 Health Datapalooza IV Submission and the The Sanofi US 2013 Data Design Diabetes Innovation Challenge – Prove It!

Research Articles 31 My Note: Research Article.

Research Articles 32 My Note: Included this in the Knowledge Base.

Datasets 33 My Note: Downloaded this dataset.

Diabetes prevalence rates by age, sex, and county, 2008 (21KB* xls) *My Note: Actual size is 556KB. My Note: Needed to be separated into county and state.

Metadata 35 My Note: Another Excel file name, but same file.

IHME Diabetes County 2009: Spotfire 36 Navigation and Metadata Data Set Map Top 10 Counties With High Prevalence of Diabetes Higher Female Than Male Diabetes Prevalence

IHME-GBD Mortality by Country: Spotfire 37

IHME-GBD Disability Factors by Health State: Spotfire 38

IHME-GBD Risk Factors by Region: Spotfire 39

IHME-GBD Cause of Death by Region: Spotfire 40