X-Informatics Introduction: What is Big Data, Data Analytics and X-Informatics? January 7 2013 Geoffrey Fox

Slides:



Advertisements
Similar presentations
Large Scale Computing Systems
Advertisements

Architecture and Measured Characteristics of a Cloud Based Internet of Things May 22, 2012 The 2012 International Conference.
1 Challenges and New Trends in Data Intensive Science Panel at Data-aware Distributed Computing (DADC) Workshop HPDC Boston June Geoffrey Fox Community.
Current NIST Definition NIST Big data consists of advanced techniques that harness independent resources for building scalable data systems when the characteristics.
International Conference on Cloud and Green Computing (CGC2011, SCA2011, DASC2011, PICom2011, EmbeddedCom2011) University.
Clouds from FutureGrid’s Perspective April Geoffrey Fox Director, Digital Science Center, Pervasive.
Big Data and Predictive Analytics in Health Care Presented by: Mehadi Sayed President and CEO, Clinisys EMR Inc.
Physics-Informatics Looking for Higgs Particle Counting Errors (Continued) January Geoffrey Fox
Geoinformatics and Data Intensive Applications on Clouds International Collaborative Center for Geo-computation Study (ICCGS)
25 Need-to-Know Facts. Fact 1 Every 2 days we create as much information as we did from the beginning of time until 2003 [Source]Source © 2014 Bernard.
Innovating in the Digital Ecology: Social Issues and Consequences Professor Robin Mansell London School of Economics and Political Science HEPTech Academia.
ONS Big Data Project. Plan for today Introduce the ONS Big Data Project Provide a overview of our work to date Provide information about our future plans.
Master of Arts in Data Science Geoffrey Fox for Data Science Program March
Big Data Open Source Software and Projects Unit 1: Introduction Data Science Curriculum March Geoffrey Fox
Master of Arts in Data Science
Unit 3—Part A Computer Memory
Big Data A big step towards innovation, competition and productivity.
Progress towards accessible analytics and data visualization Ed Summers SAS Institute.
Big Data in the Cloud: Research and Education September PPAM 2013 Warsaw Geoffrey Fox
1 Challenges Facing Modeling and Simulation in HPC Environments Panel remarks ECMS Multiconference HPCS 2008 Nicosia Cyprus June Geoffrey Fox Community.
CTS2013 Introduction May Sheraton San Diego Geoffrey Fox
Big Data and Clouds: Challenges and Opportunities NIST January Geoffrey Fox
I399 1 Research Methods for Informatics and Computing D: Basic Issues Geoffrey Fox Associate Dean for Research.
BIGDATA AND DATASCIENCE By Sigma Analytics and Computing.
[what is big data?]: “Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last.
Dr. Michael Featherstone Intro to E-Commerce. Introduction COMMUNICATION MODES My Web site GOOGLE ‘Mike Featherstone’ BING ‘Mike Featherstone’ The IME375.
PolarGrid Geoffrey Fox (PI) Indiana University Associate Dean for Graduate Studies and Research, School of Informatics and Computing, Indiana University.
X-Informatics Web Search; Text Mining B 2013 Geoffrey Fox Associate Dean for.
1 1 Slide Introduction to Data Mining and Business Intelligence.
OpenQuake Infomall ACES Meeting Maui May Geoffrey Fox
Big Data Open Source Software and Projects Introduction I590 Data Science Curriculum August Geoffrey Fox
Remarks on Big Data Clustering (and its visualization) Big Data and Extreme-scale Computing (BDEC) Charleston SC May Geoffrey Fox
Dr. Michael D. Featherstone Summer 2013 Introduction to e-Commerce Web Analytics.
Sign on to wireless – Use the ‘Northwestern – Guest’ wireless Follow instructions in browser for login. Input OSEP as ‘Sponsor’ If the ‘Northwestern Guest’
The Insight Crisis In A Data-Soaked World Sharmila Shahani-Mulligan CEO & Founder ClearStory Data.
Scientific Computing Environments ( Distributed Computing in an Exascale era) August Geoffrey Fox
ICETE 2012 Joint Conference on e-Business and Telecommunications Hotel Meliá Roma Aurelia Antica, Rome, Italy July
Unit 2—Part A Computer Memory Computer Technology (S1 Obj 2-3)
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
SALSASALSASALSASALSA Clouds Ball Aerospace March Geoffrey Fox
X-Informatics MapReduce February Geoffrey Fox Associate Dean for Research.
© 2011 IBM Corporation STEM One industry perspective Maria Hernandez IBM Corp. Director, Strategy and Transformation January 2012.
November Geoffrey Fox Community Grids Lab Indiana University Net-Centric Sensor Grids.
DRAFT Digital Profit Course A 3 Month Course/Program which teaches you Step-By-Step How to Earn a Steady Income Online.
Big Data to Knowledge Panel SKG 2014 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China August Geoffrey Fox
Expedition Workshop Towards Scalable Data Management June 10, 2008 Chris Greer Director, NCO.
HPC in the Cloud – Clearing the Mist or Lost in the Fog Panel at SC11 Seattle November Geoffrey Fox
Training Data Scientists DELSA Workshop DW4 May Washington DC Geoffrey Fox Informatics, Computing.
Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.
Sensors Sensors Everywhere! IoT by Wire – The New Relevance Rob Milks – CEO B-LineLogic, Inc.
CIW CURRICULUM.  Definition: the management and processing of information using computers and computer networks.  Most businesses have an IT department.
Big Data Open Source Software and Projects ABDS in Summary II: Layer 5 I590 Data Science Curriculum August Geoffrey Fox
This is a free Course Available on Hadoop-Skills.com.
BIG DATA USES CASES & LESSONS LEARNED Marrakech – March 2016 Alexandre AKROUR, CEO 1.
BIG DATA SOURCE AND EXAMPLES DIRECT QUOTES FROM SOURCE: RAINER, KELLY, PRINCE, BRAD AND WATSON, HUGH, MANAGEMENT INFORMATION SYSTEMS: MOVING BUSINESS FORWARD,
FALL 2007 DIANNE HANSFORD CPI 101: Introduction to Informatics Sumber dari : IntroLecture.ppt.
Panel: Beyond Exascale Computing
BIG Data 25 Need-to-Know Facts.
Data mining and real systems modeling
What is Business Intelligence?
Big-Data Fundamentals
4 Education Initiatives: Data Science, Informatics, Computational Science and Intelligent Systems Engineering; What succeeds? National Academies Workshop.
Big Data Architectures
Services, Security, and Privacy in Cloud Computing
Course Introduction CSC 576: Data Mining.
Panel on Research Challenges in Big Data
Data Analysis and R : Technology & Opportunity
Best Simplest Ways To Learn Digital Marketing Presented By:- Abhinav Shashtri.
Cloud versus Cloud: How Will Cloud Computing Shape Our World?
Overview of Cyberinfrastructure and The Breadth of Its Application
Presentation transcript:

X-Informatics Introduction: What is Big Data, Data Analytics and X-Informatics? January Geoffrey Fox Associate Dean for Research and Graduate Studies, School of Informatics and Computing Indiana University Bloomington 2013

Some Trends The Data Deluge is clear trend from Commercial (Amazon, e- commerce), Community (Facebook, Search) and Scientific applications Light weight clients from smartphones, tablets to sensors Multicore reawakening parallel computing Exascale initiatives will continue drive to high end with a simulation orientation Clouds with cheaper, greener, easier to use IT for (some) applications New jobs associated with new curricula Clouds as a distributed system (classic CS courses) Data Analytics (Important theme in academia and industry) Social Media 2

Some Terms Data: the raw bits and bytes produced by instruments, web, , social media Information: The cleaned up data without deep processing applied to it Knowledge/wisdom/decisions comes from sophisticated analysis of Information Data Analytics is the process of converting data to Information and Knowledge and then decisions or policy Data Science describes the whole process X-Informatics is use of Data Science to produce wisdom in field X

The Course in One Sentence Study Clouds running Data Analytics processing Big Data to solve problems in X-Informatics

X-Informatics already Defined Biomedical, Medical, Bio, Chem(istry), Health, Pathology Informatics Life Style Informatics (from IT for Facebook to IT for life or Health – that’s better Life Style Informatics)) Astro(nomy), Energy, Radar Informatics – Physics Informatics ought to exist but doesn’t Social Informatics in our school Business, Wealth, Financial, Marketing Informatics Security (also in School), Crisis, Intelligence Informatics Policy Informatics (many X-Informatics impact policies)

Jobs

Jobs v. Countries 8

McKinsey Institute on Big Data Jobs There will be a shortage of talent necessary for organizations to take advantage of big data. By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions. This course aimed at 1.5 million jobs. Computer Science covers the 140,000 to 190,

Tom Davenport Harvard Business School Nov 2012

Data Deluge General Structure

Some Data sizes ~ Web pages at ~300 kilobytes each = 10 Petabytes Youtube 48 hours video uploaded per minute; in 2 months in 2010, uploaded more than total NBC ABC CBS ~2.5 petabytes per year uploaded? LHC 15 petabytes per year Radiology 69 petabytes per year Square Kilometer Array Telescope will be 100 terabits/second Earth Observation becoming ~4 petabytes per year Earthquake Science – few terabytes total today PolarGrid – 100’s terabytes/year Exascale simulation data dumps – terabytes/second 12

Tom Davenport Harvard Business School Nov 2012

Anjul Bhambhri, VP of Big Data, IBM

Tom Davenport Harvard Business School Nov 2012

The data deluge: The Economist Feb According to one estimate, mankind created 150 exabytes (billion gigabytes) of data in This year(2010), it will create 1,200 exabytes. Merely keeping up with this flood, and storing the bits that might be useful, is difficult enough. Analysing it, to spot patterns and extract useful information, is harder still. Even so, the data deluge is already starting to transform business, government, science and everyday life berkeley1.pdf Jeff Hammerbacher

Data Science Process

DIKW Process Data becomes Information becomes Knowledge becomes Wisdom or Decisions – Community acceptance of results or approach important here – Volume of bits&bytes decreases as we proceed down DIKW pipeline

Data Deluge is also Information/Knowledge/Wisdom/Decision Deluge?

Example of Google Maps/Navigation Data comes from traditional maps (US Geological Survey), Satellites (overlays) and street cams Information is presented by basic Google Maps web page Knowledge is a particular optimized route Decisions (wisdom) comes from deciding to drive a particular route

Data Deluge Internet

Oracle

Anjul Bhambhri, VP of Big Data, IBM

Data Deluge Business

Anjul Bhambhri, VP of Big Data, IBM

Anjul Bhambhri, VP of Big Data, IBM

CDR = Call Data Record Anjul Bhambhri, VP of Big Data, IBM

Internet of Things Ruh VP Software GE

Ruh VP Software GE

Ruh VP Software GE MM = Million