4 Education Initiatives: Data Science, Informatics, Computational Science and Intelligent Systems Engineering; What succeeds? National Academies Workshop.

Slides:



Advertisements
Similar presentations
What Did We Learn About Our Future? Getting Ready for Strategic Planning Spring 2012.
Advertisements

New MS (CS) IBA Philosophy and Objectives.
1 Challenges and New Trends in Data Intensive Science Panel at Data-aware Distributed Computing (DADC) Workshop HPDC Boston June Geoffrey Fox Community.
Education, Outreach and Training. Specifications Document Overall objective: Better integration of ecoinformatics, in general, and SEEK tools, specifically,
Current NIST Definition NIST Big data consists of advanced techniques that harness independent resources for building scalable data systems when the characteristics.
Teaching Courses in Scientific Computing 30 September 2010 Roger Bielefeld Director, Advanced Research Computing.
Master of Arts in Data Science Geoffrey Fox for Data Science Program March
Overview of the MS Program Jan Prins. The Computer Science MS Objective – prepare students for advanced technical careers in computing or a related field.
Big Data Open Source Software and Projects Unit 1: Introduction Data Science Curriculum March Geoffrey Fox
Master of Arts in Data Science
6 th Annual Conference ~ Minneapolis Academic Forum Basic Statistics Report 2012 Mark Carroll, U. C. Davis John T. Finnell, Regenstrief Institute.
Computers Are Your Future Eleventh Edition Chapter 10: Careers & Certification Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall1.
Copyright Amy Woszczynski, 2006Master of Science in Information Systems (MSIS) Life After the Undergraduate Degree What’s Next?
Diane L. Barlow College of Information Studies University of Maryland February 2010.
© Heikki Topi Data Science and Computing Education ACM Education Council Portland, OR September 16-17, 2014 Heikki Topi, Bentley University.
Big Data and Clouds: Challenges and Opportunities NIST January Geoffrey Fox
Improving Education in ICTD: Thoughts from CCC Workshop Participants CCC Workshop on Computer Science and Global Development August 1, 2009.
X-Informatics Introduction: What is Big Data, Data Analytics and X-Informatics? January Geoffrey Fox
College of Business November 21, Departmental Structure At the present time, no separate departments Previous departmental structure – Management,
I399 1 Research Methods for Informatics and Computing D: Basic Issues Geoffrey Fox Associate Dean for Research.
AFCEA/AFA July 23,  There are 35 colleges and universities across the state that are members of the University System of Georgia.  All of them.
New M&S Curriculum: The Emerging Strategy Dr. Wayne Summers TSYS Department of Computer Science Columbus State University.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 190 Careers in Computer Science, Computer Engineering, and Computer Information.
Panel on Training and Developing HPC People HPC User Forum Dearborn MI April 13, 2010 Paul Buerger Avetec/DICE program Jim Kasdorf.
Data Science at Digital Science October Geoffrey Fox Judy Qiu
Computer Science: $ign of the Times Karen Reed CSIS 1001/sec.06 PowerPoint Presentation
CEN Program Focus Group TOPICS: –Suggestions for the CEN program. –CEN program Overhaul 1.
The New eScience Education at the University of Copenhagen Professor Eric Jul Director of eScience Studies eScience Center DIKU – Department of Computer.
M.S. Systems Engineering page 1 Master of Systems Engineering Program 36 Hour Program 7 Core course, 4 electives, 1 Capstone project Face-to-Face and Internet.
Big Data: Industry Needs Data Scientists Data Analysts Data Infrastructure Engineers Developers (all kinds) 2-3:30, August 10, 2015 Room 261 RSC.
1 HIGHER EDUCATION PROGRAMS AT NASA K. E. BLANDING, PH.D. Acting Director, Higher Education Division NASA Office of Education.
OMIS 694, Big Data Analytics
Big Data to Knowledge Panel SKG 2014 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China August Geoffrey Fox
HPC in the Cloud – Clearing the Mist or Lost in the Fog Panel at SC11 Seattle November Geoffrey Fox
Training Data Scientists DELSA Workshop DW4 May Washington DC Geoffrey Fox Informatics, Computing.
Geoffrey Fox Panel Talk: February
Panel: Beyond Exascale Computing
COMPUTER SCIENCE FOR NEW HAMPSHIRE
Columbia Collaboratory
CS Undergraduate Advisor
Business Intelligence Minor
Status and Challenges: January 2017
CS Undergraduate Advisor
We are: A Professional School The largest graduate school of Computer Science in the country We have: A University working with student and professional.
CS Undergraduate Advisor
Computer Science Department
NSF start October 1, 2014 Datanet: CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science Indiana University.
Computer Science Department
Data Science and Online Education
NSF/TCPP Workshop on Parallel and Distributed Computing Education
NSF : CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science PI: Geoffrey C. Fox Software: MIDAS HPC-ABDS.
Information Technology
Experiences with Business Analytics Curriculum Implementation
Guided Pathways ECE Update
Prem Mathew and Richard Nafshun, Co-Chairs January 14, 2016
Tutorial Overview February 2017
Natural History Collections (NHC) Biodiversity Data Informatics 101
Data Science for Life Sciences Research & the Public Good
OMIS 665, Big Data Analytics
What are your Career Options?
House Appropriations Committee Retreat Update on Degree Funding Initiative Tony Maggio November 13, 2018.
Department of Intelligent Systems Engineering
$1M a year for 5 years; 7 institutions Active:
Computing and Mathematics
3 Questions for Cluster and Grid Use
Reassessing the relationship between pure and applied mathematics
Panel on Research Challenges in Big Data
Digital Science Center
Big Data, Simulations and HPC Convergence
Convergence of Big Data and Extreme Computing
Presentation transcript:

4 Education Initiatives: Data Science, Informatics, Computational Science and Intelligent Systems Engineering; What succeeds? National Academies Workshop on COLLABORATIVE GRADUATE TRAINING INITIATIVES IN HIGH-PERFORMANCE COMPUTING FOR THE SOLID EARTH SCIENCES http://dels.nas.edu/GLOBAL/BESR/COSG April 11, 2016 Geoffrey Fox gcf@indiana.edu http://www.infomall.org Department of Intelligent Systems Engineering School of Informatics and Computing Digital Science Center Indiana University Bloomington 12/1/2018

Variants of Applied Computer Science Computational Science or Scientific Computing Large international effort starting ~25 years ago I tried 3 times (Caltech, Syracuse, FSU) and I failed; Some success by others including IUB (few students) Curriculum including HPC well understood Mainly graduate Are there enough jobs? Informatics ~3 times Computer Science at IUB at undergraduate level Undergraduates get good jobs somewhat lower salary than Computer science Information technology as in Health informatics (data not simulation) Data Science has exploded in interest IUB introduced Masters in Data Science in January 2015 Recognized name for students and employees and plenty of jobs Curriculum varies in emphasis across universities but not difficult to design HPC and Big Data in “Intelligent Systems” Engineering at IUB Modest 2-4 course HPC effort in Nanoengineering, Bioengineering, Computer Engineering; Big Data universal 12/1/2018

School of Informatics and Computing (2014-2015) Tenure-track Faculty 82 Students Undergraduate 1,472 Master’s 754 Ph.D. 301 Female Undergraduates 22%   Female Graduate Students 48% Data Science Masters is 36, 105, 290 enrolled for 13-14, 14-15 and 15-16 12/1/2018

Masters CS + HCI + Sec + Bio Masters Data Science SOIC@IUB 12/1/2018

Some Lessons on HPC and Big Data Equivalence HPC == Computational Science == Scientific Computing Equivalence Big Data == Data Science Big Data at least 10x larger in jobs and 100x larger in student interest than HPC Data science web pages more popular than computer science at IUB Data science risen from 0 to 42% of SOIC@IUB grad appls in 2 years Big Data and HPC both demand Computer Science – Application Domain collaboration Industry leads data science and moves much faster than academia President’s National Strategic Computing Initiative calls for Big Data – Exascale Convergence Includes Supercomputer Cloud hardware/software Integration (I think) clear how to do this but (unwisely?) largely ignored in HPC plans HPC on a doomed unsustainable path? “HPC-ABDS” High Performance Computing Enhanced Apache Big Data Stack offers sustainable software (via Apache), rich industry software model and performance of HPC e.g. Apache workflow better than HPC variants? Natural to integrate data and computational science education (not common?) 12/1/2018

Computational Science Computational science has important similarities to data science but with a simulation rather than data analysis flavor. Although a great deal of effort went into with meetings and several academic curricula/programs, it didn’t take off In my experience not a lot of students were interested and The academic job opportunities were not great Data science has more jobs; maybe it will do better? Can we usefully link these concepts? PS both use parallel computing! In days gone by, I did research in particle physics phenomenology which in retrospect was an early form of data science using models extensively 12/1/2018

Data Science Definition from NIST Public Working Group Data Science is the extraction of actionable knowledge directly from data through a process of discovery, hypothesis, and analytical hypothesis analysis. A Data Scientist is a practitioner who has sufficient knowledge of the overlapping regimes of expertise in business needs, domain knowledge, analytical skills and programming expertise to manage the end-to-end scientific method process through each stage in the big data lifecycle. Replace by applied math and modelling for computational science? Big Data refers to digital data volume, velocity and/or variety whose management requires scalability across coupled horizontal resources Data Science is the extraction of actionable knowledge directly from data through a process of discovery, hypothesis, and analytical hypothesis analysis. A Data Scientist is a practitioner who has sufficient knowledge of the overlapping regimes of expertise in business needs, domain knowledge, analytical skills and programming expertise to manage the end-to-end scientific method process through each stage in the big data lifecycle. See Big Data Definitions in http://bigdatawg.nist.gov/V1_output_docs.php 11/30/2015

McKinsey Institute on Big Data Jobs http://www.mckinsey.com/mgi/publications/big_data/index.asp There will be a shortage of talent necessary for organizations to take advantage of big data. By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions. IU Data Science Decision Maker Path aimed at 1.5 million jobs. Technical Path covers the 140,000 to 190,000 11/30/2015

Job Trends Big Data much larger than data science Charts Jan 6 2015 Big Data much larger than data science 19 May 2015 Jobs 3475 for “data science“ 2277 for “data scientist“ 19488 for “big data” 7 Dec 2015 Jobs 5014 for “data science“ 2830 for “data scientist“ 22388 for “big data” http://www.indeed.com/jobtrends?q=%22Data+science%22%2C+%22data+scientist%22%2C+%22big+data%22%2C&l= 11/30/2015

Its Jobs! 2012-2016 HPC ~constant Big Data grows factor 10 Big Data Deep Learning 0.01% Bioengineering 0.01% Internet of Things 0.03% Informatics 0.12% Computer Engineering 0.26% Hadoop 0.28% Simulation 0.29% Information Technology 1.58% Computer Science 2.4% Engineer 4.14% Design 8.71% 2012-2016 HPC ~constant Big Data grows factor 10 Big Data HPC 12/1/2018

02/16/2016

Big Data and (Exascale) Simulation Convergence II 02/16/2016

School of Informatics and Computing 12/1/2018

Background of the School The School of Informatics was established in 2000 as first of its kind in the United States. Computer Science was established in 1971 and became part of the school in 2005. Library and Information Science was established in 1951 and became part of the school in 2013. Now named the School of Informatics and Computing. Data Science added January 2014 Engineering added Fall 2016 12/1/2018

Undergraduate Informatics Applied CS on the IT (data) side 961 Undergrad (2.7 times number in CS) 95 Masters 110 PhD 12/1/2018

Undergraduate Computer Science 356 Undergraduate 311 Masters 161 PhD 12/1/2018

SOIC Data Science Program Cross Disciplinary Faculty – 31 in School of Informatics and Computing, a few in statistics and expanding across campus Affordable online and traditional residential curricula or mix thereof Masters, Certificate, PhD Minor in place; Full PhD being proposed http://www.soic.indiana.edu/graduate/degrees/data-science/index.html Note data science mentioned in faculty advertisements but unlike other parts of School, there are no dedicated faculty It is around 10% of School looking at fraction of enrolled students summing graduate and undergraduate levels 12/1/2018

3 Types of Data Science Students Professionals wanting skills to improve job or “required” by employee to keep up with technology advances Traditional sources of IT Masters Students in non IT fields wanting to do “domain specific data science” 12/1/2018

Basic Masters Course Requirements One course from each of three technology areas I. Data analysis and statistics II. Data lifecycle (includes “handling of research data”) III. Data management and infrastructure One course from (big data) application course cluster Other courses chosen from list maintained by Data Science Program curriculum committee (or outside this with permission of advisor/ Curriculum Committee) Capstone project optional All students assigned an advisor who approves course choice. Due to variation in preparation label courses Decision Maker Technical Corresponding to two categories in McKinsey report – note Decision Maker had an order of magnitude more job openings expected 12/1/2018