EDISON Data Science Framework: Building the Data Science Profession

Slides:



Advertisements
Similar presentations
Existing tools for cooperation – WG 2 1 Regional Policy Dialogue Capacity building seminars WORKING GROUP MEETINGS HIGH LEVEL SEMINAR SERIES 4 working.
Advertisements

Research Councils ICT Conference Welcome Malcolm Atkinson Director 17 th May 2004.
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Welcome to the Conference !! Juan Bicarregui Chair, APA Executive.
1.Data categorization 2.Information 3.Knowledge 4.Wisdom 5.Social understanding Which of the following requires a firm to expend resources to organize.
Master of Arts in Data Science
Computers & Employment By Andrew Attard and Stephen Calleja.
Integrating Digital Curation in a Digital Library curriculum: the International Master DILL case study Anna Maria Tammaro University of Parma Florence,
How Can Engineering Take Data Sciences from Ideas to Action
SCSC 311 Information Systems: hardware and software.
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
The Role of Academic Libraries in the Digital Data Universe Break-Out Session: New Partnership Models Bob Hanisch and Brian Schottlaender Co-Leaders ARL.
8/20/2013NIST Big Data WG / Roadmap Subgroup1 Architecture Storage Architecture Processing Architecture Resource Managers Architecture Infrastructure Architecture.
Big Data: Industry Needs Data Scientists Data Analysts Data Infrastructure Engineers Developers (all kinds) 2-3:30, August 10, 2015 Room 261 RSC.
Toward a common data and command representation for quantum chemistry Malcolm Atkinson Director 5 th April 2004.
Services to establish Data Science Profession: EDISON Data Science Framework and activities Yuri Demchenko, EDISON University of Amsterdam ELG2 Meeting.
EDISON Data Science Competence Framework (CF-DS) and Data Science Body of Knowledge (DS-BoK) Yuri Demchenko, EDISON University of Amsterdam IG-ETRD Meeting.
In Spain we have basically three profiles in computer science vocational training:  MIDDLE-GRADE DEGREES ◦ Microcomputer systems and networks technician.
Chang, Wen-Hsi Division Director National Archives Administration, 2011/3/18/16:15-17: TELDAP International Conference.
EGI and Data Scientists: Demand Sy Holsinger EGI.eu Senior Strategy and Policy Officer EGI Community Forum November 2015, Bari EDISON – Education.
Sierra Pacific Community College District 7300 College Avenue Sacramento, CA
EGI… …is a Federation of over 300 computing and data centres spread across 56 countries in Europe and worldwide …delivers advanced computing.
Computer Information Technology
Witold Staniszkis Empowering the Knowledge Worker End-User Software Engineering in Knowledge Management Witold Staniszkis
Joint Framework Concept
Using core competencies in curriculum design
Chapter 1 Computer Technology: Your Need to Know
Steve Brewer University of Southampton
CESSDA SaW Training on Trust, Identifying Demand & Networking
Jennie Larkin, PhD Senior Advisor
Microsoft Certification Paths
Steve Brewer EDISON, University of Southampton
Achievements in 2016 Data Integration Linked Open Metadata
MSc(IT) Program Overview
EOSC MODEL Pasquale Pagano CNR - ISTI
INTAROS WP5 Data integration and management
Integrated Management System and Certification
Research Data Alliance - Research Data Sharing without barriers Terena Networking Conference 22 May 2014.
Statistical education in times of Big Data
Betsy Wilson Environmental Update October 29, 2007
The IT Environment Section 3 ICA11v1.0
Summit 2017 Breakout Group 2: Data Management (DM)
The Enterprise Relevant Scope of DM
The National Initiative for Cybersecurity Education (NICE)  AFCEA International Cyber Education, Research, and Training Symposium January 17, 2018 Bill.
Oracle University Training and Certification Solutions
School of Information Management Nanjing University China
Assist. Prof. Magy Mohamed Kandil
European TRAINING FOUNDATION
EOSCpilot Skills Landscape & Framework
EOSC Governance Development Forum
OMIS 665, Big Data Analytics
 Deep Analytical Talent  Data Savvy Professionals  Technology and Data Enablers.
Introduction To software engineering
Ray Hentzschel Standardising International SE Certification (ISO/IEC 24773) on the INCOSE SE Competency Framework.
Microsoft Certification Paths
Common Solutions to Common Problems
LOSD Publication Deirdre Lee
Briefing to ARL Membership
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Statistical education in times of Big Data
Bird of Feather Session
Data(trans)forming Roberto Barcellan European Commission NTTS2019
TECHNOLOGY, ENGINEERING AND DATA CONTINUING AND PROFESSIONAL EDUCATION
Data Governance & Management Skills and Experience
Implementing the “Vision” within the ESS
WP6 – EOSC integration J-F. Perrin (ILL) 15th Jan 2019
What will engineering design practice be like in 2040
EOSC-hub Contribution to the EOSC WGs
Organisation av Integrations- & Informationsarbete
OU BATTLECARD: Oracle Linux Training and Certification
Open Science Conference Ljubljani 22 May 2019
Presentation transcript:

EDISON Data Science Framework: Building the Data Science Profession EDISON Project Team http://edison-project.eu presented at SKG 2016 Conference Beijing, China, September 15-17, 2016 by Marian Bubak EDISON – Education for Data Intensive Science to Open New science frontiers Grant 675419 (INFRASUPP-4-2015: CSA)

EDISON Data Science Framework (EDSF): Creating the Foundation for Data Science Profession Foundation & Concepts Services Biz Model CF-DS DS-BoK MC-DS Taxonomy and Vocabulary EDISON Online Educational Environt Edu&Train Marketpltz and Directory Roadmap & Sustainability Community Portal (CP) Professional certification Data Science career & prof development DS Prof Family Data Science Framework EDISON Framework components CF-DS – Data Science Competence Framework DS-BoK – Data Science Body of Knowledge MC-DS – Data Science Model Curriculum DSP - Data Science Professions family and professional competence profiles EOEE - EDISON Online Education Environment Other components and services EOEE - EDISON Online Education Environment Education and Training Marketplace and Resources Directory Data Science professional certification and training Community Portal (CP) Champions Conf 2016 EDISON Data Science Framework

Data Scientist definition by NIST Definitions by NIST Big Data WG (NIST SP1500 - 2015) A Data Scientist is a practitioner who has sufficient knowledge in the overlapping regimes of expertise in business needs, domain knowledge, analytical skills, and programming and systems engineering expertise to manage the end-to-end scientific method process through each stage in the big data lifecycle. Data Lifecycle in Big Data and Data Science Data science is the empirical synthesis of actionable knowledge and technologies required to handle data from raw data through the complete data lifecycle process. [ref] Legacy: NIST BDWG definition of Data Science Champions Conf 2016 EDISON Data Science Framework

Identified Data Science Competence Groups [ref] Legacy: NIST BDWG definition of Data Science Commonly accepted Data Science competences/skills groups include Data Analytics or Business Analytics or Machine Learning Engineering or Programming Subject/Scientific Domain Knowledge EDISON identified 2 additional competence groups demanded by organisations Data Management, Curation, Preservation Scientific or Research Methods and/vs Business Processes/Operations Other skills commonly recognized aka “soft skills” or “social intelligence” Inter-personal skills or team work, cooperativeness All groups need to be represented in Data Science curriculum and training programmes Challenging task for Data Science education and training: multi-skilled vs team based Another aspect of integrating Data Scientist into organisation structure General Data Science (or Big Data) literacy for all involved roles and management Common agreed and understandable way of communication and information/data presentation Role of Data Scientist: Provide such literacy advice and guiding to organisation Champions Conf 2016 EDISON Data Science Framework

Data Science Competence Groups - Research Data Science Competence includes 5 areas/groups Data Analytics Data Science Engineering Domain Expertise Data Management Scientific Methods (or Business Process Management) Scientific Methods Design Experiment Collect Data Analyse Data Identify Patterns Hypothesise Explanation Test Hypothesis Business Operations Operations Strategy Plan Design & Deploy Monitor & Control Improve & Re-design Champions Conf 2016 EDISON Data Science Framework

Data Science Competences Groups – Business Data Science Competence includes 5 areas/groups Data Analytics Data Science Engineering Domain Expertise Data Management Scientific Methods (or Business Process Management) Scientific Methods Design Experiment Collect Data Analyse Data Identify Patterns Hypothesise Explanation Test Hypothesis Business Process Operations/Stages Design Model/Plan Deploy & Execute Monitor & Control Optimise & Re-design Champions Conf 2016 EDISON Data Science Framework

Identified Data Science Skills/Experience Groups Group 1: Skills/experience related to competences Data Analytics and Machine Learning Data Management/Curation (including both general data management and scientific data management) Data Science Engineering (hardware and software) skills Scientific/Research Methods or Business Process Management Application/subject domain related (research or business) Mathematics and Statistics Group 2: Big Data (Data Science) tools and platforms Big Data Analytics platforms Mathematics & Statistics applications & tools Databases (SQL and NoSQL) Data Management and Curation platform Data and applications visualisation Cloud based platforms and tools Group 3: Programming and programming languages and IDE General and specialized development platforms for data analysis and statistics Group 4: Soft skills or Social Intelligence Personal, inter-personal communication, team work, professional network Champions Conf 2016 EDISON Data Science Framework

Data Science Professions Family Managers: Chief Data Officer (CDO), Data Science (group/dept) manager, Data Science infrastructure manager, Research Infrastructure manager Professionals: Data Scientist, Data Science Researcher, Data Science Architect, Data Science (applications) programmer/engineer, Data Analyst, Business Analyst, etc. Professional (database): Large scale (cloud) database designers and administrators, scientific database designers and administrators Professional and clerical (data handling/management): Data Stewards, Digital Data Curator, Digital Librarians, Data Archivists Technicians and associate professionals: Big Data facilities operators, scientific database/infrastructure operators Icons used: Credit to [ref] https://www.datacamp.com/community/tutorials/data-science-industry-infographic Champions Conf 2016 EDISON Data Science Framework

Data Science Body of Knowledge (DS-BoK) DS-BoK Knowledge Area Groups (KAG) KAG1-DSA: Data Analytics group including Machine Learning, statistical methods, and Business Analytics KAG2-DSE: Data Science Engineering group including Software and infrastructure engineering KAG3-DSDM: Data Management group including data curation, preservation and data infrastructure KAG4-DSRM: Scientific/Research Methods group KAG5-DSBP: Business process management group Data Science domain knowledge to be defined by related expert groups Champions Conf 2016 EDISON Data Science Framework

EDISON Data Science Framework KAG3-DSDM: Data Management group: data curation, preservation and data infrastructure DM-BoK version 2 “Guide for performing data management” – 11 Knowledge Areas (1) Data Governance (2) Data Architecture (3) Data Modelling and Design (4) Data Storage and Operations (5) Data Security (6) Data Integration and Interoperability (7) Documents and Content (8) Reference and Master Data (9) Data Warehousing and Business Intelligence (10) Metadata (11) Data Quality Other Knowledge Areas motivated by RDA, European Open Data initiatives, European Open Data Cloud (12) PID, metadata, data registries (13) Data Management Plan (14) Open Science, Open Data, Open Access, ORCID (15) Responsible data use Highlighted in red: Considered Research Data Management literacy (minimum required knowledge) Champions Conf 2016 EDISON Data Science Framework

EDISON Data Science Framework Discussion Questions Observations Suggestions Survey Data Science Competences [1]: Invitation to participate https://www.surveymonkey.com/r/EDISON_project_-_Defining_Data_science_profession Community discussion documents: Request for comments Data Science Competence Framework http://edison-project.eu/data-science-competence-framework-cf-ds Data Science Body of Knowledge http://edison-project.eu/data-science-body-knowledge-ds-bok Data Science Model Curriculum http://edison-project.eu/data-science-model-curriculum-mc-ds Data Science Professional Profiles http://edison-project.eu/data-science-professional-profiles Champions Conf 2016 EDISON Data Science Framework