Discussion: Cloud Computing for an AI First Future

Slides:

Advertisements

Similar presentations

1 Challenges and New Trends in Data Intensive Science Panel at Data-aware Distributed Computing (DADC) Workshop HPDC Boston June Geoffrey Fox Community.

Advertisements

Parallel Research at Illinois Parallel Everywhere

Clouds from FutureGrid’s Perspective April Geoffrey Fox Director, Digital Science Center, Pervasive.

Iterative computation is a kernel function to many data mining and data analysis algorithms. Missing in current MapReduce frameworks is collective communication,

3DAPAS/ECMLS panel Dynamic Distributed Data Intensive Analysis Environments for Life Sciences: June San Jose Geoffrey Fox, Shantenu Jha, Dan Katz,

Big Data and Clouds: Challenges and Opportunities NIST January Geoffrey Fox

Slide 1 Auburn University Computer Science and Software Engineering Scientific Computing in Computer Science and Software Engineering Kai H. Chang Professor.

DISTRIBUTED COMPUTING

Extreme scale parallel and distributed systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward.

Data Science at Digital Science October Geoffrey Fox Judy Qiu

Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,

Scientific Computing Environments ( Distributed Computing in an Exascale era) August Geoffrey Fox

Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.

March 9, 2015 San Jose Compute Engineering Workshop.

Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.

ISERVOGrid Architecture Working Group Brisbane Australia June Geoffrey Fox Community Grids Lab Indiana University

Plumbing the Computing Platforms of Big Data Dilma Da Silva Professor & Department Head Computer Science & Engineering Texas A&M University.

Big Data to Knowledge Panel SKG 2014 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China August Geoffrey Fox

HPC in the Cloud – Clearing the Mist or Lost in the Fog Panel at SC11 Seattle November Geoffrey Fox

Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.

1 Panel on Merge or Split: Mutual Influence between Big Data and HPC Techniques IEEE International Workshop on High-Performance Big Data Computing In conjunction.

Hadoop Javad Azimi May What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data. It includes:

Geoffrey Fox Panel Talk: February

Panel: Beyond Exascale Computing

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING

Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes for an HPC Enhanced Cloud and Fog Spanning IoT Big Data and Big Simulations.

Department of Intelligent Systems Engineering

Introduction to Distributed Platforms

Geoffrey Fox, Shantenu Jha, Dan Katz, Judy Qiu, Jon Weissman

Scaling Spark on HPC Systems

Status and Challenges: January 2017

Extreme Big Data Examples

HPC Cloud Convergence February 2017 Software: MIDAS HPC-ABDS

Enabling machine learning in embedded systems

Grid Computing.

NSF start October 1, 2014 Datanet: CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science Indiana University.

NSF : CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science PI: Geoffrey C. Fox Software: MIDAS HPC-ABDS.

Digital Science Center I

iSERVOGrid Architecture Working Group Brisbane Australia June

HPSA18: Logistics 7:00 am – 8:00 am Breakfast

I590 Data Science Curriculum August

High Performance Big Data Computing in the Digital Science Center

Data Science Curriculum March

AI-Driven Science and Engineering with the Global AI and Modeling Supercomputer GAIMSC Workshop on Clusters, Clouds, and Data for Scientific Computing.

Tutorial Overview February 2017

AI First High Performance Big Data Computing for Industry 4.0

Hilton Hotel Honolulu Tapa Ballroom 2 June 26, 2017 Geoffrey Fox

13th Cloud Control Workshop, June 13-15, 2018

Martin Swany Gregor von Laszewski Thomas Sterling Clint Whaley

Research in Digital Science Center

Cloud Evolution Dennis Gannon

Scalable Parallel Interoperable Data Analytics Library

Cloud DIKW based on HPC-ABDS to integrate streaming and batch Big Data

Clouds from FutureGrid’s Perspective

Big Data Architectures

Indiana University, Bloomington

Twister2: Design of a Big Data Toolkit

Department of Intelligent Systems Engineering

Digital Science Center

Introduction to Twister2 for Tutorial

$1M a year for 5 years; 7 institutions Active:

PHI Research in Digital Science Center

Panel on Research Challenges in Big Data

Digital Science Center

Cloud versus Cloud: How Will Cloud Computing Shape Our World?

Big Data, Simulations and HPC Convergence

Research in Digital Science Center

Geoffrey Fox High-Performance Big Data Computing: International, National, and Local initiatives COLLABORATORS China and IU: Fudan University, SICE, OVPR.

Convergence of Big Data and Extreme Computing

Twister2 for BDEC2 Poznan, Poland Geoffrey Fox, May 15,

Presentation transcript:

Discussion: Cloud Computing for an AI First Future 13th Cloud Control Workshop, June 13-15, 2018 Skåvsjöholm in the Stockholm Archipelago Geoffrey Fox, June 14, 2018 Department of Intelligent Systems Engineering gcf@indiana.edu, http://www.dsc.soic.indiana.edu/, http://spidal.org/ `, Work with Judy Qiu, Supun Kamburugamuva, Shantenu Jha, Kannan Govindarajan, Pulasthi Wickramasinghe, Gurhan Gunduz, Ahmet Uyar 1/3/2019

Cloud Computing for an AI First Future Artificial Intelligence is a dominant disruptive technology affecting all our activities including business, education, research, and society. Further, several companies have proposed AI first strategies. The AI disruption is typically associated with big data coming from edge, repositories or sophisticated scientific instruments such as telescopes, light sources and gene sequencers. AI First requires mammoth computing resources such as clouds, supercomputers, hyperscale systems and their distributed integration. AI First clouds are related to High Performance Computing HPC -- Cloud or Big Data integration/convergence This panel could examine the driving applications and their implication for hardware and software infrastructure. Panel can look at type of hardware suitable for AI First clouds 1/3/2019

Summit Supercomputer Oak Ridge Architecture: 4,608 compute servers, each containing two 22-core IBM Power9 processors and six NVIDIA Tesla V100 graphics processing unit accelerators, Power 15 MW Storage 250 PB Infiniband NVLINK Note AI First “smartest” adjective 1/3/2019

Big Data and Simulation Difficulty in Parallelism Size of Synchronization constraints Loosely Coupled Tightly Coupled HPC Clouds: Accelerators High Performance Interconnect HPC Clouds/Supercomputers Memory access also critical Commodity Clouds Size of Disk I/O MapReduce as in scalable databases Graph Analytics e.g. subgraph mining Global Machine Learning e.g. parallel clustering Deep Learning LDA Pleasingly Parallel Often independent events Unstructured Adaptive Sparse Linear Algebra at core (often not sparse) Current major Big Data category Structured Adaptive Sparse Parameter sweep simulations Largest scale simulations Just two problem characteristics There is also data/compute distribution seen in grid/edge computing Exascale Supercomputers 1/3/2019

Discussion at Workshop I Now moving to AI First Future Time Situation Situation in The discussion was animated with however agreement that The current big data focus is to left of previous slide with for example Flink and Spark having workflow and SQL special thrusts Over next five years, we can expect a greater focus of the big data community on the right side with deep learning, LDA and graph analytics having features quite similar to classic HPC problems that require high performance communication There was also agreement that are “AI First” cloud was a plausible vision However there was significant disagreement as to how to realize an “AI First” cloud and if HPC was relevant It was felt that clouds would always chose a “sweet spot” too far from HPC choices for HPC community to be able to use. 1/3/2019

Discussion at Workshop II The slow rate of change in the HPC community with many legacy applications using technologies like Fortran was noted Would Cray survive in a “converged world”? The storage differences between HDFS style in clouds and Lustre in HPC was noted However S3 (AWS) is nearer Lustre architecture (with storage separate from compute) than HDFS. The interconnect difference between Ethernet (Clouds) and Infiniband (HPC) was noted The HPC community could surely adapt to chosen public cloud communication technology as this does not affect programming model – just details of implementation. If one wants to run graph analytics on public clouds one will need excellent communication as the irregular structure and low compute/communication of big data graphs probably makes them harder for efficient implementation than big simulation problems. 1/3/2019

Discussion at Workshop III Another point of view was that the movement to the right of slide 4 offers a natural chance for convergence of HPC and public cloud hardware and software. The expectation of importance of accelerators in both big data and HPC was stressed Maybe those accelerators could be different (TPU v. GPU) but the same accelerators could drive HPC and public clouds Probably one would always use specialized machines (supercomputers) for the really large jobs with say 100,000 cores or more as not easy to get co-location at their level on public clouds Note NASA evaluation of public clouds for workloads on NASA supercomputers https://www.nas.nasa.gov/assets/pdf/papers/NAS_Technical_Report_NAS- 2018-01.pdf Note BDEC meeting on following slide. Ask Geoffrey Fox for more information 1/3/2019

Big Data and Extreme-scale Computing (BDEC) http://www. exascale BDEC Pathways to Convergence Report Next Meeting October 3-5, 2018 Bloomington Indiana USA. October 3 is evening reception with meeting focus “Defining application requirements for a data intensive computing continuum” Later meeting February 19-21 Kobe, Japan (National infrastructure visions); Q2 2019 Europe (Exploring alternative platform architectures); Q4, 2019 USA (Vendor/Provider perspectives); Q2, 2020 Europe (? Focus); Q3-4, 2020 Final meeting Asia (write report) http://www.exascale.org/bdec/sites/www.exascale.org.bdec/files/whitepapers/bdec2017pathways.pdf 1/3/2019