Discussion: Cloud Computing for an AI First Future

Slides:



Advertisements
Similar presentations
1 Challenges and New Trends in Data Intensive Science Panel at Data-aware Distributed Computing (DADC) Workshop HPDC Boston June Geoffrey Fox Community.
Advertisements

Parallel Research at Illinois Parallel Everywhere
Clouds from FutureGrid’s Perspective April Geoffrey Fox Director, Digital Science Center, Pervasive.
Iterative computation is a kernel function to many data mining and data analysis algorithms. Missing in current MapReduce frameworks is collective communication,
3DAPAS/ECMLS panel Dynamic Distributed Data Intensive Analysis Environments for Life Sciences: June San Jose Geoffrey Fox, Shantenu Jha, Dan Katz,
Big Data and Clouds: Challenges and Opportunities NIST January Geoffrey Fox
Slide 1 Auburn University Computer Science and Software Engineering Scientific Computing in Computer Science and Software Engineering Kai H. Chang Professor.
DISTRIBUTED COMPUTING
Extreme scale parallel and distributed systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward.
Data Science at Digital Science October Geoffrey Fox Judy Qiu
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
Scientific Computing Environments ( Distributed Computing in an Exascale era) August Geoffrey Fox
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
March 9, 2015 San Jose Compute Engineering Workshop.
Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.
ISERVOGrid Architecture Working Group Brisbane Australia June Geoffrey Fox Community Grids Lab Indiana University
Plumbing the Computing Platforms of Big Data Dilma Da Silva Professor & Department Head Computer Science & Engineering Texas A&M University.
Big Data to Knowledge Panel SKG 2014 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China August Geoffrey Fox
HPC in the Cloud – Clearing the Mist or Lost in the Fog Panel at SC11 Seattle November Geoffrey Fox
Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.
1 Panel on Merge or Split: Mutual Influence between Big Data and HPC Techniques IEEE International Workshop on High-Performance Big Data Computing In conjunction.
Hadoop Javad Azimi May What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data. It includes:
Geoffrey Fox Panel Talk: February
Panel: Beyond Exascale Computing
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes for an HPC Enhanced Cloud and Fog Spanning IoT Big Data and Big Simulations.
Department of Intelligent Systems Engineering
Introduction to Distributed Platforms
Geoffrey Fox, Shantenu Jha, Dan Katz, Judy Qiu, Jon Weissman
Scaling Spark on HPC Systems
Status and Challenges: January 2017
Extreme Big Data Examples
HPC Cloud Convergence February 2017 Software: MIDAS HPC-ABDS
Enabling machine learning in embedded systems
Grid Computing.
NSF start October 1, 2014 Datanet: CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science Indiana University.
NSF : CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science PI: Geoffrey C. Fox Software: MIDAS HPC-ABDS.
Digital Science Center I
iSERVOGrid Architecture Working Group Brisbane Australia June
HPSA18: Logistics 7:00 am – 8:00 am Breakfast
I590 Data Science Curriculum August
High Performance Big Data Computing in the Digital Science Center
Data Science Curriculum March
AI-Driven Science and Engineering with the Global AI and Modeling Supercomputer GAIMSC Workshop on Clusters, Clouds, and Data for Scientific Computing.
Tutorial Overview February 2017
AI First High Performance Big Data Computing for Industry 4.0
Hilton Hotel Honolulu Tapa Ballroom 2 June 26, 2017 Geoffrey Fox
13th Cloud Control Workshop, June 13-15, 2018
Martin Swany Gregor von Laszewski Thomas Sterling Clint Whaley
Research in Digital Science Center
Cloud Evolution Dennis Gannon
Scalable Parallel Interoperable Data Analytics Library
Cloud DIKW based on HPC-ABDS to integrate streaming and batch Big Data
Clouds from FutureGrid’s Perspective
Big Data Architectures
Indiana University, Bloomington
Twister2: Design of a Big Data Toolkit
Department of Intelligent Systems Engineering
Digital Science Center
Introduction to Twister2 for Tutorial
$1M a year for 5 years; 7 institutions Active:
PHI Research in Digital Science Center
Panel on Research Challenges in Big Data
Digital Science Center
Cloud versus Cloud: How Will Cloud Computing Shape Our World?
Big Data, Simulations and HPC Convergence
Research in Digital Science Center
Geoffrey Fox High-Performance Big Data Computing: International, National, and Local initiatives COLLABORATORS China and IU: Fudan University, SICE, OVPR.
Convergence of Big Data and Extreme Computing
Twister2 for BDEC2 Poznan, Poland Geoffrey Fox, May 15,
Presentation transcript:

Discussion: Cloud Computing for an AI First Future 13th Cloud Control Workshop, June 13-15, 2018 Skåvsjöholm in the Stockholm Archipelago Geoffrey Fox, June 14, 2018 Department of Intelligent Systems Engineering gcf@indiana.edu, http://www.dsc.soic.indiana.edu/, http://spidal.org/ `, Work with Judy Qiu, Supun Kamburugamuva, Shantenu Jha, Kannan Govindarajan, Pulasthi Wickramasinghe, Gurhan Gunduz, Ahmet Uyar 1/3/2019

Cloud Computing for an AI First Future Artificial Intelligence is a dominant disruptive technology affecting all our activities including business, education, research, and society. Further,  several companies have proposed AI first strategies. The AI disruption is typically associated with big data coming from edge, repositories or sophisticated scientific instruments such as telescopes, light sources and gene sequencers. AI First requires mammoth computing resources such as clouds, supercomputers, hyperscale systems and their distributed integration. AI First clouds are related to High Performance Computing HPC -- Cloud or Big Data integration/convergence This panel could examine the driving applications and their implication for hardware and software infrastructure. Panel can look at type of hardware suitable for AI First clouds 1/3/2019

Summit Supercomputer Oak Ridge Architecture: 4,608 compute servers, each containing two 22-core IBM Power9 processors and six NVIDIA Tesla V100 graphics processing unit accelerators,  Power 15 MW Storage 250 PB Infiniband NVLINK Note AI First “smartest” adjective 1/3/2019

Big Data and Simulation Difficulty in Parallelism Size of Synchronization constraints Loosely Coupled Tightly Coupled HPC Clouds: Accelerators High Performance Interconnect HPC Clouds/Supercomputers Memory access also critical Commodity Clouds Size of Disk I/O MapReduce as in scalable databases Graph Analytics e.g. subgraph mining Global Machine Learning e.g. parallel clustering Deep Learning LDA Pleasingly Parallel Often independent events Unstructured Adaptive Sparse Linear Algebra at core (often not sparse) Current major Big Data category Structured Adaptive Sparse Parameter sweep simulations Largest scale simulations Just two problem characteristics There is also data/compute distribution seen in grid/edge computing Exascale Supercomputers 1/3/2019

Discussion at Workshop I Now moving to AI First Future Time Situation Situation in The discussion was animated with however agreement that The current big data focus is to left of previous slide with for example Flink and Spark having workflow and SQL special thrusts Over next five years, we can expect a greater focus of the big data community on the right side with deep learning, LDA and graph analytics having features quite similar to classic HPC problems that require high performance communication There was also agreement that are “AI First” cloud was a plausible vision However there was significant disagreement as to how to realize an “AI First” cloud and if HPC was relevant It was felt that clouds would always chose a “sweet spot” too far from HPC choices for HPC community to be able to use. 1/3/2019

Discussion at Workshop II The slow rate of change in the HPC community with many legacy applications using technologies like Fortran was noted Would Cray survive in a “converged world”? The storage differences between HDFS style in clouds and Lustre in HPC was noted However S3 (AWS) is nearer Lustre architecture (with storage separate from compute) than HDFS. The interconnect difference between Ethernet (Clouds) and Infiniband (HPC) was noted The HPC community could surely adapt to chosen public cloud communication technology as this does not affect programming model – just details of implementation. If one wants to run graph analytics on public clouds one will need excellent communication as the irregular structure and low compute/communication of big data graphs probably makes them harder for efficient implementation than big simulation problems. 1/3/2019

Discussion at Workshop III Another point of view was that the movement to the right of slide 4 offers a natural chance for convergence of HPC and public cloud hardware and software. The expectation of importance of accelerators in both big data and HPC was stressed Maybe those accelerators could be different (TPU v. GPU) but the same accelerators could drive HPC and public clouds Probably one would always use specialized machines (supercomputers) for the really large jobs with say 100,000 cores or more as not easy to get co-location at their level on public clouds Note NASA evaluation of public clouds for workloads on NASA supercomputers https://www.nas.nasa.gov/assets/pdf/papers/NAS_Technical_Report_NAS- 2018-01.pdf Note BDEC meeting on following slide. Ask Geoffrey Fox for more information 1/3/2019

Big Data and Extreme-scale Computing (BDEC) http://www. exascale BDEC Pathways to Convergence Report Next Meeting October 3-5, 2018 Bloomington Indiana USA. October 3 is evening reception with meeting focus “Defining application requirements for a data intensive computing continuum” Later meeting February 19-21 Kobe, Japan (National infrastructure visions); Q2 2019 Europe (Exploring alternative platform architectures); Q4, 2019 USA (Vendor/Provider perspectives); Q2, 2020 Europe (? Focus); Q3-4, 2020 Final meeting Asia (write report) http://www.exascale.org/bdec/sites/www.exascale.org.bdec/files/whitepapers/bdec2017pathways.pdf 1/3/2019