FUTURE ICT CHALLENGES IN SCIENTIFIC COMPUTING White Paper 2017 FUTURE ICT CHALLENGES IN SCIENTIFIC COMPUTING Maria Girone, CERN on behalf of all the contributors to the white paper CERN openlab Open Day September 2017
Maria Girone, CERN openlab CTO A public-private partnership between the research community and industry CERN openlab V Collaboration Phase V Maria Girone, CERN openlab CTO
THE LHC UPGRADE PROGRAMME Identify challenge areas and R&D topics for the next phase of CERN openlab RUN3: LHCb and ALICE upgrades RUN4: ATLAS and CMS upgrades RUN3 and RUN4 will present important ICT challenges Multi-exabyte per year data flows, CPU needs of 20M today’s cores About a factor 10 above what technology evolution will bring Constrained budgets CPU: x60 from 2016 Data: Raw 2016: 50 PB 2027: 600 PB Derived (1 copy): 2016: 80 PB 2027: 900 PB ECFA2016 Maria Girone, CERN openlab CTO
Maria Girone, CERN openlab CTO EVENTS AT THE HL-LHC Increased complexity due to much higher pile-up and higher trigger rates will bring several challenges to reconstruction algorithms Average pile-up 20 Average pile-up 200 Maria Girone, CERN openlab CTO
FACING THE HL-LHC COMPUTING CHALLENGES HL-LHC will pose many challenges to software and computing. Collaborating with industry is key High trigger rates and complex events present challenges for experimental algorithms New industry tools like image recognition or machine learning for classification may have a big impact The infrastructure needs to evolve to handle the much higher data rates New architectures, co-processors, FPGAs, GPUs are all candidates Software performance will be key Modern coding, parallelization and vectorization Courtesy of I. Bird Maria Girone, CERN openlab CTO
WHITE PAPER: PROCESS AND TOPICS A series of workshops and discussions was held to capture the challenges faced by the LHC to set out collaborative R&D projects A solid starting point for a very challenging and constructive Phase VI R&D topics DATA-CENTRE TECHNOLOGIES AND INFRASTRUCTURES https://indico.cern.ch/event/604621/ COMPUTING PERFORMANCE AND SOFTWARE https://indico.cern.ch/event/622198/ MACHINE LEARNING AND DATA ANALYTICS https://indico.cern.ch/event/627852/ APPLICATIONS IN OTHER DISCIPLINES Set out to be a community process from brainstorming to white paper and plans We invite industry to pick up some of these challenges and work with us Maria Girone, CERN openlab CTO
DATA-CENTRE TECHNOLOGIES AND INFRASTRUCTURES R&D Maria Girone, CERN openlab CTO
DATA-CENTRE TECHNOLOGIES AND INFRASTRUCTURES R&D CERN is evaluating different models for increasing computing and data storage capacity to rapidly adapt to diverse workflow types Alternative architectures and specialized capabilities Heterogeneity and elasticity are key concepts Designing and operating distributed data infrastructures and computing centres poses several challenges, becoming greater as scale increases Networking DC Architectures Storage Databases Clouds Should cern decide on a DC, ambitious goals such PUE <1.1, a facility eventually reaching 15 MW Maria Girone, CERN openlab CTO
Maria Girone, CERN openlab CTO CHALLENGE AREAS (1) Data Centre Technologies and Infrastructures R&D Networking High-bandwidth links from detectors to the Data Centre would open opportunities for a consolidated CERN Data Centre with the experiments online systems Affordable high performance & low-latency interconnects blurring boundaries between HTC and HPC Software-defined networking and separation out of the data plane and the management plane Improvements in data management and data access Automation of network configuration Data-Centre Architectures Investigating rack disaggregation and software defined infrastructures enabling rapid allocation of the optimal storage and computing resources, including memory and bandwidth Investigating data access with hierarchical storage buffers, including volatile/non-volatile memory and high-bandwidth solid-state storage drives Maria Girone, CERN openlab CTO
Maria Girone, CERN openlab CTO CHALLENGE AREAS (2) Data Centre Technologies and Infrastructures R&D Data Storage Investigation of models for expansion of storage capacity Different media with multiple speeds and quality of service ‘Cold storage’ evolution for long term archival Evaluating low-cost flash memory, coupled with increased storage bandwidth and connectivity Database Technologies Traditional database applications will continue to play a key role in the HL-LHC High ingest rates of data, time-series workloads, high write- and read-throughputs Scale-out databases and cloud resources Evaluating new approaches, such as PaaS cloud-based architectures for services like databases and data-analytics Evaluating alternative solutions out of the traditional hierarchical mass-storage systems used so far by the LHC experiments (spinning disks and tape-systems as active archival) Maria Girone, CERN openlab CTO
Maria Girone, CERN openlab CTO CHALLENGE AREAS (3) Data Centre Technologies and Infrastructures R&D Cloud infrastructures Orchestration and automation of compute provisioning Investigations on how rack disaggregation and software-defined infrastructure could be used to provide optimized configurations Scalable clouds and global scientific clouds Improve efficiency, better capacity planning, optimising the infrastructure for different usage patterns Economy of scale via coordinated cloud initiatives across different sciences Maria Girone, CERN openlab CTO
Maria Girone, CERN openlab CTO Computing Performance and Software R&D Maria Girone, CERN openlab CTO
COMPUTING PERFORMANCE AND SOFTWARE R&D Modernising code plays/will play a vital role for the upgrades of experiments and LHC Increase software performance by adopting modern coding techniques and tools Fully exploit the features offered by modern hardware architectures Many-core GPU platforms, acceleration co-processors and innovative hybrid combinations of CPUs and FPGAs Maria Girone, CERN openlab CTO
Maria Girone, CERN openlab CTO CHALLENGE AREAS Computing Performance and Software R&D Dedicated hardware and co-processing systems Evaluating the adoption of dedicated specialized high-performance co-processors, GPUs, FPGAs and many-core co-processors to increase performance for specific sections of applications Code Modernisation Growing core count per CPU and the desire to take advantage of HPC centres are driving a push towards more parallel code Major progress on both simulation (e.,g. GeantV) and reconstruction algorithms Evaluate impact of low-latency and high-bandwidth NVRAM technology Heterogeneous Platforms and Alternative Architectures Investigations to better utilize resources (power, virtualized resources, …) Optimizing code distribution using lightweight containers Maria Girone, CERN openlab CTO
Maria Girone, CERN openlab CTO Machine Learning and Data Analytics R&D Courtesy of I. Fisk Simons Foundation Global Brain Project Maria Girone, CERN openlab CTO
MACHINE LEARNING AND DATA ANALYTICS R&D ML is important in HEP for analysis and expertise is well established Industry has demonstrated the utility of machine learning techniques for monitoring, automation, anomaly detection of complex systems, visualization, … Investigating how to benefit from industry adopted machine learning and “Big data” analytics techniques in many aspects of Data Acquisition Data Processing and Simulation Data Engineering Maria Girone, CERN openlab CTO
Maria Girone, CERN openlab CTO CHALLENGE AREAS (1) Machine Learning and Data Analytics R&D Data Acquisition Monitoring of accelerators and detectors Detecting anomalous behaviour in real-time Monitoring data quality based on anomaly detection techniques to predict failures by monitoring patterns Anomaly detection and the search for new physics Searches for new phenomena that are different from those commonly accepted by the trigger system using semi supervised /supervised algorithms Fast inference technology for trigger systems investigating deep learning for real-time event classification at the level of the trigger Maria Girone, CERN openlab CTO
Maria Girone, CERN openlab CTO CHALLENGE AREAS (2) Machine Learning and Data Analytics R&D Data Processing Simulation plays a key role in all HEP. At HL-LHC, the need of simulated data will increase enormously It is essential to improve significantly performance and optimize existing code Evaluating the possibility of replacing complex algorithms with deep-learning algorithms, including use of computer vision techniques, such are generative adversial networks Jet identification and image-based event classification to distinguish physics objects based on topology using computing vision techniques Maria Girone, CERN openlab CTO
Maria Girone, CERN openlab CTO CHALLENGE AREAS (3) Machine Learning and Data Analytics R&D Big Data Data reduction and refresh for analysis resource-heavy workflows – investigating the use of Spark and Hadoop for a more efficient use of resources through re-use of selection criteria and customized partial reconstruction Optimisation of Computing Infrastructure By applying statistical and machine learning methods to large sets of metrics collected from system components CPU and batch, disk and archive storage, network topology and flows, and application throughput Data engineering and solutions from industry Data analytics platforms such as Apache Hadoop, Spark, Kafka, Hbase and Kudu have now large user communities Being adopted by HEP Streaming analysis being investigated Maria Girone, CERN openlab CTO
Applications in Other Disciplines R&D Credit: Mark Ellisman, National Center for Microscopy and Imaging Research (NCMIR) (Attribution-NonCommercial 2.0 Generic) Maria Girone, CERN openlab CTO
APPLICATIONS IN OTHER DISCIPLINES R&D Sharing the use of techniques developed in HEP with other sciences Astrophysics Exascale data processing at future astrophysics infrastructures, such as SKA Platforms for Open Collaborations A smart data-analysis platform Life Sciences and Medical applications Simulating biological systems in the Cloud Large-scale analysis of genomic data and healthcare data Smart Everything Environmental monitoring Traffic and Mobility Maria Girone, CERN openlab CTO
Maria Girone, CERN openlab CTO CONCLUSIONS The CERN openlab white paper was targeted specifically with solving the challenges facing the HL-LHC 16 challenge areas and many use-cases grouped into 4 different R&D topics Continued goal of CERN openlab is collaboration with industry to explore ICT solutions for the benefit of science, industry and society Supporting the challenging HL-LHC programme CERN openlab is entering phase VI, which indicates the strength and longevity of the process and the value of a continued collaboration The white paper is a solid starting point for a very challenging and constructive Phase VI Opportunities for mutually beneficial collaborations Maria Girone, CERN openlab CTO
Maria Girone, CERN openlab CTO ANY QUESTIONS? Maria Girone, CERN openlab CTO