4.1.5 System Management Background What is in System Management Resource control and scheduling Booting, reconfiguration, defining limits for resource.

Slides:



Advertisements
Similar presentations
Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Advertisements

Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
Priority Research Direction Key challenges General Evaluation of current algorithms Evaluation of use of algorithms in Applications Application of “standard”
Chapter 19: Network Management Business Data Communications, 5e.
Presented by: Sheekha Khetan. Mobile Crowdsensing - individuals with sensing and computing devices collectively share information to measure and map phenomena.
Priority Research Direction (I/O Models, Abstractions and Software) Key challenges What will you do to address the challenges? – Develop newer I/O models.
Copyright © 2012, SAS Institute Inc. All rights reserved. Cyber Security threats to Open Government Data Vishal Marria April 2014.
Priority Research Direction: Portable de facto standard software frameworks Key challenges Establish forums for multi-institutional discussions. Define.
A Java Architecture for the Internet of Things Noel Poore, Architect Pete St. Pierre, Product Manager Java Platform Group, Internet of Things September.
8.
Reliability Week 11 - Lecture 2. What do we mean by reliability? Correctness – system/application does what it has to do correctly. Availability – Be.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 30 Slide 1 Security Engineering.
1 Building with Assurance CSSE 490 Computer Security Mark Ardis, Rose-Hulman Institute May 10, 2004.
Maintaining and Updating Windows Server 2008
Power is Leading Design Constraint Direct Impacts of Power Management – IDC: Server 2% of US energy consumption and growing exponentially HPC cluster market.
CISCO CONFIDENTIAL – DO NOT DUPLICATE OR COPY Protecting the Business Network and Resources with CiscoWorks VMS Security Management Software Girish Patel,
1 Data Strategy Overview Keith Wilson Session 15.
System Center Operations Manager 2007 Dave Northey Microsoft Ireland.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
A Research Agenda for Accelerating Adoption of Emerging Technologies in Complex Edge-to-Enterprise Systems Jay Ramanathan Rajiv Ramnath Co-Directors,
Priority Research Direction Key challenges Fault oblivious, Error tolerant software Hybrid and hierarchical based algorithms (eg linear algebra split across.
Architecting Secure Mobile P2P Systems James Walkerdine, Peter Phillips, Simon Lock Lancaster University.
Distributed Real-Time Systems for the Intelligent Power Grid Prof. Vincenzo Liberatore.
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
Version 4.0. Objectives Describe how networks impact our daily lives. Describe the role of data networking in the human network. Identify the key components.
Top-Down Network Design Chapter Nine Developing Network Management Strategies Oppenheimer.
Priority Research Direction (use one slide for each) Key challenges -Fault understanding (RAS), modeling, prediction -Fault isolation/confinement + local.
Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.
Environment for Information Security n Distributed computing n Decentralization of IS function n Outsourcing.
Extreme-scale computing systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward exa-scale computing.
Alert Logic Security and Compliance Solutions for vCloud Air High-level Overview.
2Object-Oriented Analysis and Design with the Unified Process The Requirements Discipline in More Detail  Focus shifts from defining to realizing objectives.
Service Transition & Planning Service Validation & Testing
What are the main differences and commonalities between the IS and DA systems? How information is transferred between tasks: (i) IS it may be often achieved.
CON Software-Defined Networking in a Hybrid, Open Data Center Krishna Srinivasan Senior Principal Product Strategy Manager Oracle Virtual Networking.
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer Extended RBAC-design and implementation.
Network and Grid Monitoring Ludek Matyska CESNET Czech Republic.
Urban Infrastructure and Its Protection Responding to the Unexpected Interest Group Report Group Members G. Giuliano (USC), Jose Holguin-Veras (CUNY),
Chapter 5 McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
14.1/21 Part 5: protection and security Protection mechanisms control access to a system by limiting the types of file access permitted to users. In addition,
Microsoft Management Seminar Series SMS 2003 Change Management.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Breakout Group: Debugging David E. Skinner and Wolfgang E. Nagel IESP Workshop 3, October, Tsukuba, Japan.
Programmability Hiroshi Nakashima Thomas Sterling.
Creating SmartArt 1.Create a slide and select Insert > SmartArt. 2.Choose a SmartArt design and type your text. (Choose any format to start. You can change.
Slide 1 Security Engineering. Slide 2 Objectives l To introduce issues that must be considered in the specification and design of secure software l To.
Overview-TPV Service Delivery
Priority Research Direction (use one slide for each) Key challenges What will you do to address the challenges?Brief overview of the barriers and gaps.
© 2014 Level 3 Communications, LLC. All Rights Reserved. Proprietary and Confidential. Simple, End-to-End Performance Management Application Performance.
Internet of Things. Creating Our Future Together.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
Keith Chadwick 1 Metric Analysis and Correlation Service. CD Seminar.
Maintaining and Updating Windows Server 2008 Lesson 8.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco PublicITE I Chapter 6 1 Creating the Network Design Designing and Supporting Computer Networks – Chapter.
Service Design.
Issues in Cloud Computing. Agenda Issues in Inter-cloud, environments  QoS, Monitoirng Load balancing  Dynamic configuration  Resource optimization.
Organizations Are Embracing New Opportunities
Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
GGF15 – Grids and Network Virtualization
11/17/2018 9:32 PM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
The University of Adelaide, School of Computer Science
Power is Leading Design Constraint
Priority Research Direction (use one slide for each)
Managing IT Risk in a digital Transformation AGE
Priority Research Direction (use one slide for each)
Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Presentation transcript:

4.1.5 System Management Background What is in System Management Resource control and scheduling Booting, reconfiguration, defining limits for resource capacity and quality, provisioning the resources, workflow management, application driven resource allocation Security Authentication and authorization, integrity of system - ensure users can not interfere with other users, detect anomalous behavior, detection of inappropriate use and response Integration and test Managing and maintain health of system, continuous diagnostics, operational capabilities Logging, reporting, and analyzing information Analyzing data Types of data Static machine, dynamic machine, RAS events, session log External coordination of resources From the system - common communication infrastructure, errors reporting in a standardized way Distributed Computing

4.1.5 System Management Technology drivers – Changing relative cost of components (io bandwidth, io latency) – Power considerations – Increasing failure rate – Exploding component count – Vast volumes of data

4.1.5 System Management Security validation of diverse components Fine-grained authentication Dynamic provisioning traditional resources Integrated non-traditional resources: bandwidth, power System Complexity (# transistors * Lines of Code) Continual resource failure and dynamic reallocation Resource control and scheduling Security Integration and test Logging, reporting, analyzing information External coordination of resources Proactive failure detection Unified framework for event collection Hardware support full system security Model and filter for event analysis Continual monitoring and test

4.1.5 System Management Recommended research agenda Need to better characterize non-traditional - power, I/O bandwidth How to control communication – provision and control, different for HPC than WAN routing Figure out real-time aspect and feedback for resource control Develop techniques for dynamic provision under constant failure of components Coordinated resource discovery and scheduling aligned with Exascale resource management Fine grained authentication and authorization by function/resource Security Verification for SW built from diverse components “Defense in depth” within systems without performance impact – Security focused OS – End-to-end data integrity Tradeoffs of security and openness (e.g. grids) Determine key elements for monitoring Continue mining failure data and determining patterns Continuous monitoring and testing without affecting system behavior Investigate good filters, provide stateful filters for predicting potential incorrect behavior Determine statistical and data models that accurately capture behavior Determine proactive diagnostic and testing

4.1.5 System Management Alternative R&D strategies – Explore overlap with telecommunications technology for bandwidth scheduling – Examine data mining techniques – Leverage real-time techniques for time sensitive operations – Apply security monitoring methodologies to system logging and activity analysis – Look at performance filtering and data analysis, many problems are the same

4.1.5 System Management Crosscutting considerations – Resiliency – Power Gather data from all subsystems and unify – Consistency of performance – Usability – Common communication and data format across subsystems

System Management Priority Research Direction (resource control and scheduling) Key challenges How to control communication addition hardware needed? Need to better characterize non-traditional power, I/O bandwidth Figure out real-time aspect and feedback Scale – ensure time does not grow with size Scale – unobtrusive to users of system Power – monitor power of machine Non-traditional resources must be managed Models must be developed Develop framework for richer workflow Software will be functionally decomposed Software will be distributed to scale, be resilient, and avoid single-point-of-failure New control points will be in the software More responsive resource provisioning More flexibility, control, and productivity for applications Summary of research direction Potential impact on software component Potential impact on usability, capability, and breadth of community

System Management Priority Research Direction (open system security) Key challenges Fine grained authentication and authorization by function/resource Security Verification for SW built from diverse components “Defense in depth” within systems Security focused OS End to End data integrity Tradeoffs of security and openness (e.g. grids) Scale and Complexity Assume identity theft Security profiling and monitoring at scale (numbers of OS’s, Numbers of links, Speed of links, etc.) without impact or restriction Use of commodity components increases vulnerability Detecting anomalous behavior by external profiling of “normal” behavior Better design Scalable security policy Finer Grain resource authentication Better monitoring Improved system uptime in the face of increased threat activities Better protection of data Improved access Summary of research direction Potential impact on software component Potential impact on usability, capability, and breadth of community

System Management Priority Research Direction (integration and test) Key challenges Determine key elements for monitoring Continue mining failure data and determining patterns Determine proactive diagnostic and testing Tradeoff between more extensive diagnostics and affecting system performance, run while system is up if possible Lots of warnings and errors – how to determine which ones matter Be able to quickly switch between different versions of system Software components need to agree on common format Need format for versioning Better MTBF More consistent performance Summary of research direction Potential impact on software component Potential impact on usability, capability, and breadth of community

System Management Priority Research Direction (logging, analyzing, and reporting) Key challenges Investigate good filters Determine statistical models that accurately capture behavior Huge amount of data already and increasing Determine common infrastructure across disparate components Provide framework for component connection Ability to visualize condensed data Need software infrastructure to capture data Better MTBF Ability to better understand system Ability to make better decisions about machine use Ability to design more reliable future machines by understanding achilles heel Summary of research direction Potential impact on software component Potential impact on usability, capability, and breadth of community

System Management Priority Research Direction (External Coordination of Resources) Key challenges Providing information about availability and use of system resources Coordinated aggregation of system resources (e.g. reservations) Better design Scalable security policy Finer Grain resource authentication Better monitoring Improved system uptime in the face of increased threat activities Better protection of data Improved access Summary of research direction Potential impact on software component Potential impact on usability, capability, and breadth of community Coordinated resource discovery and scheduling aligned with Exascale RM Coordinated distributed security