Enterprise Best (Community) Practices Workshop

Slides:



Advertisements
Similar presentations
© 2006 Open Grid Forum JSDL 1.0: Parameter Sweeps OGF 23, June 2008, Barcelona, Spain.
Advertisements

© 2006 Open Grid Forum Network Services Interface OGF30: Connection Services Guy Roberts, 27 th Oct 2010.
© 2006 Open Grid Forum Network Services Interface Introduction to NSI Guy Roberts.
© 2006 Open Grid Forum JSDL 1.0: Parameter Sweeps: Examples OGF 22, February 2008, Cambridge, MA.
© 2006 Open Grid Forum OGF19 Federated Identity Rule-based data management Wed 11:00 AM Mountain Laurel Thurs 11:00 AM Bellflower.
© 2007 Open Grid Forum JSDL-WG Session OGF27 – General Session 10:30-12:00, 14 October 2009 Banff, Canada.
©2010Open Grid Forum OGF28 OGSA-DMI Status Chairs: Mario Antonioletti, EPCC Stephen Crouch, Southampton Shahbaz Memon, FZJ Ravi Madduri, UoC.
© 2006 Open Grid Forum Joint Session on Information Modeling for Computing Resources OGF 20 - Manchester, 7 May 2007.
© 2007 Open Grid Forum JSDL-WG Session OGF21 – Activity schema session 17 October 2007 Seattle, U.S.
© 2006 Open Grid Forum 2 nd March 09 Enterprise Grid Requirements Research Group OGF25 EGR-RG Session Group.
© 2006 Open Grid Forum OGSA Next Steps Discussion Providing Value Beyond the Specifications.
Oct 15 th, 2009 OGF 27, Infrastructure Area: Status of FVGA-WG Status of Firewall Virtualization for Grid Applications - Working Group
© 2008 Open Grid Forum Resource Selection Services OGF22 – Boston, Feb
© 2006 Open Grid Forum Network Services Interface OGF29: Working Group Meeting Guy Roberts, 19 th Jun 2010.
© 2007 Open Grid Forum JSDL-WG Session 1 OGF25 – General Session 11:00-12:30, 3 March 2009 Catania.
© 2006 Open Grid Forum JSDL Optional Elements OGF 24 Singapore.
© 2007 Open Grid Forum Data/Compute Affinity Focus on Data Caching.
© 2006, 2007 Open Grid Forum Michel Drescher, FujitsuOGF-20, Manchester, UK Andreas Savva, FujitsuOGF-21, Seattle, US (update) Extending JSDL 1.0 with.
1 ©2013 Open Grid Forum OGF Working Group Sessions Security Area – FEDSEC Jens Jensen, OGF Security Area.
© 2006 Open Grid Forum DCI Federation Protocol BoF Alexander Papaspyrou, TU Dortmund University Open Grid Forum March 15-18, 2010, Munich, Germany.
© 2007 Open Grid Forum Data Grid Management Systems: Standard API - community development Arun Jagatheesan, San Diego Supercomputer Center & iRODS.org.
© 2006 Open Grid Forum Service Level Terms Andrew Grimshaw.
Peter Ziu Northrop Grumman ACS-WG Grid Provisioning Appliance Concept GGF13, March 14, 2005 (Revised 8/4/2005)
OGF DMNR BoF Dynamic Management of Network Resources Documents available at: Guy Roberts, John Vollbrecht.
© 2006 Open Grid Forum Network Services Interface OGF 32, Salt Lake City Guy Roberts, Inder Monga, Tomohiro Kudoh 16 th July 2011.
© 2007 Open Grid Forum Enterprise Best (Community) Practices Workshop OGF 22 - Cambridge Nick Werstiuk February 25, 2007.
© 2006 Open Grid Forum FEDSEC-CG Andrew Grimshaw and Jens Jensen.
© 2006 Open Grid Forum Network Services Interface OGF 33, Lyon Guy Roberts, Inder Monga, Tomohiro Kudoh 19 th Sept 2011.
© 2015 Open Grid Forum ETSI CSC activities Wolfgang Ziegler Area Director Applications, OGF Fraunhofer Institute SCAI Open Grid Forum 44, May 21-22, 2015.
© 2006 Open Grid Forum GridRPC Working Group 15 th Meeting GGF22, Cambridge, MA, USA, Feb
OGSA-RSS Face-to-Face Meeting Sunnyvale, CA, US Aug 15-16, 2005.
© 2006 Open Grid Forum OGSA-WG: EGA Reference Model GGF18 Sept. 12, 4-5:30pm, #159A-B.
© 2006 Open Grid Forum Remote Instrumentation Services in Grid Environment Introduction Marcin Płóciennik Banff, OGF 27 Marcin Płóciennik.
© 2006 Open Grid Forum Grid High-Performance Networking Research Group (GHPN-RG) Dimitra Simeonidou
Peter Ziu Northrop Grumman ACS-WG Grid Provisioning Appliance Concept GGF13, March 14, 2005
© 2006 Open Grid Forum NML Progres OGF 28, München.
© 2008 Open Grid Forum PGI - Information Security in the UNICORE Grid Middleware Morris Riedel (FZJ – Jülich Supercomputing Centre & DEISA) PGI Co-Chair.
© 2007 Open Grid Forum OGF Management Area Meeting OGF20 7 May, am-12:30pm Manchester, UK.
© 2006 Open Grid Forum Grid Resource Allocation Agreement Protocol GRAAP-WG working session 1 Thursday, 5 March, 2009 Catania, Sicily.
© 2006 Open Grid Forum VOMSPROC WG OGF36, Chicago, IL, US.
© 2007 Open Grid Forum OGF20 Levels of the Grid Workflow Interoperability OGSA-WG F2F meeting Adrian Toth University of Miskolc NIIF 11 th May, 2007.
© 2006 Open Grid Forum 1 Application Contents Service (ACS) ACS-WG#1 Monday, September 11 10:30 am - 12:00 am (158A-B) ACS-WG#2 Wednesday, September 13.
© 2008 Open Grid Forum Production Grid Infrastructure WG State Model Discussions PGI Team.
OGSA Data Architecture WG Data Transfer Session Allen Luniewski, IBM Dave Berry, NESC.
© 2007 Open Grid Forum JSDL-WG Session OGF26 – General Session 11:00-12:30, 28 May 2009 Chapel Hill, NC.
Network Services Interface
SLIDES TITLE Your name Session Name, OGSA-WG #nn
Welcome and Introduction
RISGE-RG use case template
OGSA Data Architecture WG Data Transfer Discussion
GridRPC Working Group 13th Meeting
Grid Resource Allocation Agreement Protocol
Service Virtualization via a Network Appliance….
OGF session PMA, Florence, 31 Jan 2017.
WS-Agreement Working Session
Sharing Topology Information
Network Services Interface
Network Services Interface Working Group
OGSA-Workflow OGSA-WG.
Information Model, JSDL and XQuery: A proposed solution
Network Measurements Working Group
WS Naming OGF 19 - Friday Center, NC.
Activity Delegation Kick Off
SAGA: Java Language Binding
Network Services Interface Working Group
Introduction to OGF Standards
SAGA: Java Language Binding
Proposed JSDL Extension: Parameter Sweeps
UR 1.0 Experiences OGF 24, Singapore.
OGF 40 Grand BES/JSDL Andrew Grimshaw Genesis II/XSEDE
Presentation transcript:

Enterprise Best (Community) Practices Workshop OGF 21 - Seattle Nick Werstiuk werstiuk@platform.com October 18, 2007

OGF IPR Policies Apply “I acknowledge that participation in this meeting is subject to the OGF Intellectual Property Policy.” Intellectual Property Notices Note Well: All statements related to the activities of the OGF and addressed to the OGF are subject to all provisions of Appendix B of GFD-C.1, which grants to the OGF and its participants certain licenses and rights in such statements. Such statements include verbal statements in OGF meetings, as well as written and electronic communications made at any time or place, which are addressed to: the OGF plenary session, any OGF working group or portion thereof, the OGF Board of Directors, the GFSG, or any member thereof on behalf of the OGF, the ADCOM, or any member thereof on behalf of the ADCOM, any OGF mailing list, including any group list, or any other list functioning under OGF auspices, the OGF Editor or the document authoring and review process Statements made outside of a OGF meeting, mailing list or other function, that are clearly not intended to be input to an OGF activity, group or function, are not subject to these provisions. Excerpt from Appendix B of GFD-C.1: ”Where the OGF knows of rights, or claimed rights, the OGF secretariat shall attempt to obtain from the claimant of such rights, a written assurance that upon approval by the GFSG of the relevant OGF document(s), any party will be able to obtain the right to implement, use and distribute the technology or works when implementing, using or distributing technology based upon the specific specification(s) under openly specified, reasonable, non-discriminatory terms. The working group or research group proposing the use of the technology with respect to which the proprietary rights are claimed may assist the OGF secretariat in this effort. The results of this procedure shall not affect advancement of document, except that the GFSG may defer approval where a delay may facilitate the obtaining of such assurances. The results will, however, be recorded by the OGF Secretariat, and made available. The GFSG may also direct that a summary of the results be included in any GFD published containing the specification.” OGF Intellectual Property Policies are adapted from the IETF Intellectual Property Policies that support the Internet Standards Process. IPR Notices Note Well for OGF meetings 2

Session Objective Update on Activity with Best Practices Where do things sit on the first best practice Call for Participation Ask for input and feedback on the document Ask for feedback on other best practice topics 3

Agenda Recap – how did we get here ? Context/Definitions/Scope Community Practice for Enterprise Grid Deployment and Management Problem #1 – Deployment and Configuration management Problem #2 – Maximizing Utilization and Efficiency Problem #3 – Monitoring, Troubleshooting and Reporting Conclusions and Next Steps 4

Recap Best Practice session at OGF 19 Community generated list of potential topics Surveyed Attendees to develop priorities Defined 1st priority as “Best Practices to Deploy and Manage an Enterprise Grid OGF 20 Session – feedback on what the practices should be, are we looking at the right questions End User Research - ongoing Documentation of the user information into the first document 5

Initial List of Best Practice Areas #2 Application Suitability for Grid Application Porting to Grid Grid in Financial Services Increasing Infrastructure Utilization Grid Deployment and Management Grid Financial Justification Grid Service Level Management Vendor Selection Process What type of Grid to use ? License Optimization and Management Use of External Grids integrated with Enterprise Data Management and Integration, Security, Clean Up Integration of Grid Security with Enterprise Environment #1 #3 6

Recap OGF 20 Workshop Recommendations Use this best (or community) practice activity as a way to catalog and comment on the approaches being used today in the Enterprise environments Focus on the approaches (tools) and limitations on the approaches, not which approach is the absolute ‘Best’ 7

Context – Intended Audience Enterprises looking at deploying a grid infrastructure Trying to understand some of the approaches and techniques currently being used in the community to deploy and manage grid infrastructure Existing Enterprise grid users Looking for additional information and insight on how other organizations are tackling problems and issues around deployment and management of their grid infrastructure 8

Context - Scope Typical and current use cases and approaches for Grids within enterprise environments. 80/20 Rule Applies The majority of these Enterprise grids are used for the execution of compute and data intensive workloads within the boundaries of a single enterprise Have chosen to focus on these types of deployments. 9

Context – Enterprise Scope Commercial Users of grids for compute and data intensive processing Application, Infrastructure ISV’s used for compute, data intensive grids. Organization Focus ‘Data Center’ High Performance Commercial users of grids beyond high performance applications Data Center automation, management, virtualization etc, etc End User Ecosystem Organization Type 10

Context – Scope Vertical Market Examples Horizontal Examples Industrial Manufacturing - MDA Electronics - EDA Financial Services – Investment Banking Risk, Pricing Life Sciences Oil and Gas Horizontal Examples Analytics Business Intelligence 11

Example - Telco Billing Environment 200xLinux servers 2-CPU 3.2GHz 8GB RAM All Markets concurrently 3 x SMP, 32 CPU SMP app SAN >30 Target Markets Billing CRM Database Switches Billing Data 3 x SMP, 32 CPU SMP with RDBMS app – can process 2-3 Target Markets concurrently SAN >30 Target Markets Billing CRM Database Switches Billing Data Challenges Current SLA 24-48 Desired SLA 4 hours Additional SMP to meet SLA = $12M Results Desired SLA Met Grid Resources = $1.2M 12

Context – Out of Scope Enterprise or Data Center Grids that are used primarily for large scale transactional environments. Emerging end user deployment types, not as common as the compute and data intensive use cases that are widely deployed across vertical industries. Data management and access issues in compute intensive grid environments. The focus of this document is primarily on the deployment and management of the server infrastructure and the applications that run on that infrastructure, not the associated data management issues. 13

Alignment and Discussion Comments on the Focus Area? 14

Context – Drivers and Impact Business Agility/Flexibility Apply resources flexibly to meet business needs Infrastructure Cost Challenges Drive to scale out infrastructure to meet price/performance needs End Result = Management Complexity Large scale‑out architectures (100s to 1000s of servers) have introduced severe complexity and manageability challenges 15

Problem Area #1 Deployment and Configuration of the servers in the Grid: Initial deployment and ongoing configuration management of large numbers of servers is a significant challenge. Problem magnified in dynamic environments, where the ‘personality’ of the resources needs to change to meet workload demands and business policies. Complexity of the environment grows in multiple dimensions based on key variables like number of applications, type of application pattern, number of resources, number of user groups. Key Discussion Question: What tools and approaches are organization using to manage the provisioning and update of 1000’s of servers at a time in their enterprise Grid? What tools and approaches are people using to deploy their applications and required application and middleware configuration onto their Enterprise Grid? What are the current benefits and limitations of these different approaches? (needs work) 16

Additional Issues Faced Remote Install Requirements and scalability Heterogeneous (and Multiple Layers of) Resources Hardware Operating Systems Middleware Applications Configuration Management Updates and Patching Tool Costs Pre Integrated ‘Stacks’ vs Best of Breed Integrations 17

Typical Configurations Disk-full server nodes The Operating System is installed onto a disk directly on the compute node. In this type of configuration a standard Ethernet network is used and a boot image is loaded over the network Disk-less server nodes Network Attached Storage – Operating system is installed on network attached storage and server is booted over the network. (Boot over SAN, Infiniband) In Memory OS Deployment – Operating system is installed into memory in the compute node Desktop Compute nodes Decentralized deployments 18

Typical Configurations Package Based Golden Image VM Appliance 19

Approaches and Tools Used Commercial Provisioning/Data Center Automation Tools providing a comprehensive suite of operating system provisioning and application configuration management. Typically used in enterprises for server environments beyond the grid infrastructure Commercial Cluster Management High performance computing cluster management tools. Open Source Cluster Management/Provisioning A variety of options from the academic and research oriented HPC environments, built by the users in these institutions to deploy their clusters and grid. 20

Approaches and Tools Used Open Source Configuration Management Tools Variety of tools have come from the research community grid deployments, and are a potential option for enterprise users. These tools typically operate on the configuration of the nodes once an Operating System is installed on the server. Storage and Networking Based Deployment Storage and networking vendors providing tools to enable the deployment and configuration of the grid nodes. 21

Approaches and Tools Used OS Specific Tools provided by the Operating system vendor Operating system vendors provide the capability for their users to deploy the Operating system onto server hardware. Hardware Provider Specific Solutions Server vendors typically provide their customers with a provisioning and hardware management tool to enable them to manage their specific brand of hardware. “Home grown” developed installation tools Many organizations have built out their own provisioning tools to meet their specific enterprise needs. 22

Approaches and Tools Used Grid Workload Managers Workload managers used in the enterprise grid environments have some deployment management capabilities built into their offerings. Enables file transfer, and compute service installation to be managed via services within the grid. Once the workload manager is installed, then these remote execution and management facilities can be used to address many of the typical application deployment use cases on the grid. Virtual Machine Based Environments Emerging approach in the Enterprise Grid community is to leverage VM’s as the means to deploy and configure the operating system on the grid nodes. 23

Anything Missing ? Are there other approaches being used ? 24

Problem Area #2 Maximizing Grid Resource Utilization and Efficiency: Competing goals within the organization, the infrastructure owners typically want to maximize their resource utilization, while Application owners want the best Service Levels. As diverse workloads are deployed into the grid infrastructure, the problem of efficient resource utilization mapped against business priorities becomes magnified. Key Discussion Question: What tools and approaches are people using to maximize the utilization of their enterprise Grid, while managing Service Levels for individual Application Users/Business Units 25

Additional Issues Faced Unlimited (infinite) demand with limited Supply Balance prioritization based on business policies with the ability to take maximum benefit of available resources Complexity is introduced in mixed application and workload environments, where a variety of workloads may be deployed onto the same infrastructure. Short running batch jobs Long running batch jobs. Sequential vs. Parallel Jobs Workflows of Jobs. Compute Service Oriented Interactive User Sessions 26

Typical Configurations Application Specific Resources Pool of resources dedicated to a single application or business unit Shared Resources – Single Workload Type Resources pooled across similar application pattern and business units Shared Resources – Multiple Workloads Dedicated and Opportunistic Resources Dedicated Servers Opportunistic Servers Opportunistic Desktops 27

Approaches and Tools Used Demand Policy Dimensions Priority Response Time Supply Policy Dimensions Ownership Lending Borrowing Sharing Service Level is combination and manipulation of both 28

Approaches and Tools Used Application Specific Workload Managers Job/batch environments Service Oriented Application Environments Meta Schedulers Layer on top of the clusters to distribute workload to different resources Resource Managers Application independent resource allocation and control. 29

Anything Missing ? Are there other approaches being used ? 30

Problem Area #3 Monitoring, Troubleshooting and Reporting: Different End user, application owner and infrastructure management needs Coordinated across the layers – different groups (application and infrastructure) have their own views of what is happening to the applications and the underlying grid resources Key Questions: What tools and approaches are people using to monitor the infrastructure and the applications? What tools and approaches are people using to integrate the different monitoring and configuration approaches to provide an integrated management infrastructure for their Enterprise Grid? 31

Additional Issues Faced Real Time Resource Monitoring for Workload dispatch Real Time Monitoring for Problem Diagnosis Application Monitoring Log collection for troubleshooting Remote administrator access Reporting on Application and Resources Analysis of Historical Data 32

Additional Issues Faced Correlation of information across different layers Unified Presentation Too Many Agents 33

Approaches and Tools Used Workload Managers Provide monitoring information on the grid resources. Metrics on the physical resources are typically available in support of the workload allocation activities. The workload managers also provide workload status and information, on job activity, performance and Cluster Monitoring Tools Tools available (usually included in a cluster management package) that enable monitoring of the physical resources in the cluster/grid Reporting and Performance Analysis Tools Leverage data collected from the workload manager and other sources into a data-mart/data warehouse enabling the creation of historical reports 34

Approaches and Tools Used Hardware Monitoring and Management Provided with the hardware by the server vendors Intelligent Platform Management Interface (IPMI) Network Management and Monitoring Frameworks Extensible Application and infrastructure monitoring Enterprise Management and Monitoring Tools Leverage components or suites from the large Enterprise System Management vendors Industry Specific Tools Business process specific tools that monitor the grid and infrastructure/application outside the grid 35

Anything Missing Are there other approaches being used ? 36

Conclusions Diversity of tools and approaches indicates interoperability problems Enterprises make choices between Stacks and Best of Breed Integration Initiatives emerging to address complexity via configuration recipes and verification of everything from the HW up to the Application 37

Next Steps Write up the details on Monitoring approaches Feedback, Review and Comments Other Community Practice Areas How can we scale this? Volunteers and Participation 38

Initial List of Best Practice Areas #2 Application Suitability for Grid Application Porting to Grid Grid in Financial Services Increasing Infrastructure Utilization Grid Deployment and Management Grid Financial Justification Grid Service Level Management Vendor Selection Process What type of Grid to use ? License Optimization and Management Use of External Grids integrated with Enterprise Data Management and Integration, Security, Clean Up Integration of Grid Security with Enterprise Environment #1 #3 Still Valid / Anyone want to start one ? 39