Download presentation
Presentation is loading. Please wait.
Published byKory Burns Modified over 6 years ago
1
4/12/2018 1:04 PM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
2
4/12/2018 1:04 PM P4168 Managing Secure, Scalable, Azure Service Fabric Clusters and Applications Chacko Daniel Principal Program Manager @chackod © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
3
Agenda Service Fabric Azure PowerShell Module
Build 2014 4/12/2018 1:04 PM Agenda Service Fabric Azure PowerShell Module Best practices for planning your cluster Best practices for securing your cluster Best practices for business continuity planning Monitoring and diagnostics Road map © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
4
Azure Service Fabric Any OS, Any Cloud Dev Box Azure On-Premise
Microsoft Build 2017 4/12/2018 1:04 PM Azure Service Fabric Service Fabric Windows SDK Available Service Fabric in Azure Stack GA Coming 2017 Any OS, Any Cloud Service Fabric on Linux in Azure Preview Service Fabric for Linux Coming 2017 Service Fabric on Linux in Azure Preview Service Fabric on Windows in Azure Available Service Fabric for Windows Server Available Lifecycle Management Always On Availability Programming Models Health & Monitoring Dev & Ops Tooling Auto scaling Orchestration Dev Box Azure On-Premise Data centers Other Clouds © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
5
Service Fabric RM PowerShell Module
Microsoft Build 2017 4/12/2018 1:04 PM Service Fabric RM PowerShell Module Ships as a part of Azure PowerShell In preview for now, so feedback is appreciated Automates many E2E scenarios Creating a secure cluster Adding or removing nodes from a cluster Adding or removing NodeTypes from a cluster Adding new cluster and client certificates to a cluster Switching the cluster upgrade mode from Automatic to Manual or vice-versa Updating service fabric settings © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
6
New-AzureRmServiceFabricCluster
Microsoft Build 2017 4/12/2018 1:04 PM New-AzureRmServiceFabricCluster Ships with four parameter sets Supports majority of ‘create new service fabric cluster’ scenarios Example with minimum set of parameters - create a secure cluster with a self signed cert, and optionally downloading it locally: New-AzureRmServiceFabricCluster -ResourceGroupName $RGname -Location $clusterloc -ClusterSize $numNodes -VmPassword $pwd -CertificateSubjectName $subname -CertificatePassword $pwd -CertificateOutputFolder $pfxfolder Let us see this in action. Let us see this in action. © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
7
Create a secure cluster using PowerShell
Microsoft Build 2017 4/12/2018 1:04 PM Demo: Create a secure cluster using PowerShell Demo #1 © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
8
Best practices: Cluster Security in Windows Server (Standalone)
4/12/2018 1:04 PM Best practices: Cluster Security in Windows Server (Standalone) Always use a secure cluster: use AD Cluster security: gMSA Client access (Admin and Read-only): use AD Use automated deployments Use scripts to generate, deploy, and rollover secrets No human should have access to them without authentication Additionally consider the following: Create DMZs using Network Security Groups (NSGs) in your load balancer Use Jump servers to RDP into cluster VMs or to manage the cluster Although we support the use of certs on standalone, we recommend that you use AD. For any production deployment, always use automated deployment. Use the tool of your choice, or Powershell scripts © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
9
Best practices: Cluster Security in Azure
4/12/2018 1:04 PM Best practices: Cluster Security in Azure Always use a secure cluster Cluster node to node security – use Certificates Client access (Admin and Read only) – use AAD Use automated deployments Use scripts to generate, deploy and rollover secrets Keep the secrets in KV, use AD for all other client access No human should have access to them without authentication. Additionally consider the following: Create DMZs using Network Security Groups (NSGs) Use Jump servers to RDP into cluster VMs or to manage your cluster In azure, Use Certificates for client access only as a “break glass” scenario. For any production deployment, always use automated deployment. Use the tool of your choice, or powershell scripts © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
10
Best practices: Cluster Security in Azure
4/12/2018 1:04 PM Best practices: Cluster Security in Azure Service Fabric Cluster Service Fabric Cluster Key Vault AAD Security VNET LB#1 LB#3 LB#2 LB#1 LB#2 LB#3 NSG#1 NSG#1 NSG#2 NSG#2 NSG#2 NSG#3 VMSS* ##1 VM VMSS#1 VM VMSS#2 VMSS#3 VMSS* #1 VM VMSS#1 VM Azure Storage Jump Server For Diagnostics For SF logs For VHDs For VHDs Managed Disk For VHDs © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
11
NSG ports that need to be opened
Microsoft 2016 4/12/2018 1:04 PM NSG ports that need to be opened ClientConnectionEndpoint (TCP) 19000 HttpGatewayEndpoint (HTTP/TCP) 19080 SMB support for Image Store 445, 134 ClusterConnectionEndpointPort (TCP) 1025 LeaseDriverEndpointPort (TCP) 1026 Ephemeral Port range As needed, min 256 ports App ports As needed © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
12
Review of a cluster with NSG enabled on Portal
Microsoft 2016 4/12/2018 1:04 PM Demo: Review of a cluster with NSG enabled on Portal ARM template used: Demo #2 © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
13
Service Fabric Cluster Planning
Microsoft 2016 4/12/2018 1:04 PM Service Fabric Cluster Planning Key points: Capacity planning is not an easy exercise Capacity planning requires periodic reassessment Add capacity on demand is not instantaneous Incurring downtime in the future to change capacity may not work Read more on cluster capacity planning in this document © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
14
Service Fabric Cluster Planning
Microsoft 2016 4/12/2018 1:04 PM Service Fabric Cluster Planning Define what the cluster will be used for Is this to be used for Test? Is this a part of the CI/CD pipeline? Is this for Production use? Determine the node types and sizes For each application planned to be deployed – do a sizing exercise Ports to be opened etc. Are there any unique compliance or security requirements? Compliance expectations from infrastructure and applications © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
15
Service Fabric Cluster Planning
Microsoft 2016 4/12/2018 1:04 PM Service Fabric Cluster Planning Where do you want this cluster hosted? On Azure? On-Premise, in your data center? On some other cloud provider? Choose the # of Fault Domains (FDs) This determines the headroom needed in case of unplanned failures Choose the # of Upgrade Domains (UDs) This determines the headroom needed in case of planned failures © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
16
Choosing the # of Fault Domains
Microsoft 2016 4/12/2018 1:04 PM Choosing the # of Fault Domains Number of FDs determines the headroom needed in case of unplanned failures Examples include a PDU failing or TOR maintenance that can take out all machines in a rack PDU Burn out FD1 FD2 FD3 FD4 FD5 In azure you do not get to choose the number of FDs. The VMSS instances are spread across 5 FDs. Replica In terms of capacity – you need to leave enough headroom to accommodate failure of at least one FD This will result in SF moving/creating new replicas on the available machines in other FDs © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
17
Choosing the # of Upgrade Domains
Microsoft 2016 4/12/2018 1:04 PM Choosing the # of Upgrade Domains Number of Upgrade Domains determines the headroom needed in case of planned failures/downtimes An example is when a Service Fabric upgrade going on, and a UD is down, you have to have room for additional replicas if need be In azure you do not get to choose the number of UDs. The VMSS instances are spread across 5 UDs. SF upgrade UD1 UD2 UD3 UD4 UD5 UD6 UD7 UD8 UD9 UD10 FD1 FD2 FD3 FD4 FD5 Replica © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
18
Best practices: Capacity headroom
You should plan your capacity in such a way that your service can at least survive: A loss of one FD A UD being down because of an upgrade going on A additional random node/VM failing FD1 FD2 FD3 FD4 FD5 UD1 UD2 UD3 UD4 UD5 UD6 UD7 UD8 UD9 UD10 The link above points to : resource-manager-cluster-description#cluster-capacity Read about specifying node capacity and buffered capacity here
19
Best practices: Cluster setup in Azure
Microsoft 2016 4/12/2018 1:04 PM Best practices: Cluster setup in Azure Use the ARM template to customize your cluster Setup managed storage for VM VHDs Use the ARM template to drive changes to your Resource Group Easy configuration management Auditing Avoid using implicit commands to tweak your resources Treat your cluster configuration as code Be thorough in checking the configurations you choose to deploy Now let us shift our focus to the best practices for setting up clusters in Azure… © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
20
Options for Test Clusters in Azure
Microsoft 2016 4/12/2018 1:04 PM Options for Test Clusters in Azure Use PowerShell or Portal to setup a test cluster One node cluster New-AzureRmServiceFabricCluster -ResourceGroupName $RGname -Location $clusterloc -ClusterSize 1 -VmPassword $pwd CertificateSubjectName $subname -CertificatePassword $pwd -OS UbuntuServer1604 Three node cluster New-AzureRmServiceFabricCluster -ResourceGroupName $RGname -Location $clusterloc -ClusterSize 3 -VmPassword $pwd CertificateSubjectName $subname -CertificatePassword $pwd -OS WindowsServer2016DatacenterwithContainers © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
21
Deploy Test Clusters through Portal
Microsoft 2016 4/12/2018 1:04 PM Demo: Deploy Test Clusters through Portal Demo #3 © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
22
Scaling an Azure cluster in or out
Microsoft 2016 4/12/2018 1:04 PM Scaling an Azure cluster in or out Safest option: Use the Azure PS to perform this operation Add nodes Add-AzureRmServiceFabricNode -ResourceGroupName $RGname -Name $clusterName -NodeType $nodeType -Number $addNumNodes Remove nodes Remove-AzureRmServiceFabricNode -ResourceGroupName $RGname -Name $clusterName -NodeType $nodeType -Number $addNumNodes © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
23
Scaling an Azure cluster in or out
Microsoft 2016 4/12/2018 1:04 PM Scaling an Azure cluster in or out Safest option: Use the Azure PS to perform this operation Add a new NodeType Add-AzureRmServiceFabricNodetype -ResourceGroupName $RGname -Name $clusterName -NodeType $nodeType …… Remove a new NodeType Remove-AzureRmServiceFabricNodeType -ResourceGroupName $RGname -Name $clusterName -NodeType $nodeType ….. © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
24
Scale out a cluster using the PowerShell Module
Microsoft 2016 4/12/2018 1:04 PM Demo: Scale out a cluster using the PowerShell Module Demo #4 © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
25
Business Continuity Planning
Microsoft Build 2017 4/12/2018 1:04 PM Business Continuity Planning Keep a written, updated, Business Continuity Define what your RPO and RTO are RPO - The Recovery Point Objective (RPO) determines the amount of data you can afford to lose in a disaster RTO - The Recovery Time Objective (RTO) is the maximum tolerable length of time that your service can be down after a disaster occurs Backup your application state to meet your RPO © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
26
Disasters and suggested mitigations
Microsoft Build 2017 4/12/2018 1:04 PM Disasters and suggested mitigations Types of Disasters RPO and RTO = 0, Write latency acceptable RPO and RTO > 0 Data Center Outages Cross-regional SF cluster Stand up a new cluster, restore from backup Cluster down (Very low probability for cross-regional clusters) Machine / Node down Deploy across 5+ FDs, 5+ UDs, Design for write quorum losses Other sources of data loss or “oops” Restore from backup This matrix represents suggested mitigations. The actual mitigation that you adopt depends on your applicaiton and Business continuity plans. © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
27
Scenarios to monitor and why
Microsoft Build 2017 4/12/2018 1:04 PM Scenarios to monitor and why Cluster and Node state Is the cluster healthy? Are all the nodes up? Detect and diagnose hardware and infrastructure issues Application and Service state Upgrade status, number of services and replicas Detect software and app issues, reduce service downtime Resource Usage Do all the nodes need to be up? What is the average CPU usage? Understand resource consumption and drive better business decisions Performance Tracking Is there any unexpected latency? Are the services responsive? Optimize application, service, and infrastructure performance Custom Application Metrics Is your app being used in the way that you expected? Is solution effective? Generate business insights and improvements When it comes to monitoring, think about monitoring not only your cluster, nodes and application. Think about how you an use it to monitor resource usage, application performance and effectiveness of your application. You will need to add custom application metrics to determine, if you service is truly doing what is supposed to do… © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
28
Monitoring and Diagnostics
Microsoft 2016 4/12/2018 1:04 PM Monitoring and Diagnostics Service Fabric out-of-the-box monitoring Operational events (high level cluster and node events) Health reports and load balancing decisions made by the system Reliable Services events Reliable Actors events System logs (used by us to provide support for your cluster) Best practices Always enable diagnostics Generate custom traces in your applications and services Set up automated monitoring alerts Watchdog Service (github.com/Azure-Samples/service-fabric-watchdog-service) Read more on monitoring and diagnostics in this document © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
29
Setting up monitoring and diagnostics at cluster creation
Microsoft 2016 4/12/2018 1:04 PM Demo: Setting up monitoring and diagnostics at cluster creation Demo #5 © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
30
The Road Ahead Azure CLI – same functionality as the Azure PS
Microsoft Build 2017 4/12/2018 1:04 PM The Road Ahead Azure CLI – same functionality as the Azure PS Application and services as ARM resources Patch Orchestration Application Stand alone offerings of Linux clusters © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
31
Recap Use the Service Fabric Azure PowerShell Module
Microsoft Build 2017 4/12/2018 1:04 PM Recap Use the Service Fabric Azure PowerShell Module Best practices for planning your cluster Best practices for securing your cluster Best practices for business continuity planning Monitoring and diagnostics Make your E2E operational scenarios easier by using the Azure ServiceFabric RM module Adopt the best practices for planning, deploying and securing your clusters Write down a Business continuity plan, disasaters happen and it is best to be prepared for it Leverage all the out of the box monitoring and diagnostics capabilities. © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
32
4/12/2018 1:04 PM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.