Troubleshooting beyond what you understand

Slides:



Advertisements
Similar presentations
The Scaling IQ Test: When Dev and Admin Collide Richard Campbell Strangeloop Networks.
Advertisements

IPv6-only? You’re kidding, right? Lee Howard, Wes George
Miscellaneous topics and advice Never ever ever ever ……… EVER ….. What you should never ever ever ever ever do Light bulbs, planters, tough experiemts,
Copyright © 2007 Quest Software The Changing Role of SQL Server DBA’s Bryan Oliver SQL Server Domain Expert Quest Software.
Command School On Task In Touch Online Software for Schools Developed by Schools.
DNN Performance & Scalability Planning, Evaluating & Improving : Part 1.
1 Computer and Network Bottlenecks Author: Rodger Burgess 27th October 2008 © Copyright reserved.
How the Internet Goes Wrong Jon Crowcroft,
VMware vRealize Operations Management Pack for Citrix XenDesktop & XenApp.
Distributed Replay Testing With Your Data, Your Way! ca.linkedin.com/in/melodyzacharias.
6/13/2015 Visit the Sponsor tables to enter their end of day raffles. Turn in your completed Event Evaluation form at the end of the day in the Registration.
Difference between External and Internal Server Monitoring.
Carlos Bossy Quanta Intelligence SQL Server MCTS, MCITP BI CBIP, Data Mining Real-time Data Warehouse and Reporting Solutions.
Networking – How to network effectively L 5 Ing. Jiří Šnajdar 2016.
Barracuda Subscription and Support Upsell Opportunities
Using a Growth Mindset to Help Our Students Succeed
Getting Started with Flow
Building AD-SQL-APP Server on AZURE
Modularity Most useful abstractions an OS wants to offer can’t be directly realized by hardware Modularity is one technique the OS uses to provide better.
Server Upgrade HA/DR Integration
Debugging Intermittent Issues
PROBLEM SOLVING June 2010 CANADIAN COAST GUARD AUXILIARY - PACIFIC.
name of trainer associate trainer | sparqs
Introduction to ASP.NET 2.0
Outline What does the OS protect? Authentication for operating systems
NEFA’S Online Learning Center
SQL Saturday Pittsburgh
Transactional replication
Debugging Intermittent Issues
Outline Introduction Characteristics of intrusion detection systems
Studying.
Genius Webinar series, August 2013
Outline What does the OS protect? Authentication for operating systems
Auditing in SQL Server 2008 DBA-364-M
the whole network, not just endpoints
Troubleshooting Service Broker
Borrowing and Lending.
.NET Debugging for the Production Environment
Making PowerShell Useful
Making Your Emergency Toolkit
Project Planning is a waste of time!!!
Example of a page header
Dalinda Galaviz Human Resources Troy Moldenhauer Admissions.
Computer Simulation with Concert Tour Entrepreneur
Troubleshooting beyond what you understand
Kerberos for SSRS made Simple
with: My Designs in the Chaos
EECS150 Fall 2007 – Lab Lecture #4 Shah Bawany
Your code is not just…your code
Moving from SQL Profiler to xEvents
Debugging EECS150 Fall Lab Lecture #4 Sarah Swisher
We know who they are and what they do, but how do we help them?
Automation doesn’t fix process problems
Debugging EECS150 Fall Lab Lecture #4 Sarah Swisher
Planning and Storyboarding a Web Site
ECE 352 Digital System Fundamentals
IST346: Debugging and Troubleshooting
STRENGTHENING QUALITY: REFLECTION ON DESIGN 26 October 2017
IST346: Operating Systems / Command Line Interfaces
name of trainer associate trainer | sparqs
The Troubleshooting theory
Tools for Process Improvement Team Leaders
LO1 – Understand Computer Hardware
Review Time! (Yaaaaaaay!). Review Time! (Yaaaaaaay!)
Rock the Technical Interview
Introduction to ASP.NET Parts 1 & 2
Professional Networking
Ms. Chapin’s U.S. History Class
Your code is not just…your code
SQL Server on Containers
Presentation transcript:

Troubleshooting beyond what you understand Or: How to figure out what’s broken so you can get some help from the real owner because your stuff never breaks. Right? Ryan McCauley #492 – Phoenix 2016

Ryan McCauley VB6/VB.NET developer for 10 years Full-time DBA/T-SQL dev for 6 years Currently employed by Cable ONE as Data and Reporting Manager Microsoft Certified Professional (MCTS – SQL 2008 DBA) Active on Experts-Exchange and StackOverflow Twitter: @SQLRyan Blog: www.trycatchfinally.net Email: Ryan@KilaniMcCauley.com SQL SATURDAY | #492 | PHOENIX 2016

It Was a Dark and Stormy Night Also, applications are broken somewhere… Talk about the rotating DNS issue Connections to SQL Server intermittent Information comes in slowly SQL SATURDAY | #492 | PHOENIX 2016

Agenda Today Ground rules Techniques Major symptoms Common confusion Next steps SQL SATURDAY | #492 | PHOENIX 2016

Ground Rules SQL SATURDAY | #492 | PHOENIX 2016

Ground Rules Never say “randomly”, say “intermittent” It’s not just your components Consider their interaction and what’s around intermittent is something you don't yet understand, but it always has a cause when you say "random", you're saying you can't own it because it's not in your control Given same inputs, behavior of computers is always consistent See everything as something you own and can influence – you’re not helpless SQL SATURDAY | #492 | PHOENIX 2016

Ground Rules Something always changed! Always! Just maybe not in purpose Don’t take anything for granted! Both this class and in troubleshooting Monitoring only has a single perspective

Techniques SQL SATURDAY | #492 | PHOENIX 2016

Techniques Figure out what it’s not If that’s true, what else would be true? Make the problem as small as possible Need to isolate it to prove it Does it work at all? Where can you connect from? Myers-Briggs and S (focus on resolving the examples) vs N (every example needs to fit pattern first) Small problem - You need to isolate it to prove it, especially to others Reproduce the problem in a second location with as much different as possible SQL SATURDAY | #492 | PHOENIX 2016

Techniques Is it consistent? Can you find somewhere it’s not broken? Shared vs. Dedicated components VMs can dramatically complicate things Time it takes when it does run - does it vary? Is it quick or slow? same sources always broken? DAC FTP issue - 1 server takes 0.5 seconds, other 7 take 12-14 seconds, even for failed login Which components are shared vs. dedicated? VMs complicate this issue because everything is shared and live migration is seamless SQL SATURDAY | #492 | PHOENIX 2016

Simplify everything! Things your service depends on How they get to your service Your service Customers

Major symptom – cheat sheet SQL SATURDAY | #492 | PHOENIX 2016

Major Symptoms, part 1 Never works Intermittently not accessible Firewall or app not listening Intermittently not accessible What’s changing? Load balancer/cluster? Always slow but consistent Hardware config/resource Likely not load on shared components SQL SATURDAY | #492 | PHOENIX 2016

Major Symptoms, part 2 Intermittent slowness Unchanging or predictable Hardware bottleneck or shared resource? Unchanging or predictable More likely configuration Shifting or unpredictable More likely capacity somewhere VM as shared component, harder to see the impact SQL SATURDAY | #492 | PHOENIX 2016

Common Confusion SQL SATURDAY | #492 | PHOENIX 2016

Common Confusion Login failures vs. firewall timeouts Ever used TCPING? Know common ports! Firewall rules – when are they evaluated? If somebody says “Kerberos”, it’s probably not Ping isn’t the same as making sure the path is open! Ping doesn’t use a TCP port at all Talk about subnets/VLANs SQL SATURDAY | #492 | PHOENIX 2016

Slightly less dark and stormy… Let’s approach our outage again Resolve the DNS issue If time, talk about either Firewall timeouts when we moved reporting servers (5 minutes) Mis-aligned disks on clusters = consistently slow read times SQL SATURDAY | #492 | PHOENIX 2016

Next Steps SQL SATURDAY | #492 | PHOENIX 2016

Next Steps Learn about what you don’t know Shadowing, training, ask! Specialized knowledge not required, but can help If you don’t understand concept, ask It’s not resolved until you understand why! Root cause analysis is critical Don’t let “root cause analysis” be “it’s not happening anymore” or it resolved itself = it’s not resolved until you know it’s not going to happen again! SQL SATURDAY | #492 | PHOENIX 2016

Thanks for attending, and visit the sponsors! SQL SATURDAY | #492 | PHOENIX 2016

Platinum Level Sponsors Gold Level Sponsors Venue Sponsor Key Note Sponsor Pre Conference Sponsor

Silver Level Sponsors Bronze Level Sponsors