Bryan Heden Lead Solutions Provider

Slides:



Advertisements
Similar presentations
Lecture 10 Sharing Resources. Basics of File Sharing The core component of any server is its ability to share files. In fact, the Server service in all.
Advertisements

What’s New: Windows Server 2012 R2 Tim Vander Kooi Systems Architect
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
Bangkok, Thailand An Introduction intERLab at AIT Network Management Workshop March – Bangkok, Thailand Hervey Allen & Phil Regnauld.
Windows Deployment Services WDS for Large Scale Enterprises and Small IT Shops Presented By: Ryan Drown Systems Administrator for Krannert.
Network Management Workshop intERlab at AIT Thailand March 11-15, 2008 Network Operations and Network Management.
15.1 © 2004 Pearson Education, Inc. Exam Managing and Maintaining a Microsoft® Windows® Server 2003 Environment Lesson 15: Configuring a Windows.
OpenVMS System Management A different perspective by Andy Park TrueBit b.v.
Object-Oriented Enterprise Application Development Tomcat 3.2 Configuration Last Updated: 03/30/2001.
1 Chapter 7 IT Infrastructures Business-Driven Technology
©Company confidential 1 Performance Testing for TM & D – An Overview.
From Entrepreneurial to Enterprise IT Grows Up Nate Baxley – ATLAS Rami Dass – ATLAS
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Nagios as a PC Health Monitor Sean
Andrea Sartori Solution Architect EMEA
Chapter 13: Sharing Printers on Windows Server 2008 R2 Networks BAI617.
System Center 2012 Setup The components of system center App Controller Data Protection Manager Operations Manager Orchestrator Service.
Nagios and Mod-Gearman In a Large-Scale Environment Jason Cook 8/28/2012.
Using Nagios XI To Empower Your Developers To Own Their Own Checks Nick Winn Twitter: technick Nagios Forums: technick.
DONE-10: Adminserver Survival Tips Brian Bowman Product Manager, Data Management Group.
Microsoft Internet Security and Acceleration (ISA) Server 2004 is an advanced packet checking and application-layer firewall, virtual private network.
Expert Training Presentation September 2013 Rev 3 Instant Queue Manager Enterprise Click to Chat.
Inventory:OCSNG + GLPI Monitoring: Zenoss 3
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
Client – Server Application Can you create a client server application: The server will be running as a service: does not have a GUI The server will run.
TELE 301 Lecture 10: Scheduled … 1 Overview Last Lecture –Post installation This Lecture –Scheduled tasks and log management Next Lecture –DNS –Readings:
workshop eugene, oregon What is network management? System & Service monitoring  Reachability, availability Resource measurement/monitoring.
Block1 Wrapping Your Nugget Around Distributed Processing.
1 Chapter Overview Publishing Resources in Active Directory Service Redirecting Folders Using Group Policies Deploying Applications Using Group Policies.
7-1 Management Information Systems for the Information Age Copyright 2004 The McGraw-Hill Companies, Inc. All rights reserved Chapter 7 IT Infrastructures.
Oracle 10g Database Administrator: Implementation and Administration Chapter 2 Tools and Architecture.
Sage ACT! 2013 SDK Update Brian P. Mowka March 23, 2012 Template date: October 2010.
A powerful network monitoring system
Operations in HEAnet Brian Nisbet NOC Manager. Operational Overview 30+ Technical Staff. – 75% of whom participate in NOC Duty. 60+ Clients. Expanding.
A Networked Machine Management System 16, 1999.
Graphing and statistics with Cacti AfNOG 11, Kigali/Rwanda.
A Brief Documentation.  Provides basic information about connection, server, and client.
New Delhi, India Smokeping/Cacti/Munin SANOG 10 Workshop August 29-Sep 2 – New Delhi, India Hervey Allen.
Apache JMeter By Lamiya Qasim. Apache JMeter Tool for load test functional behavior and measure performance. Questions: Does JMeter offers support for.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Stuart Kenny and Stephen Childs Trinity.
Switch Features Most enterprise-capable switches have a number of features that make the switch attractive for large organizations. The following is a.
Cluster Consistency Monitor. Why use a cluster consistency monitoring tool? A Cluster is by definition a setup of configurations to maintain the operation.
System Center & SharePoint On- Prem Matija Blagus, Acceleratio
How to use mrtg to monitor traffic on your wireless and wired network a bella mia company.
Citrix XenApp and XenDesktop Monitoring Solution Overview.
Nagios Fusion 2012 Mike Guthrie Twitter: mguthrie88 Projects:
SQL SERVER 2008 Installation Guide A Step by Step Guide Prepared by Hassan Tariq.
AA202: Performance Enhancers for Laserfiche Connie Anderson, Technical Writer.
Your current Moodle 1.9 Minimum Requirements Ability to do a TEST RUN! Upgrading Moodle to Version 2 By Ramzan Jabbar Doncaster College for the Deaf By.
2008 Taipei, Taiwan An Introduction APRICOT 2008 Network Management Workshop February – Taipei, Taiwan Hervey Allen & Phil.
Windows Server 2003 { First Steps and Administration} Benedikt Riedel MCSE + Messaging
Network Monitoring Sebastian Büttrich, NSRC / IT University of Copenhagen Last edit: February 2012, ICTP Trieste
Nate Anderson So, You’ve Inherited an OnBase System.
'08 Rabat Smokeping & Cacti Network Monitoring & Management Tutorial June 1, 2008 – AfNOG 2008 Hervey Allen.
Distributed Monitoring with Nagios: Past, Present, Future Mike Guthrie
'08 Rabat An Introduction AfNOG 2008 Network Management Workshop June 1-2 – Rabat, Morocco Hervey Allen & Phil Regnauld.
Network Management Workshop March – Bangkok, Thailand
UNICOS Application Builder Architecture
Essentials of UrbanCode Deploy v6.1 QQ147
© 2002, Cisco Systems, Inc. All rights reserved.
Collaboration with Existing Controllers
Network Operations and Network Management
Smokeping/Cacti/Munin
The Client/Server Database Environment
Introduction To Networking
Introduction to Computers
WEBINAR: Integrating SpiraTest with JIRA
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Building Web Applications
TechEd /23/2019 9:23 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Presentation transcript:

Bryan Heden Lead Solutions Provider

Introduction & Agenda What we do and what we needed Customized and configured IO issues... Offload all the things! What did we learn? What’s next?

Agile Networks: Who We Are Telecom Provider We Engineer and Operate The Agile Network, a general purpose backhaul network with Last-Mile Agility TM Who We Serve Public Sector … particularly Public Safety Oil/Gas Underserved Communities Enterprise

Customer Examples

What We Needed General insight into network health Ability to maintain SLAs with customers To react to network downtime as fast as possible The Government doesn’t like to wait To monitor traffic across the network

What We Did We chose Nagios XI Easy to use and understand interface No more text based configurations to manage (Haha, just kidding!) Built on top of something we were already comfortable with

How'd That Go? It worked, but not exactly how we wanted it to “WHAT DO YOU MEAN IT DOESN'T AUTOMATICALLY TRANSLATE OIDS INTO HUMAN UNDERSTANDABLE ENGLISH?” “YOU MEAN TO TELL ME THAT OUR EQUIPMENT DOESN'T COME STANDARD WITH NAGIOS PLUGINS OR THAT NAGIOS DOESN'T PRODUCE ONE FOR EACH TYPE OF DEVICE WE USE?!” Etc. Ping worked just fine

If You Build It....The Network Engineers will use it We wrote our own configuration wizards for each different type of device (PTP, PTMP, Routers, Power, GPS) We made some maps Executives love maps! One map tracked health of devices/links between sites along with radar Another map tracked the operating frequencies of active devices

Finally, Some Pictures! The NOC Overview MAP provides our teams insight into the health of every node and their connections on our network.

More Pictures Our Network Engineers can see from a central source what the health and operating frequencies are of our equipment.

And More Pictures My custom built configuration wizards keep our teams working on what they need to work on and allow me to be hands off with system additions.

Stress Testing in Production We reached maximum occupancy Our existing server setup wasn't meant for active checks for this many hosts and services We introduced ModGearman We offloaded MySQL Things got better, but we still had some problems...

IO is a Major Factor Lots of writes, not enough throughput There were suddenly more host and service checks than we could handle with our setup Running on a VM on an ESX Host with 2x10K drives in a RAID1 Bandwidth was only graphing once every 10 to 20 minutes Upgraded the ESX Host drives to 6x15K RAID10 Okay, okay! We upgraded some other stuff on the ESX Host, too This was the single most important decision we had made

But We Didn't Stop There! We offloaded MRTG Set up NFS Share for /var/lib/mrtg so that Nagios could read from it Set up NFS Share for /etc/mrtg so that Nagios could write to it, in order to add host configuraiton files Put both Virtual Machines on the same Host (17 Gb/sec network throughput)

MRTG MRTG had some issues of its own… We had to split the cron job into separate processes This stops MRTG from taking too long to complete its checks, preventing the next process from starting (Remember the 5 to 20 minute graphing issue a few slides back?)

Pictures of Text Here is what MRTG’s cron file looks like after we’ve made our changes:

MRTG Process Splitting How we did it Split the configuration files into logical chunks by size and created separate cron entries for each /etc/mrtg/conf.d/ has multiple subdirectories (1/, 2/, 3/, etc.) Each corresponding process in cron loads the configuration files present in those directories (Include: /etc/mrtg/conf.d/X/*.cfg) We measure each process separately (run time, errors, standard output, logs)

What Else? We did some other things, too… We installed and offloaded SmokePing We created a SmokePing Nagios XI component to increase visibility of our graphs in our NOC We built a portal to SmokePing for a particular client to login and check device health We created a ModGearman Nagios XI component to manage our servers from a central location

SmokePing Component This component keeps our gateway graphs up at all times so we can keep an eye on them, and then rotates graphs from other hosts in each zone so we can (hopefully) notice inconsistencies when they arise.

SmokePing Portal We use a portal that parses the config file for SmokePing hosts, pings them, and shows current status. It also allows the portal user to ping those hosts.

ModGearman Component I was tired of having to repeatedly log in to each ModGearman instance to tweak something when we were still getting everything set! So I wrote this to make my life a little bit easier.

Conclusion What did we learn? Nagios XI can be extended far beyond the default behavior Custom Configuration Wizards, Plugins and Components Custom MRTG installations and scans used in the Config Wizards IO will become an issue, and should be planned for How to build a process for creating customizations Offload what you can!

What’s Next? Big Plans! ● Automating the MRTG Process Splitting ● Releasing a generic and well documented Configuration Wizard Template ● Continuing to grow and expand our current installation Questions ● Do you have any?