2002 Called; They want their rrdtool shell scripts back Dave Josephsen dave@dbg.com
A Brief history of time-series data visualization architectectures 2002 Called; They want their rrdtool shell scripts back A Brief history of time-series data visualization architectectures Dave Josephsen dave@dbg.com
A Tale of 3 Sysadmin A Brief history of time-series data visualization 2002 Called; They want their rrdtool shell scripts back A Brief history of time-series data visualization Architectures A Tale of 3 Sysadmin Dave Josephsen dave@dbg.com
Jer, Per, and quitter (aka Dave) 2012
Jer, traditional needs for Fortune-500 Suitcorp >5000 hosts >20,000 services 1, 9-story office building Plenty of Budget Beefy Hardware 1.5m/1000 hosts 2012
Nagios + NG + Drraw (ho-hum) 2012
Per, near real-time data from Lots of hosts Singularity.gov 80,000 hosts in 80 clusters No budget Mad Scientists No measurable impact allowed 15 second polling interval (max) CPU, Mem, Disk, Net Needs to alert on performance thresholds 2012
Enter Ganglia 2012
That's all fine but what about Nagios? Awesome Nagios Integration Easily send data from Nagios to Ganglia with gmetric Monitor server metrics stored in Ganglia with Nagios with a series of included Nagios plug-ins Check host heartbeat Check single metric on a specific host Check multiple metrics on a specific host Check multiple metrics on a set of hosts Verify a single metric is the same on a set of hosts Display Ganglia graphs in Nagios via the Gweb URL interface Monitor Ganglia with Nagios (duh) 2012
Not just for mad scientists with supercomputers Ganglia is a great fit if You want to offload Performance data processing. You're worried about scale You want a super-lightweight metric gathering agent You need near-real time data You want a really great rrdtool FE Drag scaling, trend-lines, holt-winters forecasting, time-shifts Lots more 2012
Quitter.. er.. Dave: Graph everything always Massive Ginormic DevOps “paradise” (nightmare) Visualize datapoints on irregular intervals Code promotions Function calls LOTS of metrics (millions) Centralized time-series visualization for LOTS of very different data sources Nagios Application instrumentation Sales... thingies 2012
Enter Graphite Life after RRDTool Carbon Whisper Trivial, remote, updates Smart buffering/cacheing Horizontal scalability Whisper Automatic provisioning Interval-agnosticism Type agnosticism Graphite Functions! Typeglobs! Graphic Stolen from: http://www.aosabook.org/en/graphite.html 2012
Not just for billion dollar mega-giants Graphite works great if You want to combine data from multiple monitoring systems Nagios, Ganglia, Collectd etc.. You want to assimilate data from other groups or business units Dev, Sales, etc.. You want really flexible centralized visualization that scales You want to empower non-ops groups to explore their own data 2012
Functions! Rate is the derivative of the counter: Say you have counter data: &target=router1.bytes&target=router2.bytes &target=derive(router1.bytes) OR: &target=router[12].bytes But actually, the raw counter data is kind of interesting if We visualize it correctly: &target=router1.bytes&target=secondYAxis(router2.bytes) 2012
Moar functions! &target=user.registrations &target=summarize(user.registrations,”1h”) &target=summarize(user.registrations,”1h”) &target=summarize(user.registrations,”1h”)&target=threshold(400,”goal) &target=summarize(user.registrations,”1h”)&target=timeShift(summarize(user.registrations,”1h”),”30d”)&target=threshold(400,”goal) 2012
Nagios World Conference OK BYE! http://ganglia.sourceforge.net https://launchpad.net/graphite http://www.aosabook.org/en/graphite.html (and speaking of “buy”...) 2011 Nagios World Conference