Monitoring for large infrastructure Radu-Andrei Busnatu, Razvan Dobre
Content About Monitoring Graphite How we deployed Graphite Performance
About Monitoring Monitoring Monitoring – observe how system evolves Graphite Opentsdb InfluxDB etc Alerting – notify when system metrics get out of order – in other talk Nagios Incinga Zabyx
What is does? Stores numeric time-series data Graphite What is does? Stores numeric time-series data wisper files better than rrd Local files on disk Renders graphs of the data on demand
Graphite - Components Carbon Graphite backend daemons Subcomponents Carbon-relay – metrics forwarder and sharding Carbon-aggregator – metrics aggregator Carbon-cache Metrics collection server Listens on port 2003 Metric format a.b.c.d.e.value How to send a metric echo "local.random.diceroll 4 `date +%s`" | nc -q0 ${SERVER} ${PORT}
Graphite - Components Graphite web: Django web app for viewing metrics and creating basic dashboards Support memcache for caching Requires a DB for authentication and saving dashboards Whisper File-based time-series database Better that RRDs – supports backfill One file per metric
Graphite - Architecture
How we deployed Graphite – v0 Carbon relay uses RELAY_METHOD = consistent-hashing Pros it worked for small number of metrics Cons Metrics for single node were scattered on all nodes Hard to clean up Cpu intensive at relay level Carbon Cache Graphite Web Carbon Relay Carbon Cache Carbon Relay Carbon Cache Carbon Cache
How we deployed Graphite – v1 Carbon relay uses RELAY_METHOD = rules Pros it worked for larger number of metrics Easier to clean up Impact in case of failure was lower Data Replication Cons Python application don’t do very well with millions of metrics Carbon Cache Graphite Web Carbon Relay Carbon Cache Carbon Relay Carbon Cache Carbon Cache
How we deployed Graphite – v2
How we deployed Graphite – v2 Replaced carbon-cache with: go-carbon for data collection Carbon-server for data querying Carbon-zipper for carbon-server aggregator Replaced carbon-relay with carbon-c-relay Graphite web connects to carbon-zipper Added grafana for proper dashboarding
Performance