300
max 100 ms for a round-trip! <100 ms QPS!!! Throughput only ~ 5000 QPS ~25% failed requests use case
GET POST
What we monitor Servers Applications Latencies Money Team performance AWS Network devices
DEV, PM and Metrics SCRUM team monitors their applications SCRUM team decides & implements what and how shall be monitored Product managers monitor business side If it is critical to do “night watch”, OPS gets involved
Why it worked True value Right tools No man in the middle Management support