Sevenmachines

Monitoring

Monitoring

Rolling instance cycling with elasticsearch

An easy way to cycle EC2 instances where we have an elasticsearch cluster running. As an example target we have, Two instances i-11111111 and i-22222222, both running elastic search as a cluster with replicas set to 2, so that each has a replica of the others primary indices. Add one to the auto-scaling group, increasing desired to 3 Wait for new instance i-333333 to join the cluster

November 12, 2016

Monitoring

Beats Metrics Play

Currently I’m using the logging setup of Beaver shipping logs into an ELK stack, and metrics with collectd shipping metrics into a Graphite stack. Now that Elastic have Beats that do both logging and metrics, its worth exploring further.

June 12, 2016

Monitoring

Continuous Load for Live Services

Just as you start off on a Monday morning, at 9:01am, there’s a page, that crucial, heavily used site is broken, users are blocked from working and frustrated. What went wrong?

April 30, 2016

Monitoring

Triage for Incident Response

One of the main pressures around response to incidents is simply being overwhelmed with tasks, the outcome of so many demands and so much context-switching can easily be chaos, or poor quality quick-fixes. As with all real-time response, the key thing is to take a step-back, and triage the incoming requests as they arrive, prioritising those we need to deal with first, and deferring those that we can tackle later.

April 30, 2016

Monitoring

Elasticsearch: A Struggling Master

Quick walkthrough of a problem on a 3 node elasticsearch cluster first noticed with the generic yellow/red cluster warning. The chain of events causing the problem looks like…

January 9, 2016

Monitoring

Introduction to Sensu

The slides from a quick review of Sensu, in short, Sensu is good! RabbitMQ is the only point of communication needed between clients and servers Setup your client-customisable subscription checks on the server Setup any weird custom checks on your clients Please, please don’t alert on anything but the essentials Really, the above ^^^

January 2, 2016

Monitoring

Introduction to Cloudwatch

Some slides from an investigation into migrating to using Amazons Cloudwatch. Quick summary, Create metrics on Cloudwatch logging streams and alert on them, eg, number of 500’s in a minute You get basic free metrics from AWS, custom metrics are pretty easy to setup You have access to plenty of AWS specific metrics and triggers They are well integrated with other AWS stuff so you can do more advanced Lamda processing But, is it enough to moving away from your custom ELK/graphite type stack?

January 2, 2016