Announcing Vigilance: An Extensible Dead Man's Switch System
Introducing Vigilance
Vigilance is a dead man’s switch system written in Haskell. The idea is that you register periodical tasks that you’ve configured elsewhere such as:
- Backups
- Periodical billing
- Scripts that you run periodically
You tell vigilance an upper bound of how often the task should run via a config file. You tell vigilance how to contact you if this doesn’t happen (currently HTTP POST and emails). The task can check in to vigilance with either the vigilance
executable or via a REST API. If it doesn’t, check in, you will be notified and vigilance will hold off on other notifications until your task is back online, checking in again.
I created vigilance after a couple of incidents in which I discovered that backups on my production systems or my development box had not been running. I wanted a way to set a failsafe so that I could at least be alerted when something like this happens.
Check it out on github for the full documentation. Download vigilance via cabal install vigilance
or via hackage.
Example
Say I have daily backups that run on my server. I want to be notified if the server goes more than 36 hours without completing a backup.
First, I install vigilance from cabal:
cabal update && cabal install vigilance
This provides 2 executables, vigilance
and vigilance-server
. You can configure the build ot not include the server on client boxes for a faster install with fewer dependencies. vigilance-server
will handle the state of your watches, notifications, etc. vigilance
is what you will use to manage your watches and do check-ins.
Let’s write a config file to ~/.vigilance/server.conf
(the default location):
# ~/.vigilance/server.conf
vigilance {
port = 9999
watches {
backups {
interval = [36, "hours"]
notifications = [
["email", "me@example.com"],
["email", "joe@example.com"],
["http", "http://example.com/in-case-of-emergency"]
]
}
}
}
Vigilance tries its best to have reasonable defaults for its configuration. All state data and logs will be stored in a .vigilance
directory in your user’s home directory. Fire up your server by running:
vigilance-server
Since we’re using a non-standard port for our client, let’s write a config at ~/.vigilance/client.conf
vigilance {
host = "localhost"
port = 9999
}
Your crontab would look something like this:
@daily run_backups.sh && vigilance checkin backups.
If run_backups.sh blows up, 2 emails will be fired off and an HTTP post of the failure will be posted to your callback URL.
Adding or removing watches is as simple as editing the server.conf
file and sending a HUP signal to the vigilance-server
process:
kill -HUP pid_of_vigilance_server
If you have any ideas on more notifiers you’d like to see added, please file an issue on the github issue tracker.