Collectd is already grabbing statistics from my services and drop them into InfluxDB so that I can get pretty graphes with Grafana. One thing I miss is a single page that shows the overall status for all those systems ; somehow also known as Status Page.
There are many times when I see IT managers not monitoring their user’s services at all or setup a general application and consider the job done. Most of the time, they realise much is missing when they have to report about (recurrent) issues regarding business critical services. Monitoring is not to be considered as a cure solution but as a forecast tool. When planned and configured as such, it’ll help prevent predicable failures and drive capacity planning.