目录 Preface Chapter 1: Introduction A brief history What is SRE? What is in the book? SRE as a framework for new projects Summary References Chapter 2: Monitoring Why monitoring? Instrumenting an application What should we measure? A short introduction to SLIs, SLOs, and error budgets Service levels Error budgets Collecting and saving monitoring data Polling applications Nagios Prometheus Cacti Sensu Push applications StatsD Telegraf ELK Displaying monitoring information Arbitrary queries Graphs Dashboards Chatbots Managing and maintaining monitoring data Communicating about monitoring Do they even know there is monitoring? References and related reading Future reading Summary Chapter 3: Incident Response What is an incident? What is incident response? Alerting When do you alert? How do you alert? Alerting services What is in an alert? Who do you alert? Being on call Communication Incident Command System (ICS) Where do you communicate? Recovering the system
以下为对购买帮助不大的评价