DevOps measurement: Monitoring and observability | DevOps capabilities | Google Cloud
- Monitoring
- is tooling or a technical solution that allows teams to watch and understand the state of their systems.
- Monitoring is based on gathering predefined sets of metrics or logs.
- Observability
- is tooling or a technical solution that allows teams to actively debug their system.
- Observability is based on exploring properties and patterns not defined in advance.
To do a good job with monitoring and observability
- Reporting on the overall health of systems
- Are my systems functioning?
- Do my systems have sufficient resources available?
- Reporting on system state as experienced by customers
- Do my customers know if my system is down and have a bad experience?
- Monitoring for key business and systems metrics.
- Tooling to help you understand and debug your systems in production.
- Tooling to find information about things you did not previously know (that is, you can identify unknown unknowns).
- Access to tools and data that help trace, understand, and diagnose infrastructure problems in your production environment, including interactions between services.
How to implement monitoring and observability
Monitoring and observability solutions are designed to do
- Provide leading indicators of an outage or service degradation.
- Detect outages, service degradations, bugs, and unauthorized activity.
- Help debug outages, service degradations, bugs, and unauthorized activity.
- Identify long-term trends for capacity planning and business purposes.