Skip to content


In here you will find curated guidance, how-to's, and links to other resources that help with the application of observability (o11y) to various use cases. This includes managed services such as Amazon Managed Service for Prometheus and Amazon Managed Grafana as well as agents, for example OpenTelemetry and Fluent Bit. Content here is not resitricted to AWS tools alone though, and many open source projects are referenced here.

We want to address the needs of both developers and infrastructure folks equally, so many of the recipes "cast a wide net". We encourge you to explore and find the solutions that work best for what you are seeking to accomplish.


The content here is derived from actual customer engagement by our Solutions Architects, Professional Services, and feedback from other customers. Everything you will find here has been implemented by our actual customers in their own environments.

The way we think about the o11y space is as follows: we decompose it into six dimensions you can then combine to arrive at a specific solution:

dimension examples
Destinations Prometheus · Grafana · OpenSearch · CloudWatch · Jaeger
Agents ADOT · Fluent Bit · CW agent · X-Ray agent
Languages Java · Python · .NET · JavaScript · Go · Rust
Infra & databases RDS · DynamoDB · MSK
Compute unit Batch · ECS · EKS · AEB · Lambda · AppRunner
Compute engine Fargate · EC2 · Lightsail

Example solution requirement

I need a logging solution for a Python app I'm running on EKS on Fargate with the goal to store the logs in an S3 bucket for further consumption

One stack that would fit this need is the following:

  1. Destination: An S3 bucket for further consumption of data
  2. Agent: FluentBit to emit log data from EKS
  3. Language: Python
  4. Infra & DB: N/A
  5. Compute unit: Kubernetes (EKS)
  6. Compute engine: EC2

Not every dimension needs to be specified and sometimes it's hard to decide where to start. Try different paths and compare the pros and cons of certain recipes.

To simplify navigation, we're grouping the six dimension into the following categories:

  • By Compute: covering compute engines and units
  • By Infra & Data: covering infrastructure and databases
  • By Language: covering languages
  • By Destination: covering telemetry and analytics
  • Tasks: covering anomaly detection, alerting, troubleshooting, and more

Learn more about dimensions …

How to use

You can either use the top navigation menu to browse to a specific index page, starting with a rough selection. For example, By Compute -> EKS -> Fargate -> Logs.

Alternatively, you can search the site pressing / or the s key:

o11y space


All recipes published on this site are available via the MIT-0 license, a modification to the usual MIT license that removes the requirement for attribution.

How to contribute

Start a discussion on what you plan to do and we take it from there.

Learn more

The recipes on this site are a good practices collection. In addition, there are a number of places where you can learn more about the status of open source projects we use as well as about the managed services from the recipes, so check out: