Cloud-Agnostic AI Observability Platform - Architecture
Overview
This document describes the architecture of a cloud-agnostic AI observability platform built on AWS managed services. The platform provides unified monitoring, cost optimization, and operational insights for Large Language Model (LLM) workloads across multiple cloud providers.
Architecture Diagram

Architecture Components
1. LLM Providers Layer (Multi-Cloud)
The platform supports monitoring LLM invocations across multiple providers:
Model Flexibility
The models listed below are the ones used in this demo. Since the platform uses LiteLLM as the AI gateway, you can substitute any LLM supported by LiteLLM — simply update gateway/litellm-config.yaml with your preferred models. The observability pipeline works the same regardless of which models you choose.
AWS Bedrock
- Models: Claude 3 Haiku, Claude 3 Sonnet
- Integration: AWS SDK (boto3)
- Metrics: Token usage, latency, request counts
- Dimension:
CloudProvider=aws
Google Vertex AI
- Models: Gemini 1.5 Pro, Gemini 1.5 Flash
- Integration: Simulated (production would use Google Cloud SDK)
- Metrics: Token usage, latency, request counts
- Dimension:
CloudProvider=gcp