Platform/03 · Observe

Observability,
out of the box.

Logs, metrics, traces and deploy events - wired into every service the moment it ships. Plus first-class bridges to Datadog, New Relic, Grafana and the stack you already pay for.

Getting started Book a demo

0-config

logs & metrics

native bridges

< 5s

log-to-console

1 click

trace correlation

AI Agentic Layer · MCP + Skill

Provision

Kubernetes management

Deploy

Application deployment

Observe

Monitoring & observabilityYOU ARE HERE

Optimize

Cost & performance

Secure

Governance & compliance

Why this layer exists

A deploy
without signal
is a guess.

You shipped. Did it work? Logs are in one tool, metrics in another, traces nowhere. The deploy marker never made it to the dashboard. Observe is the layer that closes the loop.

Instrumentation that ships itself

No Datadog agent DaemonSet to debug. No sidecar to forget. Qovery installs and maintains the collector on every cluster - OpenTelemetry out, nothing proprietary locked in.

Signals joined to deploys

Latency climbed at 14:02? Qovery already knows the 14:01 release is the cause. Every chart has the deploy markers. Every incident has a one-click revert.

Use what you already pay for

You have a Datadog contract. Or Grafana Cloud. Or Splunk. Qovery does not replace it - it fans out to it, preserving tags, costs and dashboards.

What Observe gives you

Signals,
without the YAML.

Six capabilities that turn a raw cluster into a service you can actually see.

01Logs, automatic

Every container's stdout/stderr ingested, indexed, searchable. Structured JSON detected. Retention per environment class.

02Metrics, automatic

Node, pod, container, HTTP, DB - Prometheus-compatible. Kept for 30 days in-platform, forwarded to your tool for long-term.

03Traces, one flag

OpenTelemetry auto-instrumentation for Node, Python, Go, JVM, .NET. Toggle on; traces appear.

04Deploy-event correlation

Every release, config change and infra mutation is a timestamped event on every chart.

05SLOs & alert routing

Define SLOs per service. Burn-rate alerts out of the box. Route to PagerDuty, Opsgenie, Slack, webhook.

06Unified dashboards

Per service, per environment, per tenant. Owned by the team that owns the service.

Plays nice

Your tools,
still your tools.

Observe does not ask you to rip and replace. Forward any signal to the vendor you already have a contract with.

DatadogNATIVE

Logs, metrics, traces forwarded with Qovery tags and deploy events.

New RelicNATIVE

NRQL-compatible forwarding. Qovery services appear as entities automatically.

Grafana CloudNATIVE

Prometheus remote-write, Loki logs, Tempo traces.

Splunk

HEC ingestion. Log-only or full OTel pipeline - your choice.

OpenTelemetrySTANDARD

Native wire format. Point any OTel endpoint at Qovery.

Prometheus

Scrape your existing Prometheus or let Qovery scrape your workloads.

PagerDuty

Incident routing, auto-resolve on deploy fix, service-ownership mapping.

Slack / Webhook

Deploy notifications, SLO burn alerts, incident summaries.

AI Agentic Layer

The agent
reads the graph.

A p95 spike, a cost cliff, a dependency timing out - the Agentic layer correlates signals with deploys, proposes a root cause, and offers the fix.

$ Checkout latency doubled at 14:02 UTC. What happened and what do you recommend?

✓Correlate 14:02 deploy events · 3 candidates · checkout-svc v438 matches window

✓Compare p95: payment-gateway client 120 ms → 340 ms (+183 %)

✓Inspect diff · new retry policy added, exponential backoff disabled

✓Root cause: retries saturating upstream · propose revert to v437

✓Rollback plan ready · awaiting approval · ETA 2 m 14 s

DETECT

Detects regressions by itself

The agent watches every signal and flags the ones that matter.

EXPLAIN

Explains the graph

Ask why. The agent walks the dependency chain, cites the spans that moved.

ACT

Acts, safely

Suggested revert, suggested scale-up, suggested circuit breaker - ready to apply, one approval away.

"We had Datadog. We had Grafana. We had traces no one looked at. Qovery didn't replace any of it - it joined them to our deploys, so when latency moves we can tell why in seconds instead of hours."

Marc Dubreuil, VP Engineering · alan

The other layers

Ship it.
Then see it.

Every service gets logs, metrics, traces and deploy markers on day one - forwarded to the tool your team already uses.

See Observe live →Read the Observe docs

Observability,out of the box.

A deploywithout signalis a guess.