Blog
Engineering
Qovery
7
minutes

How We Built Qovery - Part 1

I am excited to launch a new series of engineering articles to dig into all the details of How we Built Qovery. A platform built for DevOps, SRE, Platform Engineers, and Developers since January 2020. Since day 1, the Qovery team has strived to make Qovery as open as possible and fight against the black box effect! In this series of 5 articles, I will explain as much as possible how things work behind the scene. I will start with a high-level view of the services and the architectural design we have chosen to put in place. Let's go!
September 26, 2025
Romaric Philogène
CEO & Co-founder
Summary
Twitter icon
linkedin icon

Some context

Today, developers' teams need to be more autonomous to deliver new features rapidly. However, providing access to infrastructure without sufficient control results in IT waste, including costly cloud security breaches and skyrocketing infrastructure costs. It is not acceptable for most organizations, especially in highly regulated industries. The deep integration of Qovery services (databases, load balancers, domains…) enables developers to manage their own cloud environments in non-production environments while enabling DevOps engineering teams to accelerate infrastructure provisioning dramatically.

In a nutshell, Qovery is a cloud deployment platform focusing on the User Experience. As a Developer, you can test and release features by creating on-demand environments with a delightful experience. As a DevOps, you can integrate Qovery in your existing workflow and keep full control of all pieces to ensure the best practices and security.

To provide such a consolidated experience between Developers and DevOps engineers, we had to create an abstraction layer on top of many infrastructure components such as Kubernetes, load balancers, VPC, security group, domains, etc... We got inspired by platforms such as Heroku for the Developer Experience, VMWare and Rancher for the Ops experience. This is how Qovery has emerged in 3 years and now helps more than 30 000 developers and hundreds of Platform Engineering teams.

Read more about what Platform Engineering is.

It's all remote

If you look at this demo video, you will see that in less than 3 minutes, you can deploy an application from a GitHub repository to AWS. And if you look at this one, you will see that you can clone a complete environment and infrastructure in one click. All those actions happen remotely! Qovery is a remote system that executes actions on the linked cloud account. When using Qovery for the first time, you need to connect your cloud account, and then Qovery takes care of the rest. Creating the VPC, Security Groups, Kubernetes cluster, Prometheus, ... all the services required to run your applications the best way.

Overview of Qovery Architecture (simplistic view)

The Qovery Engine (open-source) is in charge of bootstrapping your infrastructure, managing the upgrade, and deploying your applications.

A user installs Qovery on his cloud account (simplistic view)

Here is an example of what happens when a user installs Qovery on his cloud account. The interesting part is that a local Qovery Engine (2) pulls a task from the control plane, generates all the configuration files (Terraform, Helm, and others), and bootstraps the full infrastructure on the target remote cloud account. The process is fully automatic and takes between 10 to 30 minutes, depending on the cloud provider target. Once the infrastructure is ready, a remote Qovery Engine is created and initiates a secured connection to the control plane.

A user deploy a first application with Qovery (simplistic view)

This is where the user can deploy the first applications and services. Qovery takes care of everything, from the load balancers, temporary domain, TLS, and all the steps required to make the application ready to be used.

Deployment via Git (simplistic view)

And because Qovery integrates with GitHub, GitLab and Bitbucket, if a commit happens on a monitored repository, then Qovery will automatically trigger a new deployment.

Qovery also integrates into an existing CI with an existing Container Registry (simplistic view)

Qovery also integrates with external Container Registries and CI platforms like GitHub Actions, GitLab CI, and Circle CI. I will not cover all those use cases here, but I will in the coming parts - since it's very common to use cases from DevOps and Platform Engineers using Qovery.

Qovery Control Plane

Hundreds of Qovery Engines pull tasks from the Control Plane (simplistic view)

The Qovery Control Plane is in charge of forwarding tasks to the appropriate Qovery Engine. Indeed, in the case of Qovery, hundreds of Qovery Engines are connected to the Control Plane waiting for new tasks. The connection is initiated by the Qovery Engine to the control plane via gRPC/TLS.

Multiple user interfaces are provided (simplistic view)

The Control Plane also provides a rich open web API that serves our open-source web interface, Terraform Provider and CLI.

This how the git hook requests are handled by Qovery

Thousands of requests per second can hit the Control Plane. That's why it's composed of multiple services with their responsibility. (We'll talk about it in Part II)

Qovery Engine

The Qovery Engine is in charge of executing the tasks pulled from the Control Plane. From infrastructure creation to application deployment and network configuration. The Qovery Engine behaves like a state machine.

Qovery Engine workflow (simplistic view)

The goal is to reach the desired state and report all the operations to the Control Plane of what's going on. In case of task execution failure, the Qovery Engine can remediate it by itself. You can watch this short presentation to get a deep view of how the Engine works 👇

Other services

We developed dozens of services to provide the following features:

  • Remote Shell Service: a service written in Rust to provide remote secure access to a pod with a wonderful Developer Experience.
  • Scheduler: a service written in Rust to schedule tasks over time (E.g, Start and Stop feature).
  • Agent: a service written in Rust that forwards the app logs from Loki, retrieves Kubernetes metrics, and states to the Control Plane.
  • Webhook Gateway: a service written in Rust built to handle thousands of Git webhook requests per second.
  • WebSocket Gateway: a service written in Rust to provide real-time data to the control plane.
  • Pleco: A tool to automatically removes Cloud managed services and Kubernetes resources based on tags with TTL

We also use third-party services like:

  • Postgres and Redis to store the control plane data
  • Auth0 (acquired by Okta) for authentication
  • Kubernetes for container scheduling
  • Loki to store infra logs into S3
  • Chargebee and Stripe for payment
  • Grafana and Tableau for data analysis
  • BigQuery for storing usage events
  • Posthog for product analytics
  • Intercom and Discourse for providing community and dedicated support

Side note: our control plane's main service (core) is not written in Rust (haha) but in Kotlin and with the Spring Boot framework.

3 Architectural Pillars

Zero Trust

Qovery is built to make all the infrastructure of our customers safe from external attacks and even internal attacks. Environment Variables, Cloud Credentials, and Connections to the Qovery Control Plane are all encrypted (KMS). We'll dig into this in the next parts.

Autonomous

If you remove the Control Plane, the Qovery Engine will no longer be able to pull tasks, but the overall remote infrastructure will not be impacted. Meaning that no downtime can happen on the remote infrastructure since it's non-dependent on Qovery to run.

Resilient

Since Qovery is built on Kubernetes and the managed services of the supported cloud providers, all resources are out of the box and configured the right way to be resilient. E.g., When Qovery installs for you a Kubernetes cluster, it's at least running 3 worker nodes in 3 Availability Zones (AZ). So you don't have to worry about if the best practices have been applied.

Transparency

Qovery V3 will add a fourth pillar - Transparency! In a nutshell, you can track down every change of what happens on your infrastructure with Qovery. We'll have the chance to talk about it later on.

Wrapping up

This first part gives us a better sense of how Qovery works. Qovery is a remote system divided into 2. On one side, the control plane is in charge of all the business logic; on the other, the Qovery Engine manages the required infrastructure to deploy the user applications. Security is at the heart of the Qovery design. The Qovery Engine initiates secure connections to the Control Plane and pulls tasks that it is in charge of executing. Dozen of other services are also involved in providing an outstanding user experience.

In the second part, we will dig deeper into how the control plane works. Stay in touch

---

👋🏼 Pssttt, we are launching our first Platform Engineering Newsletter - feel free to subscribe

Share on :
Twitter icon
linkedin icon
Tired of fighting your Kubernetes platform?
Qovery provides a unified Kubernetes control plane for cluster provisioning, security, and deployments - giving you an enterprise-grade platform without the DIY overhead.
See it in action

Suggested articles

Kubernetes
 minutes
How to automate environment sleeping and stop paying for idle Kubernetes resources

Scaling your deployments to zero is only half the battle. If your cluster autoscaler does not aggressively bin-pack and terminate the underlying worker nodes, you are still paying for idle metal. True environment sleeping requires tight integration between your ingress layer and your node provisioner to actually realize FinOps savings.

Mélanie Dallé
Senior Marketing Manager
Kubernetes
DevOps
6
 minutes
10 best Kubernetes management tools for enterprise fleets in 2026

The biggest mistake enterprises make when evaluating Kubernetes management platforms is confusing infrastructure provisioning with Day-2 operations. Tools like Terraform or kOps are excellent for spinning up the underlying EC2 instances and networking, but they do absolutely nothing to prevent configuration drift, automate certificate rotation, or right-size your idle workloads once the cluster is actually running.

Mélanie Dallé
Senior Marketing Manager
DevOps
Kubernetes
Platform Engineering
6
 minutes
10 best Red Hat OpenShift alternatives to reduce licensing costs

For years, Red Hat OpenShift has been the safe choice for heavily regulated, on-premise environments. It operates as a secure fortress. But in the public cloud, that fortress acts as an expensive prison. Paying proprietary per-core licensing fees on top of your standard AWS or GCP compute bill is a redundant "middleware tax." Escaping OpenShift requires decoupling your infrastructure from your developer experience by running standard, vanilla Kubernetes paired with an agentic control plane.

Morgan Perry
Co-founder
AI
Product
3
 minutes
Qovery Skill for AI Agents: Deploy Apps in One Prompt

Use Qovery from Claude Code, OpenCode, Codex, and 20+ AI Coding agents

Romaric Philogène
CEO & Co-founder
Kubernetes
 minutes
Stopping Kubernetes cloud waste: agentic automation for enterprise fleets

Agentic Kubernetes resource reclamation is the practice of using an autonomous control plane to continuously identify, suspend, and delete idle infrastructure across a multi-cloud Kubernetes fleet. It replaces manual cleanup and reactive autoscaling with intent-based policies that act on business state, eliminating the configuration drift and cloud waste typical of unmanaged fleets.

Mélanie Dallé
Senior Marketing Manager
Platform Engineering
Kubernetes
DevOps
10
 minutes
What is Kubernetes? The reality of Day-2 enterprise fleet orchestration

Kubernetes focuses on container orchestration, but the reality on the ground is far less forgiving. Provisioning a single cluster is a trivial Day-1 exercise. The true operational nightmare begins on Day 2. Teams that treat multi-cloud fleets like isolated pets inevitably face crushing YAML configuration drift, runaway AWS bills, and severe scaling bottlenecks.

Morgan Perry
Co-founder
Kubernetes
DevOps
5
 minutes
Top 10 Rancher alternatives in 2026: beyond cluster management

Rancher solved the Day-1 problem of launching clusters across disparate bare-metal environments. But in 2026, launching clusters is no longer the bottleneck. The real failure point is Day-2: managing the operational chaos, security patching, and configuration drift on top of them. Rancher is a heavy, ops-focused fleet manager that completely ignores the application developer. If your goal is developer velocity and automated FinOps, you must graduate from basic fleet management to an intent-based Kubernetes Management Platform like Qovery.

Morgan Perry
Co-founder
AI
Compliance
Healthtech
 minutes
Agentic AI infrastructure: moving beyond Copilots to autonomous operations

The shift from AI copilots to autonomous agents is redefining infrastructure requirements. Discover how to build secure, stateful, and compliant Agentic AI systems using Kubernetes, sandboxing, and observability while meeting EU AI Act standards

Mélanie Dallé
Senior Marketing Manager

It’s time to change
the way you manage K8s

Turn Kubernetes into your strategic advantage with Qovery, automating the heavy lifting while you stay in control.