Blog
Qovery
Product
3
minutes

Bug Hunting and improvements week - what we improve on Qovery

During the past two weeks, our Frontend and Backend teams were busy bees 🐝 with a particular sprint dedicated to bug hunting đŸȘČ and improvements! Yes, both of the two weeks sprints were dedicated to bugs and improvements ONLY: tracking them, then solving them; gotta catch ‘em all 🚀
September 26, 2025
Albane Tonnellier
Product Marketing Manager
Summary
Twitter icon
linkedin icon

Why did we use a full sprint for bug hunting?

While it is true that we fix bugs in every sprint (20% maximum of each sprint is dedicated to bug fixes), sometimes it is not enough to fix them all, and as you can imagine, it’s not good practice to let them live in the backlog forever. A lot of companies, such as Dashlane or MyPorsche, are working with a “Zero bug policy”, and at Qovery, we have our way to make sure that bugs are taken care of:

  • In a standard sprint, we can use up to 20% of each sprint for bugs
  • If our backlog is greater than 20% of bugs, we freeze the next sprint to make it a “bug only sprint.”

The reason behind that is the same as for the “Zero bug policy” if you leave bugs live in your backlog, there is a chance that they will never be fixed, and there is nothing worst for a user than seeing the same bug. Over and over!

What are the improvements?

Frontend

  • (feat) filter logs by pod name
  • (fix) avoid having the same colour for different pods
  • (fix) refresh application metrics when switching between apps and during deployment
    the next step is to add the instance type selection instead of the CPU/RAM during the cluster setup

Backend

  • (fix) Special characters in database credentials can lead to unexpected behaviour (if you were using Postgres and Redis, the special characters were not well supported, and it was causing issues for the passwords that had some inside)
  • (feat) [CLI] Allow using an URL to connect via the qovery shell command
  • (feat) Display in logs if the Docker cache is used correctly
  • (feat) Accept instance type selection in cluster setup + provide an endpoint to list available instance types
  • (fix) Desired nodes should be higher or equal to min node
  • (fix) Can’t switch database from private to public and vice versa (container side)
  • (fix) Stopping containerized databases means losing data
  • (fix) “Delete organization” API call not working when there are no clusters

Wrapping up

While it’s true that we are putting a lot of work and effort to build our V3, we also want to make sure that we keep the V2 as stable and user friendly as possible and make changes like improving the application build-time or avoiding having the same colour for different pods may seem minor. Still, all added up; we believe that it will significantly enhance your product use.

Share on :
Twitter icon
linkedin icon
Tired of fighting your Kubernetes platform?
Qovery provides a unified Kubernetes control plane for cluster provisioning, security, and deployments - giving you an enterprise-grade platform without the DIY overhead.
See it in action

Suggested articles

Compliance
Kubernetes
 minutes
Enforcing security baselines across 1,000s of Kubernetes clusters

The part teams consistently underestimate is that OPA Gatekeeper, the tool most people reach for first, only enforces policy at the cluster level. It blocks non-compliant resources from being created within a single cluster. Propagating consistent Gatekeeper policies across 300 clusters, and detecting when those policies drift, is a fleet orchestration problem that Gatekeeper was not designed to solve.

Mélanie Dallé
Senior Marketing Manager
Kubernetes
AI
 minutes
7 best AI deployment platforms for production Kubernetes workloads in 2026

Training a model in a notebook is easy. What breaks teams is the step after, serving it reliably without haemorrhaging cloud budget or burying your SREs in YAML. The common trap: picking a platform that handles the model but not the surrounding stack. An AI deployment platform should orchestrate the full application graph (inference endpoints, vector databases, caching layers, and frontends) inside a single VPC, with GPU autoscaling that doesn't require a dedicated platform engineer to babysit.

Mélanie Dallé
Senior Marketing Manager
Kubernetes
7
 minutes
Kubernetes multi-cluster architecture: the Day-2 enterprise strategy for 2026

The mistake teams make early is assuming Kubernetes namespaces provide sufficient isolation between workloads or teams. They do not. Namespaces share the control plane, the node pool, and the underlying network fabric. A misconfigured workload in one namespace can exhaust node capacity or crash the API server for every other namespace simultaneously. That is when the multi-cluster conversation starts.

Morgan Perry
Co-founder
Cloud Migration
Developer Experience
Engineering
 minutes
[Alan] From nginx to Envoy: What Actually Happens When You Swap Your Proxy in Production

Migrating from nginx Ingress to Envoy Gateway? Discover how Alan migrated 100+ services in one month, the technical hurdles they faced (like Content-Length normalization), and why staging isn't always enough.

William Occelli
Platform Engineer at Alan
DevOps
Kubernetes
 minutes
How to reduce AI infrastructure costs with Kubernetes GPU partitioning

Kubernetes assigns an entire physical GPU to a single pod by default. NVIDIA MIG solves the hardware partitioning side: one A100 becomes up to seven isolated slices. The part teams underestimate is the orchestration layer: device plugin configuration, node labeling, taints, and pod affinity rules all need to be correct before Kubernetes can actually schedule onto those slices.

Mélanie Dallé
Senior Marketing Manager
Kubernetes
8
 minutes
Kubernetes management in 2026: mastering Day-2 ops with agentic control

The cluster coming up is the easy part. What catches teams off guard is what happens six months later: certificates expire without a single alert, node pools run at 40% over-provisioned because nobody revisited the initial resource requests, and a manual kubectl patch applied during a 2am incident is now permanent state. Agentic control planes enforce declared state continuously. Monitoring tools just report the problem.

Mélanie Dallé
Senior Marketing Manager
Kubernetes
6
 minutes
Kubernetes observability at scale: how to cut APM costs without losing visibility

The instinct when setting up Kubernetes observability is to instrument everything and send it all to your APM vendor. That works fine at ten nodes. At a hundred, the bill becomes a board-level conversation. The less obvious problem is the fix most teams reach for: aggressive sampling. That is how intermittent failures affecting 1% of requests disappear from your monitoring entirely.

Mélanie Dallé
Senior Marketing Manager
Kubernetes
 minutes
How to automate environment sleeping and stop paying for idle Kubernetes resources

Scaling your deployments to zero is only half the battle. If your cluster autoscaler does not aggressively bin-pack and terminate the underlying worker nodes, you are still paying for idle metal. True environment sleeping requires tight integration between your ingress layer and your node provisioner to actually realize FinOps savings.

Mélanie Dallé
Senior Marketing Manager

It’s time to change‹the way you manage K8s

Turn Kubernetes into your strategic advantage with Qovery, automating the heavy lifting while you stay in control.