Blog
Engineering
5
minutes

Understanding the Basics of Application Autoscaling

Application autoscaling is a considerable subject. At first, it looks simple because everyone understands the goal and how conceptually it works, but it’s not that simple in practice.
September 26, 2025
Pierre Mavro
CTO & Co-founder
Summary
Twitter icon
linkedin icon

Let’s start with a simple schema explaining what type of app auto-scaling exists today:

Horizontal Autoscaling vs. Vertical Autoscaling vs. Multi-dimensional Autoscaling

At Qovery, we’re using horizontal and vertical autoscaling on a daily basis for our production at different levels, and the result is excellent when the tuning is made after days/weeks of statistics, analysis, and configuration.

Graph of Qovery Engine autoscaling triggered

Horizontal autoscaling

Horizontal Autoscaling

Horizontal autoscaling is scaling up an application by adding more app instances to distribute the workload across all instances, allowing for increased capacity and improved performance.

Application autoscaling is different than cluster autoscaling, but it's somehow related. In the context where your app runs on a cluster (E.g., Kubernetes), the more your app scales up the number of instances, the more your app is likely to consume cluster resources, and the more the cluster is likely to scale up nodes.

Horizontal scaling refers to automatically adding or removing instances based on predefined rules or metrics. When the workload increases, additional instances are dynamically provisioned to handle the increased demand. Conversely, excess instances are automatically terminated when the workload decreases to optimize resource utilization and cost.

Horizontal autoscaling offers several benefits, including:

  • Improved performance: By distributing the workload across multiple instances, horizontal scaling can handle increased traffic or resource-intensive tasks more effectively, reducing response times and improving overall performance.
  • Enhanced availability: Additional instances provide redundancy and fault tolerance. If one instance fails or becomes overloaded, the load can be automatically distributed to other instances, ensuring uninterrupted service availability.
  • Scalability: Horizontal scaling allows for seamless expansion of an application or system by adding more instances. This flexibility enables businesses to accommodate sudden surges in traffic or increased demand without impacting performance.
  • Cost optimization: Autoscaling allows you to allocate resources based on actual demand. Scaling up or down based on workload ensures efficient resource utilization, preventing overprovisioning and reducing unnecessary costs.

Vertical autoscaling

Vertical Autoscaling

Vertical autoscaling is a way to make your application more resource-autonomous by upgrading the resources of a single application instead of adding more machines (and scaling horizontally). It's like boosting your computer by increasing its CPU power, memory, storage, or network capacity.

With vertical autoscaling, you can improve your application's performance without the need to manage many instances. It simplifies administration and reduces the complexity of handling a distributed system.

However, there's a maximum limit to how much you can upgrade an instance before hitting hardware constraints. Also, scaling up or down vertically may require restarting or reconfiguring the machine, resulting in temporary downtime or disruption.

Vertical autoscaling is commonly used in traditional setups or when the workload can't be easily distributed across multiple instances. It's handy for applications that require a lot of computational power, memory, or specialized hardware configurations.

Although horizontal autoscaling has gained popularity with cloud computing and containers, vertical autoscaling still plays a role in optimizing the performance and resource utilization of individual instances in specific situations.

Multidimensional autoscaling (Google proprietary)

Multi-dimensional Autoscaling

Multidimensional autoscaling is like having a super-smart system that automatically adjusts the resources of your application or system in multiple dimensions to handle changing demands. It's all about ensuring your application has the right power and capacity when needed.

Think of it as a dynamic team of helpers that can scale up or down in terms of the number of instances and by upgrading or downgrading the resources within each instance. It's like giving your application a turbo boost or dialing it down when the workload changes.

With multidimensional autoscaling, you don't have to adjust resources or add more instances manually. The system takes care of it for you, continuously monitoring metrics like CPU usage, memory, network traffic, or any other custom-defined criteria.

When your application is experiencing high traffic or increased resource demands, multidimensional autoscaling will intelligently add more resources to ensure smooth performance and prevent any slowdowns or crashes. On the other hand, when the workload decreases, it will automatically scale down to optimize resource usage and save costs.

The beauty of multidimensional autoscaling is that it considers multiple factors to make the right decisions. It's like having a super-smart teammate who knows when to boost your application and when to hold back to avoid wasting resources.

By employing multidimensional autoscaling, you can ensure your application stays resilient, responsive, and cost-effective. It's like having a magical elastic system that expands and contracts as needed, effortlessly adapting to the ever-changing demands of your application.

Unfortunately, this feature is exclusive to Google Cloud and unavailable as an open-source project.

Conclusion

Tree solutions exist. The most common autoscaling is definitively horizontal autoscaling. A lot of large companies already use it, and it works well in a lot of situations. Vertical autoscaling is helpful, but limitations restrict a lot of its usage. And multidimensional may be the best, but it requires you to know your application very well when setting limits.

Tests are mandatory to ensure the behavior of the autoscaler is the one expected for your application!

Share on :
Twitter icon
linkedin icon
Tired of fighting your Kubernetes platform?
Qovery provides a unified Kubernetes control plane for cluster provisioning, security, and deployments - giving you an enterprise-grade platform without the DIY overhead.
See it in action

Suggested articles

Kubernetes
 minutes
How to automate environment sleeping and stop paying for idle Kubernetes resources

Scaling your deployments to zero is only half the battle. If your cluster autoscaler does not aggressively bin-pack and terminate the underlying worker nodes, you are still paying for idle metal. True environment sleeping requires tight integration between your ingress layer and your node provisioner to actually realize FinOps savings.

Mélanie Dallé
Senior Marketing Manager
Kubernetes
DevOps
6
 minutes
10 best Kubernetes management tools for enterprise fleets in 2026

The biggest mistake enterprises make when evaluating Kubernetes management platforms is confusing infrastructure provisioning with Day-2 operations. Tools like Terraform or kOps are excellent for spinning up the underlying EC2 instances and networking, but they do absolutely nothing to prevent configuration drift, automate certificate rotation, or right-size your idle workloads once the cluster is actually running.

Mélanie Dallé
Senior Marketing Manager
DevOps
Kubernetes
Platform Engineering
6
 minutes
10 best Red Hat OpenShift alternatives to reduce licensing costs

For years, Red Hat OpenShift has been the safe choice for heavily regulated, on-premise environments. It operates as a secure fortress. But in the public cloud, that fortress acts as an expensive prison. Paying proprietary per-core licensing fees on top of your standard AWS or GCP compute bill is a redundant "middleware tax." Escaping OpenShift requires decoupling your infrastructure from your developer experience by running standard, vanilla Kubernetes paired with an agentic control plane.

Morgan Perry
Co-founder
AI
Product
3
 minutes
Qovery Skill for AI Agents: Deploy Apps in One Prompt

Use Qovery from Claude Code, OpenCode, Codex, and 20+ AI Coding agents

Romaric Philogène
CEO & Co-founder
Kubernetes
 minutes
Stopping Kubernetes cloud waste: agentic automation for enterprise fleets

Agentic Kubernetes resource reclamation is the practice of using an autonomous control plane to continuously identify, suspend, and delete idle infrastructure across a multi-cloud Kubernetes fleet. It replaces manual cleanup and reactive autoscaling with intent-based policies that act on business state, eliminating the configuration drift and cloud waste typical of unmanaged fleets.

Mélanie Dallé
Senior Marketing Manager
Platform Engineering
Kubernetes
DevOps
10
 minutes
What is Kubernetes? The reality of Day-2 enterprise fleet orchestration

Kubernetes focuses on container orchestration, but the reality on the ground is far less forgiving. Provisioning a single cluster is a trivial Day-1 exercise. The true operational nightmare begins on Day 2. Teams that treat multi-cloud fleets like isolated pets inevitably face crushing YAML configuration drift, runaway AWS bills, and severe scaling bottlenecks.

Morgan Perry
Co-founder
Kubernetes
DevOps
5
 minutes
Top 10 Rancher alternatives in 2026: beyond cluster management

Rancher solved the Day-1 problem of launching clusters across disparate bare-metal environments. But in 2026, launching clusters is no longer the bottleneck. The real failure point is Day-2: managing the operational chaos, security patching, and configuration drift on top of them. Rancher is a heavy, ops-focused fleet manager that completely ignores the application developer. If your goal is developer velocity and automated FinOps, you must graduate from basic fleet management to an intent-based Kubernetes Management Platform like Qovery.

Morgan Perry
Co-founder
AI
Compliance
Healthtech
 minutes
Agentic AI infrastructure: moving beyond Copilots to autonomous operations

The shift from AI copilots to autonomous agents is redefining infrastructure requirements. Discover how to build secure, stateful, and compliant Agentic AI systems using Kubernetes, sandboxing, and observability while meeting EU AI Act standards

Mélanie Dallé
Senior Marketing Manager

It’s time to change
the way you manage K8s

Turn Kubernetes into your strategic advantage with Qovery, automating the heavy lifting while you stay in control.