Blog
Engineering
DevOps
Product
4
minutes

How We Built an Agentic DevOps Copilot to Automate Infrastructure Tasks and Beyond

At Qovery, our goal is simple: eliminate the grunt work of DevOps. The idea of an assistant that can understand developer intent and autonomously take action on infrastructure has always felt like the holy grail. In February 2025, we started building that assistant - our DevOps Copilot. Today, our Agentic DevOps Copilot is live in Alpha. It helps developers automate deployments, optimize infrastructure, and answer advanced configuration questions. But getting here took multiple iterations. Here’s a look at the 4 technical phases that shaped its design and reliability.
Romaric Philogène
CEO & Co-founder
Summary
Twitter icon
linkedin icon
If you are curious to see our Agentic DevOps Copilot in action, here are a few short demo videos:

Phase 1: The Basic Agent

We started with a simple agent architecture: detect a user’s intent, and map that intent to a tool or action.

Each intent is a tool executed

For example:

“Stop all dev environments after 6pm” → matches the stop-env tool

Each intent had its own mapping logic, and tools were invoked accordingly.

✅ Pros:

  • Easy to implement
  • Predictable behavior
  • Clear control over each action

❌ Cons:

  • Every new intent had to be hardcoded
  • Complex workflows required chaining tools manually
  • No flexibility for unexpected or unplanned user requests

This version served well for a few internal use cases. But the moment real users started asking real questions — like "How can I optimize my Dockerfile?" or "Why is my deployment time high this week?" — we hit a wall. We needed something more flexible.

Phase 2: Going Agentic

The second phase was a real leap: we designed an Agentic system.

Agentic DevOps Copilot - plan and execution

Instead of hardcoded intent-to-tool mapping, the DevOps AI Agent Copilot receives user input, analyzes it, and dynamically plans a sequence of tool invocations to fulfill the request. Think of it as having a toolbox and letting the AI figure out how to use the tools in the right order.

Each tool:

  • Has a clear interface (input/output)
  • Is versioned and stateless
  • Can be independently tested and improved

Benefits:

  • Far more scalable and flexible
  • Can solve unanticipated user needs
  • Encourages clean tool abstraction
Here are a few examples of what the Agentic DevOps Copilot can do. Possibilities are endless

Challenges:

  • Tool chaining is fragile - outputs must match expected inputs
  • If one tool fails or behaves unexpectedly, the whole plan breaks

That’s when we realized: a dynamic system needs to be able to fail gracefully and recover. That took us to phase three.

Phase 3: Resilience and Recovery

We added resiliency layers and robust retry logic into the agentic execution flow.

Now, if the agent misuses a tool or the tool returns an unexpected output, the system:

  • Analyzes the failure
  • Updates its plan or fixes the step
  • Retries with a corrected approach

This required tracking intermediate state, running validation between tool steps, and allowing re-planning if an execution fails.

Without this, reliability drops fast. With it, we started seeing successful completions of multi-step workflows that weren’t even anticipated during development.

Phase 4: Agentic + Memory

At this point, the Agentic Copilot could dynamically respond and recover from errors, but each request was treated in isolation.

Without conversation memory: this was how a conversation worked before.
With conversation memory: this is how it works today

That’s not how humans work. If I ask a follow-up question like:

“What about the staging cluster?”

… it should relate to my previous question about the production cluster. But it didn’t.

So we introduced conversation memory. It allows the Agentic DevOps Copilot to:

  • Reuse previous answers
  • Understand references and context
  • Maintain continuity across a session

This drastically improved user experience - and opened the door to deeper, multi-step optimization and monitoring tasks.

What's Next

The Agentic DevOps Copilot is just getting started. We’re exploring:

  • Improving the speed of planning: It can take up to 10 seconds to plan complex tasks, which is ok in a testing phase, but not so great for production.
  • Self-hosted models: We use Claude Sonnet 3.7 right now, and even if we don't send sensitive information, we are a business solution and want to let our users use models that fit their compliance standards.
  • Long-term memory across sessions: to tailor the experience to each user and learn from the previous experience.

We’re not building another chatbot. We’re building DevOps automation with a brain - so your team can focus on building products, not managing infrastructure.

Want to try it?

Our Copilot is in Alpha. You can ask it:

  • “Generate usage stats over the last 30 days for team X.”
  • “Optimize this Dockerfile.”
  • “Stop all environments inactive for 6h and notify the team.”

If you’re a Qovery user and want early access, check the Slack message in your workspace or contact us directly.

--

Learn more about how we use QDrant - a super powerful open-source Vector Database

Share on :
Twitter icon
linkedin icon
Ready to rethink the way you do DevOps?
Qovery is a DevOps automation platform that enables organizations to deliver faster and focus on creating great products.
Book a demo

Suggested articles

Kubernetes
 minutes
Kubernetes Deployment Strategies: Pros, Cons & Use Cases

Master Kubernetes deployment strategies: Rolling Update, Recreate, Blue/Green, and Canary. Learn the pros, cons, and use cases to choose the right strategy based on your uptime, risk tolerance, and resources. Simplify complex rollouts with automation.

Mélanie Dallé
Senior Marketing Manager
DevOps
Developer Experience
 minutes
AWS ECS vs. EKS vs. Elastic Beanstalk: A Comprehensive Guide

Confused about which AWS container service to use? This comprehensive guide compares the trade-offs between simplicity, control, and complexity for ECS, EKS, and Elastic Beanstalk to help you choose the right platform for your application.

Mélanie Dallé
Senior Marketing Manager
DevOps
AWS
7
 minutes
Migrating from ECS to EKS: A Complete Guide

Planning your ECS to EKS migration? Learn the strategic business case (portability, ecosystem access), navigate the step-by-step roadmap, and avoid common pitfalls (networking, resource allocation). Discover how Qovery automates EKS complexity for a seamless transition.

Morgan Perry
Co-founder
DevOps
 minutes
Fargate Simplicity vs. Kubernetes Power: Where Does Your Scaling Company Land?

Is Fargate too simple or Kubernetes too complex for your scale-up? Compare AWS Fargate vs. EKS on cost, control, and complexity. Then, see how Qovery automates Kubernetes, giving you its power without the operational headache or steep learning curve.

Mélanie Dallé
Senior Marketing Manager
DevOps
Cloud Migration
 minutes
FluxCD vs. ArgoCD: Why Qovery is the Better Way to Do GitOps

Dive into the ultimate FluxCD vs. ArgoCD debate! Learn the differences between these top GitOps tools (CLI vs. UI, toolkit vs. platform) and discover a third path: Qovery, the DevOps automation platform that abstracts away Kubernetes complexity, handles infrastructure, and lets you ship code faster.

Mélanie Dallé
Senior Marketing Manager
Qovery
 minutes
Our rebrand: setting a new standard for DevOps automation

Qovery unveils its new brand identity, reinforcing its mission to make DevOps simple, intuitive, and powerful. Discover how our DevOps automation platform simplifies infrastructure, scaling, security, and innovation across the full DevOps lifecycle.

Romaric Philogène
CEO & Co-founder
Qovery
3
 minutes
We've raised $13M Series A to make DevOps so simple, it feels unfair

I'm excited to announce our $13M Series A, led by IRIS and Crane Venture Partners with support from Datadog founders and Speedinvest. This investment will fuel our mission to make DevOps simple and scalable, expand in the US and Europe, and accelerate product innovation.

Romaric Philogène
CEO & Co-founder
Observability
 minutes
Qovery Observe is Here: Your Deployments, Your Data, Your Visibility

Monitor your deployments with Qovery Observe: real-time metrics, logs, and events, directly integrated with your AWS applications and containers.

Julien Dan
Technical Product Manager

It’s time to rethink
the way you do DevOps

Say goodbye to DevOps overhead. Qovery makes infrastructure effortless, giving you full control without the trouble.