Coding Agents Write the Code. Who Verifies It Works? We Built the Answer.
Coding agents are good at reading a spec and producing code. But producing code is one step in a longer process. The real loop is Spec -> Code -> Deploy -> Test -> Verify -> Ship. Agents stop at step two.
Last week I assigned a Linear issue to a coding agent. Straightforward feature - add a filter to an API endpoint, update the tests, adjust the docs. Ten minutes later I had a PR. The code looked clean. The sandbox tests passed.
Then the real work started. I pulled the branch, spun up my local environment, configured the database seed, launched the app, clicked through the feature manually, found two edge cases the agent missed where the filter broke on empty arrays, pushed fixes, waited for CI. The "10-minute PR" cost me an hour of verification.
For well-tested codebases with clear specs, these tools save real time. I use them daily. They're good at reading a specification and producing code.
But producing code is one step in a longer process.
The broken loop
Every developer knows the real loop:
What developers do:
Spec -> Code -> Deploy -> Test -> Verify ->Ship ✓
Here's where coding agents stop:
What coding agents do:
Spec -> Code -> ??? 🛑
Everything after the code is still on you. And "everything after the code" is where most of the risk lives. A function that passes unit tests in a sandbox can still break when it hits a real database with production-shaped data. A UI change that looks correct in isolation can blow up when it interacts with the actual authentication flow. An API endpoint that works in a clean room can timeout when it talks to a real upstream service.
The agents write code. The humans verify it works. We're back to doing the hard part manually.
The sandbox trap
Every major coding agent runs in some form of isolation. OpenAI's Codex launched with internet access completely disabled - the agent could only interact with code explicitly provided via GitHub repositories. They walked that back weeks later, adding limited network access, because the constraint was too restrictive for real work.
GitHub Copilot's coding agent limits internet access to "a trusted list of destinations." CI/CD workflows won't run without human approval. The agent can't push to existing branches - only ones it creates.
Claude Code runs on standard GitHub-hosted runners. Secure by default. No access to staging environments, no ability to deploy, no preview URLs.
Every agent operates in a clean room that looks nothing like the environment where its code will actually run.
What each agent can actually do today
Write Code
Run Unit Tests
Deploy to Real Env
Run E2E Tests
Preview URL
GitHub Copilot
Yes
Yes (CI)
No
No
No
OpenAI Codex
Yes
Yes (sandbox)
No
No
No
Claude Code
Yes
Yes (runner)
No
No
No
Cursor Cloud
Yes
Yes (VM)
No
No
No
Qovery Agent
Yes
Yes
Yes
Yes
Yes
The security-verification tradeoff
Sandboxing agents makes sense. Giving an autonomous AI agent unrestricted access to your infrastructure is a legitimate security concern. Codex isolates for safety. Copilot blocks CI until a human approves. Claude stays on GitHub's runners.
But security demands isolation, and verification demands access to real infrastructure. Every vendor chose security. Verification stays unsolved.
The result is a gap. The agent writes the code in a sandbox. A human deploys it to a real environment and verifies it works. The loop stays broken.
And even Cursor's solution addresses the development environment problem - giving agents access to repos, dependencies, and build tools. It doesn't solve the deployment verification problem: spinning up an ephemeral environment with real databases and real services, deploying the code, running end-to-end tests, and handing the reviewer a working preview URL.
That last mile - from "code that compiles" to "software that works in a real environment" - remains a manual human task across the entire market.
😩
Before
Reviewing a 2,000-line diff and hoping it works in production
😎
After
Clicking a preview URL and seeing it work
Give your coding agents real environments.
Qovery Agent connects your AI coding agent to real infrastructure - with governance, audit trails, and preview URLs. Runs on your Kubernetes cluster.
I've been building infrastructure-as-a-service for 5 years at Qovery. We manage deployments on Kubernetes for thousands of engineering teams. When I saw coding agents hitting this wall, the answer was obvious: they need the same thing developers need. Environments.
This is what the Qovery Agent does. Here's the actual flow:
You write a spec as a Linear issue
You assign it to the Qovery Agent - it shows up as a real member of your Linear workspace, mentionable and assignable like any teammate
The agent claims the issue, reports that it's starting work, and spins up an ephemeral environment on your Kubernetes cluster via Qovery
Inside that environment, your coding agent runs - Claude Code, Codex, Cursor, Gemini, or OpenCode
It clones the repo, creates a branch, implements the task, deploys the application, runs the tests
It opens a PR. The reviewer gets a preview URL pointing to a live, deployed version of the feature
Progress shows up in Linear in real time - thoughts, actions, errors, a step-by-step checklist
After the PR is open, the environment stays alive for a configurable grace period, then auto-cleans
Linear Issue -> Qovery Agent -> K8s Environment -> Your Agent (Claude / Codex / Cursor) -> Code + Deploy + Test -> PR with Preview URL -> Human Reviews Working Software
The key point: Qovery Agent works with your coding agent. You bring your own. Connect your Anthropic API key and use Claude Code. Connect your OpenAI key and use Codex. Prefer Cursor or Gemini? Plug them in. Five runtimes are supported today. The agent you already use gets the infrastructure it's been missing - it can now deploy, test against real services, and verify its own work.
Security without the tradeoff
Qovery gives agents governed access on your own infrastructure.
Everything the Qovery Agent does runs on your Kubernetes cluster. No data leaves your infrastructure. Inside each workspace container, an HTTP proxy intercepts all outbound agent traffic with full governance controls:
DLP filters catch API keys, private keys, and sensitive file paths before they can leak
Domain allowlists and blocklists control what the agent can reach
A kill switch immediately blocks all outbound traffic if something goes wrong
A real-time approval queue lets admins review and approve specific requests
Every agent request is logged in a full audit trail
The agent gets a real environment. The environment is inside your infrastructure. You define the governance rules. The security-verification tradeoff dissolves.
What changes
The developer loop becomes:
Linear Issue -> Qovery Agent -> Code + Deploy + Test -> PR with Preview URL -> Human Reviews Working Software
The human's job shifts from "pull the branch, spin up an environment, verify the code works" to "click the preview URL, review working software, decide if it ships."
It's bidirectional too. If the agent gets stuck or the reviewer wants changes, they send a message from Linear. The agent wakes up, reads the instructions, and iterates. No context switching, no terminal juggling.
Coding agents solved the code generation problem. The environment problem is what's left. That's what we built.
Romaric founded Qovery to make Kubernetes accessible to every engineering team. He writes about platform strategy, developer experience, and the future of cloud infrastructure.
Next step
Give your coding agents real environments.
Qovery Agent connects your AI coding agent to real infrastructure - with governance, audit trails, and preview URLs. Runs on your Kubernetes cluster.