The standard in CI today is tests run in the cloud after a commit has been merged. They serve as a double-check for an engineer: did you forget to test some part of your code that you changed?

CI works for humans. The reason is our long-term understanding of a codebase and its evolution. Engineers new to a codebase know they are new and take more care (or get automated emails telling them they broke CI). Humans familiar with the codebase know implicitly what they need to be testing as they work.

The automated email from CI works because it is rare, because we all develop at human speed and breaking HEAD is OK for a little while. Some teams try auto-revert on CI breakage. This works, but you lose a lot of the value of after-the-fact testing in continual retries. Still, it works. It is better than nightly builds and binary searching your way to the culprit.

It does not work for agents. At least not as of June 2026.

There are two problems here. The first is that agents are always new to a codebase. They don’t have all the implicit knowledge of the codebase expert, and so they regress parts of the project they have not paid attention to all the time. It is excruciating developing with an agent and CI. Your agent needs to run all the tests as it is developing to make sure it understands the environment.

The second problem is the agent context window is dead and gone by the time the poor human driving it is stuck with an automated email saying they broke CI. Your agent should have been quietly solving that problem before making it other people’s problem. It has all the time in the world as long as it is not in anyone’s way. Let the computer drive the computer.

So you need to get your org into a place where agents can run all the tests all the time. There is an easy way to do this. Replace your CI with a merge queue.

Merge queue

A merge queue is a script you run to push to origin/main (instead of using a PR UI like we did back in the GitHub, all-human days).

It is very important you run all the tests in the merge queue. Do not have "slow" tests you run daily. In the age of agents, those tests will be broken every day. Someone will spend hours every day chasing after other people's agents. This is not a role anyone on the team wants. It turns out the only thing worse than cleanup up after someone else is cleaning up after someone eles's robot.

After you have a merge queue that works, create a second command to run the merge queue, minus the actual merge. Give it to your agents.

This ensures:

  1. The active context window can run the tests and fix the bugs before inflicting them on other people.
  2. No one, neither human nor machine, breaks the build.

Back when we were all human, and CI was slow, you could make arguments for CI instead of merge queues. Those arguments depended on tests being too slow to put in the merge path, and that will not stand in 2026, tests must be fast to use agents. Now it is impossible. With agents, CI is useless, the merge queue is vastly superior.

You will need expensive computers to power your merge queue. It is worth it.