The Commit Explosion Nobody Was Ready For
AI agents on GitHub aren't a future trend — they're already here, rewriting the rules of software infrastructure in real time. Commit volume on GitHub has grown roughly 14x year-over-year, and as of early 2025 the platform was tracking toward 14 billion commits annually. The system holding all of that code was designed for humans moving at human speed. It was not designed for this.
Kyle Daigle, GitHub's COO, spoke candidly about the platform's scaling crisis and agent-driven transformation on the Latent Space podcast, laying out exactly what breaks when AI stops being a coding assistant and starts being the one doing most of the committing. From our perspective as practitioners who build AI agents for production systems, his account is one of the most honest assessments of what agent-first development actually costs — technically, organizationally, and socially.
The short version: every layer of GitHub's infrastructure has had to be rethought, not scaled. That distinction matters enormously for any team deploying AI agents today.
What 14x Growth Actually Breaks

GitHub's reliability problems weren't caused by more users. They were caused by a qualitative shift in how code gets produced. Agents push larger commits, more frequently, across more repositories simultaneously. A git push that used to carry a predictable payload now regularly carries a thousand-line diff. Pull requests that arrived in human rhythms now arrive in torrents.
The specific failure points Daigle describes are instructive: a central permissioning database (internally called MySQL One) that predates modern sharding assumptions, job queuing infrastructure designed around human-paced throughput, and monorepo sizes growing faster than the blob infrastructure underneath them. None of these broke because of simple volume — they broke because the *shape* of the workload changed.
GitHub Actions, originally a CI/CD layer, has quietly become the general-purpose compute substrate for agentic workflows. Every agent task is a build. Every build consumes CPU. The result is a platform under diagonal pressure: vertical scaling doesn't solve it, horizontal scaling doesn't solve it, and the only path forward is cracking open services that have run untouched for a decade.
Working on something similar? Talk to our team about your project.
The Trust Problem Is Social, Not Technical

Here's the part that gets less attention: when agents write most of the code and agents review most of the code, the human trust layer doesn't disappear — it just gets thinner and more fragile.
Daigle frames this precisely. Pull requests have always been a mechanism for codifying trust — a senior engineer's approval carries weight because of accumulated social capital, not just technical review. When an agent writes the PR and another agent reviews it, that social signal evaporates. What's left is verification without vouching.
This is why proposals like "prompt requests" (submitting the prompt alongside the generated code) or vouching systems from maintainers like Mitchell Hashimoto keep circulating. None of them have become standard yet, and Daigle is honest about why: the PR workflow worked because GitHub waited for community consensus before cementing any practice. The community hasn't reached consensus on what agent-generated code review should look like. The open source social contract is genuinely under negotiation right now.
When agents review agents and humans only spot-check the output, trust stops being a technical problem and becomes a social one. Any team deploying AI agents into a shared codebase is navigating this tension today, whether they've named it or not. The teams doing it well are the ones treating agent output as something to be audited, not just merged.
From Mega-Skills to Micro-Skills

One of the most practically useful observations from Daigle's account is the death of the "mega-skill" — those elaborate, multi-step agent workflows that were supposed to orchestrate entire processes end-to-end.
GitHub's internal AI deployment moved away from these after finding them brittle. A workflow that stitches together six tools to produce a full marketing report sounds powerful until one input changes, and the whole thing silently degrades. What replaced them is a philosophy of atomic, single-purpose micro-skills: one skill that extracts the most important marketing signal from any MCP server, one that summarizes for an analyst audience versus a customer audience, one that parses Obsidian notes for a specific mention. Each does one thing well. Orchestration happens at the prompt level, not the workflow level.
This mirrors what we've seen in production AI agent builds. The most durable AI development approaches are the ones that keep individual agent capabilities narrow and composable, rather than trying to encode business logic into monolithic workflows that become impossible to debug or update.
The other shift Daigle describes is directional: his most valuable agent workflows look *backward* before looking forward. Pull all transcripts, emails, Slack threads, and notes from the past week. Find the patterns. Then ask what to do next. LLMs are genuinely good at retrospection across context — better, arguably, than at predicting forward — and GitHub has built internal tooling (WorkIQ, connected to MCP servers across Teams, Slack, and email) specifically to exploit this.
What Copilot Becomes After Code Completion

The original GitHub Copilot story was about autocomplete. The next chapter is about context. Daigle is explicit that Copilot's evolution — from code completion to CLI, desktop app, cloud agents, and SDK — is converging on a single goal: making GitHub act the way the individual developer wants it to act, with full awareness of their dependencies, their team's decisions, and their own history.
That's a harder problem than generating code. It requires persistent memory, access to ambient context across every communication channel a developer uses, and the judgment to know which context is relevant to a given task. None of the current tooling fully solves it. But the architecture is pointing there: a unified SDK underpinning every Copilot surface, agent compute running on fast-spin Azure VMs, and context pulled from wherever developers actually work.
For teams building on top of GitHub today, the implication is clear: the infrastructure is being rebuilt for an agent-first world, and the teams that understand the new trust model, the new compute model, and the new workflow model will ship faster than those still treating AI as a fancy autocomplete. We're tracking this closely as part of our broader view on where the AI model landscape is heading for builders.
Ready to build? NerdHeadz ships production AI in weeks, not months. Get a free estimate.
AI agents on GitHub aren't augmenting the software development workflow — they're replacing its assumptions. The infrastructure, the trust model, and the tooling philosophy all have to be rethought from first principles, not patched. The teams that internalize this shift early will have a structural advantage over those still treating agents as productivity features bolted onto human-paced processes.
“When agents review agents and humans only spot-check the output, trust stops being a technical problem and becomes a social one.”
