The Bottleneck Shifted Before Anyone Staffed for It
Code generation speed, once the central argument for AI adoption, has proven irrelevant to delivery timelines when review capacity does not scale alongside it. A one-developer case study captures the structure: one engineering team moved from two or three PRs per week to eleven in a single day — the code-writing part accelerated, the review queue did not. GitHub acknowledged this directly with its Stacked PRs release, an infrastructure change that only makes sense if the review queue is now the binding constraint.
The economics of enterprise AI tooling reinforce why this catch was missed. Cursor charges per seat; Copilot arrives bundled; the pricing is attached to generation, not to the review labor that generation creates downstream . Teams optimized for what the tool measured — code output — and discovered the cost in the metric the tool did not track: review hours accumulated per sprint. The review debt is structural, not incidental, and it does not self-correct when teams add more AI.