What Compositional Generalization Actually Ends
The deeper consequence of π0.7 is not that one model performs well on unfamiliar tasks — it is that the justification for specialized training pipelines has collapsed. The field's working assumption was that generalist models traded performance ceilings for breadth, requiring task-specific fine-tuning to reach production-grade reliability. π0.7 arriving at specialist-level performance without that fine-tuning is not an incremental improvement on that assumption — it invalidates it. Teams that built their robotics roadmaps around depth-first specialization now hold technical debt that was accrued against a premise that no longer holds. The organizations fastest to reorient around foundation-model generalism will set the deployment standard; those that treat π0.7 as a benchmark to beat rather than a paradigm to adopt will build the next generation of specialized models into irrelevance.