The world gets its first AI engineer, Devin
It's capable of doing jobs on Upwork, learning to use unfamiliar tech, and train/fine-tune models.
The SWE-Bench benchmark, which targets real-world GitHub issues in open-source projects, highlights its exceptional problem-solving capabilities.
Outperforming all rivals, it resolved 13.86% of issues unassisted, far surpassing the prior best of 1.96% unassisted and 4.80% assisted.