The world gets its first AI engineer, Devin

It's capable of doing jobs on Upwork, learning to use unfamiliar tech, and train/fine-tune models.

The SWE-Bench benchmark, which targets real-world GitHub issues in open-source projects, highlights its exceptional problem-solving capabilities.

Outperforming all rivals, it resolved 13.86% of issues unassisted, far surpassing the prior best of 1.96% unassisted and 4.80% assisted.

Previous
Previous

If you can understand spreadsheets, you can understand how AI works

Next
Next

Anthropics' new prompt library helps turn ideas into actions