<cd ../feed
introducing-swe-bench-verified.log
|src: openai.com

Introducing SWE-bench Verified

We’re releasing a human-validated subset of SWE-bench that more reliably evaluates AI models’ ability to solve real-world software issues.