OpenAI pits AI agents against each other to red team smart contracts

OpenAI said it is becoming increasingly important to evaluate the performance of AI agents in “economically meaningful environments” as their adoption grows.

OpenAI has launched a new benchmark that evaluates how well different AI models detect, patch, and even exploit security vulnerabilities found in crypto smart contracts.

OpenAI released the “EVMbench: Evaluating AI Agents on Smart Contract Security” paper on Wednesday, in collaboration with crypto investment firm Paradigm and crypto security firm OtterSec, to evaluate how much the AI agents could theoretically exploit from 120 smart contract vulnerabilities.

Anthropic’s Claude Opus 4.6 came out on top with an average “detect award” of $37,824, followed by OpenAI’s OC-GPT-5.2 and Google’s Gemini 3 Pro at $31,623 and $25,112, respectively.

If you liked the article, do not forget to share it with your friends. Follow us on Google News too, click on the star and choose us from your favorites.

If you want to read more News articles, you can visit our General category.

Source

Leave a Reply Cancel reply

Related News

The Westies: MGM+ Unveils Photos, Poster, and Trailer for New Drama Series

Bitcoin decouples from tech stocks: Is $60K BTC’s next stop?