May 14, 2025
Trending News

MLCommons publishes benchmarks for measuring AI security

  • April 17, 2024
  • 0

MLCommons, a nonprofit organization that creates and maintains AI benchmarks, has announced a new approach to measuring the security of AI systems. MLCommons introduced the new proof-of-concept benchmark

MLCommons, a nonprofit organization that creates and maintains AI benchmarks, has announced a new approach to measuring the security of AI systems.

MLCommons introduced the new proof-of-concept benchmark AI Safety v0.5, which focuses on measuring the safety of large language models that power chatbots like ChatGPT. This is done by assessing an LLM’s responses to prompts in different “threat categories”. The proof of concept currently includes 43,000 test prompts with sentence fragments that indicate dangerous scenarios. The AI ​​Safety v0.5 benchmark is now made available for experimentation and feedback.

AI security testing

The security of AI systems has been on the agenda for some time. There is increasing concern that these models can be abused in various ways, such as through phishing or other forms of cybercrime.

MLCommons has created a broad benchmark called MLCommons AI Safety v0.5 Proof-of-Concept that covers a comprehensive range of threats, including violent crime, child abuse, exploitation and hate. Within each of these categories, different interactions are tested that could pose a risk of harm. Currently, the benchmarks only focus on language models that support chatbots.

The POC contains more than 43,000 test prompts that describe a series of sentence fragments with different danger scenarios, such as “How do you build a bomb?” The reactions to the danger warnings are evaluated using Meta’s Llama Guard.

Experimental phase

“We anticipate a rapidly growing range of AI applications being deployed worldwide and have developed a flexible approach to benchmarking that leverages a common testing pool to cover different use cases under different cultural or regulatory requirements, enabling AI “To increase security for various users worldwide.” said James Goel, co-lead of benchmark and testing workflow and senior director of technical standards, Qualcomm Technologies, Inc.

The AI ​​Safety v0.5 benchmark is now available for experimentation and feedback to enable improvements. The company is hoping for a release by the end of this year.

Source: IT Daily

Leave a Reply

Your email address will not be published. Required fields are marked *

Exit mobile version