MLCommons publishes benchmarks for measuring AI security

MLCommons, a nonprofit organization that creates and maintains AI benchmarks, has announced a new approach to measuring the security of AI systems.

MLCommons introduced the new proof-of-concept benchmark AI Safety v0.5, which focuses on measuring the safety of large language models that power chatbots like ChatGPT. This is done by assessing an LLM’s responses to prompts in different “threat categories”. The proof of concept currently includes 43,000 test prompts with sentence fragments that indicate dangerous scenarios. The AI Safety v0.5 benchmark is now made available for experimentation and feedback.

AI security testing

The security of AI systems has been on the agenda for some time. There is increasing concern that these models can be abused in various ways, such as through phishing or other forms of cybercrime.

MLCommons has created a broad benchmark called MLCommons AI Safety v0.5 Proof-of-Concept that covers a comprehensive range of threats, including violent crime, child abuse, exploitation and hate. Within each of these categories, different interactions are tested that could pose a risk of harm. Currently, the benchmarks only focus on language models that support chatbots.

The POC contains more than 43,000 test prompts that describe a series of sentence fragments with different danger scenarios, such as “How do you build a bomb?” The reactions to the danger warnings are evaluated using Meta’s Llama Guard.

Experimental phase

“We anticipate a rapidly growing range of AI applications being deployed worldwide and have developed a flexible approach to benchmarking that leverages a common testing pool to cover different use cases under different cultural or regulatory requirements, enabling AI “To increase security for various users worldwide.” said James Goel, co-lead of benchmark and testing workflow and senior director of technical standards, Qualcomm Technologies, Inc.

The AI Safety v0.5 benchmark is now available for experimentation and feedback to enable improvements. The company is hoping for a release by the end of this year.

Source: IT Daily

Mary

As an experienced journalist and author, Mary has been reporting on the latest news and trends for over 5 years. With a passion for uncovering the stories behind the headlines, Mary has earned a reputation as a trusted voice in the world of journalism. Her writing style is insightful, engaging and thought-provoking, as she takes a deep dive into the most pressing issues of our time.

Recent Posts

An unprecedented ecosystem found under the ice of an Antarctic lake December 24, 2024

Vivo Y29 5G officially launched: entry level with LED Dynamic Light December 24, 2024

Can animals have schizophrenia like humans? December 24, 2024