MLCommons and AI Verify to collaborate on AI Safety Initiative

Agree to a memorandum of intent to collaborate on a set of AI safety benchmarks for LLMs

SAN FRANCISCO--(BUSINESS WIRE)--Today in Singapore, MLCommons® and AI Verify signed a memorandum of intent to collaborate on developing a set of common safety testing benchmarks for generative AI models for the betterment of AI safety globally.

A mature safety ecosystem includes collaboration across AI testing companies, national safety institutes, auditors, and researchers. The aim of the AI Safety benchmark effort that this agreement advances is to provide AI developers, integrators, purchasers, and policy makers with a globally accepted baseline approach to safety testing for generative AI.

“There is significant interest in the generative AI community globally to develop a common approach towards generative AI safety evaluations,” said Peter Mattson, MLCommons President and AI Safety working group co-chair. “The MLCommons AI Verify collaboration is a step-forward towards creating a global and inclusive standard for AI safety testing, with benchmarks designed to address safety risks across diverse contexts, languages, cultures, and value systems.”

The MLCommons AI Safety working group, a global group of academic researchers, industry technical experts, policy and standards representatives, and civil society advocates recently announced a v0.5 AI Safety benchmark proof of concept (POC). AI Verify will develop interoperable AI testing tools that will inform an inclusive v1.0 release which is expected to deliver this fall. In addition, they are building a toolkit for interactive testing to support benchmarking and red-teaming.

“Making first moves towards globally accepted AI safety benchmarks and testing standards, AI Verify Foundation is excited to partner with MLCommons to help our partners build trust in their models and applications across the diversity of cultural contexts and languages in which they were developed. We invite more partners to join this effort to promote responsible use of AI in Singapore and the world,” said Dr Ong Chen Hui, Chair of the Governing Committee at AI Verify Foundation.

The AI Safety working group encourages global participation to help shape the v1.0 AI Safety benchmark suite and beyond. To contribute, please join the MLCommons AI Safety working group.

About MLCommons

MLCommons is the world leader in building benchmarks for AI. It is an open engineering consortium with a mission to make AI better for everyone through benchmarks and data. The foundation for MLCommons began with the MLPerf® benchmarks in 2018, which rapidly scaled as a set of industry metrics to measure machine learning (ML) performance and promote transparency of ML and AI techniques. In collaboration with its 125+ members, global technology providers, academics, and researchers, MLCommons is focused on collaborative engineering work that builds tools for the entire AI industry through benchmarks and metrics, public datasets, and best practices.

About the AI Verify Foundation

The AI Verify Foundation aims to harness the collective power and contributions of the global open-source community to develop AI testing tools to enable responsible AI. The Foundation promotes best practices and standards for AI. The not-for-profit Foundation is a wholly owned subsidiary of the Infocommunications Media Development Authority of Singapore (IMDA).

Contacts

Kelly Berschauer
kelly@mlcommons.org

Industry:

More News From MLCommons

MLCommons Releases New MLPerf Inference v5.0 Benchmark Results

SAN FRANCISCO--(BUSINESS WIRE)--MLCommons announces new results for its MLPerf Inference v5.0 benchmark suite....

MLCommons Releases AILuminate LLM v1.1, Adding French Language Capabilities to Industry-Leading AI Safety Benchmark

PARIS--(BUSINESS WIRE)--MLCommons, in partnership with the AI Verify Foundation, today released v1.1 of AILuminate, incorporating new French language capabilities into its first-of-its-kind AI safety benchmark. The new update – which was announced at the Paris AI Action Summit – marks the next step towards a global standard for AI safety and comes as AI purchasers across the globe seek to evaluate and limit product risk in an emerging regulatory landscape. Like its v1.0 predecessor, the French...

MLCommons Introduces MLPerf Client v0.5

SAN FRANCISCO--(BUSINESS WIRE)--MLCommons announces the public release of the MLPerf® Client v0.5 benchmark for evaluating consumer AI performance....

Back to Newsroom

Services & Solutions

Services

Solutions For

Resources

Education

Why Business Wire

MLCommons and AI Verify to collaborate on AI Safety Initiative

Contacts

MLCommons

Contacts

MLCommons Releases New MLPerf Inference v5.0 Benchmark Results

MLCommons Releases AILuminate LLM v1.1, Adding French Language Capabilities to Industry-Leading AI Safety Benchmark

MLCommons Introduces MLPerf Client v0.5

MLCommons

Contacts