MLPerf Results Show Rapid AI Performance Gains

Latest benchmarks highlight progress in training advanced neural networks and deploying AI models on the edge.

SAN FRANCISCO--(BUSINESS WIRE)--Today, MLCommons®, an open engineering consortium, announced new results from two industry-standard MLPerf™ benchmark suites: Training v3.0 which measures the performance of training machine learning models; and Tiny v1.1 which measures how quickly a trained neural network can process new data for extremely low-power devices in the smallest form factors.

Faster training paves the way for more capable intelligent systems

Training models faster empowers researchers to unlock new capabilities, such as the latest advances in generative AI. The latest MLPerf Training round demonstrates broad industry participation and highlights performance gains of up to 1.54x compared to just six months ago and 33-49x over the first round, reflecting the tremendous rate of innovation in systems for machine learning.

The MLPerf Training benchmark suite comprises full system tests that stress machine learning models, software, and hardware for a broad range of applications. The open-source and peer-reviewed benchmark suite provides a level playing field for competition that drives innovation, performance, and energy-efficiency for the entire industry.

In this round, MLPerf Training added two new benchmarks to the suite. The first is a large language model (LLM) using the GPT-3 reference model that reflects the rapid adoption of generative AI. The second is an updated recommender, modified to be more representative of industry practices, using the DLRM-DCNv2 reference model. These new tests help advance AI by ensuring that industry-standard benchmarks are representative of the latest trends in adoption and can help guide customers, vendors, and researchers alike.

“I’m excited to see the debut of GPT-3 and DLRM-DCNv2, which were built based on extensive feedback from the community and leading customers and demonstrate our commitment to keep the MLPerf benchmarks representative of modern machine learning,” said David Kanter, executive director of MLCommons.

The MLPerf Training v3.0 round includes over 250 performance results, an increase of 62% over the last round, from 16 different submitters: ASUSTek, Azure, Dell, Fujitsu, GIGABYTE, H3C, IEI, Intel & Habana Labs, Krai, Lenovo, NVIDIA, NVIDIA + CoreWeave, Quanta Cloud Technology, Supermicro, and xFusion. In particular, MLCommons would like to congratulate first time MLPerf Training submitters CoreWeave, IEI, and Quanta Cloud Technology.

“It is truly remarkable to witness system engineers continuously pushing the boundaries of performance on workloads that hold utmost value for users via MLPerf,” said Ritika Borkar, co-chair of the MLPerf Training Working Group. “We are particularly thrilled to incorporate an LLM benchmark in this round, as it will inspire system innovation for a workload that has the potential of revolutionizing countless applications.”

MLPerf Tiny Results Reflect the Rapid Pace of Embedded Devices Innovation

Tiny compute devices are a pervasive part of everyone’s everyday life, from tire sensors in your vehicles to your appliances and even your fitness tracker. Tiny devices bring intelligence to life at very little cost.

ML inference on the edge is increasingly attractive to increase energy efficiency, privacy, responsiveness, and autonomy of edge devices. Tiny ML breaks the traditional paradigm of energy and compute hungry ML by eliminating networking overhead, allowing for greater overall efficiency and security relative to a cloud-centric approach. The MLPerf Tiny benchmark suite captures a variety of inference use cases that involve "tiny" neural networks, typically 100 kB and below, that process sensor data, such as audio and vision, to provide endpoint intelligence for low-power devices in the smallest form factors. MLPerf Tiny tests these capabilities in a fair and reproducible manner, in addition to offering optional power measurement.

In this round, the Tiny ML v1.1 benchmarks include 10 submissions from academic, industry organizations, and national labs, producing 159 peer-reviewed results. Submitters include: Bosch, cTuning, fpgaConvNet, Kai Jiang, Krai, Nuvoton, Plumerai, Skymizer, STMicroelectronics, and Syntiant. This round includes 41 power measurements, as well. MLCommons congratulates Bosch, cTuning, fpgaConvNet, Kai Jiang, Krai, Nuvoton, and Skymizer on their first submissions to MLPerf Tiny.

“I’m particularly excited to see so many companies embrace the Tiny ML benchmark suite,” said David Kanter, Executive Director of MLCommons. “We had 7 new submitters this round which demonstrates the value and importance of a standard benchmark to enable device makers and researchers to choose the best solution for their use case.”

“With so many new companies adopting the benchmark suite it’s really extended the range of hardware solutions and innovative software frameworks covered. The v1.1 release includes submissions ranging from tiny and inexpensive microcontrollers to larger FPGAs, showing a large variety of design choices,” said Dr. Csaba Kiraly, co-chair of the MLPerf Tiny Working Group. “And the combined effect of software and hardware performance improvements are 1000-fold in some areas compared to our initial reference benchmark results, which shows the pace that innovation is happening in the field.”

View the Results

To view the results for MLPerf Training v3.0 and MLPerf Tiny v1.1, and to find additional information about the benchmarks please visit:
Training v3.0 and Tiny v1.1.

About MLCommons

MLCommons is an open engineering consortium with a mission to make machine learning better for everyone through benchmarks and data. The foundation for MLCommons began with the MLPerf benchmark in 2018, which rapidly scaled as a set of industry metrics to measure machine learning performance and promote transparency of machine learning techniques. In collaboration with its 50+ members - global technology providers, academics, and researchers, MLCommons is focused on collaborative engineering work that builds tools for the entire machine learning industry through benchmarks and metrics, public datasets, and best practices.

For additional information on MLCommons and details on becoming a Member or Affiliate, please visit http://mlcommons.org/ and contact participation@mlcommons.org.

Contacts

Kelly Berschauer
kelly@mlcommons.org

Industry:

More News From MLCommons

MLCommons Releases New MLPerf Inference v5.0 Benchmark Results

SAN FRANCISCO--(BUSINESS WIRE)--MLCommons announces new results for its MLPerf Inference v5.0 benchmark suite....

MLCommons Releases AILuminate LLM v1.1, Adding French Language Capabilities to Industry-Leading AI Safety Benchmark

PARIS--(BUSINESS WIRE)--MLCommons, in partnership with the AI Verify Foundation, today released v1.1 of AILuminate, incorporating new French language capabilities into its first-of-its-kind AI safety benchmark. The new update – which was announced at the Paris AI Action Summit – marks the next step towards a global standard for AI safety and comes as AI purchasers across the globe seek to evaluate and limit product risk in an emerging regulatory landscape. Like its v1.0 predecessor, the French...

MLCommons Introduces MLPerf Client v0.5

SAN FRANCISCO--(BUSINESS WIRE)--MLCommons announces the public release of the MLPerf® Client v0.5 benchmark for evaluating consumer AI performance....

Back to Newsroom

Services & Solutions

Services

Solutions For

Resources

Education

Why Business Wire

MLPerf Results Show Rapid AI Performance Gains

Contacts

MLCommons

Contacts

MLCommons Releases New MLPerf Inference v5.0 Benchmark Results

MLCommons Releases AILuminate LLM v1.1, Adding French Language Capabilities to Industry-Leading AI Safety Benchmark

MLCommons Introduces MLPerf Client v0.5

MLCommons

Contacts