Intel secretly optimized processors for a popular benchmark test, so that the results were significantly higher and no longer indicative of real-world performance.
Benchmark specialist SPEC (Standard Performance Evaluation Corporation) accuses Intel of fraud. The company notes that Intel optimized the compiler of several processors specifically for its SPEC CPU 2017 test. Because of this optimization, the test no longer reflects how well an Intel processor performs at a particular type of workload, but only how well the CPU performs in the benchmark itself.
Benchmarks as a benchmark
Benchmarks like SPEC CPU 2017 cover several standardized workloads that processors must handle. The tests then assign a score to the results. Standardization of benchmarks allows companies to objectively compare components before deciding which part is best for their needs. For example, to test workstations we use the SPECwpc benchmark, so we have an objective benchmark to compare strengths and weaknesses.
A benchmark test is always somewhat artificial, but in principle a good test contains workloads that are representative of tasks a CPU would perform in the real world. If you see a high score on a particular type of workload that is important to you, you can assume that your variant of that workload is also performing smoothly.
No longer representative
According to SPEC, Intel has manipulated the test so that the test in question is no longer representative. Intel has made very targeted optimizations to the components of the compiler, which translates code into instructions for the processor 523.xalancbmk_r / 623.xalancbmk_s of the benchmark test using prior knowledge of that test’s code. The result is consistent: CPUs on the SPEC CPU 2017 should achieve up to nine percent more performance. Unfortunately, this performance increase only applies to the test in question, and in real life the processor is nine percent slower than the numbers suggest.
SPEC has therefore withdrawn validation of 2,600 test results from 2022 and 2023. Intel seems to have cheated, particularly with its Sapphire Rapids CPUs. Newer versions of the Intel OneAPI compiler no longer include the optimizations, so the results are representative of Emerald Rapids.
Misleading
The discovery is painful, even if the problem has been fixed for new testing. It shows that Intel is not interested in manipulating test results to its advantage and misleading customers. Of course, there is nothing illegal about this practice, but it deliberately misleads customers and testers who rely on a reputable and reliable benchmark test to make a purchasing decision based on those results.