Server Benchmarks: What Do They Test and Do They Really Matter?

Server Benchmarks: What Do They Test and Do They Really Matter?

Every time a new processor is released, customers get a barrage of benchmark results from vendors. These benchmark results are often confusing, as they are typically presented as numbers without context. Unless they dig pretty deeply, customers do not know what the benchmark is actually testing, how the benchmark works, or what it is actually telling you.

Our new system, the Fujitsu SPARC M12, has captured the top slot in both the SPECint_rate2006 and SPECfp_rate2006 benchmarks on both a per-core and system basis. This is a big deal, but what does it mean? For starters, it shows that the Fujitsu SPARC M12 is a remarkably “balanced” system that is equally adept at traditional business processing (such as transaction processing, optimization problems, etc.) and scientific processing, which has a lot in common with today’s AI, Big Data, deep learning, and machine learning processing. This is a big win for customers because you don’t have to worry whether or not your workloads fit, since the Fujitsu SPARC M12 demonstrates leading performance on both. But more importantly, what does this mean when it comes to how the systems will perform in your data center on your workloads? Let’s take a closer look at these benchmarks and how they relate to your processing.

Whether the individual benchmark program is testing the system’s ability to run multiple tasks simultaneously (common in business and scientific processing), execute complex logic quickly (used in artificial intelligence and deep analytic software), or rapidly compress data files (important to minimize storage requirements), each benchmark program correlates to a common business task that provides a glimpse into how that real-world workload will perform in your environment using the system being tested.

In this blog, we look at two of the world record benchmarks that the Fujitsu SPARC M12 has set. We’ll explore additional benchmarks in the future. (Check out our complete list of Fujitsu SPARC server benchmark records.)

What Does SPECint_rate 2006 Test?

SPECint_rate2006 is a standard benchmark that has been around for more than a decade (obviously, given the name, right?). It tests the integer processing performance of the CPU and memory components of a system. However, it’s not a single workload; it’s actually a group of the twelve different workloads. The sidebar includes an excerpt from the SPECint_rate2006 website that shows the twelve application areas tested, along with a brief description. You can find the complete table here.

CFP2006 (Integer Components of SPEC CPU2006) All of these programs are written in either C or C++, which are the dominant programming languages in the world today. When the SPECint_rate2006 benchmark is run, each task is run simultaneously with the others – which is exactly how most users run their systems.

Looking at the individual programs, many of them are direct corollaries to workloads that are run in businesses today. For example, the compression algorithm makes big files much smaller, which is a common task that is done before data is committed to disk in a business system. The C compiler task is something that every developer is using daily.

“Combinatorial Optimization” is a program that handles public transportation scheduling and optimization. This is a common task for any business that distributes products or schedules services. Another optimization problem is the game Go simulator, which is a hugely complex computational problem. In Go, players lay down ‘stones’ to build a path from one side of the board to the other. Because players can place their stones anywhere on the board, there are nearly an infinite number of possible combinations that will allow a player to win. This is much like the optimizations that business software frequently performs in order to make business functions operate more efficiently.

The benchmark also features an artificial intelligence chess simulator that forces the system to play against itself, thus exercising memory, processors, and the ability to execute complex logic quickly – much like today’s artificial intelligence and deep analytic software.

“Video compression” is a video encoding program that is a common workload for any organization processing video. It also relates to workloads that require strong per-core performance and memory throughput. “Discrete Event Simulation” is a program that simulates a large Ethernet campus network. This is a transaction processing type problem where the software handles the network packets and ensures that they are directed to the right place along the most efficient route.

Next, let’s look at SPECfp_rate2006, which is a set of 17 different scientific applications that are run simultaneously to arrive at the benchmark score. Again, almost all businesses run multiple applications on their systems, just as the SPECfp_rate2006 test is run.

What Does SPECfp_rate 2006 Test?

SPECfp_rate2006 tests the floating point performance of a system’s CPU and memory. This is very important for organizations like financial services, and is increasingly important to any organization that is running enterprise analytics and Big Data workloads.

The sidebar includes an excerpt from the SPECfp_rate2006 website that shows the various application areas tested, along with a brief description. You can find the complete table here.

CFP2006 FloatingThese are some highly complex workloads that severely stress every component in a system. Like in the SPECint_rate2006 benchmark, there are also corollaries to business processing (particularly Big Data and AI), but they’re a bit more difficult to see, since the descriptions are couched in scientific terms.

Linear regression, and variants, is a technique that many of the scientific programs above use to build accurate predictive models and discover causal relationships. What do we mean by this?

A quick example would be if you were tasked with predicting house prices in a particular neighborhood. Using regression, you would come up with a variety of factors that might influence housing prices, like the crime rate, quality of schools, income level in the town, plus a bunch of other variables. You would then put the historical measures of these variables – along with historical home prices – into the regression model.

After a lot of mathematical operations, the model would tell you the relative impact changes in each variable have on home prices.

We might find out that change in income levels account for 40% of the change in home prices, while changes in the crime rate only account for 10%. It is a very powerful tool that almost every enterprise uses in one analytical tool or another.

Big Data and analytics workloads in particular, along with AI, have a lot in common with scientific processing. All of these tasks depend on the ability of the system to support high levels of floating point calculation, the ability to utilize large amounts of memory efficiently, and high I/O capability.


What these benchmark results really show is that the Fujitsu SPARC M12 provides the highest performance while maintaining a balance between integer and floating point workloads. So whether your workload needs high integer performance for databases or business applications or high floating performance for enterprise analytics or Big Data processing, the Fujitsu SPARC M12 is an ideal choice. In addition, as the composition of your workloads change between the two alternatives, you can rest easy knowing that you’ll always have the right system for the job.

Tags: , , , , , ,

Show 1 Comment

1 Comment

  • […] heart of the Fujitsu SPARC M12-1 is the SPARC64 XII processor. As you may remember from previous blogs, this processor turned in the top scores on per-core and system level performance on the SPEC […]

Leave a reply

Post your comment
Enter your name
Your e-mail address

Before you submit your comment you must solve the following arithmetic function! * Time limit is exhausted. Please reload CAPTCHA.

Story Page