Talos Secure Workstation

The world's first ATX-compatible, workstation-class mainboard for the IBM POWER8 processor.

Oct 25, 2016

Kernel Compilation Showdown

In this update we explore the relative performance of the Talos™ Secure Workstation and one of the most powerful libre-friendly, blob-free x86 machines available.

Background

The ASUS KGPE-D16 is likely the most powerful libre-friendly x86 machine available. When populated with two Opteron G34 Piledriver CPUs clocked at 3.2 GHz, it represents the performance cap for all x86 workstation / server class machines that still respect your freedom. How does it compare against a low-end Talos™ configuration in real life? Let’s find out by measuring the wall time of something we’re all familiar with, a full Linux kernel compilation!

Test Machine Specifications

First, the detailed specifications of both test machines:

KGPE-D16
Talos™ Secure Workstation

Benchmarking

The benchmark is simple: compile a Linux kernel and all relevant modules. To avoid variance in compile times based on target driver support or optimization passes, we cross-compiled to a POWER target on both machines. Both machines were using an up-to-date copy of Debian Stretch as the build environment. The entire Linux kernel source was loaded into a dedicated tmpfs ramdisk on both machines, and as much CPU and memory traffic as possible was quiesced prior to benchmark start.

The commands used on each system were:

KGPE-D16
time CROSS_COMPILE=/usr/bin/powerpc64le-linux-gnu- ARCH=powerpc make -j16

Talos™ Secure Workstation
ppc64_cpu --smt=8
time make -j64

Results

KGPE-D16
real    12m30.152s
user    146m10.076s
sys     12m34.760s

Talos™ Secure Workstation
real    9m26.571s
user    296m42.477s
sys     28m41.248s

As you can see, even though the KGPE-D16 has double the core count, a higher overall TDP, comparable boost clocks to the POWER8, and is a much larger machine overall, the Talos™ Secure Workstation still manages to compile a kernel in only 3/4 of the time required by the KGPE-D16! This represents a very real savings overall in terms of developer time, capability, and electrical power consumed.

STREAM Benchmarks

Why is Talos™ so much faster than the KGPE-D16? Much of the increase is directly related to the more powerful cores of the newer POWER8 processor combined with the large caches and fast memory system of the POWER machines. To further explore the contribution of the memory subsystem to the performance differential, we ran STREAM memory bandwidth tests on both of these test machines:

KGPE-D16
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:           14129.6     0.049492     0.045295     0.055945
Scale:          15236.6     0.048226     0.042004     0.056949
Add:            15078.2     0.073871     0.063668     0.088305
Triad:          15466.7     0.071833     0.062069     0.088462

Talos™ Secure Workstation
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:           36279.5     0.018876     0.017641     0.022749
Scale:          37932.3     0.017997     0.016872     0.021693
Add:            41974.5     0.024270     0.022871     0.026481
Triad:          43338.9     0.023190     0.022151     0.023880

A word of caution: These results exaggerate the memory bandwidth issues present on the Opteron CPUs due to only having four of the eight possible memory channels populated. However, even assuming the Opteron doubles in bandwidth with the remaining four channels populated (which is very unlikely), the Talos™ mainboard still has a significant edge over the Opteron system even in this best-case streaming memory benchmark.

Conclusions

From the test results shown, Talos™ is a significant upgrade from existing libre-friendly, owner-controlled machines. Talos™ will accelerate development and lower ongoing costs, all while preserving owner control.

Conspicuously absent from these tests are Intel processors released concurrently with or subsequent to POWER8. We specifically chose not to compare Talos™ against the Xeon® systems as the Xeon® systems are not libre-friendly, require network-enabled signed blobs to run continuously during system operation, and otherwise require compromising security and owner control if used as an upgrade from an existing blob-free system such as the KGPE-D16. You can take the raw numbers from our benchmarks above and run tests on Xeon® hardware if desired; in general Xeon® is on par with POWER8 but due to the aforementioned issues with owner control and freedom, Xeon® systems do not present a viable upgrade path for many use cases.


Sign up to receive future updates for Talos Secure Workstation.

Subscribe to the Crowd Supply newsletter, highlighting the latest creators and projects