Many aspects of the PowerPC 970 (G5) architecture have been covered
in depth since Apple's very welcome announcement of the Power Mac G5 at the WWDC,
including greatly enhanced bandwidth and 64-bit addressing. There are
some unique aspects of the PPC 970 processor that differentiate it from
its competition and make Apple's application benchmark results quite
plausible.
FMAC Magic
In some ways, Apple actually understates the floating-point
performance of the G5. One of the most common functions used in 3D
rendering and games, Photoshop filters, and other media applications is
the "Multiply Accumulate." Basically, two numbers are multiplied
together and added to a third.
A Pentium 4 requires a minimum of two operations to complete this
algorithm, one multiply and one addition.
A PowerPC 970 can perform this common function in one instruction.
Because the PowerPC 970 has two floating-point units, it can perform
two FMAC functions per clock cycle (maximum theoretical throughput).
This is four times the throughput of a Pentium 4.
What about SSE-2?
Intel added a series of vector instructions called SSE-2 to enhance
the floating-point performance of the Pentium 4. SSE-2 does not add a
Multiply Accumulate instruction to the architecture, but rather it
allows the Pentium 4 to perform "packed" operations. This means that it
can perform two of the same operation per clock cycle. For example, it
could perform two additions or two multiplications per clock cycle
(maximum theoretical throughput). In a best case, this would yield half
of the performance of the PowerPC 970 when performing Multiply
Accumulate functions.
Precisely
The performance comparison above was regarding double precision
arithmetic. This is a very precise and computationally intensive level
of precision that is required for scientific calculations, ray tracing,
and other similar applications.
|
G5
|
G4
|
Xeon
|
Floating Point Units
|
2
|
1
|
1
|
Double Precision Operations
|
2/clock
|
1/clock
|
1/clock or
2/clock (SSE-2)
|
FMAC Throughput
|
2/clock
|
1/clock
|
0.5/clock or
1/clock (SSE-2)
|
What about Single Precision?
Single precision is often used in games, filters, and other
applications where less precise results are required. The FMAC
advantage of the G5 continues to hold true in this scenario. The
floating-point units of the G5 have the same ideal throughput for
single precision FMAC calculations - two per clock cycle. The Velocity
Engine has a peak throughput of four single precision FMACs per clock
cycle. Again this doubles the single precision throughput of the
Pentium 4.
Why focus on the FMAC?
It is unfair to focus on one instruction when comparing processor
performance. My point in illustrating the difference is this: Apple
chose very specific applications that flex the floating-point and
vector muscle of the G5. The G5 will not exhibit such efficient
performance in many applications. Not all floating-point operations can
be combined into Multiply-Accumulate operations, requiring that the
multiplication or addition be computed on its own. Furthermore,
division slows down the floating point of the PowerPC 970 considerably.
The SPEC scores for G5 reflect this reality quite well.
Other Neat Tricks
The 970 has a unique vector-permute instruction that gives it an
advantage in decryption and similar algorithms. This and its many
data-manipulation instructions allow the processor to move bits around
inside of a value with aplomb. Also, highly efficient Fast-Fourier
algorithms have been written for the Velocity Engine. Fourier
transforms are essential to a number of visual processing
applications.
The Bottom Line
So what does it all mean? It is very difficult to compare two
different architectures, but one very important thing should be clear
for Mac users. The Power Mac G5 and the 970 processor are built for the
applications we love to use: Photoshop, Final Cut, Shake, Maya,
etc.
Almost all applications will see an appreciable boost in speed from
the increased system bandwidth and clock speed of the G5, but only some
of them will be PC scorchers. The PC has advantages in certain
applications as well, and I am sure many of these will come to light in
the near future.
It is clear, however, that Apple chose wisely when staging their
cook-off at this year's WWDC.
Sources
Share your perspective on the Mac by emailing with "My Turn" as your subject.