My Turn

Secrets of the PowerPC 970: Why the G5 Runs So Fast

Chris Lozaga - 2003.07.01

My Turn is Low End Mac's column for reader-submitted articles. It's your turn to share your thoughts on all things Mac (or iPhone, iPod, etc.) and write for the Mac web. Email your submission to Dan Knight .

Many aspects of the PowerPC 970 (G5) architecture have been covered in depth since Apple's very welcome announcement of the Power Mac G5 at the WWDC, including greatly enhanced bandwidth and 64-bit addressing. There are some unique aspects of the PPC 970 processor that differentiate it from its competition and make Apple's application benchmark results quite plausible.

FMAC Magic

In some ways, Apple actually understates the floating-point performance of the G5. One of the most common functions used in 3D rendering and games, Photoshop filters, and other media applications is the "Multiply Accumulate." Basically, two numbers are multiplied together and added to a third.

A Pentium 4 requires a minimum of two operations to complete this algorithm, one multiply and one addition.

A PowerPC 970 can perform this common function in one instruction. Because the PowerPC 970 has two floating-point units, it can perform two FMAC functions per clock cycle (maximum theoretical throughput). This is four times the throughput of a Pentium 4.

What about SSE-2?

Intel added a series of vector instructions called SSE-2 to enhance the floating-point performance of the Pentium 4. SSE-2 does not add a Multiply Accumulate instruction to the architecture, but rather it allows the Pentium 4 to perform "packed" operations. This means that it can perform two of the same operation per clock cycle. For example, it could perform two additions or two multiplications per clock cycle (maximum theoretical throughput). In a best case, this would yield half of the performance of the PowerPC 970 when performing Multiply Accumulate functions.

Precisely

The performance comparison above was regarding double precision arithmetic. This is a very precise and computationally intensive level of precision that is required for scientific calculations, ray tracing, and other similar applications.

G5

G4

Xeon

Floating Point Units

2
1
1

Double Precision Operations

2/clock
1/clock
1/clock or
2/clock (SSE-2)

FMAC Throughput

2/clock
1/clock
0.5/clock or
1/clock (SSE-2)

What about Single Precision?

Single precision is often used in games, filters, and other applications where less precise results are required. The FMAC advantage of the G5 continues to hold true in this scenario. The floating-point units of the G5 have the same ideal throughput for single precision FMAC calculations - two per clock cycle. The Velocity Engine has a peak throughput of four single precision FMACs per clock cycle. Again this doubles the single precision throughput of the Pentium 4.

Why focus on the FMAC?

It is unfair to focus on one instruction when comparing processor performance. My point in illustrating the difference is this: Apple chose very specific applications that flex the floating-point and vector muscle of the G5. The G5 will not exhibit such efficient performance in many applications. Not all floating-point operations can be combined into Multiply-Accumulate operations, requiring that the multiplication or addition be computed on its own. Furthermore, division slows down the floating point of the PowerPC 970 considerably. The SPEC scores for G5 reflect this reality quite well.

Other Neat Tricks

The 970 has a unique vector-permute instruction that gives it an advantage in decryption and similar algorithms. This and its many data-manipulation instructions allow the processor to move bits around inside of a value with aplomb. Also, highly efficient Fast-Fourier algorithms have been written for the Velocity Engine. Fourier transforms are essential to a number of visual processing applications.

The Bottom Line

So what does it all mean? It is very difficult to compare two different architectures, but one very important thing should be clear for Mac users. The Power Mac G5 and the 970 processor are built for the applications we love to use: Photoshop, Final Cut, Shake, Maya, etc.

Almost all applications will see an appreciable boost in speed from the increased system bandwidth and clock speed of the G5, but only some of them will be PC scorchers. The PC has advantages in certain applications as well, and I am sure many of these will come to light in the near future.

It is clear, however, that Apple chose wisely when staging their cook-off at this year's WWDC.

Sources

Share your perspective on the Mac by emailing with "My Turn" as your subject.

Join us on Facebook, follow us on Twitter or Google+, or subscribe to our RSS news feed

Today's Links

Recent Content

About LEM Support Usage Privacy Contact

Follow Low End Mac on Twitter
Join Low End Mac on Facebook

Page not found | Low End Mac

Well this is somewhat embarrassing, isn’t it?

It seems we can’t find what you’re looking for. Perhaps searching, or one of the links below, can help.

Most Used Categories

Archives

Try looking in the monthly archives. :)

Page not found | Low End Mac

Well this is somewhat embarrassing, isn’t it?

It seems we can’t find what you’re looking for. Perhaps searching, or one of the links below, can help.

Most Used Categories

Archives

Try looking in the monthly archives. :)

Favorite Sites

MacSurfer
Cult of Mac
Shrine of Apple
MacInTouch
MyAppleMenu
InfoMac
The Mac Observer
Accelerate Your Mac
RetroMacCast
The Vintage Mac Museum
Deal Brothers
DealMac
Mac2Sell
Mac Driver Museum
JAG's House
System 6 Heaven
System 7 Today
the pickle's Low-End Mac FAQ

Affiliates

Amazon.com
The iTunes Store
PC Connection Express
Macgo Blu-ray Player
Parallels Desktop for Mac
eBay

Low End Mac's Amazon.com store

Advertise

Well this is somewhat embarrassing, isn’t it?

It seems we can’t find what you’re looking for. Perhaps searching, or one of the links below, can help.

Most Used Categories

Archives

Try looking in the monthly archives. :)

at BackBeat Media (646-546-5194). This number is for advertising only.

Open Link