My Turn

Cross-Platform Benchmarking

Be Careful What You Wish For

Peppermint Pademelon - 2001.09.10

My Turn is Low End Mac's column for reader-submitted articles. It's your turn to share your thoughts on all things Mac (or iPhone, iPod, etc.) and write for the Mac web. Email your submission to Dan Knight .

There have been a number of articles published on Low End Mac discussing the "Megahertz Myth" and how nice it would be if there were cross-platform benchmarks that would really let the truth be told. I'm here to suggest that perhaps the Macintosh community might want to be cautious about pushing for such things, unless it's willing to accept the consequences. Although I'm not saying that this would necessarily be the case, I do think there's a distinct possibility that should some sort of industrywide standard for personal computer benchmarking be adopted, the Macintosh could lose - possibly badly.

The enthusiasm for the idea of getting some numbers to "show how badly the G4 kicks Intel's butt" is natural, given the lovely Photoshop and movie rendering demos that Steve Jobs rolls out every Macworld. But an industrywide benchmark would consist of a lot more then just a very carefully chosen set of Photoshop filters.

There already exist other benchmarks that show the Mac platform in a considerably less rosy light. Well known examples, which are usually pooh-poohed by Mac enthusiasts, are game frame-rate measurements. The Mac usually loses these quite badly, sometimes by an order of magnitude compared to similarly priced x86 hardware. The excuse usually offered up is that the Macintosh version of a given game is a sloppy port and poorly optimized compared to the PC version. Fair enough, but it is a double standard to dismiss game benchmarks you lose at while embracing Photoshop wins. After all, the possibility certainly exists that the x86 version of Photoshop is sub-optimal compared to the PPC version, does it not?

For some actual numbers to chew on, I invite you to read this interesting benchmark: http://homepage.mac.com/nopea1/benchmark/ To summarize briefly, this page illustrates the results of a series of cross-compilation benchmarks performed on both x86 and Macintosh machines running Linux. The author of these tests for comparison's sake used the results to generate a "bogohurts" rating for ever machine he tested. The bogohurts number roughly states how fast an Intel PIII, which was used as the baseline, would have to be running to produce the same results. So how did the Macs do?

The 533 MHz G4 tower scored between 587 and 647 bogohurts, which means it was roughly as fast as a 600 MHz PIII. The 450 MHz G3 iMac scored between 435 and 475 bogohurts, which means it was the same speed, MHz for MHz, as a PIII would have been. That's pretty far from the "up to twice as fast" so often quoted in Apple literature and on the Web.

You could undoubtedly find holes to pick in the above benchmark, but the fact remains that the methodology used to produce it was well documented, unlike Apple's Photoshop tests. It used open-source code, the same code across both platforms, compiled with the same compiler Apple uses to make OS X, not a proprietary application having unknowable differences between platforms. Finally, it concentrated on integer-heavy operations, which is a more realistic depiction of how most people use their computers. So I'm going to use it as a reference as I start drifting off into fantasy below.


Just for fun, let's imagine some industrywide benchmark similar to the bogohurts rating were adopted, except that it's boiled down to a PIII efficiency rating, which we'll call a P3mark. Using it, a machine's clock speed in MHz is multiplied by a fudge factor that represents how efficient it is relative to the Pentium III. For example, if a given processor is 20% more efficient then a PIII and runs at 500 MHz, then it gets a "P3fudge" of 1.2 and a "P3mark" of 600. Further, let's split integer and floating point performance, so we'll have "IP3marks" and "FP3marks," in order to have more numbers to play with. A CPU's IP3marks and FP3marks would be be determined by integer and floating-point instruction mix benchmark sets, which would be as similar and fair as possible across CPU platforms. They might allow the manufacturer to tweak the code slightly in the case of processors such as the G4 or P4, which require special optimizations in order to perform well, but as closely as possible each CPU should be required to do a similar amount of work. A common OS, such as Linux or BSD, would be used to bootstrap the tests, but beyond that they'd be as OS-agnostic as possible.

Now let's suppose we ended up with P3marks something like the following table. I came up with these after browsing benchmark results for various x86 CPUs and making rough guesses for the Motorola chips, based on my personal experience with them (I own an iMac and use a G4 tower at work), the Linux benchmark above, and, yes, Photoshop results.

IP3fudge

FP3fudge

Intel PIII

1.0

1.0

Intel Celeron

0.9

0.9

Intel P4

0.75

0.9

AMD Athlon

1.1

1.2

AMD Duron

1.0

1.1

Motorola G4

1.2

2.5

Motorola G3

1.1

1.5

Yes, there are undoubtedly things wrong with this table. However, I think it's pretty fair. I honestly don't believe the G4 is 2.5 times faster than a P3 of a given clock speed for most floating point operations, or even close to that, but I wanted to throw a bone to Apple. I'll stand by that 1.2 rating for integer performance. It's actually better than the benchmarks I pointed out would indicate, and it's roughly what the differences in average operations-per-clock for the two CPUs would make one expect.

So, if we pretend those fudge factors are right, what would that mean for the numbers you'd find printed on the box when you bought your computer? Let's start with a high-end comparison:

IP3mark

FP3mark

Total

Intel P4, 2 GHz

1500

1800

3300

AMD Athlon, 1.4 GHz

1540

1680

3220

Motorola G4, 867 MHz

1040

2167

3207

Yah! We've proved that the MHz Myth is true, and Apple's computers can compete head-on with anything out there, right? I mean, they're all practically tied, and the G4 did it with less then half the MHz rating of the P4. So all is happy, right?

Not so fast. Let's go back and run some numbers on the low end:

IP3mark

FP3mark

Total

Intel Celeron, 800 MHz

720

720

1440

AMD Duron, 900 MHz

900

990

1890

Motorola G3, 500 MHz

550

750

1300

Ouch. The base model iMac and the iBook lose, and lose fairly badly, to x86 machines costing quite a bit less then they do. I imagine they would be a really hard sell to anyone who wasn't already a committed Mac user if numbers like that were staring the prospective computer buyer in the face. Even the 700 MHz top-of-the-line iMac doesn't quite tie with the 900 MHz Duron, and, face it, it costs about as much as a 1.4 GHz Athlon, against which it's utterly dwarfed. The 500 MHz PowerBook G4 just barely beats an 800 MHz Duron laptop, which similarly is much cheaper.

I think you get the idea. If someone were wandering through a store with X many dollars in their pocket to spend and all the boxes to choose from had directly comparable numbers on them, you can bet they'd buy the box that had the biggest number per dollar ratio. And that box isn't going to have a Macintosh in it.

I'd personally like to know what the truth was in the whole x86 vs. PPC speed wars, but I don't think it's in Apple's interest for that to be once and for all finally answered. I'm sure True Believers out there will find flaws in my numbers and disagree, but I really and truly think that if x86 machines and Macintoshes were reduced to numbers based solely on the performance of the hardware, the numbers for the x86 side overall will be bigger. There's just too much technology being thrown at the problem on the x86 side for them not to win. You can certainly nitpick about how inelegant their solutions are, but the fact is they do work.

It'd probably be more useful in the fight to get people to consider the Macintosh a real computing option to focus upon issues where you stand a real chance of winning, and be able to keep winning. Even if by some fluke Motorola could manage to pull out an overall win on some magic benchmark test one day, you know that the next day both Intel and AMD would knock out something new, and honestly, Motorola's CPU division doesn't have the resources to fight that for long.

Sell people on the Macintosh experience. Make hay about how lousy Windows is. If you have to, resort to "Ooooh, Shiny" whilst pointing to the G4 PowerBook. Computers are overpowered today anyway, and that's just getting worse. If you start playing the benchmark game, then you end up contributing to the madness. You'll make Macs just as disposable and short-lived as Intel machines. It's already happening, to some extent.

Think Different, or something.

Share your perspective on the Mac by emailing with "My Turn" as your subject.

Join us on Facebook, follow us on Twitter or Google+, or subscribe to our RSS news feed

Today's Links

Recent Content

About LEM Support Usage Privacy Contact

Custom Search

Follow Low End Mac on Twitter
Join Low End Mac on Facebook

Favorite Sites

MacSurfer
Cult of Mac
Shrine of Apple
MacInTouch
MyAppleMenu
InfoMac
The Mac Observer
Accelerate Your Mac
RetroMacCast
The Vintage Mac Museum
Deal Brothers
DealMac
Mac2Sell
Mac Driver Museum
JAG's House
System 6 Heaven
System 7 Today
the pickle's Low-End Mac FAQ

Affiliates

Amazon.com
The iTunes Store
PC Connection Express
Macgo Blu-ray Player
Parallels Desktop for Mac
eBay

Low End Mac's Amazon.com store

Advertise

Open Link