RAM speed, bus speed, wait states, caches, and processor speed. All
are interrelated in determining computer performance.
Way back when
Once upon a time computer motherboards ran at one speed: the speed
of the CPU. On a 1 MHz Apple II or Commodore 64, the memory and
everything else was tied to that 1 MHz speed. On early PCs, the 4.77
MHz speed of the CPU determined the speed of the rest of the
motherboard.
As processors got faster than that, they began to outrun memory. The
then-typical 150ns memory chips were fine at 6 MHz (1000 ÷ 150 =
6.6 MHz).
But the first Macs ran at 8 MHz, which was faster than 150ns memory
allowed. (The same held for 8 MHz and faster PCs.) Rather than use the
more expensive 120ns memory, which would support 8 MHz, designers added
wait states -- with one wait state, the CPU would wait one extra
cycle before reading the next data from memory.
Since the processor wasn't accessing memory every cycle, the
performance hit wasn't huge. And it let designers create more
affordable computers.
Less waiting
As CPU speeds hit 12 MHz and beyond, two new schools emerged. The
one sought the best possible performance by using memory fast enough to
run at zero wait states. This led to innovations including interleaved
memory and new design schemes for memory. It created some pretty
expensive computers.
The other school sought improved performance without inflated
prices. This is the group that brought the cache to the processor.
Sometimes as small as 16KB, the little bit of zero wait state memory in
the cache meant the processor could usually (80-90% of the time) get
the data it sought from the cache, not the motherboard. Each doubling
of cache size increased the odds of finding the data in the cache, but
also increased the cost of the cache and the computer.
This was about the time 80ns and even 70ns memory started to become
common.
One of Apple's impressive experiments was the Macintosh IIfx, introduced in 1990. It ran a 40
MHz 68030 CPU on a 40 MHz motherboard with a special 64-pin 80ns SIMM -
and a 32KB level 2 (L2) cache on the motherboard. Although it had to
access main memory with three wait states (1000 ÷ 80 = 12.5 MHz),
it could access the L2 cache at full speed.
Freeing the CPU from bus speed
The next performance enhancement was clock doubling, running the
inside of the processor at twice the speed of its external bus. The
Motorola 68040 CPU used clock doubling and an 8KB level 1 (L1) cache
right on the chip. This reduced the need for a L2 cache, but using an
L2 cache with the CPU's L1 cache made performance even faster.
After clock doubling came tripling, quadrupling, and beyond. The
PowerPC 750 or G3 can run at up to 8x bus speed.
And that's become a real point of contention. Not how fast can the
G3 run, but how fast can the bus run.
The problem
Apple had the foresight to design the Power Mac 7500, 8500, and 9500
to accept a replacement CPU card. These cards sat in a bus that would
run as fast as 50 MHz (the speed varied based on the CPU installed). Of
course, 70ns memory only runs at 14 MHz, so a L2 cache was essential
for reasonable performance. (With a 50 MHz CPU, 70ns memory requires 3
wait states; with a 350 MHz CPU, well, you don't even want to think
about it.)
With 256 KB to 1 MB of L2 cache running at up to 50 MHz, a CPU as
fast as 150 MHz could access the cache at two wait states. To also help
performance, the CPU had either a 32KB or 64KB level 1 cache.
But a lot of things changed with the G3 processor. Perhaps the
biggest difference is support for a L2 cache that isn't tied to
motherboard speed. Using very fast, very expensive high speed memory,
G3 cards can access their L2 cache at speeds to 300 MHz, six times
faster than the motherboard L2 caches of the past. This is a big
contributor to G3 speed.
But there's a problem: G3 cards are more picky about bus speed and
often will not run at the 50 MHz speed the motherboard was designed
for. For the most part, 42-46 MHz bus speeds provide the best
stability, so many G3 cards are specifically designed to run at 45
MHz.
Why is that a problem? Up to 360 MHz, it isn't, since the G3 can run
at 8x bus speed. But now they're shipping 400 MHz G3 processors. If
your bus won't run reliably at at 50 MHz, you can't reach 400 MHz. (Of
course, 360 MHz is only 10% slower than 400 MHz.)
- Update: Accelerate Your Mac
has run the XLR8 400 MHz G3 reliably on a 50 MHz bus.
On top of that, the motherboard memory only runs so fast and already
requires wait states between the processor card and RAM. Running at
42-46 MHz pretty much allows use of 70ns RAM at two wait states (to the
system bus). Going any faster would just mean adding wait states, since
RAM itself is only so fast.
If comes down to this: when your CPU has to get data from main
memory instead of the cache, it is limited to the 14 MHz throughput of
70ns memory (or maybe 16.6 MHz with 60ns RAM). Whether the bus is
running at 40 MHz, 50 MHz, or faster, it can't get the information at
bus speed - it must wait until motherboard memory can deliver it.
Moving to 60ns memory and optimizing the motherboard for faster RAM
might increase performance by a few percentage points, but bus speed is
not significant in comparison with the speed of main memory. Better to
add a larger L2 cache and reduce dependence on motherboard memory.
Put another way: you would need 2.5ns RAM to run a 400 MHz G3 at
full speed. Despite all the advances in computer technology, you'd be
hard pressed to find DIMMs faster than 60ns, which provides 16.6 MHz
performance.
- Yes, there is lightning fast static memory available, which is what
they use for the L2 cache. It's fast, since you need 3ns access to do a
1:1 cache with a 300 MHz CPU. But it's also incredibly costly, making
it impractical for use as main memory on a personal computer.
There's an incredible disparity between memory speed and processor
speed, one which will only grow as we move to 600 MHz, 1GHz, and
beyond.
Back to bus speed
There are three ways to address this disparity, each of which will
be used to some degree in future designs.
- Since the CPU operates at some multiple of bus speed, increasing
bus speed allows the same CPU to run more quickly. For instance, a G3
on a 45 MHz bus can only reach 360 MHz, but on the 66 MHz bus in the
current Macs it could reach 533 MHz. So despite the relatively slow
speed of motherboard memory, increasing the bus speed does allow
improved speed. (Apple is slowly moving toward 100 MHz
motherboards.)
- Since the CPU operates at some multiple of bus speed, increasing
that multiplier allows the CPU to run more quickly. There are rumors of
a possible 10x version of the G3, which could hit 450 MHz on the older
Power Macs and 666 MHz in the current models. It's more likely than not
that the G4 will include a 10x multiplier, and possibly 12x or higher.
Combined with a 100 MHz motherboard, 1GHz CPUs become a distinct
possibility.
- Since the L1 cache is faster than the L2 cache and the L2 cache is
a lot faster than motherboard memory, increasing cache size and
possibly adding a L3 cache between motherboard RAM and the L2 cache
will increase performance. This is harder to quantify than bus speeds
and multipliers, but increasing the L1 cache from 64KB to 256 KB would
reduce the number of times the CPU had to look to the L2 cache.
Allowing a L2 cache larger than 1 MB (as the AltiVec G4 does) will
likewise reduce the number of times the CPU has to look beyond the
cache. Adding a L3 cache at motherboard speed may reduce the number of
times main memory (at 16.6 MHz) had to be accessed, although there are
other performance tradeoffs at this level.
Back to the future
Whether Apple will design G3 models with a faster bus or whether IBM
or Motorola introduce a PowerPC 750 with a higher multiplier, we can be
confident that future Power Macs will have faster bus speeds, larger
caches, and CPUs with higher multipliers.
And memory will get faster, but slowly. In a year or two, maybe 50ns
memory will be the norm, allowing any access to motherboard memory at
20 MHz, not 16.6 MHz or slower.
Further reading
- Pipelines,
MHz, latency, caches, and more, MacKiDo
- Newer
Technology on CPU bus speed, Accelerate Your Mac
- Past,
present, and future issues of G3 upgrade cards, Accelerate Your
Mac
- Think twice about
400 MHz G3 daughter cards, MacCPU