After Moore’s Law: Why Multiple Processors Matter

1999: Transistor density has been doubling roughly every eighteen months like clockwork for 40 years now. That’s Moore’s Law in a nutshell.

Not only has chip density increased, but because of increasing density and an expanding market, prices for the same amount of memory or CPU power continually decrease. (Okay, there are glitches in memory pricing, but over time the trend is more for less.)

Bad news: It can’t go on forever.

In fact, as Roy Brander observes in The End of Moore’s Law on osOpinion,  within ten years we may run up against the laws of physics. At a certain point, electrons will freely wander between circuits, so circuits must remain above that minimum size or they will not work reliably.

Exactly where that point is and where our ability to manufacture circuits ends up is probably somewhere in the 0.10- to 0.03-micron range. (Today’s newest designs use 0.18-micron elements, so we can still expect at least four times and possibly twenty to thirty times higher density.)

Even if we make breakthroughs in optical computing or quantum computing, we will eventually hit a limit.

Corollaries to Moore’s Law

Moore’s Law specifically predicts transistor density. Corollaries derived from Moore’s Law include:

  • cost per transistor drops correspond to density increases
  • transistor speed increases as density increases
  • CPUs roughly double in power every 18 months

Power is intangible – it isn’t just about MHz. Sometimes a chip with a slower clock speed can vastly outperform one with a higher speed. Mac users saw this when the 25 MHz 68040 consistently outperformed the 40 MHz 68030 in the “wicked fast” Mac IIfx. We saw it again when the 233 MHz Power Mac G3 held its own against the vastly more expensive Power Mac 9600 with a 350 MHz 604e processor.

And we’ve claimed the same advantage for the G3 (and now the G4) compared with Intel’s Pentium line.

But let’s first look at CPU speed. If today’s processes can eek out a 733 MHz processor (Intel’s just-announced Pentium III), and we can expect to reach at least 0.09-micron chips, we could see 3 GHz processors in about three years. Over the following 4-5 years, technology permitting, we would see dies using 0.03-micron traces and achieving speeds of 20-25 GHz.

Those are simply mind-boggling numbers, but if engineers can maintain Moore’s Law for another 7-8 years, that’s what we’ll see. And then we’ll finally have all the processor designs on relatively even footing – they’ll all run at the same clock speed, so the design efficiencies will clearly show themselves.

The Real World

But it won’t necessarily happen. As Brander points out in his article, at a certain point the cost of building the factory (or fab) becomes prohibitive, more than any company or consortium can afford. That could happen within five years.

For the sake of argument, and because it’s a very nice round number, let’s say the industry achieves 10 GHz in about five years – and can go no further.*

What’s an industry based on constantly faster, bigger, more power, and lower costs going to do? How can you make a better computer when Moore’s Law finally runs into the brick wall of the laws of physics?

You innovate. One way to increase computing power is to put more on the chip: more instruction pipelines, more registers, more specialized circuitry (such as AltiVec). By taking a processor and giving it more execution units, it can do more operations in the same amount of time.

You also improve the instruction set. Some commands on the G4 run twice as fast as on the G3. Chip designers will keep pushing the envelope to optimize each instruction a CPU processes.

And you use a bigger on-chip cache, so the CPU spends less time waiting for data on the relatively slow memory bus. Back in the Mac IIci era, a 32 KB level 2 (L2) cache was huge; today we’re seeing Intel and Motorola designs that support a 2 MB L2 cache.

And finally, you abandon the CPU.

The SETI@home Supercomputer

The world’s most powerful supercomputer doesn’t exist as a single computer. It’s the collection of well over a million computers across the globe working on the SETI@home project. At over 6 TeraFLOPs/sec., it puts even the G4 to shame.

Some of today’s computers, including some older Macs and clones, support multiple processors. Not a central processing unit (CPU), but multiple processing units (MPUs). Rumors are that Apple will introduce MPU G4 machines next year, possibly at the Macworld Expo in January 2000 [it was July 2000]. Based on efficiencies of the G4 design, a dual processor G4 system could be more powerful than two single Power Mac G4 computers.

Whether that actually happens or not, adding a second processor will roughly double the computer’s performance. With Apple, Motorola, and IBM losing the MHz race to Intel and AMD, a quad-G4 system at 500 MHz could claim 2 GHz overall speed – and the four AltiVec processors would put it in the 16 GFLOPs range!

Down the road, when we reach the GHz wall, when the engineers have tweaked every instruction, and when each processor has reached a limit on how many pipelines it can reasonably handle, we’ll still be able to make our computers faster by strapping on more processors.

Already with Mac System 7.x Apple had patches (created by Daystar) to support two to four processors. That was vastly improved in Mac OS 9, despite the fact that Apple hadn’t produced a dual-processor system since 1997. And the support improves even further when you combine multiprocessor awareMac OS X with the expected multi-CPU G4 systems.

Conclusion

Transistor density will stop growing, probably within the next five or ten years. CPU power will probably peak at about the same time, as engineers use every trick in the book to give their processor the edge over the competition.

After that, the only ways to grow a faster computer will be by clustering multiple computers (as SETI@home does), using multiple processors, or doing both.

Update (2018): What has happened is that multiple processors could only go so far before the issue of coordinating data between them at high speed became an issue. The solution was to put more than one CPU on a die, allowing them to communicate at much higher speeds and share a large, high-speed data cache. Intel went from there to adding HyperThreading, so each core could work as though it was two cores, plus Turbo Boost, which lets some cores run higher than the chip’s rated speed while accommodating thermal limitations. In 2018, Intel makes chips with 18-20 cores, twice that many threads with Turbo Boost speeds as high as 3.7 GHz. (Chips with a lower number of cores can reach higher speeds, and 4.2 GHz seems to be the Turbo Boost ceiling these days.

 * As of 2018, the highest clock speed CPU on record was an AMD Bulldozer-based FX-8150 chip overclocked to 8.805 GHz. The highest clock speed of a production CPU is the IBM zEC12, which runs at 5.5 GHz. After a brief peak of the Pentium 4 at 3.8 GHz, Intel scaled back to focus on energy efficiency and multiple cores on the same die. Today’s CPUs have as many as 20 cores and 40 threads nominally clocked at 2.0 GHz with Turbo Boost to 3.7 GHz (Intel Xeon Skylake).

Further Reading

More Recent Coverage

keywords: #mooreslaw

short link: https://goo.gl/hNTzP5