[Geek] The state of the X86 market
Jul. 7th, 2006 09:13 pmOn a mailing list,
sblackery asked what the difference was between Pentium D and Core Duo processors. My answer went on rather too long; my initial “L” ought to stand for Logorrhea, not Liam. So I’m reposting it here, in the hope that it might prove interesting or useful.
The Pentium D is a dual-core Pentium 4, no more, no less. But Intel can't trademark numbers. The D is, allegedly, for Dual. The P4 uses the Netburst architecture: looooong pipelines, poor IPC (instruction per clock) performance, designed for really high speeds - up to 10GHz - but they couldn't crack the heat-dissipation problems & AMD were pinching their high-end market.
The other line was the Pentium M. This is an improved Pentium III design, which is to say an improved Pentium Pro, a superscalar CPU from 10y ago. The Pentium M was a low-power CPU for portables and embedded systems, based around a tweaked PIII core (with better branch prediction, for instance) mated to a P4 bus, so that it could be used with modern P4 chipsets.
The P4, along with Itanium, was designed in America. The Pentium M was designed in Israel.
The thing is, the P-M (like the P3) has far better IPC than the P4, and also is smaller and cooler. Takes less power, makes less heat, and at high speed (for a P3 - i.e., 1.5 - 2GHz), it stomps all over the P4 on performance per watt. It can't quite match a really high-speed P4 running at nearly 4GHz, but at lower speeds, it's a lot more attractive: it makes for smaller quieter PCs with less cooling and relatively very good performance.
So, the Pentium M was enhanced, whereas future versions of the P4 were cancelled.
(The final revision of the P4 design was the revamped "Prescott" core, which has even longer pipelines, so even poorer IPC than the older "Northwood" core and earlier, but it can run Intel's copy of AMD's x86-64 instruction set. AMD's marketdroids renamed their version AMD64, which naturally Intel will not use, so Intel calls it EM64T. It's the same thing.)
The first enhancement of the P M was a slightly faster dual core version. This is the Core Duo. Like earlier versions, It's still a 32-bit only CPU, but it uses the same bus as the P4. It's the basis of the Mac mini, the Macbook/Macbook Pro & the iMac. A single-core version is sold for low-end kit, called Core Solo.
The next step is the Core 2. This is a bigger rework of the Pentium M. It remains bus-compatible with the P4 but gains 64-bit execution as well as 32-bit. It is, from preliminary results, considerably faster than both P4 and PM.
(There are rumours that the Prescott P4s don't have a proper 64-bit core, but were a hasty rework of the 32-bit P4 core, to somehow execute x86-64 code by slicing it into 2 32-bit chunks. This would explain the fact that whilst AMD's 64-bit chips are considerably faster than AMD's 32-bit ones, Intel's 64-bit chips are no quicker than their 32-bit ones. However, I find it technically very implausible. I can elaborate if you wish.)
The Core 2 chip is not out yet. It will be sold in various forms. The desktop version is codenamed "Conroe"; the notebook/low power one is "Merom"; the server version is "Woodcrest", marketed as the Xeon 5100 family.
(Bizarre choice of name. Woodcrest is a whole new core, much better than Netburst, but it goes from just 5000 to 5100. This is presumably to hide the fact that the Xeon 5000, codenamed "Dempsey", is just another Netburst P4 with lamentable performance. However, the Xeon 5000 did also usher in a new Intel server chipset, "Bentley", which is far better than previous Intel server offerings - but it was ready before the Core 2-based Xeon was. So they shipped it early with a mild tweak of the previous-model Xeon and called it Xeon 5000 to look impressive.)
So. Summary.
Core Duo, nice low-heat-dissipation 32-bit Pentium M variant. Pentium D, last gasp of the Netburst P4, with a die shrink to 65nm, but still huge, hot and slow. Core 2, not out yet, 64-bit version of Core, significantly quicker than Netburst while being cooler and needing less power than the P4. Core 2 is Intel's answer to Athlon64 and Opteron. Initial benchmarks suggest that it quite significantly outperforms AMD's 64-bit chips, and it is cooler and less power-hungry than them, too.
Corollary
AMD has a problem. Intel has turned on a dime (for a huge company - like Microsoft when it suddenly embraced the Web in 1996 or so). It has killed off its entire Netburst line, including some nearly-ready-for-launch P4 designs (e.g. "Tigerton", IIRC), and kept people going for a while by rebranding the P4 as the PD and massively discounting it - I suspect at very thin margins or below cost. It has placed its luck and its future in the hands of its Israeli division, which has reworked the more-than-ten-year-old P3 core and produced something that looks so far like a winner.
AMD has, for now, been outmanouvred. Its latest chips are bigger, hotter and slower than Core 2. But it's only just moved to DDR2 memory with the new socket AM2 processors; socket AM3 and DDR3 are coming next year, and will be backwardsly-compatible. It also has a major potential advantage in Hypertransport and onboard memory controllers, neither of which Intel has - it still uses a relatively old-fashioned Front Side Bus and offboard, motherboard-based memory controllers. This does mean Intel has more agility when it comes to changing memory types, though.
AMD still has the edge in big servers - HT makes 4-way boxes and upwards much easier and more efficient than Intel's Xeon 5100. HT also allows for dedicated coprocessors that fit into standard CPU sockets, which are just starting to appear. This opens up some interesting new avenues, too.
But AMD does not seem worried. I think it has Plans up its sleeve, which it is, very wisely, keeping quiet for now.
I have no clear idea what, but I think it will involve significant improvements in IPC ratio. One possible way forward, though it's as yet unclear how, is that it might have a technology that allows 2 or more cores to appear as 1 to the OS. This is being referred to as "reverse hyperthreading". People are theorising that this could, somehow, split the work of instruction decoding amongst multiple CPU cores, allowing multicore processors - or even multichip machines - to offer much better single-thread execution speeds. Multicore chips are no faster than single-core ones at running a single thread; they merely excel at running 2 threads concurrently as fast as a single-core chip runs 1.
There's been little improvement in raw single-thread speeds for a while now, and memory latencies in modern processors are /appalling/, which CPU vendors try to assuage with more and more cache.
I think Level 3 cache is possibly going to reappear, as it did, briefly, with the clever but not hugely commercially successful AMD K6-3 CPU. Maybe on the motherboard, maybe on daughterboards again (as with Intel's late and unlamented Slot 1 and Slot 2 and AMD's Slot A), maybe on the CPU die.
But I think AMD is hoping to do something dramatically clever to do with improving IPC. I can't wait.
Recently, the microprocessor world has been getting interesting again. I just mourn that it's increasingly all x86 where the real innovation is happening; all the weird and wonderful other architectures are fading away. Consolidation, I guess. 32-bit x86 isn't too bad, I believe, and 64-bit x86 is quite a lot better.
The Pentium D is a dual-core Pentium 4, no more, no less. But Intel can't trademark numbers. The D is, allegedly, for Dual. The P4 uses the Netburst architecture: looooong pipelines, poor IPC (instruction per clock) performance, designed for really high speeds - up to 10GHz - but they couldn't crack the heat-dissipation problems & AMD were pinching their high-end market.
The other line was the Pentium M. This is an improved Pentium III design, which is to say an improved Pentium Pro, a superscalar CPU from 10y ago. The Pentium M was a low-power CPU for portables and embedded systems, based around a tweaked PIII core (with better branch prediction, for instance) mated to a P4 bus, so that it could be used with modern P4 chipsets.
The P4, along with Itanium, was designed in America. The Pentium M was designed in Israel.
The thing is, the P-M (like the P3) has far better IPC than the P4, and also is smaller and cooler. Takes less power, makes less heat, and at high speed (for a P3 - i.e., 1.5 - 2GHz), it stomps all over the P4 on performance per watt. It can't quite match a really high-speed P4 running at nearly 4GHz, but at lower speeds, it's a lot more attractive: it makes for smaller quieter PCs with less cooling and relatively very good performance.
So, the Pentium M was enhanced, whereas future versions of the P4 were cancelled.
(The final revision of the P4 design was the revamped "Prescott" core, which has even longer pipelines, so even poorer IPC than the older "Northwood" core and earlier, but it can run Intel's copy of AMD's x86-64 instruction set. AMD's marketdroids renamed their version AMD64, which naturally Intel will not use, so Intel calls it EM64T. It's the same thing.)
The first enhancement of the P M was a slightly faster dual core version. This is the Core Duo. Like earlier versions, It's still a 32-bit only CPU, but it uses the same bus as the P4. It's the basis of the Mac mini, the Macbook/Macbook Pro & the iMac. A single-core version is sold for low-end kit, called Core Solo.
The next step is the Core 2. This is a bigger rework of the Pentium M. It remains bus-compatible with the P4 but gains 64-bit execution as well as 32-bit. It is, from preliminary results, considerably faster than both P4 and PM.
(There are rumours that the Prescott P4s don't have a proper 64-bit core, but were a hasty rework of the 32-bit P4 core, to somehow execute x86-64 code by slicing it into 2 32-bit chunks. This would explain the fact that whilst AMD's 64-bit chips are considerably faster than AMD's 32-bit ones, Intel's 64-bit chips are no quicker than their 32-bit ones. However, I find it technically very implausible. I can elaborate if you wish.)
The Core 2 chip is not out yet. It will be sold in various forms. The desktop version is codenamed "Conroe"; the notebook/low power one is "Merom"; the server version is "Woodcrest", marketed as the Xeon 5100 family.
(Bizarre choice of name. Woodcrest is a whole new core, much better than Netburst, but it goes from just 5000 to 5100. This is presumably to hide the fact that the Xeon 5000, codenamed "Dempsey", is just another Netburst P4 with lamentable performance. However, the Xeon 5000 did also usher in a new Intel server chipset, "Bentley", which is far better than previous Intel server offerings - but it was ready before the Core 2-based Xeon was. So they shipped it early with a mild tweak of the previous-model Xeon and called it Xeon 5000 to look impressive.)
So. Summary.
Core Duo, nice low-heat-dissipation 32-bit Pentium M variant. Pentium D, last gasp of the Netburst P4, with a die shrink to 65nm, but still huge, hot and slow. Core 2, not out yet, 64-bit version of Core, significantly quicker than Netburst while being cooler and needing less power than the P4. Core 2 is Intel's answer to Athlon64 and Opteron. Initial benchmarks suggest that it quite significantly outperforms AMD's 64-bit chips, and it is cooler and less power-hungry than them, too.
Corollary
AMD has a problem. Intel has turned on a dime (for a huge company - like Microsoft when it suddenly embraced the Web in 1996 or so). It has killed off its entire Netburst line, including some nearly-ready-for-launch P4 designs (e.g. "Tigerton", IIRC), and kept people going for a while by rebranding the P4 as the PD and massively discounting it - I suspect at very thin margins or below cost. It has placed its luck and its future in the hands of its Israeli division, which has reworked the more-than-ten-year-old P3 core and produced something that looks so far like a winner.
AMD has, for now, been outmanouvred. Its latest chips are bigger, hotter and slower than Core 2. But it's only just moved to DDR2 memory with the new socket AM2 processors; socket AM3 and DDR3 are coming next year, and will be backwardsly-compatible. It also has a major potential advantage in Hypertransport and onboard memory controllers, neither of which Intel has - it still uses a relatively old-fashioned Front Side Bus and offboard, motherboard-based memory controllers. This does mean Intel has more agility when it comes to changing memory types, though.
AMD still has the edge in big servers - HT makes 4-way boxes and upwards much easier and more efficient than Intel's Xeon 5100. HT also allows for dedicated coprocessors that fit into standard CPU sockets, which are just starting to appear. This opens up some interesting new avenues, too.
But AMD does not seem worried. I think it has Plans up its sleeve, which it is, very wisely, keeping quiet for now.
I have no clear idea what, but I think it will involve significant improvements in IPC ratio. One possible way forward, though it's as yet unclear how, is that it might have a technology that allows 2 or more cores to appear as 1 to the OS. This is being referred to as "reverse hyperthreading". People are theorising that this could, somehow, split the work of instruction decoding amongst multiple CPU cores, allowing multicore processors - or even multichip machines - to offer much better single-thread execution speeds. Multicore chips are no faster than single-core ones at running a single thread; they merely excel at running 2 threads concurrently as fast as a single-core chip runs 1.
There's been little improvement in raw single-thread speeds for a while now, and memory latencies in modern processors are /appalling/, which CPU vendors try to assuage with more and more cache.
I think Level 3 cache is possibly going to reappear, as it did, briefly, with the clever but not hugely commercially successful AMD K6-3 CPU. Maybe on the motherboard, maybe on daughterboards again (as with Intel's late and unlamented Slot 1 and Slot 2 and AMD's Slot A), maybe on the CPU die.
But I think AMD is hoping to do something dramatically clever to do with improving IPC. I can't wait.
Recently, the microprocessor world has been getting interesting again. I just mourn that it's increasingly all x86 where the real innovation is happening; all the weird and wonderful other architectures are fading away. Consolidation, I guess. 32-bit x86 isn't too bad, I believe, and 64-bit x86 is quite a lot better.