In lifting the lid on the new "10h" architecture, which will power its upcoming Phenom and Barcelona quad-core chips, AMD is throwing down the gauntlet to Intel in the battle for processor supremacy.
To really understand where the quad-core competition stands, one must untangle the race to market from the debate over whose architecture is better. On the first score, Intel is clearly ahead. Intel already offers several quad-core desktop processors, as part of its Core 2 Quad and Core 2 Extreme families. On the server side, Intel currently ships no less than nine quad-core Xeon server chips.
Intel detailed its other near-term plans at May 3 Spring Analyst Meeting. The chip giant's laundry list of planned introductions includes two new quads based on Intel's latest 45-nanometer chip technology: Yorkfield for desktops and Harpertown for servers. Both are due in the second half of this year.
AMD, behind in race to ship quads, trying to shift the discussion to which company is building a better processor. With Phenom, the just-announced product name for the desktop quads previously called Agena, and with Barcelona, the upcoming quad-core versions of the Opteron server chip, AMD thinks it does. Barcelona is expected to ship sometime this summer; Phenom will follow later this year.
Despite--or perhaps because of--the fact that Intel was the first to ship quads, AMD never hestitates to point out that its initial quad-core processors are fresh, from-the-ground-up designs. "Currently there is no manufacturer on the planet that has native quad-cores. Our competitors have dual, dual cores," is how Ian McNaughton, AMD's FX product manager, put it in a phone interview last month.
That dig refers to the fact that Intel's first-generation quads essentially place two dual-core processors side by side. Intel doesn't think this is an issue. As Intel CEO Paul Otellini put it at the Intel Developers Forum last September: "The initial ones are multi-chip, but so what?' You guys are misreading the market if you think people care what's in the package.''
Judging by past history, PC users are far more likely to care about processor performance than design issues which are difficult for non-electrical engineers to get much of a grasp on. Indeed, when dual-core processors debuted in 2005, a similar battle of marketing one-upmanship occurred. Then, AMD touted its Athlon 64 X2 processors as "true" dual cores, as compared to the bolted-together 800-series Pentium Ds.
However, the dual-core duel became, and remains a performance battle. AMD was widely perceived to have taken an initial lead. Intel was seen as recovering the advantage when its introduced its Core 2 Duo family in mid-2006. When Phenom and Barcelona ship later this year, AMD is hoping the new 10h architecture it's using will help it do some performance leapfrogging of its own. The design incorporates a bunch of enhancements, including: new instructions, improved floating-point execution units, faster data transfer between floating-point and general-purpose registers, and 1-Gbyte paging, to name a few. The 10h architecture also incorporates optimization to make AMD's hardware-based virtualization run faster. Beyond the computing power packed onto the chip itself, much of the elegance of the AMD approach is evident in the way it handles I/O to external devices as well as interprocessor communications. In contrast to the traditional method of sending outbound data over a frontside bus, AMD has long used its proprietary HyperTransfer interface. With10h, HyperTransport3 debuts. This next-generation upgrades boosts the total bandwidth of the link to 20.8 Gbytes/sec.
When AMD discussed its quad-core Opteron at the International Solid State Circuits Conference in February, it said the the processor contained 450 million transistors and would be fabricated in 65-nm CMOS technology. This puts AMD at something of a disadvantage vis-'-vis Intel, which will ship 45-nm quad core processors later this year. In terms of chip construction, smaller is better because its enables lower power operation. It also allows the chip vendor to get higher yields, by placing more processors on each of the large 300-mm wafers on which the chips are made before they're sliced off and individually packaged.
Intel's sheer size has long given it a big advantage in chip manufacturing. It's currently thought to be well ahead of AMD on the road to 45 nm. Intel said last week that it's in the process of bringing four factories on line to make 45-nm chips. However, for its part, AMD is also rushing to ready 45-nm and has reportedly already made prototype wafers at its Fab 36 in Dresden, Germany. In terms of on-chip features, the four cores of Barcelona are expected to each have their own, 512-kB L2 cache, and to share a 2-MB L3 cache. The processor will support a fast, DDR2/DDR3 memory interface.
Interestingly, the race to four cores is also something of a race to eight cores. AMD is emphasizing that Phenom will support dual-socket motherboards. This will allow two chips, each with four processors, to be placed in the same system, for eight core overall. Barcelona will allow similar multi-socket configurations (including a four-socket NUMA design at the very high end), as will Intel's offerings.
Diving deeper into the laundry list of arcane technical improvements AMD has packed into the new 10h architecture, which will debut with Phenom and Barcelona, there are the following improvements:
- Beefed up floating-point support. Earlier processors had 64-bit floating-point execution units. Because of 10h, AMD will be able to equip Phenom and Barcelona with 128-bit floating-point units, if it so chooses. The wider design will double the performance of floating-point vector operations.
- Instruction-fetching improvements. The fetch window has been widened to 32 bytes from 16 bytes. This will allow the processor to handle a complete sequence of three large instructions per cycle.
- Large page support. As mentioned earlier, the 10h processors now support 1-GB paging. The feature provides a big benefit to applications, such as multimedia, which operate on large data sets.
- Instruction-set improvements. These include the addition of two advanced bit-manipulation instructions, which operate on general purpose registers
- Virtual machine optimizations. The 10h architecture includes many improvements to boost the performance of AMD's virtualization technology, as well as compiler-related optimizations.