Offshore Outsourcing will help you to reduce cost and enhance your productivity

Home About Us Services Partners Articles Classifieds Directory Contact Us
   
Offshoring
Outsourcing
BPO
Computers Networks
Internet
Operating Systems
Data Storage
Telecommunications
Programming
Software Engineering
Information Technology
Online Rights - Law
Business
E-Commerce
IT Outsourcing
Business Consulting
Finance & Accounting
Graphic Design
Web Services
Search Engine Optimization
Open Source
Hardware
Security
Others
Games

Phenom 9700, AMD's 1st Quad-Core CPU

Top Model With Four Cores - AMD Phenom 9700

Top Model With Four Cores - AMD Phenom 9700

Translation: The Spider Spins It's Web.

Today marks a historic occasion for AMD. After delays of more than a year, the company can finally present its new, highly anticipated processor - and not a moment too soon. AMD needs a fresh product. While this CPU was originally meant as a competitor to Intel's Core 2 CPUs, the balance of power in the CPU arena has shifted over the past 18 months. The new processor, dubbed Phenom by AMD, is the first quad-core CPU by AMD and, as the company likes to remind us, the first native quad-core design.

The exhaustion in the faces of our editors in the Munich lab is a testament to the hard work they've put into this article over the past few hours and days. We tested all three models of the new processor, the Phenom 9700, Phenom 9600 and Phenom 9500 , running each of them through our benchmark suite. Along with the Phenom processor, AMD is also presenting its "AMD OverDrive" tool.

With the new 7-series chipset family, consisting of the 790FX, 790X and 770, AMD is simultaneously unveiling the Spider platform. Up to four graphics cards can be set up as a Crossfire X configuration using the new 790FX chipset.

All of the current and the new motherboards and processors are fully compatible with one another.

Looking towards Eastern Europe: For the actual introduction of its Phenom processor, AMD invited the press to the Polish capital of Warsaw, where the company held a three-day press conference.

AMD Phenom Launch

Jochen Polster, manager of AMD Germany, opened the event with a keynote addressing the press. For the first time since the acquisition of graphics chip company ATI, AMD is presenting a complete platform consisting of the Phenom processor, the 790FX chipset and the HD3800 graphics card series. With this platform, code named Spider, AMD aims to offer the basis for a computer that is affordable for everyone.

Jochen Polster emphasised that the Phenom quad-core processor does not represent a high-end model for now. AMD plans to price the Phenom models markedly lower than Intel's quad-core models.

The gaming market has always been a driving force in PC sales. With the 790FX chipset, AMD now offers buyers the possibility of creating a system using up to four graphics cards in a crossfire configuration. The appropriate driver is expected for release in January 2008.

Since we already covered the HD3800 series of graphics cards in a separate launch article, we will concentrate exclusively on the Phenom quad-core processor and the new 790FX chipset in this article.

AMD Phenom Launch

The Phenom In Detail - A Revamped Athlon 64

AMD has thoroughly reworked the core of the Phenom processor compared to the Athlon 64, succeeding in raising the number of instructions per clock cycle (IPC). According to AMD, we should expect to see a performance increase of up to 25% at the same clock speed.

AMD Phenom Launch

Like several of the later Athlon 64 models, the Phenom is manufactured on a 65 nm production process. In its presentation, AMD stated that it will begin transitioning to a 45 nm process starting in 2008. Unlike Intel's quad-core solutions, which consist of two dual-core processors combined in one CPU package, AMD's Phenom uses a single die comprising four cores. The resulting die has an area of 285 mm² and consists of 600 million transistors. That means that the transistor count has more than doubled compared to the Athlon 64 X2, which consisted of 227 million transistors.

AMD Phenom Launch

The BIOS POST message

The downside to the single-die quad-core approach is a greater risk of manufacturing defects and thus lower yields. If even one of the cores suffers a manufacturing defect, the entire quad-core CPU becomes defective. AMD has found a solution, if this should ever happen, though. If one of the cores is indeed defective, it is deactivated, and the processor is sold as a three-core model. In an interview in Warsaw, AMD now officially confirmed that the tri-core models are indeed quad-cores with one deactivated core. In the end, this is a boon to the consumer. Where Intel would sell a processor with one defective core in the notebook sector, since the desktop line does not include a single-core Core 2 processor, AMD's customers will be able to purchase an inexpensive tri-core CPU. However, for now it is unclear when the Phenom X3 processors will go on sale.

Stars Core Micro-Architecture

While AMD's quad-core processor was still in development, the new micro-architecture was referred to as K10. Now, with the official introduction, it is being rechristened the Stars core micro-architecture

The last time AMD introduced a completely new micro-architecture was in September of 2003 with the launch of the Athlon 64. During the long development time for the Phenom processor, a great number of alterations were made to the core design, resulting in a performance increase at the same clock speed.

AMD Phenom Launch

Technology I - Advanced Memory Prefetcher, SSE4a

As our avid readers will undoubtedly remember, Intel introduced the first SIMD extensions to the X86 ISA in the shape of the MMX instruction set. As a countermove, AMD implemented the 3DNow feature in its own processors. This resulted in a situation where software did not benefit from the same kind of performance boost on both companies' processors, since it had to be specially optimized to take advantage of the extensions. Thankfully, this kind of competition and incompatibility died down, and the SSE, SSE2 and SSE3 extensions used by AMD and Intel were identical. However, the two chipmakers are now parting ways once again, to the detriment of the users and the programmers. With the launch of the Penryn core, Intel introduced the SSE4.1 instruction set. AMD, meanwhile, is implementing SSE4a (formerly known as SSE128) in the new Stars Core micro architecture.

Technology I - Advanced Memory Prefetcher, SSE4a

The Phenom's SSE unit is being widened to 128 bits , up from the Athlon 64's 64 bit unit. Additionally, AMD is adding four new instructions , namely EXTRQ/INSERTQ and MOVNTSD/MOVNTSS. Two more instructions, LZCNT/POPCNT, which are primarily used for load operations and bit manipulations functions, are included as well.

Sadly, Intel's SSE4.1 and AMD's SSE4a are incompatible with one another - a fact that may soon cause problems for programmers and users alike.

Technology I - Advanced Memory Prefetcher, SSE4a

The advanced memory prefetcher can load data directly from the RAM to the core's L1 cache without needing to take a detour through the L2 cache first. Thus, the data can be loaded into the processor with a much lower latency. Simultaneously, this also results in a lower load on the L2 cache, which can instead buffer data more efficiently, in turn translating into an overall performance boost.

Furthermore, the prefetcher identifies recurring data patterns and can pre-fetch them even before they are requested.

x86 instructions are between 3 and 15 bytes long. Compared to the Athlon 64 core, the data buffer for fetching instructions was increased to 32 bytes, allowing the core to process more instructions simultaneously. Thus, as you can see in our diagram, up to three instructions can be processed at the same time, depending on the length of the instructions.

Technology I - Advanced Memory Prefetcher, SSE4a

Technology II - Branch Prediction, Stack Counter

Object-oriented programming languages such as C++, Delphi and Java cause the most problems for branch prediction units. When branching occurs in assembly code, the question is not only whether or not a jump occurs, but also what code module the jump points to. AMD has analyzed the current crop of compilers and tweaked its branch prediction logic to increase the likelihood that the processor chooses the right branch/path. This allows many programs to execute faster.

When program code is executed, its memory address in the stack, which is basically a buffer for data, is stored in the ESP register. Until now, while decoding x86 instructions the processor had to manage the micro-ops for manipulating the ESP register on its own, which required processor time. AMD's Phenom now comes with a sideband counter that monitors the stack independently and automatically adjusts the ESP register. Thus, the instructions for updating the ESP no longer have to be executed, speeding up overall program execution.

Technology II - Branch Prediction, Stack Counter

Technology III - Virtualisation, L3-Cache, HTT 3.0

The virtualisation functionality integrated in the Phenom processor has also received a notable performance boost . Now, operating systems in virtual environments can interact directly with the hypervisor, the management software for virtual machines. This reduces the switching times between the hypervisor and the virtual machines.

This functionality is already found in the Barcelona-based Opteron processors of the server segment. Since both processors, i.e. the Phenom and the new Opteron, use the same core design, this function can now be used on the desktop without limitation as well.

Technology III - Virtualisation, L3-Cache, HTT 3.0

L3 Cache

As a result of AMD's decision to use a single-die design for its quad-core processor, the chipmaker is able to let all four cores share a common L3 cache. In other words, AMD is implementing an L3 cache. Each of the four processor cores possesses its own 512 kB L2 cache. Additionally, all cores have access to the same data pool through the 2MB of L3 cache . This leads to an additional increase in performance.

Additionally, the L3 cache acts as a write buffer for the system memory, which also brings a small performance increase with it.

L3 Cache

Unlike the Athlon 64 X2, the Phenom processor no longer uses the Hypertransport 2.0 interface. Instead, AMD pairs it with the faster Hypertransport 3.0 , which increases the available bandwidth to 20.8 GB/s and can result in better 3D performance. This is an especially important factor when a multi-card Crossfire setup is used.

Hypertransport versions and their respective bandwidths:

  • Version 1.0: 6.4 GB/s, 1600 MHz
  • Version 2.0: 8.0 GB/s, 2000 MHz
  • Version 3.0: 20.8 GB/s, 3600 MHz

Since the Hypertransport protocol is downward compatible, users will be able to plug the Phenom into older AM2 socket boards as well as the new AM2+ generation.

Direct AMD Comparison - Phenom And Athlon X2

In the following table we compare the most important technical characteristics of the Phenom and the Athlon X2.

AMD Phenom Launch

The bottom of the Phenom CPU.
Phenom vs. Athlon X2
  Phenom Athlon X2
Code Name Agena Windsor
Brisbane
Orleans
Lima
Manila
Clock Speed max. 2.30 GHz max. 3.20 GHz
Hypertransport HTT 1.0
HTT 2.0
HTT 3.0
HTT 1.0
HTT 2.0
L1 Cache 4x 64+64 kB 2x 64+64 kB
L2 Cache 4x 512 kB 2x 1 MB
2x 512 kB
L3 Cache 2 MB none
Fabrication Process 65 nm 90 nm
65 nm
Transistors 600 M 227 M
154 M
DIE Area 285 mm² 183 mm²
230 mm²
TDP 125 W 125 W
89 W
65 W
62 W
45 W
35 W
Cool'n'Quiet Version 2.0 Version 1.0
Instruction Sets MMX
3DNow!
NX
X86-64
Pacifica
Presidio
SSE
SSE2
SSE3
SSE4A
MMX
3DNow!
NX
X86-64
Pacifica
Presidio
SSE
SSE2
SSE3

Direct AMD Comparison - Phenom And Athlon X2

The Phenom processor carries the code name Agena and uses the B2 stepping.

Direct AMD Comparison - Phenom And Athlon X2

Phenom - Models

Three different CPU models were mentioned in the first slide of the presentation we received in preparation for AMD's event.

Phenom CPUs
Name Clock Speed L2 Cache L3 Cache TDP
AMD Phenom 9700 2.4 GHz 4x 512 kB 2 MB 125 W
AMD Phenom 9600 2.3 GHz 4x 512 kB 2 MB 95 W
AMD Phenom 9500 2.2 GHz 4x 512 kB 2 MB 95 W

AMD Phenom Launch

The first slide of AMD's Phenom presentation

We tested all three models extensively.

AMD Phenom 9700

AMD Phenom 9700

AMD Phenom 9600

AMD Phenom 9600

AMD Phenom 9500

AMD Phenom 9500

However, once the presentation reached the point where specific models were mentioned, AMD only spoke about two of the models.

AMD Phenom Launch

Now there's only talk of two CPUs...

As you can imagine, the journalists present asked why AMD's presentation was limited to the Phenom 9500 and 9600 models and what had happened to the 9700...

Phenom Caught A Processor Bug - Remembering The Pentium 60

AMD Phenom Launch

A statement about the Phenom 9700

Somehow, we couldn't help but think back to the P60 as Dave Everitt informed the audience that a processor bug had found its way into the early Phenom processor samples. The bug causes the system to freeze when a certain combination of instructions coincides with extraordinarily high traffic.

This bug can only be reproduced in the lab but does not occur under normal, real-world conditions. It is still present in the 2.20 GHz and 2.30 GHz versions of the Phenom (9500 and 9600).

As a result of this bug, the 2.4 GHz version of the Phenom with the model number 9700 has been pushed back to January of 2008. When it comes out, that version will not contain the bug.

This turn of events caught both the press and AMD's employees completely by surprise as this fact was completely unknown before the launch event. Currently, many online stores still list the Phenom 9700, but any attempt to order it should end in a cancellation by the retailer.

AMD Phenom Launch

Still listed - Phenom 9700

We did not encounter any crashes or instabilities with the CPU we received for testing.

We should mention that bugs like these are nothing extraordinary and are a comparatively commonplace occurrence. The processor makers list these bugs in so called Errata that detail the specifics of each bug. In most cases, the error is fixed in the next stepping of the CPU without the user ever knowing it existed in the first place. Intel, for example, details how the errata on each of its processors can be provoked to cause an exception or error. In other words, the fact that the first batch of Phenom processors has a bug shouldn't be dwelled on all that much