The Presler Core
The Presler Core
With the maturity of Intel's 65nm process engineering, it was a golden opportunity for Intel to spruce up its current processor lineup for both the desktop and mobile platforms. Massive efforts went towards the formulation the Yonah core for the laptop market, which is Intel's first dual-core mobile processor that's now known as the Core Duo processor (and incidentally, it's also the design basis of the upcoming Merom and Conroe processors). The desktop side was not to be outdone as well and the new Pentium D 900-series debuted at about the same time frame as Core Duo did. While the new Pentium D based on the Presler core wasn't as groundbreaking as Yonah, it had enough advancements to categorize it as an evolutionary step up.
The old Smithfield based Pentium D 800-series processors had a single large die (read as 'expensive') which fused two cores with 1MB of L2 cache each and were basically two Prescott-class processors put together. Sharing the same roots as its single-core Pentium 4 counterparts, the feature set was identical to them in every way - Netburst architecture, deep 31-stage pipeline (per core), Execute Disable Bit, Intel EM64T and Enhanced Intel SpeedStep Technology (EIST). However, unlike the 600-series Pentium 4 processors which sport a huge 2MB L2 cache, the 800-series Pentium D has a total of 2MB L2 cache; meaning 1MB exclusively for each core only. Based on their then 90nm process technology and the CPU structure of the Smithfield, Intel could only qualify them until 3.2GHz and were still operating off the old 800MHz FSB. The reduced cache quantity available to each core along with the limited FSB greatly hampered the Pentium D's dual-core processing throughput in addition to other shortcomings such as its limited clock speed range and the high cost of manufacturing the Smithfield.
The newer Preseler core of the Pentium D 900-series approaches the dual-core situation in another route. Instead of having one single monolithic die with two cores, Intel went with two separate dies (with one core each) put together on the same packaging (as shown in the above diagram). Since each core or die can exist on their own in the form of the Pentium 4 single-core lineup, Intel can maximize processor production output quantity and minimize chances and cost of discarding the low-yielding or faulty dies.
Each of these dies are manufactured on the 65nm process technology, have a full 2MB L2 cache, support all of the other features found in the previous generation plus Intel Virtualization Technology (VT). While virtualization was always possible via the software stack alone, the inclusion of hardware support in the CPUs would greatly simplify the software side of things as well as increase the experience of using multiple virtual environments concurrently for dedicated usage needs without compromising stability or performance. Lets get back to the processor talk, as we'll be covering more about VT in a separate article. The single-core version goes by the codename of Cedar Mill and you can identify Pentium 4 processors using them via CPU model numbers of the 6x1 (without VT) and 6x2 class (with VT). Having mentioned what Presler is about earlier, it is basically two Cedar Mill cores put together on the same package.
The Presler will have 2 x 2MB L2 cache for a grand total of 4MB per CPU, but like its predecessors, each core can access its own cache exclusively. With more cache per core closer to the processing units than the Pentium D 800-series, the Pentium D 900-series is bound to benefit from this larger pool of ready data to feed the deep processing pipelines. Should there be any branch prediction error, the chances of having a cache hit to retrieve the necessary data increases two-folds. Of course with a larger cache, the Pentium D 900-series has a better prefetch mechanism to fill it up as well.
The L2 cache of a modern processor requires significant die real estate and with Presler having twice that of the Smithfield, the transistor count has shot up to a staggering 376 million (both cores combined) from the 230 million on the Smithfield. Thanks to the 65nm process technology, it has truly enabled Intel to pack more on the processor and surprisingly, costs less to produce than its predecessor! To prove this point, the die size of the Smithfield is 206mm2 while the superior Presler core is only 162mm2 (both cores combined). Given the same 300mm wafers used for production, Intel now can obtain up to 25% more Pentium D processors per wafer. And since the same dies are also used for the top-end Pentium 4 series, Intel is maximizing all its efforts to gain more for reduced costs. As such, the 900-series processors retail at the same price or slightly less than the 800-series. Interestingly, the Thermal Design Power envelop of these new processors remain the same as the older generation despite the new process technology adoption. It is highly likely that the added features and vast increase in transistor count negated any power savings obtainable.
The new Pentium Extreme Edition Processors based on the Pentium D 900-series have yet another two aces up their sleeves ï¿½ an FSB boost to 1066MHz for a higher throughput and higher clock speeds up to 3.73GHz on the 965 processor model. These are in addition to Hyper-Threading technology to give them a total of 4 logical CPUs. Together with the enhancements made to the Pentium D lineup, the top-of-the-line Pentium XE processors can be a formidable challenge to the AMD's best Athlon 64 X2 and Athlon 64 FX processor series. Laid out below is a feature and characteristics comparison table of the current best-in-class CPUs from both Intel and AMD:-
|Processor Name||Pentium Extreme Edition||Pentium Extreme Edition||Pentium 4 Extreme Edition||AMD Athlon 64||AMD Athlon 64||AMD Athlon 64 X2|
|Processor Model||965 / 955||840||-||FX-60||FX-57||4800+|
|Processor Frequency||3.73GHz / 3.46GHz||3.2GHz||3.73GHz||2.6GHz||2.8GHz||2.4GHz|
|No. of Cores||2||2||1||2||1||2|
|No. of Logical Processors||4||4||2||2||1||2|
|Front Side Bus (MHz)||1066||800||1066||-||-||-|
|HyperTransport Bus||-||-||-||1GHz (2000MT/s)||1GHz (2000MT/s)||1GHz (2000MT/s)|
|L1 Cache (data + instruction)||(16KB + 12KB) x 2||(16KB + 12KB) x 2||16KB + 12KB||(64KB + 64KB) x 2||64KB + 64KB||(64KB + 64KB) x 2|
|L2 Cache||2MB x 2||1MB x 2||2MB||1MB x 2||1MB||1MB x 2|
|VID (V)||1.20 -1.3375||1.20 -1.40||1.25 - 1.40||1.35 - 1.40||1.35 - 1.40||1.35 - 1.40|
|Icc (max) (A)||125||125||119||80||74.9||80|
|Execute Disable Bit||Yes||Yes||Yes||Yes||Yes||Yes|
|Intel EM64T / AMD64||Yes||Yes||Yes||Yes||Yes||Yes|
|Enhanced Intel SpeedStep Technology (EIST) / AMD Cool 'n' Quiet||No||No||No||Yes||Yes||Yes|
|Process Technology||65nm||90nm||90nm||90nm SOI||90nm SOI||90nm SOI|
|Processor Codename||Presler||Smithfield||Prescott 2M||Toledo||San Diego||Toledo|
|No. of Transistors||376 million||230 million||169 million||233.2 million||113 million||233.2 million|