Return of HyperThreading, Turbo Mode and More
Tri-Cache Level Structure
Not only has Intel gone with a very AMD-like integrated memory controller and NUMA architecture, even the new cache hierarchy on the Core i7 similar to AMD's Barcelona. Again, this move was obvious, since the addition of the memory controller meant that Intel did not need large L1 and L2 caches to bolster its Core 2 processors. Instead, a massive 8MB of L3 cache is found on the Core i7, along with a new second-level TLB to improve virtual address translations.
Talking about the new L3 cache, it is actually not recognized by Intel as part of the new 'core'. Intel has classified it under the (strangely named, if logical) 'uncore' portion of the processor. This conforms to the scalable nature of the Core i7, with the QPI, memory controller all other examples of the basic building blocks that would go into the design of a Nehalem processor and considered the 'uncore'. Intel has even hinted at adding a GPU as another such building block in a direct counter to AMD's as yet unreleased Fusion.
The Return of HyperThreading
What you should know is that beneath all the additions, the basic quad-core at the heart of the Nehalem is still mostly the same Core 2 processor. Of course, there have been some enhancements to the internal algorithms and branch prediction capabilities, but the main story here is the return of an 'old' feature from the past. Yes, the notorious HyperThreading of the Pentium 4 era is back.
Now known as Simultaneous Multi-threading (SMT), Intel claims that it has been enhanced but it is a very similar idea of having each processor core fed more than one thread simultaneously. As such, the quad-core Core i7 will be seen as 8 logical cores by the OS. With multithreaded applications more prevalent than the Pentium 4 days, and now with the staggering memory bandwidth found on the Nehalem, Intel may have a point about HyperThreading Redux having some utility this time around, though like before, it is heavily reliant on the threaded nature of the applications, which are still few and far in the mainstream area. Workstation and Server specific workloads are far more threaded and would yield a good deal more from HyperThreading.
Finally, Intel has a Turbo Mode option for the Core i7, where the user can define the clock speeds for each core within the quad-core Core i7 individually (in the form of adjusting the clock ratio). So if your application is only using up to two cores, the Core i7 can allow for higher clock speeds for those two cores in use. It's a compromise made so that users can have the best of both worlds - better performance with higher clocks when fewer cores are used while retaining the capability of more cores when the applications needs it (scaling down the clock speeds to default or a less aggressive turbo level with all cores engaged).
** Updated on 10th December 2008 **
While the Turbo Mode option is available on all Core i7 models, only the Extreme Edition model has the core ratios completely unlocked and one can manually increase and decrease the default Turbo mode multipliers at their willing. Additionally, the Extreme Edition has no strict TDP limitation which would have otherwise throttled the CPU performance when it hits a thermal envelope (which is applicable for the other two Core i7 models).
Now that you know the important features of the Core i7, here's how it stacks up against current quad-core 'equivalents' from AMD and Intel.
|Processor Name||Core i7||Core 2 Extreme / Quad (45nm)||AMD Phenom X4|
|Processor Model||i7-965 Extreme Edition, i7-940, i7-920||QX9770, QX9650, Q9650, Q9550, Q9450, Q9400, Q9300||9950 'Black Edition, 9850 'Black Edition', 9850, 9750, 9650, 9550|
|Processor Frequency||3.2GHz, 2.93GHz, 2.66GHz||3.2GHz, 3.0GHz, 2.83GHz, 2.66GHz, 2.5GHz||2.6GHz, 2.5GHz, 2.4GHz, 2.3GHz, 2.2GHz|
|No. of Cores||4||4||4|
|Front Side Bus (MHz)||-||1333||-|
|HyperTransport Bus / QuickPath Interconnect||6.4GT/sec for i7-965 XE, 4.8GT/sec for i7-940, i7-920||-||2.0GHz (9950, 9850 only), 1.8GHz|
|L1 Cache (data + instruction)||(32KB + 32KB) x 4||(32KB + 32KB) x 4||(64KB + 64KB) x 4|
|L2 Cache||256KB x 4||6MB x 2, 3MB x 2 (Q9300 only)||512KB x 4|
|Memory Controller||Integrated Triple Channel (up to DDR3-1066)||External - Chipset Dependent||Integrated Dual Channel (up to DDR2-1066)|
|TDP (W)||130||95 - 130||95 - 125|
|Instruction Set Support||MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2||MMX, SSE, SSE2, SSE3, SSSE3, SSSE4.1||MMX, SSE, SSE2, SSE3, SSE4a|
|Execute Disable Bit||Yes||Yes||Yes|
|Intel EM64T / AMD64||Yes||Yes||Yes|
|Enhanced Intel SpeedStep Technology (EIST) / AMD Cool 'n' Quiet||Yes||Yes||Yes|
|Virtualization Technology||Yes (Enhanced)||Yes (Enhanced)||Yes|
|Process Technology||45nm||45nm||65nm SOI|
|No. of Transistors||731 million||820 million||450 million|