Graphics Cards Guide
GeForce GTX 200 Architectural Enhancements - Part 2
Texture Filtering and ROP Improvements
Previously, the GeForce 9800 GTX can address and filter up to 64 pixels per clock. The GeForce GTX 200 GPUs bring this number up to 80 pixels per clock for the high-end model, while providing an equal balance between texture addressing and filtering. A more efficient scheduler enables the new GPUs to approach closer to its theoretical peak limits compared to the GeForce 9 series.
The texture hardware count has remained at the same proportion as before but NVIDIA has increased the number of shaders (by having 1 more SM in each TPC), thereby leading to a higher shader to texture ratio. This is done in order to create a more balanced GPU to respond to the needs of modern games and applications, which nowadays are shifting more to complex shaders.
In terms of ROP (Raster Operations Processors) hardware, the larger number of TPCs naturally requires a greater number of ROPs for a better balance. Hence, there are now 8 partitions of ROPs on the GTX 200 compared to 6 on the GeForce 9800 GTX. Additionally, ROP frame buffer blending for pixels using 8-bit unsigned integer data format can now be done at twice the speed as before.
Geometry shading and stream out performance has also received an upgrade. NVIDIA has boosted the number of internal output buffer structures by up to six times for the GTX 200. This was done after feedback was received that the present quantity on the GeForce 9 series was inadequate for certain applications.
Using the Right Amount of Power
As one would expect from its number of transistors, the GTX 200 GPUs can consume a staggering amount of power. NVIDIA itself states the maximum TDP of the GTX 280 in full 3D mode is up to 236W. Therefore, NVIDIA has been at pains to introduce a more aggressive dynamic power management architecture for the GTX 200.
Besides supporting NVIDIA's new Hybrid SLI technology, which allows one to turn off the discrete GPU completely (hence drawing 0W of power) in favor of the integrated GPU on the latest NVIDIA nForce motherboard, the GTX 200 can effectively 'turn off' sections of the GPU thanks to its clock-gating circuitry. According to NVIDIA, at idle, the GTX 200 consumes around 25W of power, going up to 35W when playing back HD videos (e.g. using the PureVideo HD capable VP2). Such clock and voltage adjustments are done automatically internally and they are very responsive (in milliseconds) to the utilization rate of the GPU so as to achieve maximum power savings while being able to perform at its peak when necessary.
Using the latest version of GPU-Z utility, we managed to catch a snapshot of the clock speeds on the GeForce GTX 280 ramping up from its idle state when starting 3DMark Vantage. Of course, we aren't completely sure about the accuracy of GPU-Z's readings, given how new the GTX 280 is, but the results seem fairly reasonable.
In case you were keeping count of all that we have mentioned (and some minor ones that we have left out), below is NVIDIA quantifying the improvements that have been made from the GeForce 8800 GTX to the GeForce GTX 280. (Of course, some of the information here like FB bandwidth and PCI Express for example, are moot as these are theoretical numbers that are unlikely to be reached in both cases.)