2D Dual-Rate Texturing in D-Series GPUs
For every GPU generation the performance teams within Imagination run through a wide range of content, analysing and understanding the different workload types and their bottlenecks. As part of this analysis, the data revealed that many modern games spend an increasing amount of time executing post-processing algorithms to enable depth of field, bloom, blur and other effects.
Most of these post-processing passes are texture-sampling heavy filter effects which are modest in ALU requirements but bottlenecked by the throughput rate of the Texture Processing Unit (TPU). One approach to resolve this would be to simply brute force change the ratio of the number of TPU units versus the USC/ALU rate. However, our analysis indicated this was not a good strategy, for several reasons.
First, in regular render passes the ratio of ALU versus TPU in D-Series GPUs was already optimal and adding another TPU would simply not result in any benefits as the workload would become ALU limited. Meanwhile, other processing passes were TPU-heavy but also bandwidth-heavy, and hence boosting the TPU would not help, as there would be insufficient bandwidth to feed the extra TPU throughput so performance would not be enhanced.
To read the full article, click here
Related Semiconductor IP
- E-Series GPU IP
- Arm's most performance and efficient GPU till date, offering unparalled mobile gaming and ML performance
- Highest performance automotive GPU IP, with revolutionary functional safety technology
- High performance GPU for cloud gaming with DirectX support
- Arm’s latest flagship GPU is based on the new 5th Gen GPU architecture, bringing the next generation of visual computing to mobile
Related Blogs
- Pipelined Data Masters in D-Series GPUs
- Imagination Demonstrates DirectX Gaming on D-Series GPUs
- What's driving 3D IC design? Do 2D EDA tools need a total overhaul to support 3D design?
- GPUs Taking Bigger Share Of SOC
Latest Blogs
- A Repeatable Framework for Hardware Security Assurance
- Inside the SiFive Performance™ P570 Gen 3: High Performance Efficiency for Next-Generation Consumer and Commercial Applications
- What the steam engine can teach us about modern chip design
- Automotive silicon in the era of AI, functional safety, and cybersecurity
- JPEG XS Officially Joins GenICam, The Machine Vision Standard Managed By EMVA