It is based on 2-issue VLIW architectue with SIMD instruction in each slot. When operated in 50MHz, it achieves the performance of 400MFLOPS with two 4-channel floating-point operations exectued simultaneously for transforming 12.5M vertices/s.
AMT with data forwarding reduces data hazard conditions to improve the performance with fewer pipeline bublles, and alleviate the data access of the register files form the datapath to reduce the power consumption.
AMT with data forwarding reduces data hazard conditions to improve the performance with fewer pipeline bublles, and alleviate the data access of the register files form the datapath to reduce the power consumption.
A geometry-content-aware technique called ERAT is developed to reduce power consumption and increase the performance by jrecting redundant triangles after the transform stage.
(a) shows the results of power reduction when all the three proposed key techniques are employed.
(b) shows 1.82 times improvement can be achieved when compared with the state-of-the-art vertex processor.