Optimizing STT-RAM Based Register File Energy Consumption on GPGPU with Delta Compression
Published in the 53rd Design Automation Conference, 2016
To facilitate efficient context switches, GPUs usually employ a large-capacity register file to accommodate a massive amount of context information. However, the large register file introduces high power consumption, owing to high leakage power SRAM cells. Emerging non-volatile STT-RAM memory has recently been studied as a potential replacement to alleviate the leakage challenge when constructing register files on GPUs. Unfortunately, due to the long write latency and high energy consumption associated with write operations in STT-RAM, simply replacing SRAM with STTRAM for register files would incur non-trivial performance overhead and only bring marginal energy benefits.
In this paper, we propose to optimize STT-RAM based GPU register files for better energy-efficiency and performance via two techniques. First, we employ a light-weight compression framework with awareness of register value similarity. It is coupled with a group-based write driver control to mitigate the high energy overhead caused bySTT-RAMwrites. Second,toaddressthelongwrite latency overhead of STT-RAM, we propose a centralized SRAMbased write buffer design to efficiently absorb STT-RAM writes with better buffer utilization, rather than the conventional design with distributed per-bank based write buffers. The experimental results show that our STT-RAM based register file design consumes only 37.4% energy over the SRAM baseline, while incurring only negligible performance degradation.
Recommended citation: Hang Zhang, Xuhao Chen, Nong Xiao, Fang Liu, Optimizing STT-RAM Based Register File Energy Consumption on GPGPU with Delta Compression, In the Proceedings of the 53rd Design Automation Conference (DAC-53), Austin, TX, June 2016.
Download Paper | Download Slides