Title: Architectural Support for Address Translation on GPUs
Topic: explore GPU Memory Management Units (MMUs) consisting of Translation Lookaside Buffers (TLBs) and page table walkers (PTWs), analyze current technology and propose augments
Idea: In Heterogeneous Systems Architecture, the uniform virtual address of GPU/CPU memory can provide programming benefits and need efficient hardware, GPU’s warp-based execution model plays an important role. Using CPU-like MMU design would hurt GPU performance, they explored cache-conscious warp/wavefront scheduling for MMU design and how TLB affects dynamic warp formation. They explored and propose methods in 3 aspects: Address Translation for GPU, Cache Conscious Warp Scheduling, thread block compaction.
The benefit of their work: remove need of CPU to handle GPU MMU, support multi-context and support application libraries.
General sight: Default warp scheduling would break temporal locality, sophisticated warp scheduling would loss effectiveness.