Paper Note of APCM

Title: Access Pattern-Aware Cache Management for Improving Data Utilization in GPU

Topic: A mechanism to improve GPU performance based on data usage pattern

Idea: The design of GPU memory hierarchy not support well for warp, previous warp-level cache mechanisms are too coarse that streaming cannot benefit and some instructions with strong temporal locality are forced to bypass. Detect locality data type by monitoring per load exemplar warp and use cache tag array to track data sharing, then use different strategies for the according locality data types, make corresponding hardware design to support the mechanism. For streaming type bypass the L1 cache and directly use L2, for inter-warp utilize the LRU stategy, for intra-warp use a protection algorithm that estimate the lifespan of data to utilize the locality until mostly end the reuse.

Contribution: Analysis of the weakness of previous GPU cache management. Proposed a hardware based solution for improve GPU L1 cache performance.

Intellectual merit: Use the notion of access pattern similarity (APS) to measure the consistency of access patterns. Analysis the locality data type for per-load level optimization. Use a protection algorithm for utilize the dominant intra-warp locality.

Strengths: Clearly illustrate the strategies and design for APCM, detailed experiments to show the performance.

Weakness: Heuristic based that the locality data types tend to not change.