TY - GEN
T1 - Increasing TLB reach by exploiting clustering in page translations
AU - Pham, Binh
AU - Bhattacharjee, Abhishek
AU - Eckert, Yasuko
AU - Loh, Gabriel H.
PY - 2014
Y1 - 2014
N2 - The steadily increasing sizes of main memory capacities require corresponding increases in the processor's translation lookaside buffer (TLB) resources to avoid performance bottlenecks. Large operating system page sizes can mitigate the bottleneck with a smaller TLB, but most OSs and applications do not fully utilize the large-page support in current hardware. Recent work has shown that, while not guaranteed, some virtual-to-physical page mappings exhibit 'contiguous' spatial locality in which consecutive virtual pages map to consecutive physical pages. Such locality provides opportunities to coalesce 'adjacent' TLB entries for increased reach. We observe that beyond simple adjacent-entry coalescing, many more translations exhibit 'clustered' spatial locality in which a group or cluster of nearby virtual pages map to a similarly clustered set of physical pages. In this work, we provide a detailed characterization of the spatial locality among the virtual-to-physical translations. Based on this characterization, we present a multi-granular TLB organization that significantly increases its effective reach and reduces miss rates substantially while requiring no additional OS support. Our evaluation shows that the multi-granular design outperforms conventional TLBs and the recently proposed coalesced TLBs technique.
AB - The steadily increasing sizes of main memory capacities require corresponding increases in the processor's translation lookaside buffer (TLB) resources to avoid performance bottlenecks. Large operating system page sizes can mitigate the bottleneck with a smaller TLB, but most OSs and applications do not fully utilize the large-page support in current hardware. Recent work has shown that, while not guaranteed, some virtual-to-physical page mappings exhibit 'contiguous' spatial locality in which consecutive virtual pages map to consecutive physical pages. Such locality provides opportunities to coalesce 'adjacent' TLB entries for increased reach. We observe that beyond simple adjacent-entry coalescing, many more translations exhibit 'clustered' spatial locality in which a group or cluster of nearby virtual pages map to a similarly clustered set of physical pages. In this work, we provide a detailed characterization of the spatial locality among the virtual-to-physical translations. Based on this characterization, we present a multi-granular TLB organization that significantly increases its effective reach and reduces miss rates substantially while requiring no additional OS support. Our evaluation shows that the multi-granular design outperforms conventional TLBs and the recently proposed coalesced TLBs technique.
UR - http://www.scopus.com/inward/record.url?scp=84903973894&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84903973894&partnerID=8YFLogxK
U2 - 10.1109/HPCA.2014.6835964
DO - 10.1109/HPCA.2014.6835964
M3 - Conference contribution
AN - SCOPUS:84903973894
SN - 9781479930975
T3 - Proceedings - International Symposium on High-Performance Computer Architecture
SP - 558
EP - 567
BT - 20th IEEE International Symposium on High Performance Computer Architecture, HPCA 2014
PB - IEEE Computer Society
T2 - 20th IEEE International Symposium on High Performance Computer Architecture, HPCA 2014
Y2 - 15 February 2014 through 19 February 2014
ER -