TY - GEN
T1 - CoLT
T2 - 2012 IEEE/ACM 45th International Symposium on Microarchitecture, MICRO 2012
AU - Pham, Binh
AU - Vaidyanathan, Viswanathan
AU - Jaleel, Aamer
AU - Bhattacharjee, Abhishek
PY - 2012
Y1 - 2012
N2 - Translation Look aside Buffers (TLBs) are critical to system performance, particularly as applications demand larger working sets and with the adoption of virtualization. Architectural support for super pages has previously been proposed to improve TLB performance. By allocating contiguous physical pages to contiguous virtual pages, the operating system (OS) constructs super pages which need just one TLB entry rather than the hundreds required for the constituent base pages. While this greatly reduces TLB misses, these gains are often offset by the implementation difficulties of generating and managing ample contiguity for super pages. We show, however, that basic OS memory allocation mechanisms such as buddy allocators and memory compaction naturally assign contiguous physical pages to contiguous virtual pages. Our real-system experiments show that while usually insufficient for super pages, these intermediate levels of contiguity exist under various system conditions and even under high load. In response, we propose Coalesced Large-Reach TLBs (CoLT), which leverage this intermediate contiguity to coalesce multiple virtual-to-physical page translations into single TLB entries. We show that CoLT implementations eliminate 40\% to 58\% of TLB misses on average, improving performance by 14\%. Overall, we demonstrate that the OS naturally generates page allocation contiguity. CoLT exploits this contiguity to eliminate TLB misses for next-generation, big-data applications with low-overhead implementations.
AB - Translation Look aside Buffers (TLBs) are critical to system performance, particularly as applications demand larger working sets and with the adoption of virtualization. Architectural support for super pages has previously been proposed to improve TLB performance. By allocating contiguous physical pages to contiguous virtual pages, the operating system (OS) constructs super pages which need just one TLB entry rather than the hundreds required for the constituent base pages. While this greatly reduces TLB misses, these gains are often offset by the implementation difficulties of generating and managing ample contiguity for super pages. We show, however, that basic OS memory allocation mechanisms such as buddy allocators and memory compaction naturally assign contiguous physical pages to contiguous virtual pages. Our real-system experiments show that while usually insufficient for super pages, these intermediate levels of contiguity exist under various system conditions and even under high load. In response, we propose Coalesced Large-Reach TLBs (CoLT), which leverage this intermediate contiguity to coalesce multiple virtual-to-physical page translations into single TLB entries. We show that CoLT implementations eliminate 40\% to 58\% of TLB misses on average, improving performance by 14\%. Overall, we demonstrate that the OS naturally generates page allocation contiguity. CoLT exploits this contiguity to eliminate TLB misses for next-generation, big-data applications with low-overhead implementations.
UR - http://www.scopus.com/inward/record.url?scp=84876544775&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84876544775&partnerID=8YFLogxK
U2 - 10.1109/MICRO.2012.32
DO - 10.1109/MICRO.2012.32
M3 - Conference contribution
AN - SCOPUS:84876544775
SN - 9780769549248
T3 - Proceedings - 2012 IEEE/ACM 45th International Symposium on Microarchitecture, MICRO 2012
SP - 258
EP - 269
BT - Proceedings - 2012 IEEE/ACM 45th International Symposium on Microarchitecture, MICRO 2012
Y2 - 1 December 2012 through 5 December 2012
ER -