TY - GEN
T1 - CrossPrefetch
T2 - 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2024
AU - Garg, Shaleen
AU - Zhang, Jian
AU - Pitchumani, Rekha
AU - Parashar, Manish
AU - Xie, Bing
AU - Kannan, Sudarsun
N1 - Publisher Copyright:
© 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.
PY - 2024/4/27
Y1 - 2024/4/27
N2 - We introduce CrossPrefetch, a novel cross-layered I/O prefetching mechanism that operates across the OS and a user-level runtime to achieve optimal performance. Existing OS prefetching mechanisms suffer from rigid interfaces that do not provide information to applications on the prefetch effectiveness, suffer from high concurrency bottlenecks, and are inefficient in utilizing available system memory. CrossPrefetch addresses these limitations by dividing responsibilities between the OS and runtime, minimizing overhead, and achieving low cache misses, lock contentions, and higher I/O performance.CrossPrefetch tackles the limitations of rigid OS prefetching interfaces by maintaining and exporting cache state and prefetch effectiveness to user-level runtimes. It also addresses scalability and concurrency bottlenecks by distinguishing between regular I/O and prefetch operations paths and introduces fine-grained prefetch indexing for shared files. Finally, CrossPrefetch designs low-interference access pattern prediction combined with support for adaptive and aggressive techniques to exploit memory capacity and storage bandwidth. Our evaluation of CrossPrefetch, encompassing microbenchmarks, macrobenchmarks, and real-world workloads, illustrates performance gains of up to 1.22x-3.7x in I/O throughput. We also evaluate CrossPrefetch across different file systems and local and remote storage configurations.
AB - We introduce CrossPrefetch, a novel cross-layered I/O prefetching mechanism that operates across the OS and a user-level runtime to achieve optimal performance. Existing OS prefetching mechanisms suffer from rigid interfaces that do not provide information to applications on the prefetch effectiveness, suffer from high concurrency bottlenecks, and are inefficient in utilizing available system memory. CrossPrefetch addresses these limitations by dividing responsibilities between the OS and runtime, minimizing overhead, and achieving low cache misses, lock contentions, and higher I/O performance.CrossPrefetch tackles the limitations of rigid OS prefetching interfaces by maintaining and exporting cache state and prefetch effectiveness to user-level runtimes. It also addresses scalability and concurrency bottlenecks by distinguishing between regular I/O and prefetch operations paths and introduces fine-grained prefetch indexing for shared files. Finally, CrossPrefetch designs low-interference access pattern prediction combined with support for adaptive and aggressive techniques to exploit memory capacity and storage bandwidth. Our evaluation of CrossPrefetch, encompassing microbenchmarks, macrobenchmarks, and real-world workloads, illustrates performance gains of up to 1.22x-3.7x in I/O throughput. We also evaluate CrossPrefetch across different file systems and local and remote storage configurations.
UR - http://www.scopus.com/inward/record.url?scp=85191510193&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85191510193&partnerID=8YFLogxK
U2 - 10.1145/3617232.3624872
DO - 10.1145/3617232.3624872
M3 - Conference contribution
AN - SCOPUS:85191510193
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 102
EP - 116
BT - Spring Cycle
PB - Association for Computing Machinery
Y2 - 27 April 2024 through 1 May 2024
ER -