TY - GEN
T1 - Read as needed
T2 - 18th USENIX Conference on File and Storage Technologies, FAST 2020
AU - He, Jun
AU - Wu, Kan
AU - Kannan, Sudarsun
AU - Arpaci-Dusseau, Andrea C.
AU - Arpaci-Dusseau, Remzi H.
N1 - Funding Information:
Many proposed techniques for search engines seek to reduce the overhead/cost of query processing [25–28,32,40,51]. These techniques may be adopted in WiSER to further improve its performance. 6 Conclusions We have built a new search engine, WiSER, that efficiently utilizes high-performance SSDs with smaller amounts of system main memory. WiSER employs multiple techniques, including optimized data layout, a novel Bloom filter, adaptive prefetching, and space-time trade-offs. While some of the techniques could increase space usage, these techniques collectively reduce read amplification by up to 3x, increase query throughput by up to 2.7x, and reduce latency by 16x when compared to the state-of-the-art Elasticsearch. We believe that the design principle behind WiSER, "read as needed", can be applied to optimize a broad range of data-intensive applications on high performance storage devices. Acknowledgments We thank Suparna Bhattacharya (our shepherd), the anonymous reviewers and the members of ADSL for their valuable input. This material was supported by funding from NSF CNS-1838733, CNS-1763810 and Microsoft Gray Systems Laboratory. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and may not reflect the views of NSF or any other institutions. References [1] Apache Lucene. https://lucene.apache.org/
Publisher Copyright:
Copyright © Proc. of the 18th USENIX Conference on File and Storage Tech., FAST 2020. All rights reserved.
PY - 2020
Y1 - 2020
N2 - We describe WiSER, a clean-slate search engine designed to exploit high-performance SSDs with the philosophy "read as needed". WiSER utilizes many techniques to deliver high throughput and low latency with a relatively small amount of main memory; the techniques include an optimized data layout, a novel two-way cost-aware Bloom filter, adaptive prefetching, and space-time trade-offs. In a system with memory that is significantly smaller than the working set, these techniques increase storage space usage (up to 50%), but reduce read amplification by up to 3x, increase query throughput by up to 2.7x, and reduce latency by 16x when compared to the state-of-the-art Elasticsearch. We believe that the philosophy of "read as needed" can be applied to more applications as the read performance of storage devices keeps improving.
AB - We describe WiSER, a clean-slate search engine designed to exploit high-performance SSDs with the philosophy "read as needed". WiSER utilizes many techniques to deliver high throughput and low latency with a relatively small amount of main memory; the techniques include an optimized data layout, a novel two-way cost-aware Bloom filter, adaptive prefetching, and space-time trade-offs. In a system with memory that is significantly smaller than the working set, these techniques increase storage space usage (up to 50%), but reduce read amplification by up to 3x, increase query throughput by up to 2.7x, and reduce latency by 16x when compared to the state-of-the-art Elasticsearch. We believe that the philosophy of "read as needed" can be applied to more applications as the read performance of storage devices keeps improving.
UR - http://www.scopus.com/inward/record.url?scp=85091829645&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85091829645&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85091829645
T3 - Proceedings of the 18th USENIX Conference on File and Storage Technologies, FAST 2020
SP - 59
EP - 73
BT - Proceedings of the 18th USENIX Conference on File and Storage Technologies, FAST 2020
PB - USENIX Association
Y2 - 25 February 2020 through 27 February 2020
ER -