How to copy files

Yang Zhan, Alex Conway, Yizheng Jiao, Nirjhar Mukherjee, Ian Groombridge, Michael A. Bender, Martín Farach-Colton, William Jannen, Rob Johnson, Donald E. Porter, Jun Yuan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Making logical copies, or clones, of files and directories is critical to many real-world applications and workflows, including backups, virtual machines, and containers. An ideal clone implementation meets the following performance goals: (1) creating the clone has low latency; (2) reads are fast in all versions (i.e., spatial locality is always maintained, even after modifications); (3) writes are fast in all versions; (4) the overall system is space efficient. Implementing a clone operation that realizes all four properties, which we call a nimble clone, is a long-standing open problem. This paper describes nimble clones in BetrFS, an opensource, full-path-indexed, and write-optimized file system. The key observation behind our work is that standard copyon- write heuristics can be too coarse to be space efficient, or too fine-grained to preserve locality. On the other hand, a write-optimized key-value store, as used in BetrFS or an LSMtree, can decouple the logical application of updates from the granularity at which data is physically copied. In our writeoptimized clone implementation, data sharing among clones is only broken when a clone has changed enough to warrant making a copy, a policy we call copy-on-abundant-write. We demonstrate that the algorithmic work needed to batch and amortize the cost of BetrFS clone operations does not erode the performance advantages of baseline BetrFS; BetrFS performance even improves in a few cases. BetrFS cloning is efficient; for example, when using the clone operation for container creation, BetrFS outperforms a simple recursive copy by up to two orders-of-magnitude and outperforms file systems that have specialized LXC backends by 3-4×.

Original languageEnglish (US)
Title of host publicationProceedings of the 18th USENIX Conference on File and Storage Technologies, FAST 2020
PublisherUSENIX Association
Pages75-89
Number of pages15
ISBN (Electronic)9781939133120
StatePublished - 2020
Event18th USENIX Conference on File and Storage Technologies, FAST 2020 - Santa Clara, United States
Duration: Feb 25 2020Feb 27 2020

Publication series

NameProceedings of the 18th USENIX Conference on File and Storage Technologies, FAST 2020

Conference

Conference18th USENIX Conference on File and Storage Technologies, FAST 2020
Country/TerritoryUnited States
CitySanta Clara
Period2/25/202/27/20

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Software
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'How to copy files'. Together they form a unique fingerprint.

Cite this