Abstract
To facilitate load balancing, distributed systems store data redundantly. We evaluate the load balancing performance of storage schemes in which each object is stored at d different nodes, and each node stores the same number of objects. In our model, the load offered for the objects is sampled uniformly at random from all the load vectors with a fixed cumulative value. We find that the load balance in a system of n nodes improves multiplicatively with d as long as d =o (log (n)) , and improves exponentially once d = Θ (log (n)). We show that the load balance improves in the same way with d when the service choices are created with XOR's of r objects rather than object replicas. In such redundancy schemes, storage overhead is reduced multiplicatively by r. However, recovery of an object requires downloading content from r nodes. At the same time, the load balance increases additively by r. We express the system's load balance in terms of the maximal spacing or maximum of d consecutive spacings between the ordered statistics of uniform random variables. Using this connection and the limit results on the maximal d -spacings, we derive our main results.
Original language | English (US) |
---|---|
Article number | 9335615 |
Pages (from-to) | 3623-3644 |
Number of pages | 22 |
Journal | IEEE Transactions on Information Theory |
Volume | 67 |
Issue number | 6 |
DOIs | |
State | Published - Jun 2021 |
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Library and Information Sciences
Keywords
- Load balancing
- distributed storage
- distributed systems
- redundant storage