TY - JOUR
T1 - Center-based clustering under perturbation stability
AU - Awasthi, Pranjal
AU - Blum, Avrim
AU - Sheffet, Or
N1 - Funding Information:
✩ This work was supported in part by the National Science Foundation under grant CCF-0830540, as well as by CyLab at Carnegie Mellon under grants DAAD19-02-1-0389 and W911NF-09-1-0273 from the Army Research Office. * Corresponding author. E-mail addresses: pawasthi@cs.cmu.edu (P. Awasthi), avrim@cs.cmu.edu (A. Blum), osheffet@cs.cmu.edu (O. Sheffet).
PY - 2012/1/15
Y1 - 2012/1/15
N2 - Clustering under most popular objective functions is NP-hard, even to approximate well, and so unlikely to be efficiently solvable in the worst case. Recently, Bilu and Linial (2010) [11] suggested an approach aimed at bypassing this computational barrier by using properties of instances one might hope to hold in practice. In particular, they argue that instances in practice should be stable to small perturbations in the metric space and give an efficient algorithm for clustering instances of the Max-Cut problem that are stable to perturbations of size O(n1/2). In addition, they conjecture that instances stable to as little as O(1) perturbations should be solvable in polynomial time. In this paper we prove that this conjecture is true for any center-based clustering objective (such as k-median, k-means, and k-center). Specifically, we show we can efficiently find the optimal clustering assuming only stability to factor-3 perturbations of the underlying metric in spaces without Steiner points, and stability to factor 2+3 perturbations for general metrics. In particular, we show for such instances that the popular Single-Linkage algorithm combined with dynamic programming will find the optimal clustering. We also present NP-hardness results under a weaker but related condition.
AB - Clustering under most popular objective functions is NP-hard, even to approximate well, and so unlikely to be efficiently solvable in the worst case. Recently, Bilu and Linial (2010) [11] suggested an approach aimed at bypassing this computational barrier by using properties of instances one might hope to hold in practice. In particular, they argue that instances in practice should be stable to small perturbations in the metric space and give an efficient algorithm for clustering instances of the Max-Cut problem that are stable to perturbations of size O(n1/2). In addition, they conjecture that instances stable to as little as O(1) perturbations should be solvable in polynomial time. In this paper we prove that this conjecture is true for any center-based clustering objective (such as k-median, k-means, and k-center). Specifically, we show we can efficiently find the optimal clustering assuming only stability to factor-3 perturbations of the underlying metric in spaces without Steiner points, and stability to factor 2+3 perturbations for general metrics. In particular, we show for such instances that the popular Single-Linkage algorithm combined with dynamic programming will find the optimal clustering. We also present NP-hardness results under a weaker but related condition.
KW - Analysis of algorithms
KW - Clustering
KW - Stability conditions
KW - k-Means
KW - k-Median
UR - http://www.scopus.com/inward/record.url?scp=80054691918&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80054691918&partnerID=8YFLogxK
U2 - 10.1016/j.ipl.2011.10.006
DO - 10.1016/j.ipl.2011.10.006
M3 - Article
AN - SCOPUS:80054691918
SN - 0020-0190
VL - 112
SP - 49
EP - 54
JO - Information Processing Letters
JF - Information Processing Letters
IS - 1-2
ER -