TY - GEN

T1 - Accurate and nearly optimal sublinear approximations to ulam distance

AU - Naumovitz, Timothy

AU - Saks, Michael

AU - Seshadhri, C.

N1 - Publisher Copyright:
Copyright © by SIAM.

PY - 2017

Y1 - 2017

N2 - The Ulam distance between two permutations of length n is the minimum number of insertions and deletions needed to transform one sequence into the other. Equivalently, the Ulam distance d is n minus the length of the longest common subsequence (LCS) between the permutations. Our main result is an algorithm, that for any fixed " > 0, provides a (1 + ∈)-multiplicative approximation for d in ∼O∈(n=d + p n) time, which has been shown to be optimal up to polylogarithmic factors. This is the first sublinear time algorithm (provided that d = (log n)!(1)) that obtains arbitrarily good multiplicative approximations to the Ulam distance. The previous best bound is an O(1)-approximation (with a large constant) by Andoni and Nguyen (2010) with the same running time bound (ignoring polylogarithmic factors). The improvement in the approximation factor from O(1) to (1+∈) allows for significantly more powerful sublinear algorithms. For example, for any fixed δ > 0, we can get additive δn approximations for the LCS between permutations in ∼O∈( p √n) time. Previous sublinear algorithms require δ to be at least 11=C, where C is the approximation factor, which is close to 1 when C is large. Our algorithm is obtained by abstracting the basic algorithmic framework of Andoni and Nguyen, and combining it with the sublinear approximations for the longest increasing subsequence by Saks and Seshadhri (2010).

AB - The Ulam distance between two permutations of length n is the minimum number of insertions and deletions needed to transform one sequence into the other. Equivalently, the Ulam distance d is n minus the length of the longest common subsequence (LCS) between the permutations. Our main result is an algorithm, that for any fixed " > 0, provides a (1 + ∈)-multiplicative approximation for d in ∼O∈(n=d + p n) time, which has been shown to be optimal up to polylogarithmic factors. This is the first sublinear time algorithm (provided that d = (log n)!(1)) that obtains arbitrarily good multiplicative approximations to the Ulam distance. The previous best bound is an O(1)-approximation (with a large constant) by Andoni and Nguyen (2010) with the same running time bound (ignoring polylogarithmic factors). The improvement in the approximation factor from O(1) to (1+∈) allows for significantly more powerful sublinear algorithms. For example, for any fixed δ > 0, we can get additive δn approximations for the LCS between permutations in ∼O∈( p √n) time. Previous sublinear algorithms require δ to be at least 11=C, where C is the approximation factor, which is close to 1 when C is large. Our algorithm is obtained by abstracting the basic algorithmic framework of Andoni and Nguyen, and combining it with the sublinear approximations for the longest increasing subsequence by Saks and Seshadhri (2010).

UR - http://www.scopus.com/inward/record.url?scp=85016170911&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85016170911&partnerID=8YFLogxK

U2 - 10.1137/1.9781611974782.131

DO - 10.1137/1.9781611974782.131

M3 - Conference contribution

AN - SCOPUS:85016170911

T3 - Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms

SP - 2012

EP - 2031

BT - 28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017

A2 - Klein, Philip N.

PB - Association for Computing Machinery

T2 - 28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017

Y2 - 16 January 2017 through 19 January 2017

ER -