TY - JOUR

T1 - Predictive learning on hidden tree-structured ising models

AU - Nikolakakis, Konstantinos E.

AU - Kalogerias, Dionysios S.

AU - Sarwate, Anand D.

N1 - Funding Information:
This work was supported in part by DARPA and SSC Pacific under contract N66001-15-C-4070 and the United States National Science Foundation under award CCF-1453432, and the United States National Institutes of Health under award 1R01DA040487.
Publisher Copyright:
© 2021 Konstantinos E. Nikolakakis, Dionysios S. Kalogerias and Anand D. Sarwate.

PY - 2021

Y1 - 2021

N2 - We provide high-probability sample complexity guarantees for exact structure recovery and accurate predictive learning using noise-corrupted samples from an acyclic (tree-shaped) graphical model. The hidden variables follow a tree-structured Ising model distribution, whereas the observable variables are generated by a binary symmetric channel taking the hidden variables as its input (flipping each bit independently with some constant probability q ∈ [0, 1/2)). In the absence of noise, predictive learning on Ising models was recently studied by Bresler and Karzand (2020); this paper quantifies how noise in the hidden model impacts the tasks of structure recovery and marginal distribution estimation by proving upper and lower bounds on the sample complexity. Our results generalize state-of-the-art bounds reported in prior work, and they exactly recover the noiseless case (q = 0). In fact, for any tree with p vertices and probability of incorrect recovery δ > 0, the sufficient number of samples remains logarithmic as in the noiseless case, i.e., O(log(p/δ)), while the dependence on q is O(1/(1 - 2q)4, for both aforementioned tasks. We also present a new equivalent of Isserlis’ Theorem for sign-valued tree-structured distributions, yielding a new low-complexity algorithm for higher-order moment estimation.

AB - We provide high-probability sample complexity guarantees for exact structure recovery and accurate predictive learning using noise-corrupted samples from an acyclic (tree-shaped) graphical model. The hidden variables follow a tree-structured Ising model distribution, whereas the observable variables are generated by a binary symmetric channel taking the hidden variables as its input (flipping each bit independently with some constant probability q ∈ [0, 1/2)). In the absence of noise, predictive learning on Ising models was recently studied by Bresler and Karzand (2020); this paper quantifies how noise in the hidden model impacts the tasks of structure recovery and marginal distribution estimation by proving upper and lower bounds on the sample complexity. Our results generalize state-of-the-art bounds reported in prior work, and they exactly recover the noiseless case (q = 0). In fact, for any tree with p vertices and probability of incorrect recovery δ > 0, the sufficient number of samples remains logarithmic as in the noiseless case, i.e., O(log(p/δ)), while the dependence on q is O(1/(1 - 2q)4, for both aforementioned tasks. We also present a new equivalent of Isserlis’ Theorem for sign-valued tree-structured distributions, yielding a new low-complexity algorithm for higher-order moment estimation.

KW - Chow-Liu Algorithm

KW - Distribution Estimation

KW - Hidden Markov Random Fields

KW - Ising Model

KW - Noisy Data

KW - Predictive Learning

KW - Structure Learning

UR - http://www.scopus.com/inward/record.url?scp=85105890907&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85105890907&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:85105890907

VL - 22

JO - Journal of Machine Learning Research

JF - Journal of Machine Learning Research

SN - 1532-4435

ER -