TY - GEN
T1 - Influencers and the giant component
T2 - 2021 SIAM International Conference on Data Mining, SDM 2021
AU - Rezaei, Aria
AU - Gao, Jie
AU - Sarwate, Anand D.
N1 - Publisher Copyright:
© 2021 by SIAM.
PY - 2021
Y1 - 2021
N2 - The presence of correlation is known to make privacy protection more difficult. We investigate the privacy of socially contagious attributes on a network of individuals, where each individual possessing that attribute may influence a number of others into adopting it. We show that for contagions following the Independent Cascade model there exists a giant connected component of infected nodes, containing a constant fraction of all the nodes who all receive the contagion from the same set of sources. We further show that it is extremely hard to hide the existence of this giant connected component if we want to obtain an estimate of the activated users at an acceptable level. Moreover, an adversary possessing this knowledge can predict the real status (“active” or “inactive”) with decent probability for many of the individuals regardless of the privacy (perturbation) mechanism used. As a case study, we show that the Wasserstein mechanism, a state-of-the-art privacy mechanism designed specifically for correlated data, introduces a noise with magnitude of order Ω(n) in the count estimation in our setting. We provide theoretical guarantees for two classes of random networks: Erdös-Rényi graphs and Chung-Lu power-law graphs under the Independent Cascade model. Experiments demonstrate that a giant connected component of infected nodes can indeed appear in real-world networks and a simple inference attack can reveal the status of a good fraction of nodes.
AB - The presence of correlation is known to make privacy protection more difficult. We investigate the privacy of socially contagious attributes on a network of individuals, where each individual possessing that attribute may influence a number of others into adopting it. We show that for contagions following the Independent Cascade model there exists a giant connected component of infected nodes, containing a constant fraction of all the nodes who all receive the contagion from the same set of sources. We further show that it is extremely hard to hide the existence of this giant connected component if we want to obtain an estimate of the activated users at an acceptable level. Moreover, an adversary possessing this knowledge can predict the real status (“active” or “inactive”) with decent probability for many of the individuals regardless of the privacy (perturbation) mechanism used. As a case study, we show that the Wasserstein mechanism, a state-of-the-art privacy mechanism designed specifically for correlated data, introduces a noise with magnitude of order Ω(n) in the count estimation in our setting. We provide theoretical guarantees for two classes of random networks: Erdös-Rényi graphs and Chung-Lu power-law graphs under the Independent Cascade model. Experiments demonstrate that a giant connected component of infected nodes can indeed appear in real-world networks and a simple inference attack can reveal the status of a good fraction of nodes.
UR - http://www.scopus.com/inward/record.url?scp=85120945298&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85120945298&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85120945298
T3 - SIAM International Conference on Data Mining, SDM 2021
SP - 217
EP - 225
BT - SIAM International Conference on Data Mining, SDM 2021
PB - Siam Society
Y2 - 29 April 2021 through 1 May 2021
ER -