Sharp bounds for the maximum of the chi-square index in a class of contingency tables with given marginals

Bahman Kalantari, I. Lari, A. Rizzi, B. Simeone

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Our research is motivated by the problem of finding a joint frequency distribution with given marginals which is 'as far as possible' from the independent distribution with the same marginals. If the distance between these two distributions is measured by Pearson's chi-square index, the problem can be formulated as maximizing a quadratic convex separable function subject to transportation constraints. We present three heuristics for this problem: two of them are greedy heuristics; the third one amounts to solve a linear relaxation of the problem, and yields also an upper bound of the optimum; thus one can estimate the relative error E of any given heuristic. The lower bounds given by the three heuristics may be improved via Frank-Wolfe's algorithm. Numerical experiments on 600 randomly generated test problems with up to 50 rows and 100 columns show that the above heuristics provide sharp bounds on the optimum (often one has E < 0.01). Even more interestingly, these bounds become sharper and sharper as the problem size increases.

Original languageEnglish (US)
Pages (from-to)19-34
Number of pages16
JournalComputational Statistics and Data Analysis
Volume16
Issue number1
DOIs
StatePublished - Jan 1 1993

Fingerprint

Chi-square
Sharp Bound
Contingency Table
Heuristics
Experiments
Linear Relaxation
Greedy Heuristics
Relative Error
Test Problems
Numerical Experiment
Class
Contingency table
Lower bound
Upper bound
Estimate

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Statistics, Probability and Uncertainty
  • Electrical and Electronic Engineering
  • Computational Mathematics
  • Numerical Analysis
  • Statistics and Probability

Cite this

@article{c2d8d8bac83c4a9abb03a05e11eff789,
title = "Sharp bounds for the maximum of the chi-square index in a class of contingency tables with given marginals",
abstract = "Our research is motivated by the problem of finding a joint frequency distribution with given marginals which is 'as far as possible' from the independent distribution with the same marginals. If the distance between these two distributions is measured by Pearson's chi-square index, the problem can be formulated as maximizing a quadratic convex separable function subject to transportation constraints. We present three heuristics for this problem: two of them are greedy heuristics; the third one amounts to solve a linear relaxation of the problem, and yields also an upper bound of the optimum; thus one can estimate the relative error E of any given heuristic. The lower bounds given by the three heuristics may be improved via Frank-Wolfe's algorithm. Numerical experiments on 600 randomly generated test problems with up to 50 rows and 100 columns show that the above heuristics provide sharp bounds on the optimum (often one has E < 0.01). Even more interestingly, these bounds become sharper and sharper as the problem size increases.",
author = "Bahman Kalantari and I. Lari and A. Rizzi and B. Simeone",
year = "1993",
month = "1",
day = "1",
doi = "10.1016/0167-9473(93)90242-L",
language = "English (US)",
volume = "16",
pages = "19--34",
journal = "Computational Statistics and Data Analysis",
issn = "0167-9473",
publisher = "Elsevier",
number = "1",

}

Sharp bounds for the maximum of the chi-square index in a class of contingency tables with given marginals. / Kalantari, Bahman; Lari, I.; Rizzi, A.; Simeone, B.

In: Computational Statistics and Data Analysis, Vol. 16, No. 1, 01.01.1993, p. 19-34.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Sharp bounds for the maximum of the chi-square index in a class of contingency tables with given marginals

AU - Kalantari, Bahman

AU - Lari, I.

AU - Rizzi, A.

AU - Simeone, B.

PY - 1993/1/1

Y1 - 1993/1/1

N2 - Our research is motivated by the problem of finding a joint frequency distribution with given marginals which is 'as far as possible' from the independent distribution with the same marginals. If the distance between these two distributions is measured by Pearson's chi-square index, the problem can be formulated as maximizing a quadratic convex separable function subject to transportation constraints. We present three heuristics for this problem: two of them are greedy heuristics; the third one amounts to solve a linear relaxation of the problem, and yields also an upper bound of the optimum; thus one can estimate the relative error E of any given heuristic. The lower bounds given by the three heuristics may be improved via Frank-Wolfe's algorithm. Numerical experiments on 600 randomly generated test problems with up to 50 rows and 100 columns show that the above heuristics provide sharp bounds on the optimum (often one has E < 0.01). Even more interestingly, these bounds become sharper and sharper as the problem size increases.

AB - Our research is motivated by the problem of finding a joint frequency distribution with given marginals which is 'as far as possible' from the independent distribution with the same marginals. If the distance between these two distributions is measured by Pearson's chi-square index, the problem can be formulated as maximizing a quadratic convex separable function subject to transportation constraints. We present three heuristics for this problem: two of them are greedy heuristics; the third one amounts to solve a linear relaxation of the problem, and yields also an upper bound of the optimum; thus one can estimate the relative error E of any given heuristic. The lower bounds given by the three heuristics may be improved via Frank-Wolfe's algorithm. Numerical experiments on 600 randomly generated test problems with up to 50 rows and 100 columns show that the above heuristics provide sharp bounds on the optimum (often one has E < 0.01). Even more interestingly, these bounds become sharper and sharper as the problem size increases.

UR - http://www.scopus.com/inward/record.url?scp=38249003741&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=38249003741&partnerID=8YFLogxK

U2 - 10.1016/0167-9473(93)90242-L

DO - 10.1016/0167-9473(93)90242-L

M3 - Article

VL - 16

SP - 19

EP - 34

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

IS - 1

ER -