### Abstract

Our research is motivated by the problem of finding a joint frequency distribution with given marginals which is 'as far as possible' from the independent distribution with the same marginals. If the distance between these two distributions is measured by Pearson's chi-square index, the problem can be formulated as maximizing a quadratic convex separable function subject to transportation constraints. We present three heuristics for this problem: two of them are greedy heuristics; the third one amounts to solve a linear relaxation of the problem, and yields also an upper bound of the optimum; thus one can estimate the relative error E of any given heuristic. The lower bounds given by the three heuristics may be improved via Frank-Wolfe's algorithm. Numerical experiments on 600 randomly generated test problems with up to 50 rows and 100 columns show that the above heuristics provide sharp bounds on the optimum (often one has E < 0.01). Even more interestingly, these bounds become sharper and sharper as the problem size increases.

Original language | English (US) |
---|---|

Pages (from-to) | 19-34 |

Number of pages | 16 |

Journal | Computational Statistics and Data Analysis |

Volume | 16 |

Issue number | 1 |

DOIs | |

State | Published - Jan 1 1993 |

### Fingerprint

### All Science Journal Classification (ASJC) codes

- Computational Theory and Mathematics
- Statistics, Probability and Uncertainty
- Electrical and Electronic Engineering
- Computational Mathematics
- Numerical Analysis
- Statistics and Probability

### Cite this

*Computational Statistics and Data Analysis*,

*16*(1), 19-34. https://doi.org/10.1016/0167-9473(93)90242-L

}

*Computational Statistics and Data Analysis*, vol. 16, no. 1, pp. 19-34. https://doi.org/10.1016/0167-9473(93)90242-L

**Sharp bounds for the maximum of the chi-square index in a class of contingency tables with given marginals.** / Kalantari, Bahman; Lari, I.; Rizzi, A.; Simeone, B.

Research output: Contribution to journal › Article

TY - JOUR

T1 - Sharp bounds for the maximum of the chi-square index in a class of contingency tables with given marginals

AU - Kalantari, Bahman

AU - Lari, I.

AU - Rizzi, A.

AU - Simeone, B.

PY - 1993/1/1

Y1 - 1993/1/1

N2 - Our research is motivated by the problem of finding a joint frequency distribution with given marginals which is 'as far as possible' from the independent distribution with the same marginals. If the distance between these two distributions is measured by Pearson's chi-square index, the problem can be formulated as maximizing a quadratic convex separable function subject to transportation constraints. We present three heuristics for this problem: two of them are greedy heuristics; the third one amounts to solve a linear relaxation of the problem, and yields also an upper bound of the optimum; thus one can estimate the relative error E of any given heuristic. The lower bounds given by the three heuristics may be improved via Frank-Wolfe's algorithm. Numerical experiments on 600 randomly generated test problems with up to 50 rows and 100 columns show that the above heuristics provide sharp bounds on the optimum (often one has E < 0.01). Even more interestingly, these bounds become sharper and sharper as the problem size increases.

AB - Our research is motivated by the problem of finding a joint frequency distribution with given marginals which is 'as far as possible' from the independent distribution with the same marginals. If the distance between these two distributions is measured by Pearson's chi-square index, the problem can be formulated as maximizing a quadratic convex separable function subject to transportation constraints. We present three heuristics for this problem: two of them are greedy heuristics; the third one amounts to solve a linear relaxation of the problem, and yields also an upper bound of the optimum; thus one can estimate the relative error E of any given heuristic. The lower bounds given by the three heuristics may be improved via Frank-Wolfe's algorithm. Numerical experiments on 600 randomly generated test problems with up to 50 rows and 100 columns show that the above heuristics provide sharp bounds on the optimum (often one has E < 0.01). Even more interestingly, these bounds become sharper and sharper as the problem size increases.

UR - http://www.scopus.com/inward/record.url?scp=38249003741&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=38249003741&partnerID=8YFLogxK

U2 - 10.1016/0167-9473(93)90242-L

DO - 10.1016/0167-9473(93)90242-L

M3 - Article

VL - 16

SP - 19

EP - 34

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

IS - 1

ER -