### Abstract

Our research is motivated by the problem of finding a joint frequency distribution with given marginals which is 'as far as possible' from the independent distribution with the same marginals. If the distance between these two distributions is measured by Pearson's chi-square index, the problem can be formulated as maximizing a quadratic convex separable function subject to transportation constraints. We present three heuristics for this problem: two of them are greedy heuristics; the third one amounts to solve a linear relaxation of the problem, and yields also an upper bound of the optimum; thus one can estimate the relative error E of any given heuristic. The lower bounds given by the three heuristics may be improved via Frank-Wolfe's algorithm. Numerical experiments on 600 randomly generated test problems with up to 50 rows and 100 columns show that the above heuristics provide sharp bounds on the optimum (often one has E < 0.01). Even more interestingly, these bounds become sharper and sharper as the problem size increases.

Original language | English (US) |
---|---|

Pages (from-to) | 19-34 |

Number of pages | 16 |

Journal | Computational Statistics and Data Analysis |

Volume | 16 |

Issue number | 1 |

DOIs | |

State | Published - Jan 1 1993 |

### Fingerprint

### All Science Journal Classification (ASJC) codes

- Statistics and Probability
- Computational Mathematics
- Computational Theory and Mathematics
- Applied Mathematics

### Keywords

- Association indexes
- Contingency tables
- Fréchet class
- Heuristics
- Quadratic transportation

### Cite this

*Computational Statistics and Data Analysis*,

*16*(1), 19-34. https://doi.org/10.1016/0167-9473(93)90242-L

}

*Computational Statistics and Data Analysis*, vol. 16, no. 1, pp. 19-34. https://doi.org/10.1016/0167-9473(93)90242-L

**Sharp bounds for the maximum of the chi-square index in a class of contingency tables with given marginals.** / Kalantari, Bahman; Lari, I.; Rizzi, A.; Simeone, B.

Research output: Contribution to journal › Article

TY - JOUR

T1 - Sharp bounds for the maximum of the chi-square index in a class of contingency tables with given marginals

AU - Kalantari, Bahman

AU - Lari, I.

AU - Rizzi, A.

AU - Simeone, B.

PY - 1993/1/1

Y1 - 1993/1/1

N2 - Our research is motivated by the problem of finding a joint frequency distribution with given marginals which is 'as far as possible' from the independent distribution with the same marginals. If the distance between these two distributions is measured by Pearson's chi-square index, the problem can be formulated as maximizing a quadratic convex separable function subject to transportation constraints. We present three heuristics for this problem: two of them are greedy heuristics; the third one amounts to solve a linear relaxation of the problem, and yields also an upper bound of the optimum; thus one can estimate the relative error E of any given heuristic. The lower bounds given by the three heuristics may be improved via Frank-Wolfe's algorithm. Numerical experiments on 600 randomly generated test problems with up to 50 rows and 100 columns show that the above heuristics provide sharp bounds on the optimum (often one has E < 0.01). Even more interestingly, these bounds become sharper and sharper as the problem size increases.

AB - Our research is motivated by the problem of finding a joint frequency distribution with given marginals which is 'as far as possible' from the independent distribution with the same marginals. If the distance between these two distributions is measured by Pearson's chi-square index, the problem can be formulated as maximizing a quadratic convex separable function subject to transportation constraints. We present three heuristics for this problem: two of them are greedy heuristics; the third one amounts to solve a linear relaxation of the problem, and yields also an upper bound of the optimum; thus one can estimate the relative error E of any given heuristic. The lower bounds given by the three heuristics may be improved via Frank-Wolfe's algorithm. Numerical experiments on 600 randomly generated test problems with up to 50 rows and 100 columns show that the above heuristics provide sharp bounds on the optimum (often one has E < 0.01). Even more interestingly, these bounds become sharper and sharper as the problem size increases.

KW - Association indexes

KW - Contingency tables

KW - Fréchet class

KW - Heuristics

KW - Quadratic transportation

UR - http://www.scopus.com/inward/record.url?scp=38249003741&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=38249003741&partnerID=8YFLogxK

U2 - 10.1016/0167-9473(93)90242-L

DO - 10.1016/0167-9473(93)90242-L

M3 - Article

AN - SCOPUS:38249003741

VL - 16

SP - 19

EP - 34

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

IS - 1

ER -