TY - JOUR
T1 - Performance bounds for column-block partitioning of parallel Gaussian elimination and Gauss-Jordan methods
AU - Gerasoulis, Apostolos
AU - Yang, Tao
N1 - Funding Information:
* The work presented here was in part supported by ARPA contract DABT-63-93-C-0064, by the Office of Naval research under grant NO001493101 14, by NSF grant CCR-9409695, a startup fund and faculty fellowship from University of California at Santa Barbara. The content of the information herein does not necessarily reflect the position of the Government and official endorsement should not be inferred. * Corresponding author. E-mail: tyang@cs.ucsb.edu. ’ E-mail: gerasoul@cs.rutgers.edu.
PY - 1994/12
Y1 - 1994/12
N2 - Column-block partitioning is commonly used in the parallelization of Gaussian Elimination (GE) and Gauss-Jordan (GJ) algorithms. It is therefore of interest to know performance bounds of such partitioning on scalable distributed-memory parallel architectures. In this paper, we use a graph-theoretic approach in deriving asymptotic performance lower bounds of column-block partitioning for both GE and GJ. The new contribution is the incorporation of communication cost in the analysis which results in the derivation of sharper lower bounds. We use our scheduling system PYRROS to experimentally compare the actual run time performance with that derived by these lower bounds on the nCUBE-2 hypercube parallel machine.
AB - Column-block partitioning is commonly used in the parallelization of Gaussian Elimination (GE) and Gauss-Jordan (GJ) algorithms. It is therefore of interest to know performance bounds of such partitioning on scalable distributed-memory parallel architectures. In this paper, we use a graph-theoretic approach in deriving asymptotic performance lower bounds of column-block partitioning for both GE and GJ. The new contribution is the incorporation of communication cost in the analysis which results in the derivation of sharper lower bounds. We use our scheduling system PYRROS to experimentally compare the actual run time performance with that derived by these lower bounds on the nCUBE-2 hypercube parallel machine.
UR - http://www.scopus.com/inward/record.url?scp=33750099398&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33750099398&partnerID=8YFLogxK
U2 - 10.1016/0168-9274(94)00051-4
DO - 10.1016/0168-9274(94)00051-4
M3 - Article
AN - SCOPUS:33750099398
SN - 0168-9274
VL - 16
SP - 283
EP - 297
JO - Applied Numerical Mathematics
JF - Applied Numerical Mathematics
IS - 1-2
ER -