A sequential split-and-conquer approach for the analysis of big dependent data in computer experiments

Chengrui Li, Ying Hung, Minge Xie

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Massive correlated data with many inputs are often generated from computer experiments to study complex systems. The Gaussian process (GP) model is a widely used tool for the analysis of computer experiments. Although GPs provide a simple and effective approximation to computer experiments, two critical issues remain unresolved. One is the computational issue in GP estimation and prediction where intensive manipulations of a large correlation matrix are required. For a large sample size and with a large number of variables, this task is often unstable or infeasible. The other issue is how to improve the naive plug-in predictive distribution which is known to underestimate the uncertainty. In this article, we introduce a unified framework that can tackle both issues simultaneously. It consists of a sequential split-and-conquer procedure, an information combining technique using confidence distributions (CD), and a frequentist predictive distribution based on the combined CD. It is shown that the proposed method maintains the same asymptotic efficiency as the conventional likelihood inference under mild conditions, but dramatically reduces the computation in both estimation and prediction. The predictive distribution contains comprehensive information for inference and provides a better quantification of predictive uncertainty as compared with the plug-in approach. Simulations are conducted to compare the estimation and prediction accuracy with some existing methods, and the computational advantage of the proposed method is also illustrated. The proposed method is demonstrated by a real data example based on tens of thousands of computer experiments generated from a computational fluid dynamic simulator.

Original languageEnglish (US)
Pages (from-to)712-730
Number of pages19
JournalCanadian Journal of Statistics
Volume48
Issue number4
DOIs
StatePublished - Dec 2020

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Keywords

  • Computer experiment
  • confidence distribution
  • divide-conquer-combine method
  • predictive distribution

Fingerprint

Dive into the research topics of 'A sequential split-and-conquer approach for the analysis of big dependent data in computer experiments'. Together they form a unique fingerprint.

Cite this