A simulation-based tree method for building linear models with interactions

Jin Wang, Javier Cabrera, Kwok Leung Tsui

Research output: Contribution to journalArticlepeer-review


Linear models are the most common predictive models for a continuous, discrete or categorical response and often include interaction terms, but for more than a few predictors interactions tend to be neglected because they add too many terms to the model. In this paper, we propose a simulation-based tree method to detect the interactions, which contributes to the predictions. In the method, we first bootstrap the observations and randomly choose a number of variables to build trees. The interactions between the roots and the corresponding leaves are collected. The times of each interaction that appear are counted. To obtain the benchmark of the number of each interaction that appears in the trees, the response values are substituted by randomly generated values and then we repeat the procedure. The interactions with occurrence frequency more than the benchmark are put into the regression models. Finally, we select variables by running LASSO for the model with main effects and the interactions obtained. In the experiments, our method shows good performances, especially for the data set with many interactions.

Original languageEnglish (US)
Pages (from-to)404-413
Number of pages10
JournalCommunications in Statistics - Theory and Methods
Issue number2
StatePublished - 2022

All Science Journal Classification (ASJC) codes

  • Statistics and Probability


  • Simulation
  • interaction
  • prediction
  • regression
  • tree


Dive into the research topics of 'A simulation-based tree method for building linear models with interactions'. Together they form a unique fingerprint.

Cite this