Broken rail prediction with machine learning-based approach

Zhipeng Zhang, Kang Zhou, Xiang Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations


Broken rails are the most frequent cause of freight train derailments in the United States. According to the U.S. Federal Railroad Administration (FRA) railroad accident database, there are over 900 Class I railroad freight-train derailments caused by broken rails between 2000 and 2017. In 2017 alone, broken rail-caused freight train derailments cause $15.8 million track and rolling stock damage costs to Class I railroads. The prevention of broken rails is crucial for reducing the risk due to broken rail-caused derailments. Although there is fast-growing big data in the railroad industry, quite limited prior research has taken advantage of these data to disclose the relationship between real-world factors and broken rail occurrence. This article aims to predict the occurrence of broken rails via machine learning approach that simultaneously accounts for track files, traffic information, maintenance history, and prior defect information. In the prediction of broken rails, a machine learning-based algorithm called extreme gradient boosting (XGBoost) is developed with various types of variables, including track characteristics (e.g. rail profile information, rail laid information), traffic-related information (e.g. gross tonnage recorded by time, number of passing cars), maintenance records (e.g. rail grinding and track ballast cleaning), and historical rail defect records. Area Under the Curve (AUC) is used as the evaluation metric to identify the prediction accuracy of developed machine learning model. The preliminary result shows that the AUC for one year of the XGBoost-based prediction model is 0.83, which is higher than two comparative models, logistic regression and random forests. Furthermore, the feature importance discloses that segment length, traffic tonnage, number of car passes, rail age, and the number of detected defects in the past six months have relatively greater importance for the prediction of broken rails. The prediction model and outcomes, along with future research in the relationship between broken rails and broken rail-caused derailment, can benefit railroad practical maintenance planning and capital planning.

Original languageEnglish (US)
Title of host publication2020 Joint Rail Conference, JRC 2020
PublisherAmerican Society of Mechanical Engineers (ASME)
ISBN (Electronic)9780791883587
StatePublished - 2020
Event2020 Joint Rail Conference, JRC 2020 - St. Louis, United States
Duration: Apr 20 2020Apr 22 2020

Publication series

Name2020 Joint Rail Conference, JRC 2020


Conference2020 Joint Rail Conference, JRC 2020
Country/TerritoryUnited States
CitySt. Louis

All Science Journal Classification (ASJC) codes

  • Mechanical Engineering
  • Transportation


Dive into the research topics of 'Broken rail prediction with machine learning-based approach'. Together they form a unique fingerprint.

Cite this