TY - JOUR
T1 - Toward the development of a conventional time series based web error forecasting framework
AU - Roy, Arunava
AU - Pham, Hoang
N1 - Publisher Copyright:
© 2017, Springer Science+Business Media, LLC.
PY - 2018/4/1
Y1 - 2018/4/1
N2 - Web reliability is gaining importance with time due to the exponential increase in the popularity of different social community networks, mailing systems and other online applications. Hence, to enhance the reliability of any existing web system, the web administrators must have the knowledge of various web errors present in the system, influences of various workload characteristics on the manifestation of several web errors and the relations among different workload characteristics. But in reality, often it may not be possible to institute a generalized correspondence among several workload characteristics. Moreover, the issues like the prediction and estimation of the cumulative occurrences of the source content failures and the corresponding time between failures of a web system become less highlighted by the reliability research community. Hence, in this work, the authors have presented a well-defined procedure (a forecasting framework) for the web admins to analyze and enhance the reliability of the web sites under their supervision. Initially, it takes the HTTP access and the error logs to extract all the necessary information related to the workloads, web errors and corresponding time between failures. Next, we have performed the principal component analysis, correlation analysis and the change point analysis to select the number of independent variables. Next, we have developed various time series based forecasting models for foretelling the cumulative occurrences of the source content failures and the corresponding time between failures. In the current work, the multivariate models also include various uncorrelated workloads, the exogeneous and the endogenous noises for forecasting the web errors and the corresponding time between failures. The proposed methodology has been validated with usage statistics collected from the web sites belong of two highly renowned Indian academic institutions.
AB - Web reliability is gaining importance with time due to the exponential increase in the popularity of different social community networks, mailing systems and other online applications. Hence, to enhance the reliability of any existing web system, the web administrators must have the knowledge of various web errors present in the system, influences of various workload characteristics on the manifestation of several web errors and the relations among different workload characteristics. But in reality, often it may not be possible to institute a generalized correspondence among several workload characteristics. Moreover, the issues like the prediction and estimation of the cumulative occurrences of the source content failures and the corresponding time between failures of a web system become less highlighted by the reliability research community. Hence, in this work, the authors have presented a well-defined procedure (a forecasting framework) for the web admins to analyze and enhance the reliability of the web sites under their supervision. Initially, it takes the HTTP access and the error logs to extract all the necessary information related to the workloads, web errors and corresponding time between failures. Next, we have performed the principal component analysis, correlation analysis and the change point analysis to select the number of independent variables. Next, we have developed various time series based forecasting models for foretelling the cumulative occurrences of the source content failures and the corresponding time between failures. In the current work, the multivariate models also include various uncorrelated workloads, the exogeneous and the endogenous noises for forecasting the web errors and the corresponding time between failures. The proposed methodology has been validated with usage statistics collected from the web sites belong of two highly renowned Indian academic institutions.
KW - HTTP logs, Forecasting
KW - Multivariate Time Series
KW - Univariate Time Series
KW - Web Server
KW - Web Software Reliability
UR - http://www.scopus.com/inward/record.url?scp=85025821745&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85025821745&partnerID=8YFLogxK
U2 - 10.1007/s10664-017-9530-4
DO - 10.1007/s10664-017-9530-4
M3 - Article
AN - SCOPUS:85025821745
SN - 1382-3256
VL - 23
SP - 570
EP - 644
JO - Empirical Software Engineering
JF - Empirical Software Engineering
IS - 2
ER -