TB Research

Application of CNN-LSTM Model in Pulmonary Tuberculosis Incidence Rate Prediction in China

Shiyu Zhang

Abstract

Objective To investigate the feasibility of CNN-LSTM model in predicting the incidence rate of pulmonary tuberculosis. Methods The National Influenza Center provided the monthly incidence data for tuberculosis in China from January 2012 to December 2021, and the monthly incidence rate from 2012 to 2020 was used as a training set. Using Python software, a CNN-LSTM model was created, where the data features were first extracted by convolutional neural network (CNN) and then the incidence of pulmonary tuberculosis was fitted with long-short-term memory network (LSTM). The monthly incidence rate in 2021 was used as the test set to test the prediction effect, and the Seasonal Autoregressive Integrated Moving Average model (SARIMA) model was established as the control group to compare the prediction performance of the two models. Results The RMSE and MAE fitted by the CNN-LSTM model were 0.7377 and 0.5287, respectively, and the RMSE and MAE predicted by the model were 0.9139 and 1.0700, respectively, which were better than the SARIMA(1,1,1)(0,1,1) <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">12</inf> model. Conclusion The fitting degree and prediction effect of CNN-LSTM model are good, and it has application value in predicting the incidence of pulmonary tuberculosis.

MeSH terms

  • Incidence (geometry)
  • Python (programming language)
  • Autoregressive integrated moving average
  • Mean squared error
  • Test set
  • Convolutional neural network
  • Computer science
  • Artificial neural network
  • Artificial intelligence
  • Pulmonary tuberculosis
  • Statistics
  • Tuberculosis
  • Machine learning