Abstract

The most accessible and widely used Pulmonary Function Test (PFT) is spirometry, however it is
unable to measure static lung volumes, essentials to support the diagnosis of hyperinflation or lung restriction. This study aimed to fill this gap by using Machine Learning (ML) models to predict these values and to make an automatic classification of patient?s ventilatory pattern.

This is a retrospective study with 8140 anonymized PFTs containing biological, anthropometric, spirometric and whole-body plethysmography?s data, in which we applied seven ML models for classifying the ventilatory pattern and three to infer the values of Residual Volume (RV), Total Lung Capacity (TLC) and RV/TLC. 

The model with the best performance for classification was the Extreme Gradient Boosting (XGboost), with an area under the curve ROC (AUC) of 0.881 for air trapping (identification of presence/absence of elevated RV/TLC) and an AUC of 0.874 for the ventilatory pattern (identification as normal static lung volumes, large lungs, hyperinflation, air trapping, simple restriction or complex restriction).

In terms of regression, the algorithm with the best performance was the Linear Regression (LR), with a R2 of 0.539 to predict the value of RV, for the prediction of TLC, LR had an R2 of 0.856, for the RV/TLC the R2 was 0.752 and for the RV/TLC?s z score had an R2 of 0.442.

This research demonstrates the viability of incorporating ML into spirometry to provide a more comprehensive analysis of lung function. By incorporating these techniques, we can gain deeper insight into patient?s lung health, supporting more accurate diagnosis and treatment plans.