Abstract

Introduction: Idiopathic pulmonary fibrosis (IPF) is a fibrotic lung disease leading to loss of respiratory function. The IPF-PRO Registry is a prospective multi-centre registry that provides a valuable resource to investigate biomarkers associated with disease progression in patients with IPF.

Aim: To develop a biomarker-inclusive model predictive of disease progression or death in patients with IPF.

Methods: Using clinical data and blood samples taken from 558 patients with IPF at enrolment into the IPF-PRO Registry, we used a machine learning-based random forest survival approach to identify demographic/clinical parameters and gene and protein expression predictive of disease progression, i.e. a composite of drop in FVC % predicted >10%, lung transplant, or death. We utilized a feature pre-selection approach, based on univariable Cox regression models in conjunction with feature correlation, to maximize feature relevance and minimize redundancy among the biomarkers.

Results: The final model for prediction of the composite endpoint included, among other parameters, DLCO % predicted, body mass index, SP-D protein expression, and MIXL1 gene expression. The combination of demographic/clinical parameters and gene or protein expression outperformed prediction of the composite endpoint based on demographic/clinical parameters only. We validated the model on a separate sub-cohort of 247 patients from the IPF-PRO Registry.

Conclusions: The set of biomarkers identified in this analysis may serve as a panel to predict disease progression or death in patients with IPF.