Abstract

RATIONALE:
2021 ERS/ATS standards for PFT interpretation introduce a new complexity: new decision trees and the use of Z-scores to better reflect the population variation. This study investigates the speed and accuracy at which an expert can perform PFT interpretations.
METHODS:
Four experts and automatic software (ArtiQ.PFT) interpreted and classified as many lung function tests as possible in one uninterrupted hour. They commented on spirometry, lung volume and diffusion, lung function, diffusion severities and bronchodilator response (BDR).
RESULTS:
Experts classified on average 17 cases in 1 hour (2.8 min/case) with a mean accuracy of 80% (Table 1) whereas software correctly classified five patients per second. The large standard deviations reveal variations between experts. A discrepancy was identified among experts on the use of pre and post-bronchodilator injection measurements to determine patterns and severity. Unlike the GOLD criteria, ERS/ATS standards lack clear guidance about it (e.g., should FEV1 post-measurement be used instead of pre-values for the evaluation of obstructive severity?).
CONCLUSION:
This study shows that lung function interpretation using 2021 ERS/ATS standards requires significant time from experts and is associated with large inter-rater variability, whereas automatic software is accurate, consistent and fast.

TABLE 1 - Classification accuracy (*from Stanojevic et al., 2021)

Mean ± sd (%)
Number of patients in 1h 17 ± 5
Spirometry interpretation (fig 8*) 88 ± 14
Bronchodilator response 91 ± 13
Lung volumes interpretation (fig 10*) 85 ± 12
Lung function severity 68 ± 22
Diffusion interpretation (fig 11*) 89 ± 10
Diffusion severity 59 ± 29