Abstract

Methods

Generative adversarial networks synthesise realistic synthetic images. We trained 2 deep learning models to identify UIP-like features on HRCT. Each model outputs a 5-point UIP PIOPED score (a likelihood of definite UIP on HRCT). One model was trained on real HRCT data (output: RealCT UIP score) using 500 unique 4-slice montages from 264 HRCTs. The second model (output: SynthCT UIP score) was trained on 500 real and 500 synthetic 4-slice montages (Figure 1) from the same 264 HRCTs. We compared model performance to expert radiologist evaluation (RadiologistCT UIP score) using Cox regression to assess model accuracy on a national registry of 504 patients with suspected IPF.

Results

On univariable analysis, controlling for total fibrosis extent, RealCT UIP scores and SynthCT UIP scores predicted mortality (n=504, HR 1.99 p<0.0001 and HR 2.30 p<0.0001 respectively). On bivariable analysis of SynthCT UIP scores and RadiologistCT UIP scores, controlling for total fibrosis extent, only SynthCT UIP scores predicted mortality (n=504, HR 1.24 p=0.007). SynthCT UIP scores independently predicted mortality in patients with an ?indeterminate? HRCT pattern (n=89, HR 1.96, p=0.002) and trended toward significance in patients who underwent surgical lung biopsy (n=78, HR 1.62, p=0.058).

Conclusion

Synthetic HRCTs can be used to augment the training of prognostic deep learning models with better-than-human performance