Abstract

Background: Missing sputum samples are common and hamper the use of paired sputum-based outcome measures. The aim of this study was to illustrate the effects of different methods for handling missing data.

Methods: Data on mannitol-induced sputum from three intervention studies were pooled and unconditional median imputation (UCM) and multiple imputation (MI) on eosinophil- and neutrophil counts at baseline and follow-up was evaluated using two different approaches: first, prediction of improvement in lung function and airway hyperreactivity. Second, missing samples were simulated at different prevalences (20%, 40% and 60%) in patients with complete data with accuracy of imputation evaluated using the Bland-Altman (BA) approach.

Results: Of the 135 patients included, paired sputum samples were collected in 70 patients while 120 had at least one sample collected. Patients with no sample collected were excluded.
The imputation methods had similar effects on the predictionmodel for improvement in clinical outcomes (n=120) in terms of ?, SE and significance level with the estimates of effect slightly more conservative using MI.

Using the BA approach (n=70), both UCM and MI both introduced a systematic proportional bias albeit with different sign. For eosinophil count, median raw bias ranged from -2.5 to 5.8 using UCM and -12.0 to 6.4 using MI. Width of the limits of agreement was consistently narrower using UCM.

Conclusion: Imputation importantly addresses - rather than ignores - missing data and greatly increases statistical power but these advantages must be weighed against the introduction of a systematic proportional bias. UCM and MI performed similarly despite the superior statistical properties of MI.