Robustness to Missing Data: Breakdown Point Analysis

Missing data is pervasive in econometric applications, and rarely is it plausible that the data are missing (completely) at random. This paper proposes a methodology for studying the robustness of results drawn from incomplete datasets. Selection is measured as the squared Hellinger divergence between the distributions of complete and incomplete observations, which has a natural interpretation. The breakdown point is defined as the minimal amount of selection needed to overturn a given result. Reporting point estimates and lower confidence intervals of the breakdown point is a simple, concise way to communicate a result’s robustness. An estimator of the breakdown point of results from GMM models is proposed and shown root-n consistent and asymptotically normal under mild assumptions. Confidence intervals are constructed with a simple bootstrap procedure. The paper concludes with a simulation study illustrating the good finite sample performance of the procedure.