Flexible Imputation of Missing Data

Free Standard Shipping

Purchasing Options

ISBN 9781439868249
Cat# K13103



SAVE 20%

eBook (VitalSource)
ISBN 9781439868256
Cat# KE13273



SAVE 30%

eBook Rentals

Other eBook Options:


  • Provides an accessible introduction to multiple imputation for handling missing data
  • Examines various missing-data problems and presents strategies for tackling them
  • Includes many examples using real data
  • Supported by an R package, enabling the reader to replicate the analyses and use the methods in their own work.
  • All material downloadable from www.multiple-imputation.com


Missing data form a problem in every scientific discipline, yet the techniques required to handle them are complicated and often lacking. One of the great ideas in statistical science—multiple imputation—fills gaps in the data with plausible values, the uncertainty of which is coded in the data itself. It also solves other problems, many of which are missing data problems in disguise.

Flexible Imputation of Missing Data is supported by many examples using real data taken from the author's vast experience of collaborative research, and presents a practical guide for handling missing data under the framework of multiple imputation. Furthermore, detailed guidance of implementation in R using the author’s package MICE is included throughout the book.

Assuming familiarity with basic statistical concepts and multivariate methods, Flexible Imputation of Missing Data is intended for two audiences:

  • (Bio)statisticians, epidemiologists, and methodologists in the social and health sciences
  • Substantive researchers who do not call themselves statisticians, but who possess the necessary skills to understand the principles and to follow the recipes

This graduate-tested book avoids mathematical and technical details as much as possible: formulas are accompanied by a verbal statement that explains the formula in layperson terms. Readers less concerned with the theoretical underpinnings will be able to pick up the general idea, and technical material is available for those who desire deeper understanding. The analyses can be replicated in R using a dedicated package developed by the author.

Table of Contents

The problem of missing data
Concepts of MCAR, MAR and MNAR
Simple solutions that do not (always) work
Multiple imputation in a nutshell
Goal of the book
What the book does not cover
Structure of the book

Multiple imputation
Historic overview
Incomplete data concepts
Why and when multiple imputation works
Statistical intervals and tests
Evaluation criteria
When to use multiple imputation
How many imputations?

Univariate missing data
How to generate multiple imputations
Imputation under the normal linear normal
Imputation under non-normal distributions
Predictive mean matching
Categorical data
Other data types
Classification and regression trees
Multilevel data
Non-ignorable methods

Multivariate missing data
Missing data pattern
Issues in multivariate imputation
Monotone data imputation
Joint Modeling
Fully Conditional Specification
FCS and JM

Imputation in practice
Overview of modeling choices
Ignorable or non-ignorable?
Model form and predictors
Derived variables
Algorithmic options

Analysis of imputed data
What to do with the imputed data?
Parameter pooling
Statistical tests for multiple imputation
Stepwise model selection

Case studies
Measurement issues
Too many columns
Sensitivity analysis
Correct prevalence estimates from self-reported data
Enhancing comparability

Selection issues
Correcting for selective drop-out
Correcting for non-response

Longitudinal data
Long and wide format
SE Fireworks Disaster Study
Time raster imputation

Some dangers, some do's and some don'ts
Other applications
Future developments

Appendices: Software
Other software

Author Index
Subject Index

Editorial Reviews

"This book would be well suited as a textbook, especially at the graduate level, possibly for biostatisticians, epidemiologists, or applied scientists and users of statistical methodology. …a very enjoyable read, and—at least in my opinion—it is a book that belongs on everyone’s shelf as it does open one’s eyes to a problem that has surrounded us (and that many of us have ignored!) for a very long time."
—Wolfgang S. Jank, Journal of the American Statistical Association, June 2013

"From the first lines of Chapter 1 throughout the entire monograph, the author presents numerous R language codes, so the book also serves as a good introduction to R. Each chapter is complete with various examples and exercises. The book is very useful to graduate students and researchers for solving practical problems with real data."
Technometrics, February 2013

"It’s excellent and I highly recommend it. … van Buuren’s book is great even if you don’t end up using the algorithm described in the book … he supplies lots of intuition, examples, and graphs."
—Andrew Gelman, Columbia University

"… a beautiful book that is so full of guidance for statisticians … exceptionally up to date and has more useful wisdom about dealing with common missing data problems than any other source I've seen."
—Frank Harrell, Vanderbilt University

"I’m delighted to see this new book on multiple imputation by Stef van Buuren …This book represents a 'no nonsense' straightforward approach to the application of multiple imputation. I particularly like Stef’s use of graphical displays … It’s great to have Stef’s book on multiple imputation, and I look forward to seeing more editions as this rapidly developing methodology continues to become even more effective at handling missing data problems in practice."
—From the Foreword by Donald B. Rubin