WK81: Missing data: consequences and solutions
Although researchers do their best to avoid missing data, it is a common problem in medical and epidemiological studies. How large your missing data problem is and how to deal with it depends on how much data is missing and why your data are missing. This three-day course provides you with tools how to evaluate and handle missing data in medical and epidemiological studies with different missing data rates.
(If there is [full] to a course, please do sign up, but you will be placed on a waiting list. Once there is an open spot we will contact you. At that point you can decide whether to participate in the course.)
|21, 22, 23 January 2019||Tuition fee: € 950,-
Course description and topics
Although researchers do their best to avoid missing data, it is a common problem in medical and epidemiological studies. How large your missing data problem is and how to deal with it depends on how much data is missing and why your data are missing. This two-day course provides you with tools how to evaluate and handle missing data in medical and epidemiological studies with different missing data rates.
There are various methods to deal with missing data. Simple solutions are that you ignore the missing values and delete all cases with missing values from the analysis or to use a regression model to estimate the missing values. There are also more advanced methods as Multiple Imputation. Multiple Imputation with the Multivariate Imputation with Chained Equations (MICE) procedure is a promising technique that works well in various missing data situations. With Multiple Imputation several complete datasets are generated. Data analysis has to be done in each dataset and results are pooled using special calculation rules (called Rubin’s rules). These steps will be discussed during the course as well as questions of how to use different missing data methods in medical and epidemiological datasets.
Before you are going to use a method to handle missing data you must have to gain insight into the effect of missing data on your study results. Consequences of various rates of missing data for your study results will be explored and discussed during the course. In general there are three missing data mechanisms, missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR). Knowledge about these mechanisms is important and provides information about how well you are able to estimate and replace the missing values and how well you are able to solve the missing data problem in your study. Furthermore it is important to check if your imputation strategy was successful (imputation diagnostics) which will also be discussed during the course.
Each course day starts with lectures in the morning followed by computer exercises in the afternoon. During the computer exercises various ways to explore missing data problems as well as simple and more advanced missing data methods as Multiple Imputation will be trained using SPSS software. During the computer exercises you will work with real epidemiological and medical datasets.
Martijn W. Heymans, PhD , course coordinator
Department of Epidemiology & Biostatistics. Amsterdam UMC, location VUmc
Dr. Martijn Heymans expertise is in prognostic and prediction modeling, missing data and longitudinal data analysis. He (co)-authored more than 80 scientific publications and also teaches courses in epidemiology, applied biostatistics and regression techniques and works as a statistical consultant.
Iris Eekhout, PhD
Department of Epidemiology & Biostatistics. Amsterdam UMC, location VUmc
Department Child Health, Netherlands Organisation for Applied Scientific Research (TNO), Leiden
Iris Eekhout finished a master in Clinical Psychology and a master in Methodology and Statistics at the University of Leiden. She did a PhD project on missing data methods at the department of Epidemiology and Biostatistics of the VU University medical center, that focused on methods to handle missing questionnaire items and total scores. Currently, Iris teaches in several EpidM courses and works as a statistician at TNO.
Missing data consequences
- Examples of Missing data in different Epidemiological and Medical research designs.
- The meaning of missing data mechanisms (MCAR, MAR, MNAR).
- Consequences and impact of missing data rates for statistical analyses.
- Ways to evaluate various missing data situations and mechanisms.
Missing data solutions
- The application of simple missing data methods.
- The theory and practice of Multiple Imputation.
- Data analysis after Multiple Imputation.
- How to evaluate imputation success by using imputation diagnostics
- 1. The participant is able to distinguish between different missing data mechanisms called missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR).
2. The participant can apply basic evaluation procedures to make a valid assumption about the missing data mechanism.
3. The participant understands the working of the most frequently used methods to handle missing data in epidemiological and medical datasets.
4. The participant recognizes the strengths and limitations of the most frequently used methods to handle missing data in various missing data situations.
5. The participant is able to work with SPSS to investigate missing data and to work with the best missing data methods for various missing data situations.
6. The participant is able to use Multiple Imputation by the Multivariate Imputation by Chained Equations (MICE) procedure in SPSS amd RStudio.
7. The participant understands how multiple imputation works and how a multiple imputation model should be specified.
8. The participant understands how to handle missing questionnaire data and can comprehend the difference between handling item scores at item level and at total score level.
9. The participant understands the practical solutions to handle missing data in longitudinal studies.
10. The participant is able to work with SPSS and RStudio to handle missing data in questionnaires and in longitudinal studies.
Target group and course pre-requisites
Target groupThe course is designed for PhD-students, practitioners and applied researchers working in the field of epidemiology, medicine, public health, psychology, human movement sciences. The course is designed for everybody who wants to learn about missing data because missing data may be present in your own research and you are going to start with your data analysis or you want to learn how to judge other articles or research grants who report missing data. It is also important to be able to judge the impact of missing data for practice-related research.
The following concepts are assumed known by participants at the start of this course:
- Knowledge of basic statistical tests as t-tests and regression analyses.
- Knowledge of some basic SPSS commands.
On the first day of the course you will receive a package which contains copies of all lecture presentations and computer exercises, assignments, and feedback on these assignments.
Exam and accreditation
Participants who take this course as part of the Master Epidemiology always complete the course with an exam. Other participants can choose if they want to complete the course with an exam.
The exam will be in English. Only when you pass the exam you get a certificate showing the credits (study points/EC).
The examination dates can be found on the website of EpidM.
Anyone who wants to participate in the examination should apply at least four weeks before the exam to register via the website: https://www.epidm.nl/nl/tentamens/
The examination material of reference and questions to practice can be found on the Canvas page of the course (see above).
During the examinations of EpidM the use of e-books is forbidden
Only for Dutch students!
If you wish to be considered for accreditation points connected to this course, you must sign the attendance list on the last day of the course.
To qualify for the accreditation points, you must have been present throughout the course.