eBook

- Provides an alternative and intuitive understanding of statistical modeling using vector geometry
- Uses vector geometry extensively to explain the problems with collinearity in linear models and other complex statistical models
- Introduces a wide range of statistical models as analytical tools: from simple regression, analysis of covariance, multilevel models, latent growth models, growth mixture models to partial least square regression
- Examples come from real research settings and are discussed in great detail without over-simplification
- Discusses vital but often poorly understood statistical concepts, such as mathematical coupling, regression to the mean, co-linearity, reversal paradox and statistical interaction

While biomedical researchers may be able to follow instructions in the manuals accompanying the statistical software packages, they do not always have sufficient knowledge to choose the appropriate statistical methods and correctly interpret their results. **Statistical Thinking in Epidemiology** examines common methodological and statistical problems in the use of correlation and regression in medical and epidemiological research: mathematical coupling, regression to the mean, collinearity, the reversal paradox, and statistical interaction.

**Statistical Thinking in Epidemiology **is about thinking statistically when looking at problems in epidemiology. The authors focus on several methods and look at them in detail: specific examples in epidemiology illustrate how different model specifications can imply different causal relationships amongst variables, and model interpretation is undertaken with appropriate consideration of the context of implicit or explicit causal relationships. This book is intended for applied statisticians and epidemiologists, but can also be very useful for clinical and applied health researchers who want to have a better understanding of statistical thinking.

*Throughout the book, statistical software packages R and Stata are used for general statistical modeling, and Amos and Mplus are used for structural equation modeling.*

**Introduction**Uses of Statistics in Medicine and Epidemiology

Structure and Objectives of This Book

Nomenclature in This Book

Glossary

Vector Geometry of Linear Models for Epidemiologists

Introduction

Basic Concepts of Vector Geometry in Statistics

Correlation and Simple Regression in Vector Geometry

Linear Multiple Regression in Vector Geometry

Significance Testing of Correlation and Simple Regression in Vector Geometry

Significance Testing of Multiple Regression in Vector Geometry

Summary

Path Diagrams and Directed Acyclic Graphs

Introduction

Path Diagrams

Directed Acyclic Graphs

Direct and Indirect Effects

Summary

Introduction

Historical Background

Why Should Change Not Be Regressed on Initial Value? A Review of the Problem

Proposed Solutions in the Literature

Comparison between Oldham’s Method and Blomqvist’s Formula

Oldham’s Method and Blomqvist’s Formula Answer Two Different Questions

What Is Galton’s Regression to the Mean?

Testing the Correct Null Hypothesis

Evaluation of the Categorisation Approach

Testing the Relation between Changes and Initial Values When There Are More than Two Occasions

Discussion

Analysis of Change in Pre-/Post-Test Studies

Introduction

Analysis of Change in Randomised Controlled Trials

Comparison of Six Methods

Analysis of Change in Non-Experimental Studies: Lord’s Paradox

ANCOVA and

Conclusion

Collinearity and Multicollinearity

Introduction: Problems of Collinearity in Linear Regression

Collinearity

Multicollinearity

Mathematical Coupling and Collinearity

Vector Geometry of Collinearity

Geometrical Illustration of Principal Components Analysis as a Solution to Multicollinearity

Example: Mineral Loss in Patients Receiving Parenteral Nutrition

Solutions to Collinearity

Conclusion

Is ‘Reversal Paradox’ a Paradox?

A Plethora of Paradoxes: The Reversal Paradox

Background: The Foetal Origins of Adult Disease

Hypothesis (Barker’s Hypothesis)

Vector Geometry of the Foetal Origins Hypothesis

Reversal Paradox and Adjustment for Current Body Size: Empirical Evidence from Meta-Analysis

Discussion

Conclusion

Testing Statistical Interaction

Introduction: Testing Interactions in Epidemiological Research

Testing Statistical Interaction between Categorical Variables

Testing Statistical Interaction between Continuous Variables

Partial Regression Coefficient for Product Term in Regression Models

Categorization of Continuous Explanatory Variables

The Four-Model Principle in the Foetal Origins Hypothesis

Categorization of Continuous Covariates and Testing Interaction

Discussion

Conclusion

Finding Growth Trajectories in Lifecourse Research

Introduction

Current Approaches to Identifying Postnatal Growth Trajectories in Lifecourse Research

Discussion

Partial Least Squares Regression for Lifecourse Research

Introduction

Data

OLS Regression

PLS Regression

Discussion

Conclusion

Concluding Remarks

References

**Dr Yu-Kang Tu** is a Senior Clinical Research Fellow in the Division of Biostatistics, School of Medicine, and in the Leeds Dental Institute, University of Leeds, Leeds, UK. He was a visiting Associate Professor to the National Taiwan University, Taipei, Taiwan. First trained as a dentist and then an epidemiologist, he has published extensively in dental, medical, epidemiological and statistical journals. He is interested in developing statistical methodologies to solve statistical and methodological problems such as mathematical coupling, regression to the mean, collinearity and the reversal paradox. His current research focuses on applying latent variables methods, e.g. structural equation modeling, latent growth curve modelling, and lifecourse epidemiology. More recently, he has been working on applying partial least squares regression to epidemiological data.

**Prof Mark S Gilthorpe** is professor of Statistical Epidemiology, Division of Biostatistics, School of Medicine, University of Leeds, Leeds, UK. Having completed a single honours degree in mathematical Physics (University of Nottingham), he undertook a PhD in Mathematical Modelling (University of Aston in Birmingham), before initially embarking upon a career as self-employed Systems and Data Analyst and Computer Programmer, and eventually becoming an academic in biomedicine. Academic posts include systems and data analyst of UK regional routine hospital data in the Department of Public Health and Epidemiology, University of Birmingham; Head of Biostatistics at the Eastman Dental Institute, University College London; and founder and Head of the Division of Biostatistics, School of Medicine, University of Leeds. His research focus has persistently been that of the development and promotion of robust and sophisticated modelling methodologies for non-experimental (and sometimes large and complex) observational data within biomedicine, leading to extensive publications in dental, medical, epidemiological and statistical journals.

"The graphical explanations proposed are quite convincing and these tools should be more exploited in statistical classes."

—Sophie Donnet, Université Paris-Dauphine, *CHANCE*, 25.4