**
**- Do Storks Bring Babies?
Karl Pearson and spurious correlation

Jerzy Neyman, storks, and babies

Is Poisson regression the solution to the stork problem?

**
**Further reading

**
**

- Risks and Rates
What is a rate?

Closed and open populations

Measures of time

Numerators for rates: counts

Numerators that may be mistaken for counts

Prevalence proportions

Denominators for rates: count denominators for incidence proportions (risks)

Denominators for rates: person-time for incidence rates

Rate numerators and denominators for recurrent events

Rate denominators other than person-time

Different incidence rates tell different stories

Potential advantages of incidence rates compared with incidence proportions (risks)

Potential advantages of incidence proportions (risks) compared with incidence rates

Limitations of risks and rates

Radioactive decay: an example of exponential decline

The relevance of exponential decay to human populations

Relationships between rates, risks, and hazards

Further reading

**
**

- Rate Ratios and Differences
Estimated associations and causal effects

Sources of bias in estimates of causal effect

Estimation versus prediction

Ratios and differences for risks and rates

Relationships between measures of association in a closed population

The hypothetical TEXCO study

Breaking the rules: Army data for Companies A and B

Relationships between odds ratios, risk ratios, and rate ratios in case-control studies

Symmetry of measures of association

Convergence problems for estimating associations

Some history regarding the choice between ratios and differences

Other influences on the choice between use of ratios or differences

The data may sometimes be used to choose between a ratio or a difference

**
**

- The Poisson Distribution
Alpha particle radiation

The Poisson distribution

Prussian soldiers kicked to death by horses

Variances, standard deviations, and standard errors for counts and rates

An example: mortality from Alzheimer’s disease

Large sample P-values for counts, rates, and their differences using the Wald statistic

Comparisons of rates as differences versus ratios

Large sample P-values for counts, rates, and their differences using the score statistic

Large sample confidence intervals for counts, rates, and their differences

Large sample P-values for counts, rates, and their ratios

Large sample confidence intervals for ratios of counts and rates

A constant rate based on more person-time is more precise

Exact methods

What is a Poisson process?

Simulated examples

What if the data are not from a Poisson process? Part , overdispersion

What if the data are not from a Poisson process? Part , underdispersion

Must anything be rare?

Bicyclist deaths in and

**
**

- Criticism of Incidence Rates
Florence Nightingale, William Farr, and hospital mortality rates Debate in

Florence Nightingale, William Farr, and hospital mortality rates Debate in -

Criticism of rates in the British Medical Journal in

Criticism of incidence rates in

**
**

- Stratified Analysis: Standardized Rates
Why standardize?

External weights from a standard population: direct standardization

Comparing directly standardized rates

Choice of the standard influences the comparison of standardized rates

Standardized comparisons versus adjusted comparisons from variance-minimizing methods

Stratified analyses

Variations on directly standardized rates

Internal weights from a population: indirect standardization

The standardized mortality ratio (SMR)

Advantages of SMRs compared with SRRs (ratios of directly standardized rates)

Disadvantages of SMRs compared with SRRs (ratios of directly standardized rates)

The terminology of direct and indirect standardization

P-values for directly standardized rates

Confidence intervals for directly standardized rates

P-values and confidence intervals for SRRs (ratios of directly standardized rates)

Large sample P-values and confidence intervals for SMRs

Small sample P-values and confidence intervals for SMRs

Standardized rates should not be used as regression outcomes

Standardization is not always the best choice

**
**

- Stratified Analysis: Inverse-variance and Mantel-Haenszel Methods
Inverse-variance methods

Inverse-variance analysis of rate ratios

Inverse-variance analysis of rate differences

Choosing between rate ratios and differences

Mantel-Haenszel methods

Mantel-Haenszel analysis of rate ratios

Mantel-Haenszel analysis of rate differences

P-values for stratified rate ratios or differences

Analysis of sparse data

Maximum-likelihood stratified methods

Stratified methods versus regression

**
**

- Collapsibility and Confounding
What is collapsibility?

The British X-Trial: introducing variation in risk

Rate ratios and differences are noncollapsible because exposure influences person-time

Which estimate of the rate ratio should we prefer?

Behavior of risk ratios and differences

Hazard ratios and odds ratios

Comparing risks with other outcome measures

The Italian X-Trial: -levels of risk under no exposure

The American X-Cohort study: -levels of risk in a cohort study

The Swedish X-Cohort study: a collapsible risk ratio in confounded data

A summary of findings

A different view of collapsibility

Practical implications: avoid common outcomes

Practical implications: use risks or survival functions

Practical implications: case-control studies

Practical implications: uniform risk

Practical implications: use all events

**
**

- Poisson Regression for Rate Ratios
The Poisson regression model for rate ratios

A short comparison with ordinary linear regression

A Poisson model without variables

A Poisson regression model with one explanatory variable

The iteration log

The header information above the table of estimates

Using a generalized linear model to estimate rate ratios

An alternative parameterization for Poisson models: a regression trick

Further comments about person-time

A short summary

**
**

- Poisson Regression for Rate Differences
A regression model for rate differences

Florida and Alaska cancer mortality: regression models that fail

Florida and Alaska cancer mortality: regression models that succeed

A generalized linear model with a power link

A caution

**
**

- Linear Regression
Limitations of ordinary least squares linear regression

Florida and Alaska cancer mortality rates

Weighted least squares linear regression

Importance weights for weighted least squares linear regression

Comparison of Poisson, weighted least squares, and ordinary least squares regression

Exposure to a carcinogen: ordinary linear regression ignores the precision of each rate

Differences in homicide rates: simple averages versus population-weighted averages

The place of ordinary least squares linear regression for the analysis of incidence rates

Variance weighted least squares regression

Cautions regarding inverse-variance weights

Why use variance weighted least squares?

A short comparison of weighted Poisson regression, variance weighted least squares, and weighted linear regression

Problems when age-standardized rates are used as outcomes

Ratios and spurious correlation

Linear regression with ln(rate) as the outcome

Predicting negative rates

Summary

**
**

- Model Fit
Tabular and graphic displays

Goodness of fit tests: deviance and Pearson statistics

A conditional moment chi-squared test of fit

Limitations of goodness-of-fit statistics

Measures of dispersion

Robust variance estimator as a test of fit

Comparing models using the deviance

Comparing models using Akaike and Bayesian information criterion

Example : using Stata’s generalized linear model command to decide between a rate ratio or a rate difference model for the randomized controlled trial of exercise and falls

Example : a rate ratio or a rate difference model for hypothetical data regarding the association between fall rates and age

A test of the model link

Residuals, influence analysis, and other measures

Adding model terms to improve fit

A caution

**
**

- Adjusting Standard Errors and Confidence Intervals
Estimating the variance without regression

Poisson regression

Rescaling the variance using the Pearson dispersion statistic

Robust variance

Generalized Estimating Equations

Using the robust variance to study length of hospital stay

Computer intensive methods

The bootstrap idea

The bootstrap normal method

The bootstrap percentile method

The bootstrap bias-corrected percentile method

The bootstrap bias-corrected and accelerated method

The bootstrap-t method

Which bootstrap CI is best?

Permutation and Randomization

Randomization to nearly equal groups

Better randomization using the randomized block design of the original study

A summary

**
**

- Storks and Babies, Revisited
Neyman’s approach to his data

Using methods for incidence rates

A model that uses the stork/women ratio

**
**

- Flexible Treatment of Continuous Variables
The problem

Quadratic splines

Fractional polynomials

Flexible adjustment for time

Which method is best?

**
**

- Judging Variation in Size of an Association
An example: shoes and falls

Problem : Using subgroup P-values for interpretation

Problem : Failure to include main effect terms when interaction terms are used

Problem : Incorrectly concluding that there is no variation in association

Problem : Interaction may be present on a ratio scale but not on a difference scale, and vice versa

Problem : Failure to report all subgroup estimates in an evenhanded manner

**
**

- Negative Binomial Regression
Negative binomial regression is a random effects or mixed model

An example: accidents among workers in a munitions factory

Introducing equal person-time in the homicide data

Letting person-time vary in the homicide data

Estimating a rate ratio for the homicide data

Another example using hypothetical data for five regions

Unobserved heterogeneity

Observing heterogeneity in the shoe data

Underdispersion

A rate difference negative binomial regression model

Conclusion

**
**

- Clustered Data
Data from fictitious nursing homes

Results from , data simulations for the nursing homes

A single random set of data for the nursing homes

Variance adjustment methods

Generalized estimating equations (GEE)

Mixed model methods

What do mixed models estimate?

Mixed model estimates for the nursing home intervention

Simulation results for some mixed models

Mixed models weight observations differently than Poisson regression

Which should we prefer for clustered data, variance-adjusted or mixed models?

Additional model commands for clustered data

Further reading

**
**

- Longitudinal Data
Just use rates

Using rates to evaluate governmental policies

Study designs for governmental policies

A fictitious water treatment and US mortality -

Poisson regression

Population-averaged estimates (GEE)

Conditional Poisson regression, a fixed-effects approach

Negative binomial regression

Which method is best?

Water treatment in only states

Conditional Poisson regression for the -state water-treatment data

A published study

**
**

- Matched Data
Matching in case-control studies

Matching in randomized controlled trials

Matching in cohort studies

Matching to control confounding in some randomized trials and cohort studies

A benefit of matching; only matched sets with at least one outcome are needed

Studies designs that match a person to themselves

A matched analysis can account for matching ratios that are not constant

Choosing between risks and rates for the crash data in Tables and

Stratified methods for estimating risk ratios for matched data

Odds ratios, risk ratios, cell A, and matched data

Regression analysis of matched data for the odds ratio

Regression analysis of matched data for the risk ratio

Matched analysis of rates with one outcome event

Matched analysis of rates for recurrent events

The randomized trial of exercise and falls; some problems revealed

Final words

**
**

- Marginal Methods
What are margins?

Converting logistic regression results into risk ratios or risk differences: marginal standardization

Estimating a rate difference from a rate ratio model

Death by age and sex: a short example

Skunk bite data: a long example

Obtaining the rate difference: crude rates

Using the robust variance

Adjusting for age

Full adjustment for age and sex

Marginal commands for interactions

Marginal methods for a continuous variable

Using a rate difference model to estimate a rate ratio: use the ln scale

**
**

- Bayesian Methods
Cancer mortality rate in Alaska

The rate ratio for falling in a trial of exercise

**
**

- Exact Poisson Regression
A simple example

A perfectly predicted outcome

Memory problems

A caveat

**
**

- Instrumental Variables
The problem: what does a randomized controlled trial estimate?

Analysis by treatment received may yield biased estimates of treatment effect

Using an instrumental variable

Two-stage linear regression for instrumental variables

Generalized method of moments

Generalized method of moments for rates

What does an instrumental variable analysis estimate?

There is no free lunch

Final comments

**
**

- Hazards