1st Edition

Statistics in Human Genetics and Molecular Biology

By Cavan Reilly Copyright 2009
    280 Pages 24 B/W Illustrations
    by Chapman & Hall

    Focusing on the roles of different segments of DNA, Statistics in Human Genetics and Molecular Biology provides a basic understanding of problems arising in the analysis of genetics and genomics. It presents statistical applications in genetic mapping, DNA/protein sequence alignment, and analyses of gene expression data from microarray experiments.

    The text introduces a diverse set of problems and a number of approaches that have been used to address these problems. It discusses basic molecular biology and likelihood-based statistics, along with physical mapping, markers, linkage analysis, parametric and nonparametric linkage, sequence alignment, and feature recognition. The text illustrates the use of methods that are widespread among researchers who analyze genomic data, such as hidden Markov models and the extreme value distribution. It also covers differential gene expression detection as well as classification and cluster analysis using gene expression data sets.

    Ideal for graduate students in statistics, biostatistics, computer science, and related fields in applied mathematics, this text presents various approaches to help students solve problems at the interface of these areas.

    Basic Molecular Biology for Statistical Genetics and Genomics

    Mendelian genetics

    Cell biology

    Genes and chromosomes

    DNA

    RNA

    Proteins

    Some basic laboratory techniques

    Bibliographic notes and further reading

    Basics of Likelihood-Based Statistics

    Conditional probability and Bayes theorem

    Likelihood-based inference

    Maximum likelihood estimates

    Likelihood ratio tests

    Empirical Bayes analysis

    Markov chain Monte Carlo sampling

    Bibliographic notes and further reading

    Markers and Physical Mapping

    Introduction

    Types of markers

    Physical mapping of genomes

    Radiation hybrid mapping

    Basic Linkage Analysis

    Production of gametes and data for genetic mapping

    Some ideas from population genetics

    The idea of linkage analysis

    Quality of genetic markers

    Two point parametric linkage analysis

    Multipoint parametric linkage analysis

    Computation of pedigree likelihoods

    Extensions of the Basic Model for Parametric Linkage

    Introduction

    Penetrance

    Phenocopies

    Heterogeneity in the recombination fraction

    Relating genetic maps to physical maps

    Multilocus models

    Nonparametric Linkage and Association Analysis

    Introduction

    Sib-pair method

    Identity by descent

    Affected sib-pair (ASP) methods

    QTL mapping in human populations

    A case study: dealing with heterogeneity in QTL mapping

    Linkage disequilibrium

    Association analysis

    Sequence Alignment

    Sequence alignment

    Dot plots

    Finding the most likely alignment

    Dynamic programming

    Using dynamic programming to find the alignment

    Global versus local alignments

    Significance of Alignments and Alignment in Practice

    Statistical significance of sequence similarity

    Distributions of maxima of sets of iid random variables

    Rapid methods of sequence alignment

    Internet resources for computational biology

    Hidden Markov Models

    Statistical inference for discrete parameter finite state space Markov chains

    Hidden Markov models

    Estimation for hidden Markov models

    Parameter estimation

    Integration over the model parameters

    Feature Recognition in Biopolymers

    Gene transcription

    Detection of transcription factor binding sites

    Computational gene recognition

    Multiple Alignment and Sequence Feature Discovery

    Introduction

    Dynamic programming

    Progressive alignment methods

    Hidden Markov models

    Block motif methods

    Enumeration based methods

    A case study: detection of conserved elements in mRNA

    Statistical Genomics

    Functional genomics

    The technology

    Spotted cDNA arrays

    Oligonucleotide arrays

    Normalization

    Detecting Differential Expression

    Introduction

    Multiple testing and the false discovery rate

    Significance analysis for microarrays

    Model based empirical Bayes approach

    A case study: normalization and differential detection

    Cluster Analysis in Genomics

    Introduction

    Some approaches to cluster analysis

    Determining the number of clusters

    Biclustering

    Classification in Genomics

    Introduction

    Cross-validation

    Methods for classification

    Aggregating classifiers

    Evaluating performance of a classifier

    References

    Index

    Exercises appear at the end of each chapter.

    Biography

    Cavan Reilly is associate professor of biostatistics at the University of Minnesota.

    Thankfully, some brave souls are willing to serve as guides to rigorous application and understanding of statistical approaches to genetically informative data. Cavan Reilly is among them. … The book is self-contained and well organized, covering a substantial breadth of the core topics in genetics and genomics. … this book is a valuable reference source for both statistics-oriented and human-genetics-oriented researchers and graduate students to learn the specialized methodology for analysis of diverse genetic data. … a useful textbook for beginners trained in applied mathematics and statistics to take in a panoramic snapshot of the very evolving field of statistical genetics and genomics.
    —Xiang-Yang Lou and David B. Allison, Biometrics, December 2011

    Very useful for those taking courses in statistics and geneticists.
    Pediatric Endocrinology Reviews, Vol. 7, No. 4, June 2010