1st Edition

Introduction to Functional Data Analysis

By Piotr Kokoszka, Matthew Reimherr Copyright 2018
    306 Pages
    by Chapman & Hall

    306 Pages
    by Chapman & Hall

    Introduction to Functional Data Analysis provides a concise textbook introduction to the field. It explains how to analyze functional data, both at exploratory and inferential levels. It also provides a systematic and accessible exposition of the methodology and the required mathematical framework.





    The book can be used as textbook for a semester-long course on FDA for advanced undergraduate or MS statistics majors, as well as for MS and PhD students in other disciplines, including applied mathematics, environmental science, public health, medical research, geophysical sciences and economics. It can also be used for self-study and as a reference for researchers in those fields who wish to acquire solid understanding of FDA methodology and practical guidance for its implementation. Each chapter contains plentiful examples of relevant R code and theoretical and data analytic problems.





    The material of the book can be roughly divided into four parts of approximately equal length: 1) basic concepts and techniques of FDA, 2) functional regression models, 3) sparse and dependent functional data, and 4) introduction to the Hilbert space framework of FDA. The book assumes advanced undergraduate background in calculus, linear algebra, distributional probability theory, foundations of statistical inference, and some familiarity with R programming. Other required statistics background is provided in scalar settings before the related functional concepts are developed. Most chapters end with references to more advanced research for those who wish to gain a more in-depth understanding of a specific topic.

    First steps in the analysis of functional data



    Basis expansions



    Sample mean and covariance



    Principal component functions



    Analysis of BOA stock returns



    Diffusion tensor imaging



    Problems




    Further topics in exploratory FDA



    Derivatives



    Penalized smoothing



    Curve alignment



    Further reading



    Problems




    Mathematical framework for functional data



    Square integrable functions



    Random functions



    Linear transformations




    Scalar- on - function regression



    Examples



    Review of standard regression theory



    Difficulties specific to functional regression



    Estimation through a basis expansion



    Estimation with a roughness penalty



    Regression on functional principal components



    Implementation in the refund package



    Nonlinear scalar-on-function regression



    Problems




    Functional response models



    Least squares estimation and application to angular motion



    Penalized least squares estimation



    Functional regressors



    Penalized estimation in the refund package



    Estimation based on functional principal components



    Test of no effect



    Verification of the validity of a functional linear model



    Extensions and further reading



    Problems



    Functional generalized linear models



    Background



    Scalar-on-function GLM's



    Functional response GLM



    Implementation in the refund package



    Application to DTI



    Further reading



    Problems




    Sparse FDA



    Introduction



    Mean function estimation



    Covariance function estimation



    Sparse functional PCA



    Sparse functional regression



    Problems




    Functional time series



    Fundamental concepts of time series analysis



    Functional autoregressive process



    Forecasting with the Hyndman-Ullah method



    Forecasting with multivariate predictors



    Long-run covariance function



    Testing stationarity of functional time series



    Generation and estimation of the FAR(1) model using package fda



    Conditions for the existence of the FAR(1) process



    Further reading and other topics



    Problems




    Spatial functional data and models



    Fundamental concepts of spatial statistics



    Functional spatial fields



    Functional kriging



    Mean function estimation



    Implementation in the R package geofd



    Other topics and further reading



    Problems




    Elements of Hilbert space theory



    Hilbert space



    Projections and orthonormal sets



    Linear operators



    Basics of spectral theory



    Tensors



    Problems





    Random functions



    Random elements in metric spaces



    Expectation and covariance in a Hilbert space



    Gaussian functions and limit theorems



    Functional principal components



    Problems




    Inference from a random sample



    Consistency of sample mean and covariance

    Biography

    Piotr Kokoszka is a professor of statistics at Colorado State University. His research interests include functional data analysis, with emphasis on dependent data structures, and applications to geosciences and finance. He is a coauthor of the monograph Inference for Functional Data with Applications (with L. Horváth). He is an associate editor of several journals, including Computational Statistics and Data Analysis, Journal of Multivariate Analysis, Journal of Time Series Analysis, and Scandinavian Journal of Statistics.





    Matthew Reimherr is an assistant professor of statistics at Pennsylvania State University. His research interests include functional data analysis, with emphasis on longitudinal studies and applications to genetics and public health. He is an associate editor of Statistical Modeling.

    "This well-written book provides a great and intuitive introduction to functional data analysis (FDA) which has emerged as an important area in statistics and found tons of scientific applications...This book succeeds at introducing this novel statistical concept and methodology while keeps the level of mathematical and statistical sophistication required to understand at the level of an introductory graduate-level course, which makes for pleasant reading. A nice feature of the book is its strong focus on implementation using R, which makes it a great candidate of textbooks or reference books for (master-level) graduate students and applied researchers...Some unique features of this book as compared to existing ones include (1) its strong focus on implementation using R; (2) chapters on Sparse FDA, generalized functional linear models, functional time series, and spatial functional data; (3) well-designed exercises that can be used as homework problems."

    ~Xianyang Zhang, Texas A&M University

    "The main advantage of the book is its emphasis introducing the material through realistic examples and computational tools, while also providing mathematical guidance for the methodologies. Also, important topics like functional time series and spatial functional data are not adequately covered in comparable texts like Ramsay and Silverman, Ramsay and Hooker, Ferraty and Vieu, and Hsing and Eubank. In that respect, the book offers additional and practically relevant material and perspective."

    ~Debashis Paul, University of California, Davis

    "The classic tools from the field of functional data analysis are introduced comprehensively and immediately put into a framework of potential application. I would probably advise any reader that is new to functional data analysis to start by reading this book."

    ~Claudia Klüppelberg, Technische Universität München

    "Being more advanced and up to date than the Ramsay and Silverman, it complements various topics that are just briefly mentioned or not covered at all by Ramsay and Silverman."

    ~Laura Sangali, Politecnico di Milano

    "As a relatively young subfield of statistics, functional data analysis (FDA) has not had a large glut of textbooks pertaining to it. The most famous of the FDA books is the classic text by J. O. Ramsay and B. W. Silverman [Functional data analysis, Springer Ser. Statist., Springer, New York, 1997; second edition, 2005; MR2168993], which introduced many statisticians to the area. Ramsay and Silverman [Applied functional data analysis, Springer Ser. Statist., Springer, New York, 2002; MR1910407] provided a useful collection of FDA case studies, and Ramsay, G. Hooker and S. Graves [Func-tional data analysis with R and MATLAB, Use R, Springer, New York, 2009, doi:10. 1007/978-0-387-98185-7] presented R and MATLAB code for analyzing real functional data sets. [F. Ferraty and P. Vieu, Nonparametric functional data analysis, Springer Ser. Statist., Springer, New York, 2006; MR2229687] and [T. Hsing and R. L. Eubank, Theoretical foundations of functional data analysis, with an introduction to linear opera- tors, Wiley Ser. Probab. Stat., Wiley, Chichester, 2015; MR3379106] are well-respected theoretical presentations of FDA.

    This book by Kokoszka and Reimherr provides a nice mix of foundational material, accessible theory, and practical examples (including much R code). It is a valuable addition to the FDA literature, and is perhaps an ideal choice of a course textbook for either an undergraduate or graduate course in FDA, whereas several of the other textbooks are more valuable as references for researchers and practitioners than as tutorials for learners. At the end of each chapter is a nice variety of problems that instructors could use for homework assignments.

    Chapter 1 introduces basic terminology related to FDA, such as the ubiquitous tool of basis expansion and the distinction between dense and sparse functional data. Summary statistics and plots (sample mean and covariance functions, principal components analysis (PCA), functional boxplots) for FDA are brie
    y presented. Chapter 2 continues basic FDA topics with a discussion of derivative information, penalized smoothing, and alignment/registration of curves.

    The theoretical underpinnings of FDA are presented quickly in Chapter 3, where topics such as square integrable functions, random functions following some distribution, and operator theory are defined brie
    y. A fuller coverage of theoretical concerns is saved for (the optional in a course setting) Chapters 10 and 11. The heart of the book is Chapters 4 through 9, which cover functional linear models in detail, before moving on to specialized FDA topics such as sparse FDA, functional
    time series, and spatial functional data.

    Scalar-on-function regression, in which the response is a scalar and the predictor is a function, is treated in Chapter 4, and illustrated via the use of the refund package in R. Nonlinear scalar-on-function regression is brie
    y mentioned. Chapter 5 covers both the function-on-scalar regression case and the fully functional regression model in which both response and predictor are functions. Testing and validation of the functional linear model are also shown. Chapter 6 covers functional generalized linear models (GLMs) which have a nonnormal scalar response and a functional predictor. The somewhat
    nebulous situation with functional-response GLMs is brie
    y covered as well.

    The next chapter deals with sparse functional data, and presents methods for mean function estimation, covariance function estimation, PCA, and regression in the sparse case when relatively few points are measured for each observed curve. Functional time series occur when the sample functions are observed sequentially over time rather than cross-sectionally. The assumption of independent functional data fails in this case, and Chapter 8 presents a functional autoregressive model for such data that can be used for forecasting. Spatial functional data may commonly be encountered in geostatistics when curves are observed both over time and at various spatial locations. Chapter 9 discusses models for such data and prediction using functional kriging.

    Chapter 12 discusses treating a functional data set as a sample from some population of functions and performing inference on the population. Of particular interest are the methods presented for formal hypothesis tests and confidence bands about the population mean function.

    Clustering and classification of functional data are not discussed in detail in this
    book, nor is FDA on manifolds, although references are given to guide readers to recent
    research in these areas."

    ~David Benner Hitchcock