Utility-Based Learning from Data provides a pedagogical, self-contained discussion of probability estimation methods via a coherent approach from the viewpoint of a decision maker who acts in an uncertain environment. This approach is motivated by the idea that probabilistic models are usually not learned for their own sake; rather, they are used to make decisions. Specifically, the authors adopt the point of view of a decision maker who
(i) operates in an uncertain environment where the consequences of every possible outcome are explicitly monetized,
(ii) bases his decisions on a probabilistic model, and
(iii) builds and assesses his models accordingly.
These assumptions are naturally expressed in the language of utility theory, which is well known from finance and decision theory. By taking this point of view, the book sheds light on and generalizes some popular statistical learning approaches, connecting ideas from information theory, statistics, and finance. It strikes a balance between rigor and intuition, conveying the main ideas to as wide an audience as possible.
Notions from Utility Theory
Model Performance Measurement
The Viewpoint of This Book
Organization of This Book
Some Probabilistic Concepts
Entropy and Relative Entropy
The Horse Race
The Basic Idea of an Investor in a Horse Race
The Expected Wealth Growth Rate
The Kelly Investor
Entropy and Wealth Growth Rate
The Conditional Horse Race
Elements of Utility Theory
Beginnings: The St. Petersburg Paradox
Some Popular Utility Functions
The Horse Race and Utility
The Discrete Unconditional Horse Races
Discrete Conditional Horse Races
Continuous Unconditional Horse Races
Continuous Conditional Horse Races
Select Methods for Measuring Model Performance
Rank-Based Methods for Two-State Models
Performance Measurement via Loss Function
A Utility-Based Approach to Information Theory
Interpreting Entropy and Relative Entropy in the Discrete Horse Race Context
(U,O)-Entropy and Relative (U,O)-Entropy for Discrete Unconditional Probabilities
Conditional (U,O)-Entropy and Conditional Relative (U,O)-Entropy for Discrete Probabilities
U-Entropy for Discrete Unconditional Probabilities
Utility-Based Model Performance Measurement
Utility-Based Performance Measures for Discrete Probability Models
Revisiting the Likelihood Ratio
Utility-Based Performance Measures for Discrete Conditional Probability Models
Utility-Based Performance Measures for Probability Density Models
Utility-Based Performance Measures for Conditional Probability Density Models
Monetary Value of a Model Upgrade
Select Methods for Estimating Probabilistic Models
Classical Parametric Methods
Regularized Maximum Likelihood Inference
Minimum Relative Entropy (MRE) Methods
A Utility-Based Approach to Probability Estimation
Discrete Probability Models
Conditional Density Models
Probability Estimation via Relative U-Entropy Minimization
Expressing the Data Constraints in Purely Economic Terms
Model Performance Measures and MRE for Leveraged Investors
Model Performance Measures and MRE for Investors in Incomplete Markets
Utility-Based Performance Measures for Regression Models
Three Credit Risk Models
The Gail Breast Cancer Model
A Text Classification Model
Exercises appear at the end of most chapters.
Craig Friedman is a managing director and head of research in the Quantitative Analytics group at Standard & Poor’s in New York. Dr. Friedman is also a fellow of New York University’s Courant Institute of Mathematical Sciences. He is an associate editor of both the International Journal of Theoretical and Applied Finance and the Journal of Credit Risk.
Sven Sandow is an executive director in risk management at Morgan Stanley in New York. Dr. Sandow is also a fellow of New York University’s Courant Institute of Mathematical Sciences. He holds a Ph.D. in physics and has published articles in scientific journals on various topics in physics, finance, statistics, and machine learning.
The contents of this book are Dr. Sandow’s opinions and do not represent Morgan Stanley.
Utility-Based Learning from Data is an excellent treatment of data-driven statistics for decision-making. Friedman and Sandow lucidly describe the connections between different branches of statistics and econometrics, such as utility theory, maximum entropy, and Bayesian analysis. A must-read for serious statisticians!
—Marco Avellaneda, Professor of Mathematics, New York University, and Risk Magazine Quant of the Year 2010
Combining insights from both theory and practice, this is a model trade book about modeling trading books.
—Peter Carr, Global Head of Market Modeling, Morgan Stanley, and Executive Director, Masters in Math Finance, New York University
Utility-Based Learning from Data connects key ideas from utility theory with methods from statistics, machine learning, and information theory. It presents, using decision-theoretic principles, a framework for building models that can be used by decision makers. By adopting the utility-based approach, Friedman and Sandow are able to adapt models to the risk preferences of the model user, while maintaining tractability. It is a much-needed and comprehensive book, which should help put model-building for use by decision makers on more solid ground.
—Gregory Piatetsky-Shapiro, editor of KDnuggets.com, co-founder and past Chair of SIGKDD, and founder of the Knowledge Discovery and Data Mining (KDD) conferences