Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data

Published:
Author(s):

Purchasing Options

Hardback
Not available
in your region
ISBN 9781574443448
Cat# SL3445
 

Features

  • Explores Gen IQ, a non-statistical machine learning model that is effective in finding the best possible subset variables
  • Demonstrates profiling techniques for identifying best customers; expands discussion of the benefits of predictive profiling to include look-alike profiling
  • Applies unconventional usage of CHAID algorithm toward market segment classification and other problems
  • Examines three concepts in model assessment—traditional decile analysis, precision, and separability
  • Exposes weaknesses of decile analysis, and offers new bootstrap approach for measuring database model efficiency
  • Summary

    Traditional statistical methods are limited in their ability to meet the modern challenge of mining large amounts of data. Data miners, analysts, and statisticians are searching for innovative new data mining techniques with greater predictive power, an attribute critical for reliable models and analyses.
    Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data delivers a collection of successful database marketing methodologies for big data. This compendium solves common database marketing problems by applying new hybrid modeling techniques that combine traditional statistical and new machine learning methods. The book delivers a thorough analysis of these cutting-edge techniques, which include non-statistical machine learning and genetic intelligent hybrid models.
    By following the step-by-step procedures detailed in the text, database marketing professionals can learn how to apply the proper statistical techniques to any database marketing challenge. The practical case studies and examples provided involve real problems and real data, and are taken from a variety of industries, including banking, insurance, finance, retail, and telecommunications.

    Table of Contents

    Contents

    Introduction


    The Personal Computer and Statistics
    Statistics and Data Analysis
    EDA
    The EDA Paradigm
    EDA Weaknesses
    Small and Big Data
    Data Mining Paradigm
    Statistics and Machine Learning
    Statistical Learning
    References
    Two Simple Data Mining Methods for Variable Assessment
    Correlation Coefficient
    Scatterplots
    Data Mining
    Smoothed Scatterplot
    General Association Test
    Summary
    References
    Logistic Regression: The Workhorse of Database Response Modeling
    Logistic Regression Model
    Case Study
    Logits and Logit Plots
    The Importance of Straight Data
    Re-expressing for Straight Data
    Straight Data for Case Study
    Techniques When Bulging Rule Does Not Apply
    Re-expressing MOS_OPEN
    Assessing the Importance of Variables
    Important Variables for Case Study
    Relative Importance of the Variables
    Best Subset of Variables for Case Study
    Visual Indicators of Goodness of Model Predictions
    Evaluating the Data Mining Work
    Smoothing a Categorical Variable
    Additional Data Mining Work for Case Study
    Summary
    Ordinary Regression: The Workhorse of Database Profit Modeling
    Ordinary Regression Model
    Illustration
    Mini Case Study
    Important Variables for Mini Case Study
    Best Subset of Variable for Case Study
    Summary
    CHAID for Interpreting a Logistic Regression Model
    Logistic Regression Model
    Database Marketing Response Model Case Study
    CHAID
    Multivariable CHAID Trees
    CHAID Market Segmentation
    CHAID Tree Graphs
    Summary
    The Importance of the Regression Coefficient
    The Ordinary Regression Model
    Four Questions
    Important Predictor Variables
    P-values and BIG Data
    Returning to Question #1
    Predictor Variable’s Effect On Prediction
    The Caveat
    Returning to Question #2
    Ranking Predictor Variables By Effect On Prediction
    Returning to Question #3
    Returning to Question #46.12 Summary
    Reference
    The Predictive Contribution Coefficient: A Measure of Predictive Importance
    Background
    Illustration of Decision Rule, Predictive Contribution Coefficient
    Calculation of Predictive Contribution Coefficient
    Extra Illustration of Predictive Contribution Coefficient
    Summary
    Reference
    CHAID for Specifying a Model with Interaction Variables
    Interaction Variables
    Strategy for Modeling with Interaction Variables
    Strategy Based on the Notion of a Special Point
    Example of a Response Model with an Interaction Variable
    CHAID for Uncovering Relationships
    Illustration of CHAID Specifying a Model
    An Exploratory Look
    Database Implication
    Summary
    Reference
    Market Segment Classification Modeling with Logistic Regression
    Binary Logistic Regression
    Polychotomous Logistic Regression Model
    Model Building With PLR
    Market Segmentation Classification Model
    Summary
    CHAID as a Method for Filling in Missing Values
    Introduction to the Problem of Missing Data
    Missing-data Assumption
    CHAID Imputation
    Illustration
    CHAID Most-likely Category Imputation for a Categorical Variable
    Summary
    Reference
    Identifying Your Best Customers: Descriptive, Predictive and Look-alike Profiling
    Some Definitions
    Illustration of a Flawed Targeting Effort
    Well-Defined Targeting Effort
    Predictive Profiles
    Continuous Trees
    Look-alike Profiling
    Look-alike Tree Characteristics
    Summary

    Assessment of Database Marketing Models


    Accuracy for Response Model
    Accuracy for Profit Model
    Decile Analysis and Cum Lift for Response Model
    Decile Analysis and Cum Lift for Profit Model
    Precision for Response Model
    Construction of SWMAD
    Separability for Response and Profit Models
    Guidelines for Using Cum Lift, HL/SWMAD and CV
    Summary
    Bootstrapping in Database Marketing: A New Approach for Validating Models
    Traditional Model Validation
    Illustration
    Three Questions
    The Bootstrap
    How to Bootstrap
    Bootstrap Decile Analysis Validation
    Another Question
    Bootstrap Assessment of Model Implementation Performance
    Bootstrap Assessment of Model Efficiency
    Summary
    Reference

    Visualization of Database Models


    Brief History of the Graph
    Star Graph Basics
    Star Graphs for Single Variables
    Star Graphs for Many Variables Considered Jointly
    Profile Curves Method
    Illustration
    Summary
    SAS Code for Star Graphs for Each Demographic Variable about the Deciles SAS Code for Star Graphs for Each Decile About the Demographic Variables
    SAS Code for Profile Curves: All Deciles
    Reference
    Genetic Modeling in Database Marketing: The GenIQ Model
    What Is Optimization?
    What Is Genetic Modeling
    Genetic Modeling: An Illustration
    Parameters for Controlling a Genetic Model Run
    Genetic Modeling: Strengths and Limitations
    Goals of Modeling in Database Marketing
    The GenIQ Response Model
    The GenIQ Profit Model
    Case Study-Response Model
    Case Study-Profit Model
    Summary
    Reference
    Finding the Best Variables for Database Marketing Models
    Background
    Weakness in the Variable Selection Methods
    Goals of Modeling in Database Marketing
    Variable Selection with GenIQ
    Nonlinear Alternative to Logistic Regression Model
    Summary
    Reference

    Interpretation of Coefficient-free Models


    The Linear Regression Coefficient
    Illustration for the Simple Ordinary Regression Model
    The Quasi-Regression Coefficient for Simple Regression Models
    Partial Quasi-RC for the Everymodel
    Quasi-RC for a Coefficient-free Model
    Summary