Statistical Data Mining Using SAS Applications, Second Edition describes statistical data mining concepts and demonstrates the features of user-friendly data mining SAS tools. Integrating the statistical and graphical analysis tools available in SAS systems, the book provides complete statistical data mining solutions without writing SAS program codes or using the point-and-click approach. Each chapter emphasizes step-by-step instructions for using SAS macros and interpreting the results. Compiled data mining SAS macro files are available for download on the author’s website. By following the step-by-step instructions and downloading the SAS macros, analysts can perform complete data mining analysis fast and effectively.
New to the Second Edition—General Features
- Access to SAS macros directly from desktop
- Compatible with SAS version 9, SAS Enterprise Guide, and SAS Learning Edition
- Reorganization of all help files to an appendix
- Ability to create publication quality graphics
- Macro-call error check
New Features in These SAS-Specific Macro Applications
- Converting PC data files to SAS data (EXLSAS2 macro)
- Randomly splitting data (RANSPLIT2)
- Frequency analysis (FREQ2)
- Univariate analysis (UNIVAR2)
- PCA and factor analysis (FACTOR2)
- Multiple linear regressions (REGDIAG2)
- Logistic regression (LOGIST2)
- CHAID analysis (CHAID2)
Requiring no experience with SAS programming, this resource supplies instructions and tools for quickly performing exploratory statistical methods, regression analysis, logistic regression multivariate methods, and classification analysis. It presents an accessible, SAS macro-oriented approach while offering comprehensive data mining solutions.
Data Mining: A Gentle Introduction
Introduction
Data Mining: Why It Is Successful in the IT World
Benefits of Data Mining
Data Mining: Users
Data Mining: Tools
Data Mining: Steps
Problems in the Data Mining Process
SAS Software the Leader in Data Mining
Introduction of User-Friendly SAS Macros for Statistical Data Mining
Preparing Data for Data Mining
Introduction
Data Requirements in Data Mining
Ideal Structures of Data for Data Mining
Understanding the Measurement Scale of Variables
Entire Database or Representative Sample
Sampling for Data Mining
User-Friendly SAS Applications Used in Data Preparation
Exploratory Data Analysis
Introduction
Exploring Continuous Variables
Data Exploration: Categorical Variable
SAS Macro Applications Used in Data Exploration
Unsupervised Learning Methods
Introduction
Applications of Unsupervised Learning Methods
Principal Component Analysis (PCA)
Exploratory Factor Analysis (EFA)
Disjoint Cluster Analysis (DCA)
Biplot Display of PCA, EFA, and DCA Results
PCA and EFA Using SAS Macro FACTOR2
Disjoint Cluster Analysis Using SAS Macro DISJCLS2
Supervised Learning Methods: Prediction
Introduction
Applications of Supervised Predictive Methods
Multiple Linear Regression Modeling
Binary Logistic Regression Modeling
Ordinal Logistic Regression
Survey Logistic Regression
Multiple Linear Regression Using SAS Macro REGDIAG2
Lift Chart Using SAS Macro LIFT2
Scoring New Regression Data Using the SAS Macro RSCORE2
Logistic Regression Using SAS Macro LOGIST2
Scoring New Logistic Regression Data Using the SAS Macro LSCORE2
Case Study 1: Modeling Multiple Linear Regressions
Case Study 2: If-Then Analysis and Lift Charts
Case Study 3: Modeling Multiple Linear Regression with Categorical Variables
Case Study 4: Modeling Binary Logistic Regression
Case Study 5: Modeling Binary Multiple Logistic Regression
Case Study 6: Modeling Ordinal Multiple Logistic Regression
Supervised Learning Methods: Classification
Introduction
Discriminant Analysis
Stepwise Discriminant Analysis
Canonical Discriminant Analysis
Discriminant Function Analysis
Applications of Discriminant Analysis
Classification Tree Based on CHAID
Applications of CHAID
Discriminant Analysis Using SAS Macro DISCRIM2
Decision Tree Using SAS Macro CHAID2
Case Study 1: Canonical Discriminant Analysis and Parametric Discriminant Function Analysis
Case Study 2: Nonparametric Discriminant Function Analysis
Case Study 3: Classification Tree Using CHAID
Advanced Analytics and Other SAS Data Mining Resources
Introduction
Artificial Neural Network Methods
Market Basket Analysis
SAS Software: The Leader in Data Mining
Appendix I: Instruction for Using the SAS Macros
Appendix II: Data Mining SAS Macro Help Files
Appendix III: Instruction for Using the SAS Macros with Enterprise Guide Code Window
Index
A Summary and References appear at the end of each chapter.
Biography
George Fernandez is a professor of applied statistical methods and the director of the Center for Research Design and Analysis at the University of Nevada in Reno.
Its key features include the provision of case studies throughout the sections, downloadable macros and instructions on how to run them. … The step-by-step instructions and the graphical representations of data make it particularly useful to those wishing to communicate complex and technical data to a largely non-specialist audiences.
—Kassim S. Mwitondi, Journal of Applied Statistics, 2012
If I had to recommend a good introduction to data mining, I would choose this one.
— J. A. Pardo, Complutense University of Madrid, Madrid, Spain, in Statistical Papers, 2012Like the first edition of the book, this new edition provides a high-level introduction to some important concepts and algorithms in data mining. … the author presents broad statistical data mining solutions without writing SAS program codes. One of the nicest features of this book is that it gives access to SAS macros directly from the desktop and offers to create publication quality graphs. … this new edition provides a simple and straightforward introduction to data mining, along with a number of detailed, worked case studies.
—Technometrics, February 2011Praise for the First Edition:
The macros integrate nicely with SAS’s output delivery system … . this is a book that could serve as an easy-to read introduction to some classical statistical techniques that are used in data mining, and, with the associated macros, provide an opportunity to see those techniques in action.
—Journal of the American Statistical Association, June 2004, Vol. 99, No. 466Use of these data mining SAS macros facilitated reliable conversion, examination, and analysis of the data, and selection of best statistical models despite the great size of the data sets. …
—Christopher Ross, US Bureau of Land ManagementAn excellent treatment of data mining using SAS applications is provided in this book. … This book would be suitable for students (as a textbook), data analysts, and experienced SAS programmers. No SAS programming experience, however, is required to benefit from the book.
—Computing Reviews, June 2003… the book provides a welcome contrast to treatments of data mining that focus on only the most novel aspects of the subject. Dr. Fernandez is quite right in pointing out that a lot of data mining can be carried out by standard statistical methods in familiar packages. The book also has a healthy emphasis on the use of cross validation (a hallmark of data mining). This and other concepts are well illustrated with numerous examples. Finally, the book demonstrates that the fancy (and expensive) user interfaces sported by many data mining work benches are not essential to the data mining enterprise and might even be counterproductive.
—Computational Statistics, 2005