Data Mining with R: Learning with Case Studies

Free Standard Shipping

Purchasing Options

ISBN 9781439810187
Cat# K10510



SAVE 20%

eBook (VitalSource)
ISBN 9781439876404
Cat# KE13702



SAVE 30%

eBook Rentals

Other eBook Options:


  • Covers the main data mining techniques through carefully selected case studies
  • Describes code and approaches that can be easily reproduced or adapted to your own problems
  • Requires no prior experience with R
  • Includes introductions to R and MySQL basics
  • Provides a fundamental understanding of the merits, drawbacks, and analysis objectives of the data mining techniques
  • Offers data and R code on


The versatile capabilities and large set of add-on packages make R an excellent alternative to many existing and often expensive data mining tools. Exploring this area from the perspective of a practitioner, Data Mining with R: Learning with Case Studies uses practical examples to illustrate the power of R and data mining.

Assuming no prior knowledge of R or data mining/statistical techniques, the book covers a diverse set of problems that pose different challenges in terms of size, type of data, goals of analysis, and analytical tools. To present the main data mining processes and techniques, the author takes a hands-on approach that utilizes a series of detailed, real-world case studies:

  1. Predicting algae blooms
  2. Predicting stock market returns
  3. Detecting fraudulent transactions
  4. Classifying microarray samples

With these case studies, the author supplies all necessary steps, code, and data.

Web Resource
A supporting website mirrors the do-it-yourself approach of the text. It offers a collection of freely available R source files that encompass all the code used in the case studies. The site also provides the data sets from the case studies as well as an R package of several functions.

Table of Contents

How to Read This Book
A Short Introduction to R
A Short Introduction to MySQL

Predicting Algae Blooms
Problem Description and Objectives
Data Description
Loading the Data into R
Data Visualization and Summarization
Unknown Values
Obtaining Prediction Models
Model Evaluation and Selection
Predictions for the 7 Algae

Predicting Stock Market Returns
Problem Description and Objectives
The Available Data
Defining the Prediction Tasks
The Prediction Models
From Predictions into Actions
Model Evaluation and Selection
The Trading System

Detecting Fraudulent Transactions
Problem Description and Objectives
The Available Data
Defining the Data Mining Tasks
Obtaining Outlier Rankings

Classifying Microarray Samples
Problem Description and Objectives
The Available Data
Gene (Feature) Selection
Predicting Cytogenetic Abnormalities



Index of Data Mining Topics

Index of R Functions

Author Bio(s)

Editorial Reviews

This is certainly one of the best books for a direct implementation of data mining algorithms. Another good point of the book is that for most of the problems there are different ways to solve them. … an invaluable resource for data miners, R programmers, as well as people involved in fields such as fraud detection and stock market prediction. If you’re serious about data mining and want to learn from experiences in the field, don’t hesitate!
—Sandro Saitta, Data Mining Research blog, May 2011

If you want to learn how to analyze your data with a free software package that has been built by expert statisticians and data miners, this is your book. A broad range of real-world case studies highlights the breadth and depth of the R software.
—Bernhard Pfahringer, University of Waikato, New Zealand

Both R novices and experts will find this a great reference for data mining.
Intelligent Trading blog and R-bloggers, November 2010