eBook

- Focuses on approaches specifically for mining graph data, such as the use of graph kernels
- Requires no prerequisites of mathematics or data mining
- Provides numerous worked examples with R source code available online
- Includes exercises and real-world applications at the end of each chapter
- Offers lecture slides on the first author's website

*Solutions manual available upon qualifying course adoption*

*Discover Novel and Insightful Knowledge from Data Represented as a Graph*

*Hands-On Application of Graph Data Mining*

Each chapter in the book focuses on a graph mining task, such as link analysis, cluster analysis, and classification. Through applications using real data sets, the book demonstrates how computational techniques can help solve real-world problems. The applications covered include network intrusion detection, tumor cell diagnostics, face recognition, predictive toxicology, mining metabolic and protein-protein interaction networks, and community detection in social networks.

*Develops Intuition through Easy-to-Follow Examples and Rigorous Mathematical Foundations*

Every algorithm and example is accompanied with R code. This allows readers to see how the algorithmic techniques correspond to the process of graph data analysis and to use the graph mining techniques in practice. The text also gives a rigorous, formal explanation of the underlying mathematics of each technique.

*Makes Graph Mining Accessible to Various Levels of Expertise*Assuming no prior knowledge of mathematics or data mining, this self-contained book is accessible to students, researchers, and practitioners of graph data mining. It is suitable as a primary textbook for graph mining or as a supplement to a standard data mining course. It can also be used as a reference for researchers in computer, information, and computational science as well as a handy guide for data analytics practitioners.

**Introduction** *Kanchana Padmanabhan, William Hendrix, and Nagiza F. Samatova*

Graph Mining Applications

Book Structure

**An Introduction to Graph Theory** *Stephen Ware*

What Is a Graph?

Vertices and Edges

Comparing Graphs

Directed Graphs

Families of Graphs

Weighted Graphs

Graph Representations

**An Introduction to R** *Neil Shah*

What Is R?

What Can R Do?

R Packages

Why Use R?

Common R Functions

R Installation

**An Introduction to Kernel Functions** *John Jenkins*

Kernel Methods on Vector Data

Extending Kernel Methods to Graphs

Choosing Suitable Graph Kernel Functions

Kernels in This Book

Analyzing Links

Metrics for Analyzing Networks

The PageRank Algorithm

Hyperlink-Induced Topic Search (HITS)

Link Prediction

Applications

**Graph-Based Proximity Measures** *Kevin A. Wilson, Nathan D. Green, Laxmikant Agrawal, Xibin Gao, Dinesh Madhusoodanan, Brian Riley, and James P. Sigmon*Defining the Proximity of Vertices in Graphs

Evaluating Relatedness Using Neumann Kernels

Applications

**Frequent Subgraph Mining** *Brent E. Harrison, Jason C. Smith, Stephen G. Ware, Hsiao-Wei Chen, Wenbin Chen, and Anjali Khatri*About Frequent Subgraph Mining

The gSpan Algorithm

The SUBDUE Algorithm

Mining Frequent Subtrees with SLEUTH

Applications

**Cluster Analysis** *Kanchana Padmanabhan, Brent Harrison, Kevin Wilson, Michael L. Warren, Katie Bright, Justin Mosiman, Jayaram Kancherla, Hieu Phung, Benjamin Miller, and Sam Shamseldin*Introduction

Minimum Spanning Tree Clustering

Shared Nearest Neighbor Clustering

Betweenness Centrality Clustering

Highly Connected Subgraph Clustering

Maximal Clique Enumeration

Clustering Vertices with Kernel

Application

How to Choose a Clustering Technique

**Classification** *Srinath Ravindran, John Jenkins, Huseyin Sencan, Jay Prakash Goel, Saee Nirgude, Kalindi K. Raichura, Suchetha M. Reddy, and Jonathan S. Tatagiri*Overview of Classification

Classifcation of Vector Data: Support Vector Machines

Classifying Graphs and Vertices

Applications

**Dimensionality Reduction** *Madhuri R. Marri, Lakshmi Ramachandran, Pradeep Murukannaiah, Padmashree Ravindra, Amrita Paul, Da Young Lee, David Funk, Shanmugapriya Murugappan, and William Hendrix*Multidimensional Scaling

Kernel Principal Component Analysis

Linear Discriminant Analysis

Applications

**Graph-Based Anomaly Detection** *Kanchana Padmanabhan, Zhengzhang Chen, Sriram Lakshminarasimhan, Siddarth Shankar Ramaswamy, and Bryan Thomas Richardson*Types of Anomalies

Random Walk Algorithm

GBAD Algorithm

Tensor-Based Anomaly Detection Algorithm

Applications

**Performance Metrics for Graph Mining Tasks** *Kanchana Padmanabhan and John Jenkins*Introduction

Supervised Learning Performance Metrics

Unsupervised Learning Performance Metrics

Optimizing Metrics

Statistical Significance Techniques

Model Comparison

Handling the Class Imbalance Problem in Supervised Learning

Other Issues

Application Domain-Specific Measures

**Introduction to Parallel Graph Mining** *William Hendrix, Mekha Susan Varghese, Nithya Natesan, Kaushik Tirukarugavur Srinivasan, Vinu Balajee, and Yu Ren*Parallel Computing Overview

Embarassingly Parallel Computation

Calling Parallel Codes in R

Creating Parallel Codes in R Using Rmpi

Practical Issues in Parallel Programming

**Index**

Exercises and Bibliography appear at the end of each chapter.

Nagiza F. Samatova is an associate professor of computer science at North Carolina State University and a senior research scientist at Oak Ridge National Laboratory.