Practical Data Mining

Practical Data Mining

Published:
Author(s):

Purchasing Options

Hardback
$89.95
Add to cart
ISBN 9781439868362
Cat# K13109
eBook
ISBN 9781439868379
Cat# KE13279
 

Features

    • Provides sequential presentations of practical strategies for a variety of data mining scenarios
    • Organized according to problem for quick-and-easy access and maximum utility
    • Includes workable solutions, with explained technical components, potential problems, and pitfalls along the way
    • Supplies tools to allow readers to plan a principled attack on a data mining problem by filling out checklists and running decision trees

    Summary

    Used by corporations, industry, and government to inform and fuel everything from focused advertising to homeland security, data mining can be a very useful tool across a wide range of applications. Unfortunately, most books on the subject are designed for the computer scientist and statistical illuminati and leave the reader largely adrift in technical waters.

    Revealing the lessons known to the seasoned expert, yet rarely written down for the uninitiated, Practical Data Mining explains the ins-and-outs of the detection, characterization, and exploitation of actionable patterns in data. This working field manual outlines the what, when, why, and how of data mining and offers an easy-to-follow, six-step spiral process. Catering to IT consultants, professional data analysts, and sophisticated data owners, this systematic, yet informal treatment will help readers answer questions, such as:

    • What process model should I use to plan and execute a data mining project?
    • How is a quantitative business case developed and assessed?
    • What are the skills needed for different data mining projects?
    • How do I track and evaluate data mining projects?
    • How do I choose the best data mining techniques?

    Helping you avoid common mistakes, the book describes specific genres of data mining practice. Most chapters contain one or more case studies with detailed projects descriptions, methods used, challenges encountered, and results obtained. The book includes working checklists for each phase of the data mining process. Your passport to successful technical and planning discussions with management, senior scientists, and customers, these checklists lay out the right questions to ask and the right points to make from an insider’s point of view.

    Visit the book’s webpage for access to additional resources—including checklists, figures, PowerPoint slides, and a small set of simple prototype data mining tools.

    http://www.celestech.com/PracticalDataMining

    Table of Contents

    What Is Data Mining and What Can It Do?
    Introduction
    A Brief Philosophical Discussion
    The Most Important Attribute of the Successful Data Miner: Integrity
    What Does Data Mining Do?
    What Do We Mean By Data?
    Data Complexity
    Computational Complexity
    Summary

    The Data Mining Process
    Introduction
    Discovery and Exploitation
    Eleven Key Principles of Information Driven Data Mining
    Key Principles Expanded
    Type of Models: Descriptive, Predictive, Forensic
    Data Mining Methodologies
    A Generic Data Mining Process
    RAD Skill Set Designators
    Summary

    Problem Definition (Step 1)
    Introduction
    Problem Definition Task 1: Characterize Your Problem
    Problem Definition Checklist
    Candidate Solution Checklist
    Problem Definition Task 2: Characterizing Your Solution
    Problem Definition Case Study
    Summary

    Data Evaluation (Step 2)
    Introduction
    Data Accessibility Checklist
    How Much Data Do You Need?
    Data Staging
    Methods Used for Data Evaluation
    Data Evaluation Case Study: Estimating the Information Content Features
    Some Simple Data Evaluation Methods
    Data Quality Checklist
    Summary

    Feature Extraction and Enhancement (Step 3)
    Introduction: A Quick Tutorial on Feature Space
    Characterizing and Resolving Data Problems
    Principal Component Analysis
    Synthesis of Features
    Degapping
    Summary

    Prototyping Plan and Model Development (Step 4)
    Introduction
    Step 4A: Prototyping Plan
    Prototyping Plan Case Study
    Step 4B: Prototyping/Model Development
    Model Development Case Study
    Summary

    Model Evaluation (Step 5)
    Introduction
    Evaluation Goals and Methods
    What Does Accuracy Mean?
    Summary

    Implementation (Step 6)
    Introduction
    Quantifying the Benefits of Data Mining
    Tutorial on Ensemble Methods
    Getting It Wrong: Mistakes Every Data Miner Has Made
    Summary

    Supervised Learning Genre Section 1—Detecting and Characterizing Known Patterns
    Introduction
    Representative Example of Supervised Learning: Building a Classifier
    Specific Challenges, Problems, and Pitfalls of Supervised Learning
    Recommended Data Mining Architectures for Supervised Learning
    Descriptive Analysis
    Predictive Modeling
    Summary

    Forensic Analysis Genre Section 2—Detecting, Characterizing, and Exploiting Hidden Patterns
    Introduction
    Genre Overview
    Recommended Data Mining Architectures for Unsupervised Learning
    Examples and Case Studies for Unsupervised Learning
    Tutorial on Neural Networks
    Making Syntactic Methods Smarter: The Search Engine Problem
    Summary

    Genre Section 3—Knowledge: Its Acquisition, Representation, and Use
    Introduction to Knowledge Engineering
    Computing with Knowledge
    Inferring Knowledge from Data: Machine Learning
    Summary

    References
    Glossary
    Index

    Author Bio(s)

    Monte F. Hancock, Jr., BA, MS, is Chief Scientist for Celestech, Inc., which has offices in Falls Church, Virginia, and Phoenix, Arizona. He was also a Technical Fellow at Northrop Grumman; Chief Cognitive Research Scientist for CSI, Inc., and was a software architect and engineer at Harris corporation, and HRB Singer, Inc. He has over 30 years of industry experience in software engineering and data mining technology development.

    He is also Adjunct Full Professor of Computer Science for the Webster University Space Coast Region, where he serves as Program Mentor for the Master of Science Degree in Computer Science. Monte has served for 26 years on the adjunct faculty in the Mathematics and Computer Science Department of the Hamilton Holt School of Rollins College, Winter Park, Florida, and served 3 semesters as adjunct Instructor in Computer Science at Pennsylvania State University.

    Monte teaches secondary Mathematics, AP Physics, Chemistry, Logic, Western Philosophy, and Church History at New Covenant School, and New Testament Greek at Heritage Christian Academy, both in Melbourne, Florida. He was a mathematics curriculum developer for the Department of Continuing Education of the University of Florida in Gainesville, and serves on the Industry Advisory Panels in Computer Science for both the Florida Institute of Technology, and Brevard Community College in Melbourne, Florida. Monte has twice served on panels for the National Science Foundation.

    Monte has served on many program committees for international data mining conferences, was a Session Chair for KDD. He has presented 15 conference papers, edited several book chapters, and co-authored the book Data Mining Explained with Rhonda Delmater, Digital Press, 2001.

    Monte is cited in (among others):

    • "Who’s Who in the World" (2009–2012)
    • "Who’s Who in America" (2009–2012)
    • "Who’s Who in Science and Engineering" (2006–2012)
    • "Who’s Who in the Media and Communication" (1st ed.)
    • "Who’s Who in the South and Southwest" (23rd–25th ed.)
    • "Who’s Who Among America’s Teachers" (2006, 2007)
    • "Who’s Who in Science and Theology" (2nd ed.)

    Editorial Reviews

    Achieves a unique and delicate balance between depth, breadth, and clarity.
    —Stefan Joe-Yen, Cognitive Research Engineer, Northrop Grumman Corporation & Adjunct Professor, Department of Computer Science, Webster University

    Used as a primer for the recent graduate or as a refresher for the grizzled veteran, Practical Data Mining is a must-have book for anyone in the field of data mining and analytics.
    —Chad Sessions, Program Manager, Advanced Analytics Group (AAG)