1st Edition

Data Classification Algorithms and Applications

Edited By Charu C. Aggarwal Copyright 2015
    707 Pages
    by Chapman & Hall

    708 Pages 84 B/W Illustrations
    by Chapman & Hall

    Comprehensive Coverage of the Entire Area of Classification

    Research on the problem of classification tends to be fragmented across such areas as pattern recognition, database, data mining, and machine learning. Addressing the work of these different communities in a unified way, Data Classification: Algorithms and Applications explores the underlying algorithms of classification as well as applications of classification in a variety of problem domains, including text, multimedia, social network, and biological data.

    This comprehensive book focuses on three primary aspects of data classification:







    • Methods: The book first describes common techniques used for classification, including probabilistic methods, decision trees, rule-based methods, instance-based methods, support vector machine methods, and neural networks.


    • Domains: The book then examines specific methods used for data domains such as multimedia, text, time-series, network, discrete sequence, and uncertain data. It also covers large data sets and data streams due to the recent importance of the big data paradigm.


    • Variations: The book concludes with insight on variations of the classification process. It discusses ensembles, rare-class learning, distance function learning, active learning, visual learning, transfer learning, and semi-supervised learning as well as evaluation aspects of classifiers.

    An Introduction to Data Classification. Feature Selection for Classification: A Review. Probabilistic Models for Classification. Decision Trees: Theory and Algorithms. Rule-Based Classification. Instance-Based Learning: A Survey. Support Vector Machines. Neural Networks: A Review. A Survey of Stream Classification Algorithms. Big Data Classification. Text Classification. Multimedia Classification. Time Series Data Classification. Discrete Sequence Classification. Collective Classification of Network Data. Uncertain Data Classification. Rare Class Learning. Distance Metric Learning for Data Classification. Ensemble Learning. Semi-Supervised Learning. Transfer Learning. Active Learning: A Survey. Visual Classification. Evaluation of Classification Methods. Educational and Software Resources for Data Classification. Index.

    Biography

    Charu C. Aggarwal is a research scientist at the IBM T.J. Watson Research Center. A fellow of the IEEE and the ACM, he is the author/editor of ten books, an associate editor of several journals, and the vice-president of the SIAM Activity Group on Data Mining. Dr. Aggarwal has published over 200 papers, has applied for or been granted over 80 patents, and has received numerous honors, including the IBM Outstanding Technical Achievement Award and EDBT 2014 Test of Time Award. His research interests include performance analysis, databases, and data mining. He earned a Ph.D. from the Massachusetts Institute of Technology.