Parallel Computing for Data Science: With Examples in R, C++ and CUDA

1st Edition

Norman Matloff

Chapman and Hall/CRC
Published June 4, 2015
Reference - 328 Pages - 7 B/W Illustrations
ISBN 9781466587014 - CAT# K20322
Series: Chapman & Hall/CRC The R Series


Add to Wish List
FREE Standard Shipping!


  • Focuses on applications in the data sciences, including statistics, data mining, and machine learning
  • Discusses structures common in data science, such as network data models
  • Emphasizes general principles throughout, such as avoiding factors that reduce the speed of parallel programs
  • Covers the main types of computing platforms: multicore, cluster, and GPU
  • Explains how the Thrust package eases the programming of multicore machines and GPUs and enables the same code to be used on either platform
  • Provides code for the examples on the author’s web page


Parallel Computing for Data Science: With Examples in R, C++ and CUDA is one of the first parallel computing books to concentrate exclusively on parallel data structures, algorithms, software tools, and applications in data science. It includes examples not only from the classic "n observations, p variables" matrix format but also from time series, network graph models, and numerous other structures common in data science. The examples illustrate the range of issues encountered in parallel programming.

With the main focus on computation, the book shows how to compute on three types of platforms: multicore systems, clusters, and graphics processing units (GPUs). It also discusses software packages that span more than one type of hardware and can be used from more than one type of programming language. Readers will find that the foundation established in this book will generalize well to other languages, such as Python and Julia.