Introduction to Data Science: Data Analysis and Prediction Algorithms with R

1st Edition

Rafael A. Irizarry

Chapman and Hall/CRC
October 22, 2019 Forthcoming
Textbook - 784 Pages
ISBN 9780367357986 - CAT# 316898
Series: Chapman & Hall/CRC Data Science Series

For Instructors Request Inspection Copy

USD$99.95

Add to Wish List
FREE Standard Shipping!

Summary

The book begins by going over the basics of R and the tidyverse. You learn R throughout the book, but in the first part we go over the building blocks needed to keep learning during the rest of the book. Data Visualization is the topic of the following part. The growing availability of informative datasets and software tools has led to increased reliance on Data Visualizations in many fields. We demonstrate how to use ggplot2 to generate graphs and describe important Data Visualization principles. The third part uses several examples to familiarize the reader with Data Wrangling. Among the specific skills learned are web scrapping, using regular expressions, and joining and reshaping data tables using the tidyverse tools. In the fourth part, the importance of Statistics in data analysis is demonstrated by answering case study questions using Probability, Statistical Inference, and Regression. In the fifth part, several challenges lead to Machine Learning. The caret package to build prediction algorithms including K-nearest Neighbors and Random Forests. The final chapter introduces the tools we use on a day-to-day basis in data science projects. These are RStudio, UNIX/Linux shell, Git and GitHub, and knitr and R Markdown.

Key Features:

  • Covers the basics of R and the tidyverse
  • Demonstrate how to use ggplot2 to generate graphs and describe important Data Visualization principles
  • Introduces Data Wranglin topics such as web scrapping, using regular expressions, and joining and reshaping data tables using the tidyverse tools
  • Illustrates the importance of statistics in data analysis using case studies
  • Uses the caret package to build prediction algorithms including K-nearest Neighbors and Random Forests
  • Includes tools used on a day-to-day basis in data science projects including RStudio, UNIX/Linux shell, Git and GitHub, and knitr and R Markdown
  • Instructors

    We provide complimentary e-inspection copies of primary textbooks to instructors considering our books for course adoption.

    Request an
    e-inspection copy

    Share this Title

    Related Titles