Data Wrangling with R: Load, explore, transform and visualize data for modeling with tidyverse libraries
- Length: 384 pages
- Edition: 1
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2023-02-23
- ISBN-10: 1803235403
- ISBN-13: 9781803235400
- Sales Rank: #1504944 (See Top 100 Books)
Take your data wrangling skills to the next level by gaining a deep understanding of tidyverse libraries and effectively prepare your data for impressive analysis
Purchase of the print or Kindle book includes a free PDF eBook
Key Features
- Explore state-of-the-art libraries for data wrangling in R and learn to prepare your data for analysis
- Find out how to work with different data types such as strings, numbers, date, and time
- Build your first model and visualize data with ease through advanced plot types and with ggplot2
Book Description
In this information era, where large volumes of data are being generated every day, companies want to get a better grip on it to perform more efficiently than before. This is where skillful data analysts and data scientists come into play, wrangling and exploring data to generate valuable business insights. In order to do that, you’ll need plenty of tools that enable you to extract the most useful knowledge from data.
Data Wrangling with R will help you to gain a deep understanding of ways to wrangle and prepare datasets for exploration, analysis, and modeling. This data book enables you to get your data ready for more optimized analyses, develop your first data model, and perform effective data visualization.
The book begins by teaching you how to load and explore datasets. Then, you’ll get to grips with the modern concepts and tools of data wrangling. As data wrangling and visualization are intrinsically connected, you’ll go over best practices to plot data and extract insights from it. The chapters are designed in a way to help you learn all about modeling, as you will go through the construction of a data science project from end to end, and become familiar with the built-in RStudio, including an application built with Shiny dashboards.
By the end of this book, you’ll have learned how to create your first data model and build an application with Shiny in R.
What you will learn
- Discover how to load datasets and explore data in R
- Work with different types of variables in datasets
- Create basic and advanced visualizations
- Find out how to build your first data model
- Create graphics using ggplot2 in a step-by-step way in Microsoft Power BI
- Get familiarized with building an application in R with Shiny
Who this book is for
If you are a professional data analyst, data scientist, or beginner who wants to learn more about data wrangling, this book is for you. Familiarity with the basic concepts of R programming or any other object-oriented programming language will help you to grasp the concepts taught in this book. Data analysts looking to improve their data manipulation and visualization skills will also benefit immensely from this book.
Cover Copyright Contributors Table of Contents Preface Part 1: Load and Explore Data Chapter 1: Fundamentals of Data Wrangling What is data wrangling? Why data wrangling? Benefits The key steps of data wrangling Frameworks in Data Science Summary Exercises Further reading Chapter 2: Loading and Exploring Datasets Technical requirements How to load files to RStudio Loading a CSV file to R Tibbles versus Data Frames Saving files A workflow for data exploration Loading and viewing Descriptive statistics Missing values Data distributions Visualizations Basic Web Scraping Getting data from an API Summary Exercises Further reading Chapter 3: Basic Data Visualization Technical requirements Data visualization Creating single-variable plots Dataset Boxplots Density plot Creating two-variable plots Scatterplot Bar plot Line plot Working with multiple variables Plots side by side Summary Exercises Further reading Part 2: Data Wrangling Chapter 4: Working with Strings Introduction to stringr Detecting patterns Subset strings Managing lengths Mutating strings Joining and splitting Ordering strings Working with regular expressions Learning the basics Creating frequency data summaries in R Regexps in practice Creating a contingency table using gmodels Text mining Tokenization Stemming and lemmatization TF-IDF N-grams Factors Summary Exercises Further reading Chapter 5: Working with Numbers Technical requirements Numbers in vectors, matrices, and data frames Vectors Matrices Data frames Math operations with variables apply functions Descriptive statistics Correlation Summary Exercises Further reading Chapter 6: Working with Date and Time Objects Technical requirements Introduction to date and time Date and time with lubridate Arithmetic operations with datetime Time zones Date and time using regular expressions (regexps) Practicing Summary Exercises Further reading Chapter 7: Transformations with Base R Technical requirements The dataset Slicing and filtering Slicing Filtering Grouping and summarizing Replacing and filling Arranging Creating new variables Binding Using data.table Summary Exercises Further reading Chapter 8: Transformations with Tidyverse Libraries Technical requirements What is tidy data The pipe operator Slicing and filtering Slicing Filtering Grouping and summarizing data Replacing and filling data Arranging data Creating new variables The mutate function Joining datasets Left Join Right join Inner join Full join Anti-join Reshaping a table Do more with tidyverse Summary Exercises Further reading Chapter 9: Exploratory Data Analysis Technical requirements Loading the dataset to RStudio Understanding the data Treating missing data Exploring and visualizing the data Univariate analysis Multivariate analysis Exploring Analysis report Report Next steps Summary Exercises Further reading Part 3: Data Visualization Chapter 10: Introduction to ggplot2 Technical requirements The grammar of graphics Data Geometry Aesthetics Statistics Coordinates Facets Themes The basic syntax of ggplot2 Plot types Histograms Boxplot Scatterplot Bar plots Line plots Smooth geometry Themes Summary Exercises Further reading Chapter 11: Enhanced Visualizations with ggplot2 Technical requirements Facet grids Map plots Time series plots 3D plots Adding interactivity to graphics Summary Exercises Further reading Chapter 12: Other Data Visualization Options Technical requirements Plotting graphics in Microsoft Power BI using R Preparing data for plotting Creating word clouds in RStudio Summary Exercises Further reading Part 4: Modeling Chapter 13: Building a Model with R Technical requirements Machine learning concepts Classification models Regression models Supervised and unsupervised learning Understanding the project The dataset The project The algorithm Preparing data for modeling in R Exploring the data with a few visualizations Selecting the best variables Modeling Training Testing and evaluating the model Predicting Summary Exercises Further reading Chapter 14: Build an Application with Shiny in R Technical requirements Learning the basics of Shiny Get started Basic functions Creating an application The project Coding Deploying the application on the web Summary Exercises Further reading Conclusion References Index Other Books You May Enjoy
Donate to keep this site alive
How to download source code?
1. Go to: https://github.com/PacktPublishing
2. In the Find a repository… box, search the book title: Data Wrangling with R: Load, explore, transform and visualize data for modeling with tidyverse libraries
, sometime you may not get the results, please search the main title.
3. Click the book title in the search results.
3. Click Code to download.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.