SQL for Data Analytics: Harness the power of SQL to extract insights from data, 3rd Edition
- Length: 540 pages
- Edition: 3
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2022-08-29
- ISBN-10: 180181287X
- ISBN-13: 9781801812870
- Sales Rank: #0 (See Top 100 Books)
Take your first steps to becoming a fully qualified data analyst by learning how to explore complex datasets
Key Features
- Master each concept through practical exercises and activities
- Discover various statistical techniques to analyze your data
- Implement everything you’ve learned on a real-world case study to uncover valuable insights
Book Description
Every day, businesses operate around the clock, and a huge amount of data is generated at a rapid pace. This book helps you analyze this data and identify key patterns and behaviors that can help you and your business understand your customers at a deep, fundamental level.
SQL for Data Analytics, Third Edition is a great way to get started with data analysis, showing how to effectively sort and process information from raw data, even without any prior experience.
You will begin by learning how to form hypotheses and generate descriptive statistics that can provide key insights into your existing data. As you progress, you will learn how to write SQL queries to aggregate, calculate, and combine SQL data from sources outside of your current dataset. You will also discover how to work with advanced data types, like JSON. By exploring advanced techniques, such as geospatial analysis and text analysis, you will be able to understand your business at a deeper level. Finally, the book lets you in on the secret to getting information faster and more effectively by using advanced techniques like profiling and automation.
By the end of this book, you will be proficient in the efficient application of SQL techniques in everyday business scenarios and looking at data with the critical eye of�analytics professional.
What you will learn
- Use SQL to clean, prepare, and combine different datasets
- Aggregate basic statistics using GROUP BY clauses
- Perform advanced statistical calculations using a WINDOW function
- Import data into a database to combine with other tables
- Export SQL query results into various sources
- Analyze special data types in SQL, including geospatial, date/time, and JSON data
- Optimize queries and automate tasks
- Think about data problems and find answers using SQL
Who this book is for
If you’re a database engineer looking to transition into analytics or a backend engineer who wants to develop a deeper understanding of production data and gain practical SQL knowledge, you will find this book useful. This book is also ideal for data scientists or business analysts who want to improve their data analytics skills using SQL.
Basic familiarity with SQL (such as basic SELECT, WHERE, and GROUP BY clauses) as well as a good understanding of linear algebra, statistics, and PostgreSQL 14 are necessary to make the most of this SQL data analytics book.
SQL for Data Analytics Third edition Preface About the Book About the Authors Audience About the Chapters Conventions Setting up Your Environment Installing PostgreSQL 14 Downloading and Installing PostgreSQL on Windows Setting the PATH Variable Installation on macOS Installing Python Installing Python on Windows Installing Python on Linux Installing Python on macOS Installing Git Installing Git on Windows or macOS Installing Git on Linux Loading the Sample Datasets – Windows Loading the Sample Datasets – Linux Loading the Sample Datasets – macOS Running SQL files Accessing the Code Files 1. Understanding and Describing Data Introduction Data Analytics and Statistics Activity 1.01: Classifying a New Dataset Types of Statistics Methods of Descriptive Statistics Univariate Analysis Data Frequency Distribution Exercise 1.01: Creating a Histogram Quantiles Exercise 1.02: Calculating the Quartiles for Add-On Sales Central Tendency Exercise 1.03: Calculating the Central Tendency of Add-On Sales Dispersion Exercise 1.04: Dispersion of Add-On Sales Bivariate Analysis Scatterplots Linear Trend Analysis and Pearson Correlation Coefficient Exercise 1.05: Calculating the Pearson Correlation Coefficient for Two Variables Interpreting and Analyzing the Correlation Coefficient Time Series Data Activity 1.02: Exploring Dealership Sales Data Working with Missing Data Statistical Significance Testing Common Statistical Significance Tests SQL and Analytics Summary 2. The Basics of SQL for Analytics Introduction The World of Data Types of Data Relational Databases and SQL Advantages and Disadvantages of SQL Databases PostgreSQL Relational Database Management System (RDBMS) Exercise 2.01: Running Your First SELECT Query SELECT Statement The WHERE Clause The AND/OR Clause The IN/NOT IN Clause ORDER BY Clause The LIMIT Clause IS NULL/IS NOT NULL Clause Exercise 2.02: Querying the salespeople Table Using Basic Keywords in a SELECT Query Activity 2.01: Querying the customers Table Using Basic Keywords in a SELECT Query Creating Tables Creating Blank Tables Basic Data Types of SQL Numeric Character Boolean Datetime Data Structures: JSON and Arrays Column Constraints Simple CREATE Statement Exercise 2.03: Creating a Table in SQL Creating Tables with SELECT Updating Tables Adding and Removing Columns Adding New Data Updating Existing Rows Exercise 2.04: Updating the Table to Increase the Price of a Vehicle Deleting Data and Tables Deleting Values from a Row Deleting Rows from a Table Deleting Tables Exercise 2.05: Deleting an Unnecessary Reference Table Activity 2.02: Creating and Modifying Tables for Marketing Operations SQL and Analytics Summary 3. SQL for Data Preparation Introduction Assembling Data Connecting Tables Using JOIN Types of Joins Inner Joins Outer Joins Cross Joins Exercise 3.01: Using Joins to Analyze a Sales Dealership Subqueries Unions Exercise 3.02: Generating an Elite Customer Party Guest List Using UNION Common Table Expressions Cleaning Data The CASE WHEN Function Exercise 3.03: Using the CASE WHEN Function to Get Regional Lists The COALESCE Function The NULLIF Function The LEAST/GREATEST Functions The Casting Function Transforming Data The DISTINCT and DISTINCT ON Functions Activity 3.01: Building a Sales Model Using SQL Techniques Summary 4. Aggregate Functions for Data Analysis Introduction Aggregate Functions Exercise 4.01: Using Aggregate Functions to Analyze Data Aggregate Functions with the GROUP BY Clause The GROUP BY Clause Multiple-Column GROUP BY Exercise 4.02: Calculating the Cost by Product Type Using GROUP BY Grouping Sets Ordered Set Aggregates Aggregate Functions with the HAVING Clause Exercise 4.03: Calculating and Displaying Data Using the HAVING Clause Using Aggregates to Clean Data and Examine Data Quality Finding Missing Values with GROUP BY Measuring Data Uniqueness with Aggregates Activity 4.01: Analyzing Sales Data Using Aggregate Functions Summary 5. Window Functions for Data Analysis Introduction Window Functions The Basics of Window Functions Exercise 5.01: Analyzing Customer Data Fill Rates over Time The WINDOW Keyword Statistics with Window Functions Exercise 5.02: Rank Order of Hiring Window Frame Exercise 5.03: Team Lunch Motivation Activity 5.01: Analyzing Sales Using Window Frames and Window Functions Summary 6. Importing and Exporting Data Introduction The COPY Command Running the psql Command The COPY Statement \COPY with psql Creating Temporary Views Configuring COPY and \COPY Using COPY and \COPY to Bulk Upload Data to Your Database Exercise 6.01: Exporting Data to a File for Further Processing in Excel Using Python with your Database Getting Started with Python Improving PostgreSQL Access in Python with SQLAlchemy and pandas What is SQLAlchemy? Using Python with SQLAlchemy and pandas Reading and Writing to a Database with pandas Writing Data to the Database Using Python Exercise 6.02: Reading, Visualizing, and Saving Data in Python Improving Python Write Speed with COPY Reading and Writing CSV Files with Python Best Practices for Importing and Exporting Data Going Passwordless Activity 6.01: Using an External Dataset to Discover Sales Trends Summary 7. Analytics Using Complex Data Types Introduction Date and Time Data types for Analysis The DATE Data type Transforming Date Data types Intervals Exercise 7.01: Analytics with Time Series Data Performing Geospatial Analysis in PostgreSQL Latitude and Longitude Representing Latitude and Longitude in PostgreSQL Exercise 7.02: Geospatial Analysis Using Array Data types in PostgreSQL Starting with Arrays Exercise 7.03: Analyzing Sequences Using Arrays Using JSON Data types in PostgreSQL JSONB: Pre-Parsed JSON Accessing Data from a JSON or JSONB Field Leveraging the JSON Path Language for JSONB Fields Creating and Modifying Data in a JSONB Field Exercise 7.04: Searching through JSONB Text Analytics Using PostgreSQL Tokenizing Text Exercise 7.05: Performing Text Analytics Performing Text Search Optimizing Text Search on PostgreSQL Activity 7.01: Sales Search and Analysis Summary 8. Performant SQL Introduction The Importance of Highly Efficient SQL Database Scanning Methods Query Planning Exercise 8.01: Interpreting the Query Planner Activity 8.01: Query Planning Index Scanning The B-Tree Index Exercise 8.02: Creating an Index Scan Activity 8.02: Implementing Index Scans The Hash Index Exercise 8.03: Generating Several Hash Indexes to Investigate Performance Activity 8.03: Implementing Hash Indexes Effective Index Use Killing Queries Exercise 8.04: Canceling a Long-Running Query Functions and Triggers Function Definitions Exercise 8.05: Creating Functions without Arguments Activity 8.04: Defining a Largest Sale Value Function Exercise 8.06: Creating Functions with Arguments The \df and \sf commands Activity 8.05: Creating Functions with Arguments Triggers Exercise 8.07: Creating Triggers to Update Fields Activity 8.06: Creating a Trigger to Track Average Purchases Summary 9. Using SQL to Uncover the Truth: A Case Study Introduction Case Study The Scientific Method Exercise 9.01: Preliminary Data Collection Using SQL Techniques Exercise 9.02: Extracting the Sales Information Activity 9.01: Quantifying the Sales Drop Exercise 9.03: Launch Timing Analysis Activity 9.02: Analyzing the Difference in the Sales Price Hypothesis Exercise 9.04: Analyzing Sales Growth by Email Opening Rate Exercise 9.05: Analyzing the Performance of the Email Marketing Campaign Conclusions In-Field Testing Summary Appendix 1. Understanding and Describing Data Activity 1.01: Classifying a New Dataset Activity 1.02: Exploring Dealership Sales Data 2. The Basics of SQL for Analytics Activity 2.01: Querying the customers Table Using Basic Keywords in a SELECT Query Activity 2.02: Creating and Modifying Tables for Marketing Operations 3. SQL for Data Preparation Activity 3.01: Building a Sales Model Using SQL Techniques 4. Aggregate Functions for Data Analysis Activity 4.01: Analyzing Sales Data Using Aggregate Functions 5. Window Functions for Data Analysis Activity 5.01: Analyzing Sales Using Window Frames and Window Functions 6. Importing and Exporting Data Activity 6.01: Using an External Dataset to Discover Sales Trends 7. Analytics Using Complex Data Types Activity 7.01: Sales Search and Analysis 8. Performant SQL Activity 8.01: Query Planning Activity 8.02: Implementing Index Scans Activity 8.03: Implementing Hash Indexes Activity 8.04: Defining a Largest Sale Value Function Activity 8.05: Creating Functions with Arguments Activity 8.06: Creating a Trigger to Track Average Purchases 9. Using SQL to Uncover the Truth: A Case Study Activity 9.01: Quantifying the Sales Drop Activity 9.02: Analyzing the Difference in the Sales Price Hypothesis Hey!
Donate to keep this site alive
How to download source code?
1. Go to: https://github.com/PacktPublishing
2. In the Find a repository… box, search the book title: SQL for Data Analytics: Harness the power of SQL to extract insights from data, 3rd Edition
, sometime you may not get the results, please search the main title.
3. Click the book title in the search results.
3. Click Code to download.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.