Python for Data Science: A Hands-On Introduction
- Length: 248 pages
- Edition: 1
- Language: English
- Publisher: No Starch Press
- Publication Date: 2022-08-02
- ISBN-10: 1718502206
- ISBN-13: 9781718502208
- Sales Rank: #1405332 (See Top 100 Books)
A hands-on, real-world introduction to data analysis with the Python programming language, loaded with wide-ranging examples.
Python is an ideal choice for accessing, manipulating, and gaining insights from data of all kinds. Python for Data Science introduces you to the Pythonic world of data analysis with a learn-by-doing approach rooted in practical examples and hands-on activities. You’ll learn how to write Python code to obtain, transform, and analyze data, practicing state-of-the-art data processing techniques for use cases in business management, marketing, and decision support.
You will discover Python’s rich set of built-in data structures for basic operations, as well as its robust ecosystem of open-source libraries for data science, including NumPy, pandas, scikit-learn, matplotlib, and more. Examples show how to load data in various formats, how to streamline, group, and aggregate data sets, and how to create charts, maps, and other visualizations. Later chapters go in-depth with demonstrations of real-world data applications, including using location data to power a taxi service, market basket analysis to identify items commonly purchased together, and machine learning to predict stock prices.
Title Page Copyright About the Author Introduction Using Python for Data Science Who Should Read This Book? What’s in the Book? Chapter 1: The Basics of Data Categories of Data Unstructured Data Structured Data Semistructured Data Time Series Data Sources of Data APIs Web Pages Databases Files The Data Processing Pipeline Acquisition Cleansing Transformation Analysis Storage The Pythonic Way Summary Chapter 2: Python Data Structures Lists Creating a List Using Common List Object Methods Using Slice Notation Using a List as a Queue Using a List as a Stack Using Lists and Stacks for Natural Language Processing Making Improvements with List Comprehensions Tuples A List of Tuples Immutability Dictionaries A List of Dictionaries Adding to a Dictionary with setdefault() Loading JSON into a Dictionary Sets Removing Duplicates from Sequences Performing Common Set Operations Exercise #1: Improved Photo Tag Analysis Summary Chapter 3: Python Data Science Libraries NumPy Installing NumPy Creating a NumPy Array Performing Element-Wise Operations Using NumPy Statistical Functions Exercise #2: Using NumPy Statistical Functions pandas pandas Installation pandas Series Exercise #3: Combining Three Series pandas DataFrames Exercise #4: Using Different Joins scikit-learn Installing scikit-learn Obtaining a Sample Dataset Loading the Sample Dataset into a pandas DataFrame Splitting the Sample Dataset into a Training Set and a Test Set Transforming Text into Numerical Feature Vectors Training and Evaluating the Model Making Predictions on New Data Summary Chapter 4: Accessing Data from Files and APIs Importing Data Using Python’s open() Function Text Files Tabular Data Files Exercise #5: Opening JSON Files Binary Files Exporting Data to Files Accessing Remote Files and APIs How HTTP Requests Work The urllib3 Library The Requests Library Exercise #6: Accessing an API with Requests Moving Data to and from a DataFrame Importing Nested JSON Structures Converting a DataFrame to JSON Exercise #7: Manipulating Complex JSON Structures Loading Online Data into a DataFrame with pandas-datareader Summary Chapter 5: Working with Databases Relational Databases Understanding SQL Statements Getting Started with MySQL Defining the Database Structure Inserting Data into the Database Querying Database Data Exercise #8: Performing a One-to-Many Join Using Database Analytics Tools NoSQL Databases Key-Value Stores Document-Oriented Databases Exercise #9: Inserting and Querying Multiple Documents Summary Chapter 6: Aggregating Data Data to Aggregate Combining DataFrames Grouping and Aggregating the Data Viewing Specific Aggregations by MultiIndex Slicing a Range of Aggregated Values Slicing Within Aggregation Levels Adding a Grand Total Adding Subtotals Exercise #10: Excluding Total Rows from the DataFrame Selecting All Rows in a Group Summary Chapter 7: Combining Datasets Combining Built-in Data Structures Combining Lists and Tuples with + Combining Dictionaries with ** Combining Corresponding Rows from Two Structures Implementing Different Types of Joins for Lists Concatenating NumPy Arrays Exercise #11: Adding New Rows/Columns to a NumPy Array Combining pandas Data Structures Concatenating DataFrames Joining Two DataFrames Summary Chapter 8: Creating Visualizations Common Visualizations Line Graphs Bar Graphs Pie Charts Histograms Plotting with Matplotlib Installing Matplotlib Using matplotlib.pyplot Working with Figure and Axes Objects Exercise #12: Combining Bins into an “Other” Slice Using Other Libraries with Matplotlib Plotting pandas Data Plotting Geospatial Data with Cartopy Exercise #13: Drawing a Map with Cartopy and Matplotlib Summary Chapter 9: Analyzing Location Data Obtaining Location Data Turning a Human-Readable Address into Geo Coordinates Getting the Geo Coordinates of a Moving Object Spatial Data Analysis with geopy and Shapely Finding the Closest Object Finding Objects in a Certain Area Exercise #14: Defining Two or More Polygons Combining Both Approaches Exercise #15: Further Improving the Pick-Up Algorithm Combining Spatial and Nonspatial Data Deriving Nonspatial Attributes Exercise #16: Filtering Data with a List Comprehension Joining Spatial and Nonspatial Datasets Summary Chapter 10: Analyzing Time Series Data Regular vs. Irregular Time Series Common Time Series Analysis Techniques Calculating Percentage Changes Rolling Window Calculations Calculating the Percentage Change of a Rolling Average Multivariate Time Series Processing Multivariate Time Series Analyzing Dependencies Between Variables Exercise #17: Adding More Metrics to Analyze Dependencies Summary Chapter 11: Gaining Insights from Data Association Rules Support Confidence Lift The Apriori Algorithm Creating a Transaction Dataset Identifying Frequent Itemsets Generating Association Rules Visualizing Association Rules Gaining Actionable Insights from Association Rules Generating Recommendations Planning Discounts Based on Association Rules Exercise #18: Mining Real Transaction Data Summary Chapter 12: Machine Learning for Data Analysis Why Machine Learning? Types of Machine Learning Supervised Learning Unsupervised Learning How Machine Learning Works Data to Learn From A Statistical Model Previously Unseen Data A Sentiment Analysis Example: Classifying Product Reviews Obtaining Product Reviews Cleansing the Data Splitting and Transforming the Data Training the Model Evaluating the Model Exercise #19: Expanding the Example Set Predicting Stock Trends Getting Data Deriving Features from Continuous Data Generating the Output Variable Training and Evaluating the Model Exercise #20: Experimenting with Different Stocks and New Metrics Summary Index
Donate to keep this site alive
How to download source code?
1. Go to: https://nostarch.com/
2. Search the book title: Python for Data Science: A Hands-On Introduction
, sometime you may not get the results, please search the main title
3. Click the book title in the search results
3. Download the Source Code.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.