Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
- Length: 316 pages
- Edition: 1
- Language: English
- Publisher: BPB Publications
- Publication Date: 2020-08-14
- ISBN-10: 9389845645
- ISBN-13: 9789389845648
- Sales Rank: #4701029 (See Top 100 Books)
“Get familiar with various Supervised, Unsupervised and Reinforcement learning algorithms
Key Features
- Understand the types of Machine learning.
- Get familiar with different Feature extraction methods.
- Get an overview of how Neural Network Algorithms work.
- Learn how to implement Decision Trees and Random Forests.
- The book not only explains the Classification algorithms but also discusses the deviations/ mathematical modeling.
Description
This book covers important concepts and topics in Machine Learning. It begins with Data Cleansing and presents an overview of Feature Selection. It then talks about training and testing, cross-validation, and Feature Selection. The book covers algorithms and implementations of the most common Feature Selection Techniques. The book then focuses on Linear Regression and Gradient Descent. Some of the important Classification techniques such as K-nearest neighbors, logistic regression, Naïve Bayesian, and Linear Discriminant Analysis are covered in the book. It then gives an overview of Neural Networks and explains the biological background, the limitations of the perceptron, and the backpropagation model. The Support Vector Machines and Kernel methods are also included in the book. It then shows how to implement Decision Trees and Random Forests.
Towards the end, the book gives a brief overview of Unsupervised Learning. Various Feature Extraction techniques, such as Fourier Transform, STFT, and Local Binary patterns, are covered. The book also discusses Principle Component Analysis and its implementation.
What will you learn
- Learn how to prepare Data for Machine Learning.
- Learn how to implement learning algorithms from scratch.
- Use scikit-learn to implement algorithms.
Who this book is for
The book is designed for Undergraduate and Postgraduate Computer Science students and for the professionals who intend to switch to the fascinating world of Machine Learning. This book requires basic know-how of programming fundamentals, Python, in particular.
Cover Page Title Page Copyright Page Dedication Page About the Author About the Reviewer Acknowledgement Preface Errata Table of Contents 1. Introduction to Data Analysis Structure Objective Inspiration for data analysis What is data science? Structured data Semi-structured data Unstructured data Domain expertise Maths and statistics Artificial intelligence Machine learning Supervised learning Unsupervised learning Reinforcement learning Data infrastructure Data analysis process Business requirements Data collection Data cleansing Data exploring and visualization Data modeling Model validation and testing Deployment Why Python for data analysis? Growth of Python programming language Prototype extension Python libraries for data analysis and installation JupyterLab Pandas Numpy Matplotlib Seaborn Conclusion Exercises Answers 2. JupyterLab Structure Objective Introduction to JupyterLab Architecture Components Console Interface Creating a new notebook Cell modes Code Markdown Headings Lists and unordered lists Formatting words Math equations Load images Load videos Raw Menu Magic commands %bash ! %run Keyboard shortcuts Conclusion Exercises Answers 3. Python Overview Structure Objective Python, Hello World Variables and data types Strings Integer and float List Tuple Dictionary Sets Functions Docstrings Function arguments Required arguments Keyword arguments Default arguments Variable-length arguments Lambda List comprehensions Functional programming using (map, filter, and reduce) Map Filter Reduce Working with datetime objects Constructing datetime objects Timedelta Conclusion Exercises Answers 4. Introduction to Numpy Structure Objective Ndarray Difference between List and Numpy arrays Storage Type check Speed Copying arrays Mathematical operations Trigonometric functions Statistical operations Reshaping Vertical and horizontal stacking of Numpy arrays Fancy indexing Indexing with Boolean arrays Broadcasting Conclusion Exercises Answers 5. Introduction to Pandas Structure Objective Data structures in pandas Series Creating a series object with strings Creating a series object with integers Creating a series object with booleans Creating a series object with tuples Creating a series object with a dictionary Series attributes Series methods DataFrames Create a dataframe from the dictionary Create a dataframe object with a list of lists Creating a dataframe object with a list of dictionaries DataFrame methods and attributes Head Tail Shape Index Count Columns Conclusion Exercises Answers 6. Data Analysis Structure Objective Handling different file formats Handling rows and columns Rename column headers Select a series from dataframe Updating a series object Datatype conversion Adding new series to a dataframe Deleting series objects Filtering rows and columns Selecting rows and columns using Index Loc iLoc Groupby The internal implementation of groupby Splitting the Object Applying a function Combine Aggregate Transform Filter Concatenate DataFrames Merge DataFrames Many to one Inner join Joins Left join Right join Outer join Purging duplicate rows Data Transformations Apply Map ApplyMap Crosstab Cleansing the Data Handling missing values Find NaN in series and dataframe objects Select the missing values Drop missing values from Series Drop NaN from DataFrame Fill missing values Forward and backward fill Filling the DateTime values Replacing individual values Pivot and pivot table Pivot table Grouper Handling large datasets Optimizing the data types Memory usage Datatype subtypes Optimizing the Numeric columns Optimizing the Object columns Converting to Pickle file Read the file in chunks Modin pandas Dask Conclusion Exercises Answers 7. Time Series Analysis Structure Objective Creating time series data Start and end dates Periods Frequency weekly, monthly, hourly and seconds Converting string-based dates to datetime objects Unix / Epoch time Time series analysis using a real-time dataset Slicing and indexing Resampling Plotting Handling timezones Shifting or lagging Handling holidays Conclusion Exercises Answers 8. Introduction to Statistics Structure Objective Population Sample Types of data Categorical variables Numerical data Levels of measurement Qualitative nominal Qualitative ordinal Quantitative interval Example Quantitative ratio Example Descriptive statistics Measures of central tendency Mean Median Mode Measures of variability Range Quartile Deviation/Variance Standard deviation Coefficient of variation Covariance Inferential statistics What is a distribution? Standardization (Z-score) Central limit theorem Standard error Confidence intervals Hypothesis Testing Null hypothesis Alternate hypothesis Errors Type 1 error Type 2 error One-tailed test Two-tailed test Hypothesis testing steps The T-Test (Student T-Test) Z-Test Conclusion Exercises Answers 9. Matplotlib Structure Objective Why data visualization? Matplotlib architecture Backend layer Artistic layer Scripting layer Chart properties Figure and axes Labeling the axes Formatting Saving the chart Adding annotations Legends Controlling axes limits Controlling xticks, y_ticks, and tick_labels Scatter plot Bar plot Histograms Line plot Pie chart Subplots Conclusion Exercises Answers 10. Seaborn Structure Objective Why Seaborn? Matplotlib versus Seaborn Seaborn chart properties Styles Axes spines Controlling style properties Scaling plot elements Color palette Qualitative Sequential Diverging About pokemon Importing libraries and dataset Visualizing statistical relationships Plotting categorical variables Visualizing the distribution of the data Univariate Bivariate Multiplot grids Pair grid Conclusion Exercises Answers 11. Exploratory Data Analysis Structure Objective A little story, Titanic Importing libraries and dataset Handling missing values Variable identification Categorical nominal Categorical ordinal Numerical continuous Numerical discrete Univariate analysis Bivariate analysis Scatter plot Line plot HeatMap Bar plot Multivariate analysis Handling outliers Feature Selection Conclusion Exercises Answers
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.