Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter
- Length: 360 pages
- Edition: 1
- Language: English
- Publisher: BPB Publications
- Publication Date: 2021-03-01
- ISBN-10: 9389898064
- ISBN-13: 9789389898064
- Sales Rank: #1673217 (See Top 100 Books)
Solve business problems with data-driven techniques and easy-to-follow Python examples
Key Features
- Essential coverage on statistics and data science techniques.
- Exposure to Jupyter, PyCharm, and use of GitHub.
- Real use-cases, best practices, and smart techniques on the use of data science for data applications.
Description
This book begins with an introduction to Data Science followed by the Python concepts. The readers will understand how to interact with various database and Statistics concepts with their Python implementations. You will learn how to import various types of data in Python, which is the first step of the data analysis process. Once you become comfortable with data importing, you will clean the dataset and after that will gain an understanding about various visualization charts. This book focuses on how to apply feature engineering techniques to make your data more valuable to an algorithm. The readers will get to know various Machine Learning Algorithms, concepts, Time Series data, and a few real-world case studies. This book also presents some best practices that will help you to be industry-ready.
This book focuses on how to practice data science techniques while learning their concepts using Python and Jupyter. This book is a complete answer to the most common question that how can you get started with Data Science instead of explaining Mathematics and Statistics behind the Machine Learning Algorithms.
What you will learn
- Rapid understanding of Python concepts for data science applications.
- Understand and practice how to run data analysis with data science techniques and algorithms.
- Learn feature engineering, dealing with different datasets, and most trending machine learning algorithms.
- Become self-sufficient to perform data science tasks with the best tools and techniques.
Who this book is for
This book is for a beginner or an experienced professional who is thinking about a career or a career switch to Data Science. Each chapter contains easy-to-follow Python examples.
About the Author
Prateek Gupta is a Data Enthusiast and loves data-driven technologies. Prateek has completed his B.Tech in Computer Science & Engineering and he is currently working as a Data Scientist in an IT company. Prateek has a total 9 years of experience in the software industry, and currently, he is working in the computer vision area. Prateek has implemented various end-to-end Data Science projects for fishing, winery, and ecommerce clients. His implemented object detection and recognition models and product recommendation engines have solved many business problems of various clients. His keen area of interest is in natural language processing and computer vision. In his leisure time, he writes posts about artificial intelligence in his blog.
Blog links: http://dsbyprateekg.blogspot.com/
LinkedIn Profile: https://www.linkedin.com/in/prateek-gupta-64203354/
Cover Page Title Page Copyright Page Dedication Page About the Author Acknowledgement Preface Errata Table of Contents 1. Data Science Fundamentals Structure Objective What is data? Structured data Unstructured data Semi-structured data What is data science? What does a data scientist do? Real-world use cases of data science Why Python for data science? Conclusion 2. Installing Software and System Setup Structure Objective System requirements Downloading Anaconda Installing the Anaconda on Windows Installing the Anaconda in Linux How to install a new Python library in Anaconda? Open your notebook – Jupyter Know your notebook Conclusion 3. Lists and Dictionaries Structure Objective What is a list? How to create a list? Different list manipulation operations Difference between Lists and Tuples What is a Dictionary? How to create a dictionary? Some operations with dictionary Conclusion 4. Package, Function, and Loop Structure Objective The help() function in Python How to import a Python package? How to create and call a function? Passing parameter in a function Default parameter in a function How to use unknown parameters in a function? A global and local variable in a function What is a Lambda function? Understanding main in Python while and for loop in Python Conclusion 5. NumPy Foundation Structure Objective Importing a NumPy package Why use NumPy array over list? NumPy array attributes Creating NumPy arrays Accessing an element of a NumPy array Slicing in NumPy array Array concatenation Conclusion 6. Pandas and DataFrame Structure Objective Importing Pandas Pandas data structures Series DataFrame .loc[] and .iloc[] Some Useful DataFrame Functions Handling missing values in DataFrame Conclusion 7. Interacting with Databases Structure Objective What is SQLAlchemy? Installing SQLAlchemy package How to use SQLAlchemy? SQLAlchemy engine configuration Creating a table in a database Inserting data in a table Update a record How to join two tables Inner join Left join Right join Conclusion 8. Thinking Statistically in Data Science Structure Objective Statistics in data science Types of statistical data/variables Mean, median, and mode Basics of probability Statistical distributions Poisson distribution Binomial distribution Normal distribution Pearson correlation coefficient Probability Density Function (PDF) Real-world example Statistical inference and hypothesis testing Conclusion 9. How to Import Data in Python? Structure Objective Importing text data Importing CSV data Importing Excel data Importing JSON data Importing pickled data Importing a compressed data Conclusion 10. Cleaning of Imported Data Structure Objective Know your data Analyzing missing values Dropping missing values Automatically fill missing values How to scale and normalize data? How to parse dates? How to apply character encoding? Cleaning inconsistent data Conclusion 11. Data Visualization Structure Objective Bar chart Line chart Histograms Scatter plot Stacked plot Box plot Conclusion 12. Data Pre-processing Structure Objective About the case-study Importing the dataset Exploratory data analysis Data cleaning and pre-processing Feature Engineering Conclusion 13. Supervised Machine Learning Structure Objective Some common ML terms Introduction to machine learning (ML) Supervised learning Unsupervised learning Semi-supervised learning Reinforcement learning List of common ML algorithms Supervised ML fundamentals Logistic Regression Decision Tree Classifier K-Nearest Neighbor Classifier Linear Discriminant Analysis (LDA) Gaussian Naive Bayes Classifier Support Vector Classifier Solving a classification ML problem About the dataset Attribute information Why train/test split and cross-validation? Solving a regression ML problem How to tune your ML model? How to handle categorical variables in sklearn? The advanced technique to handle missing data Conclusion 14. Unsupervised Machine Learning Structure Objective Why unsupervised learning? Unsupervised learning techniques Clustering K-mean clustering Hierarchical clustering t-SNE Principal Component Analysis (PCA) Case study Validation of unsupervised ML Conclusion 15. Handling Time-Series Data Structure Objective Why time-series is important? How to handle date and time? Transforming a time-series data Manipulating a time-series data Comparing time-series growth rates How to change time-series frequency? Conclusion 16. Time-Series Methods Structure Objective What is time-series forecasting? Basic steps in forecasting Time-series forecasting techniques Autoregression (AR) Moving Average (MA) Autoregressive Moving Average (ARMA) Autoregressive Integrated Moving Average (ARIMA) Seasonal Autoregressive Integrated Moving-Average (SARIMA) Seasonal Autoregressive Integrated Moving-Average with Exogenous Regressors (SARIMAX) Vector Autoregression Moving-Average (VARMA) Holt Winter’s Exponential Smoothing (HWES) Forecast future traffic to a web page Conclusion 17. Case Study-1 Predict whether or not an applicant will be able to repay a loan Conclusion 18. Case Study-2 Build a prediction model that will accurately classify which text messages are spam Conclusion 19. Case Study-3 Build a film recommendation engine Conclusion 20. Case Study-4 Predict house sales in King County, Washington State, USA, using regression Conclusion 21. Python Virtual Environment Structure Objective What is a Python virtual environment? How to create and activate a virtual environment? How to open Jupyter notebook with this new environment? How to set an activated virtual environment in PyCharm IDE? What is requirements.txt file? What is README.md file? Upload your project in GitHub Conclusion 22. Introduction to An Advanced Algorithm - CatBoost Structure Objective What is a Gradient Boosting algorithm? Introduction to CatBoost Install CatBoost in Python virtual environment How to solve a classification problem with CatBoost? Push your notebook in your GitHub repository Conclusion 23. Revision of All Chapters’ Learning Conclusion Index
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.