Advances in Data Science and Analytics: Concepts and Paradigms
- Length: 352 pages
- Edition: 1
- Language: English
- Publisher: Wiley-Scrivener
- Publication Date: 2022-11-15
- ISBN-10: 111979188X
- ISBN-13: 9781119791881
- Sales Rank: #0 (See Top 100 Books)
ADVANCES in DATA SCIENCE and ANALYTICS
Presenting the concepts and advances of data science and analytics, this volume, written and edited by a global team of experts, also goes into the practical applications that can be utilized across multiple disciplines and industries, for both the engineer and the student, focusing on machining learning, big data, business intelligence, and analytics.
Data science is an interdisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from many structural and unstructured data. Data science is related to data mining, deep learning, and big data. Data analytics software is a more focused version of this and can even be considered part of the larger process. Analytics is devoted to realizing actionable insights that can be applied immediately based on existing queries. For the purposes of this volume, data science is an umbrella term that encompasses data analytics, data mining, machine learning, and several other related disciplines. While a data scientist is expected to forecast the future based on past patterns, data analysts extract meaningful insights from various data sources.
Although data mining and other related areas have been around for a few decades, data science and analytics are still quickly evolving, and the processes and technologies change, almost on a day-to-day basis. This volume provides an overview of some of the most important advances in these areas today, including practical coverage of the daily applications. Valuable as a learning tool for beginners in this area as well as a daily reference for engineers and scientists working in these areas, this is a must-have for any library.
Cover Title Page Copyright Page Contents Preface Chapter 1 Implementation Tools for Generating Statistical Consequence Using Data Visualization Techniques 1.1 Introduction 1.2 Literature Review 1.3 Tools in Data Visualization 1.4 Methodology 1.4.1 Plotting the Data 1.4.2 Plotting the Model on Data 1.4.3 Quantifying Linear Relationships 1.4.4 Covariance vs. Correlation 1.5 Conclusion References Chapter 2 Decision Making and Predictive Analysis for Real Time Data 2.1 Introduction 2.2 Data Analytics 2.2.1 Descriptive Analytics 2.2.2 Diagnostic Analytics 2.2.3 Predictive Analytics 2.2.4 Prescriptive Analytics 2.3 Predictive Modeling 2.4 Categories of Predictive Models 2.5 Process of Predictive Modeling 2.5.1 Requirement Gathering 2.5.2 Data Gathering 2.5.3 Data Analysis and Massaging 2.5.4 Machine Learning Statistics 2.5.5 Predictive Modeling 2.5.6 Prediction and Decision Making 2.6 Predictive Analytics Opportunities 2.6.1 Detecting Fraud 2.6.2 Reduction of Risk 2.6.3 Marketing Campaign Optimization 2.6.4 Operation Improvement 2.6.5 Clinical Decision Support System 2.7 Classification of Predictive Analytics Models 2.7.1 Predictive Models 2.7.2 Descriptive Models 2.7.3 Decision Models 2.8 Predictive Analytics Techniques 2.8.1 Predictive Analytics Software 2.8.2 The Importance of Good Data 2.8.3 Predictive Analytics vs. Business Intelligence 2.8.4 Pricing Information 2.9 Data Analysis Tools 2.9.1 Excel 2.9.2 Tableau 2.9.3 Power BI 2.9.4 Fine Report 2.9.5 R & Python 2.10 Advantages & Disadvantages of Predictive Modeling 2.10.1 Advantages 2.10.2 Disadvantages 2.10.2.1 Data Labeling 2.10.2.2 Obtaining Massive Training Datasets 2.10.2.3 The Explainability Problem 2.10.2.4 Generalizability of Learning 2.10.2.5 Bias in Algorithms and Data 2.11 Predictive Analytics Biggest Impact 2.11.1 Predicting Demand 2.11.2 Transformation Using Technology and Process 2.11.3 Improved Pricing 2.11.4 Predictive Maintenance 2.12 Application of Predictive Analytics 2.12.1 Financial and Banking Services 2.12.2 Retail 2.12.3 Health and Insurance 2.12.4 Oil and Gas Utilities 2.12.5 Public Sector 2.13 Future Scope of Predictive Modeling 2.13.1 Technological Advancements 2.13.2 Changes in Work 2.13.3 Risk Mitigation 2.14 Conclusion References Chapter 3 Optimizing Water Quality with Data Analytics and Machine Learning 3.1 Introduction 3.2 Related Work 3.3 Data Sources and Collection 3.4 Water Demand Forecasting 3.4.1 Network Flow and Zone Demand Estimation 3.4.2 Demand Forecasting 3.4.2.1 Feature Importance 3.4.2.2 Forecast Horizon 3.4.3 Performance Characterization 3.5 Re-Chlorination Optimization 3.5.1 Data 3.5.2 Water Age Estimation 3.5.2.1 Travel Time Estimation 3.5.2.2 Residential Time Estimation 3.5.3 Ammonia Prediction 3.5.4 Optimization Model Definition 3.5.5 Improvements in Customer Water Quality 3.5.6 Plant Dosing Optimization 3.6 Conclusion Acknowledgements References Chapter 4 Lip Reading Framework using Deep Learning and Machine Learning 4.1 Introduction 4.1.1 Overview 4.1.2 Motivation 4.1.3 Lip Reading System Outcomes and Deliverables 4.2 The Emergence and Definition of the Lip-Reading System 4.2.1 Background of Domain 4.2.2 Identified Problems 4.2.3 Tools and Technologies Used 4.2.4 Implementation Aspects 4.2.4.1 Data Preparation 4.3 Design and Components of Lip-Reading System 4.4 Lip Reading System Architecture 4.5 Testing 4.6 Problems Encountered During Implementation 4.6.1 Assumptions and Constraints 4.7 Conclusion 4.8 Future Work References Chapter 5 New Perspective to Management, Economic Growth and Debt Nexus Analysis: Evidence from Indian Economy 5.1 Introduction 5.2 Literature Review 5.2.1 External Debt and Economic Growth 5.2.2 Trade Openness, FDI, and Economic Growth 5.2.3 FDI and Economic Growth 5.3 Data 5.3.1 Analytical Framework and Data Description 5.3.2 Theoretical Background and Specifications 5.3.2.1 Model Specification 5.4 Methodology and Findings 5.4.1 Unit Root Testing 5.4.2 Cointegration 5.4.3 Vector Error Correction Model 5.4.4 Long-Run Relationship Estimation 5.4.5 Causality Test 5.5 Conclusion and Policy Implications Declarations Availability of Data and Materials Competing Interests Funding Authors’ Contributions Acknowledgments References Chapter 6 Data-Driven Delay Analysis with Applications to Railway Networks 6.1 Introduction 6.2 Related Works 6.3 Background Knowledge 6.3.1 Background and Problem Formulation 6.3.1.1 Train Delay 6.3.1.2 Delay Propagation 6.3.2 Preliminaries 6.3.2.1 Bayesian Inference 6.3.2.2 Markov Property 6.4 Delay Propagation Model 6.4.1 Conditional Bayesian Delay Propagation 6.4.1.1 Delay Self-Propagation 6.4.1.2 Incremental Run-Time Delay 6.4.1.3 Incremental Dwell Time Delay 6.4.1.4 Accumulative Departure Delay 6.4.2 Cross-Line Propagation, Backward Propagation and Train Connection Propagation 6.5 Primary Delay Tracing Back 6.5.1 Delay Candidates Selection 6.5.2 Relation Construction 6.5.2.1 Preceding and Following Trains 6.5.2.2 Preceding and Connecting Trains 6.6 Evaluation on Dwell Time Improvement Strategy 6.7 Experiments 6.7.1 Experiment Setting 6.7.2 Temporal Prediction of Delay Propagation 6.7.3 Spatial Prediction of Delay Propagation 6.7.4 Case Study of Primary Delay Tracing Down 6.7.5 Evaluation of Dwell Time Improvement Strategy 6.8 Conclusion References Chapter 7 Proposing a Framework to Analyze Breast Cancer in Mammogram Images Using Global Thresholding, Gray Level Co-Occurrence Matrix, and Convolutional Neural Network (CNN) 7.1 Introduction & Purpose of Study 7.1.1 Segmentation 7.1.1.1 Types of Segmentation 7.1.2 Compression 7.2 Literature Review & Motivation 7.3 Proposed Work 7.3.1 Algorithm 7.3.2 Explanation 7.3.3 Flowchart 7.4 Observation Tables and Figures 7.5 Conclusion 7.6 Future Work References Chapter 8 IoT Technologies for Smart Healthcare 8.1 Introduction 8.2 Literature Review 8.2.1 IoT-Based Smart Health 8.2.2 Advantages of Applying IoT in Health 8.3 Findings 8.3.1 Significant Features and Applications of IoT in Health 8.3.1.1 Simultaneous Monitoring and Reporting 8.3.1.2 End-to-End Connectivity and Affordability 8.3.1.3 Data Analysis 8.3.1.4 Tracking, Alerts, and Remote Medical Care 8.3.1.5 Research 8.3.1.6 Patient-Generated Health Data (PGHD) 8.3.1.7 Management of Chronic Diseases and Preventative Care 8.3.1.8 Home-Based and Short-Term Care 8.4 Case Study: CyberMed as an IoT-Based Smart Health Model 8.5 Discussions 8.5.1 Limitations of Adopting IoT in Health 8.5.1.1 Data Security and Privacy 8.5.1.2 Connectivity 8.5.1.3 Compatibility and Data Integration 8.5.1.4 Implementation Cost 8.5.1.5 Complexity and Risk of Errors 8.6 Future Insights 8.7 Conclusions References Chapter 9 Enhancement of Scalability of SVM Classifiers for Big Data 9.1 Introduction 9.2 Support Vector Machine 9.2.1 Challenges 9.3 Parallel and Distributed Mechanism 9.3.1 Shared-Memory Parallelism 9.4 Distributed Big Data Architecture 9.4.1 Hadoop MapReduce 9.4.2 Spark 9.4.3 AKKA 9.5 Distributed High Performance Computing 9.5.1 GASNet 9.5.2 Charm++ 9.6 GPU Based Parallelism 9.6.1 CUDA 9.6.2 OpenCL 9.7 Parallel and Distributed SVM Algorithms 9.7.1 LS-SVM 9.7.2 Cascade SVM 9.7.3 DC SVM 9.7.4 Parallel Distributed Multiclass SVM Algorithms 9.8 Conclusion and Future Research Directions References Chapter 10 Electrical Network-Related Incident Prediction Based on Weather Factors 10.1 Introduction 10.2 Related Work 10.3 Methodology 10.3.1 Binary Classification of Incident and Normality 10.3.2 Incident Categorization Using Natural Language Processing 10.3.3 Classification of Multiple Types of Incidents 10.4 Experiments 10.4.1 Data Sets 10.4.2 Evaluation Metrics 10.4.3 Binary Classification 10.4.4 Incident Categorization 10.4.5 Multi-Class Classification 10.5 Conclusion and Future Work Acknowledgements References Chapter 11 Green IoT: Environment-Friendly Approach to IoT 11.1 Introduction 11.2 G-IoT (Green Internet of Things) 11.3 Layered Architecture of G-IoT 11.3.1 Data Center/Cloud 11.3.2 Data Analytics and Control Applications It 11.3.3 Data Aggregation and Storage 11.3.4 Edge Computing 11.3.5 Communication and Processing Unit 11.4 Techniques for Implementation of G-IoT 11.5 Power Saving Methods Based on Components 11.6 Applications of G-IoT 11.7 Challenges and Future Scope 11.8 Case Study 11.9 Conclusion References Chapter 12 Big-Data Analytics: A New Paradigm Shift in Micro Finance Industry 12.1 Introduction 12.2 Reality of Area and Transcendent Difficulties 12.2.1 Probable Overlending 12.2.2 Information Imbalance 12.2.3 Retreating Not-for-Profit Sector 12.2.4 Neighbourhood Pressure 12.3 Data Analytics in Microfinance 12.3.1 Types of Data Analytics Used in Microfinance 12.3.2 Use of Big Data in Microfinance Industry 12.3.3 Risk and Data Based Credit Decisions 12.3.4 Product Development and Selection 12.3.5 Product or Service Positioning 12.3.6 M-Commerce and E-Payments 12.3.7 Making Reliable Credit Decisions 12.3.8 Big Data-Driven Model Promises Psychometric Evaluations 12.3.9 Product Build-Up, Service Positioning, and Offering 12.4 Opportunities and Risks in Using Data Analytics 12.5 Risk in Utilizing Big Data 12.6 Conclusion References Chapter 13 Big Data Storage and Analysis 13.1 Introduction 13.1.1 6 V’s of Big Data 13.1.2 Types of Data 13.1.3 Issues in Handling Big Data 13.2 Hadoop as a Solution to Challenges of Big Data 13.2.1 The Hadoop Ecosystem 13.2.2 Rack Awareness Policy in HDFS 13.3 In-Memory Storage and NoSQL 13.3.1 Key-Value Data Stores 13.3.2 Document Stores 13.3.3 Wide Column Stores 13.3.4 Graph Stores 13.3.5 Multi-Modal Databases 13.4 Advantages of NoSQL Database 13.5 Conclusion References Chapter 14 A Framework for Analysing Social Media and Digital Data by Applying Machine Learning Techniques for Pandemic Management 14.1 Introduction 14.2 Literature Review 14.3 Understanding Pandemic Analogous to a Disaster 14.4 Application of Machine Learning Techniques at Various Phases of Pandemic Management 14.4.1 Mitigation Phase 14.4.2 Preparedness Phase 14.4.3 Response Phase 14.4.4 Recovery Phase 14.5 Generalized Framework to Apply Machine Learning Techniques for Pandemic Management 14.6 Conclusion References About the Editors Index EULA
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.