Big Data and Analytics

Length: 224 pages
Edition: 1
Language: English
Publisher: Notion Press
Publication Date: 2022-01-12
ISBN-10: B09QF8JN82
ISBN-13: 9798885304870
Sales Rank: #9918439 (See Top 100 Books)

Big data is a state-of-the-art technology that revolutionizes system design and decision-making. On the other hand, Hadoop is a distributed framework that allows the effective management of big data. This book combines theoretical and practical facets of big data technology. The first few chapters provide a theoretical introduction to big data and Hadoop, with individual chapters covering different components of the Hadoop ecosystem. The rest of the book provides lab tutorials, giving basic working knowledge of the different components and how they can synergistically be used to develop a big data application.

Key features of the book include:

It provides a background of the big data problem and introduces Hadoop in light of how it solves it.

It covers all the processes of the big data lifecycle and the different components of Hadoop that serve these processes.

It offers dedicated lab tutorials for installation and demonstration of the different components of the Hadoop ecosystem.

Cover 
Title Page
Copyright Page
Contents
Preface
Acknowledgments
Chapter 1: Introduction to Big Data and Hadoop
    Objectives
    1.1: Big Data Concept
    1.2: IBM’s 3v Model
    1.3: How Can Big Data Benefit Businesses
    1.4: Hadoop and its Applications
    1.5: Limitations of Existing Architectures
    1.6: Key Characteristics of hadoop
    1.7: Key Differences Between RDBMS and Hadoop
    Objective-Type Questions
    Short Answer Questions
    Long Answer Questions
Chapter 2: Hadoop and HDFS Architecture
    Objectives
    2.1: Hadoop Ecosystem
    2.2: Core Components of Hadoop 2.X
    2.3: Hadoop Functions
    2.4: Hadoop 2.X Cluster Architecture – Federation and Availability
    2.5: Resource Management in Hadoop 2.X
    2.6: Cluster Modes
    2.7: Configuration Files
    Objective-Type Questions
    Short Answer Questions
    Long Answer Questions
Chapter 3: Basics of Mapreduce Framework
    Objectives
    3.1: Mapreduce Programming Paradigm
    3.2: Traditional Way of Processing Large Data
    3.3: The Mapreduce Approach
    3.4: Mapreduce vs Traditional Programming Approaches
    3.5: Mapreduce Implementation
    3.6: Mapreduce Architecture
    Objective-Type Questions
    Short Answer Questions
    Long Answer Questions
Chapter 4: Advanced Concepts in Mapreduce Programming
    Objectives
    4.1: Input Splits in Mapreduce
    4.2: Partitioner
    4.3: Combiner
    4.4: Map And Reduce Side Joins
    4.5: Counters
    4.6: Input Formats
    4.7: Mrunit Testing Framework
    Objective-Type Questions
    Short Answer Questions
    Long Answer Questions
Chapter 5: Pig
    Objectives
    5.1: Introduction to Pig
    5.2: The Yahoo! Story
    5.3: Key Characteristics
    5.4: Performance: Pig vs. Mapreduce
    5.5: Limitations
    5.6: Applications
    5.7: Working With Pig
    Objective-Type Questions
    Short Answer Questions
    Long Answer Questions
Chapter 6: Hive
    Objectives
    6.1: Background
    6.2: The Facebook Story
    6.3: Hive Basics
    6.4: Differences Between Hive and Pig
    6.5: Differences Between Hive and Traditional Rdbms
    6.6: Hive Architecture
    6.7: Components of Hive
    6.8: Limitations of Hive
    6.9: Hive Scripting
    Objective-Type Questions
    Short Answer Questions
    Long Answer Questions
Chapter 7: NoSQL Databases and Hbase
    Objectives
    7.1: Introduction
    7.2: Need for Hbase
    7.3: Classification of NoSQL Databases
    7.4: Defining HBase
    7.5: Uses of HBase
    7.6: Limitations of HBase
    7.7: Components of HBase
    7.8: HBase Storage Architecture
    7.9: Need for Zookeeper
    7.10: Working in HBase
    Objective-Type Questions
    Short Answer Questions
    Long Answer Questions
Chapter 8: Oozie
    Objectives
    8.1: Understanding Oozie
    8.2: Functional Components of Oozie
    Objective-Type Questions
    Short Answer Questions
    Long Answer Questions
Chapter 9: Integrating R With Hadoop
    Objectives
    9.1: Introduction to R
    9.2: Using R With Hadoop
    9.3: Integration Methods for R and Hadoop
    9.4: Solving Problems With R and Hadoop
    Objective-Type Questions
    Short Answer Questions
    Long Answer Questions
Chapter 10: Setting Up Hadoop Standalone Cluster
Chapter 11: Setting Up Hadoop Multi-Node Cluster
Chapter 12: Basic HDFS Commands
Chapter 13: Writing and Executing Mapreduce Programs
Chapter 14: Advanced Programming in Mapreduce
Chapter 15: Pig Commands and Scripting
Chapter 16: Hive Commands and Scripting
Chapter 17: Working in HBase
Chapter 18: Job Management in Oozie
Chapter 19: Data Loading Techniques
Chapter 20: Project
Appendix:
Index
About the Author

Computers & Technology
- Databases & Big Data

Data Mining

Free sample
How to download

Donate to keep this site alive

To access the Link, solve the captcha.

1. Disable the AdBlock plugin. Otherwise, you may not get any links.

2. Solve the CAPTCHA.

3. Click download link.

4. Lead to download server to download.

Big Data and Analytics

Everyday Data Visualization: Design Effective Charts and Dashboards

Cracking the Data Science Interview: Unlock insider tips from industry experts to master the data science field

Ultimate Snowflake Architecture for Cloud Data Warehousing: Architect, Manage, Secure, and Optimize Your Data Infrastructure Using Snowflake for ... and Informed Decisions

Data Analytics for Marketing: A practical guide to analyzing marketing data using Python

Modern Data Mining with Python: A risk-managed approach to developing and deploying explainable and efficient algorithms using ModelOps

Excel BI and Dashboards in 7 Days: Build interactive dashboards for powerful data visualization and insights