Learning Elasticsearch 7.x: Index, Analyze, Search and Aggregate Your Data Using Elasticsearch

Length: 310 pages
Edition: 1
Language: English
Publisher: BPB Publications
Publication Date: 2020-12-03
ISBN-10: 9389898307
ISBN-13: 9789389898309
Sales Rank: #3170421 (See Top 100 Books)

A step-by-step guide that will teach you how to use Elasticsearch in your application effectively

Key Features

Get familiar with the core concepts of Elasticsearch.
Understand how the search engine works and how Elasticsearch is different from other similar tools.
Learn to install Elasticsearch on different operating systems..
Get familiar with the components of Elastic Stack such as Kibana, Logstash, and Beats, etc.
Learn how to import data from different sources such as RDBMS, and files, etc

Description
In the modern Information Technology age, we are flooded with loads of data so we should know how to handle those data and transform them to fetch meaningful information. This book is here to help you manage the data using Elasticsearch.

The book starts by covering the fundamentals of Elasticsearch and the concept behind it. After the introduction, you will learn how to install Elasticsearch on different platforms. You will then get to know about Index Management where you will learn to create, update, and delete Elasticsearch indices. Then you will understand how the Query DSL works and how to write some complex search queries using the Query DSL. After completing these basic features, you will move to some advanced topics. Under advanced topics, you will learn to handle Geodata which can be used to plot the data on a map. The book then focuses on Data Analysis using Aggregation. You will then learn how to tune Elasticsearch performance. The book ends with a chapter on Elasticsearch administration.

What will you learn

Learn how to create and manage a cluster
Work with different components of Elastic Stack
Review the list of top Information Security certifications.
Get to know more about Elasticsearch Index Management.
Understand how to improve the performance by tuning Elasticsearch.

Who this book is for
This book is for developers, architects, DBA, DevOps, and other readers who want to learn Elasticsearch efficiently and want to apply that in their application whether it is a new one or an existing one. It is also beneficial to those who want to play with their data using Elasticsearch. Basic computer programming is a prerequisite.

About the Authors
Anurag Srivastava works as Deputy Manager in the R&D centre of an air conditioning company. With 14+ years of experience in the software industry, he has led and handled teams and clients for more than 7 years. He is well experienced with the Elastic Stack for creating dashboards using system metrics data, log data, application data, or relational databases.

He is a regular blogger on technical subjects, which can be found at bqstack or medium.

Linkedin profile: https://www.linkedin.com/in/anubioinfo/

Cover Page
Title Page
Copyright Page
Dedication Page
About the Author
About the Reviewer
Acknowledgement
Preface
Errata
Table of Contents
1. Getting Started with Elasticsearch
    Introduction
    Structure
    Objectives
    Introduction to Elasticsearch
    What is Elasticsearch
    The basic concepts of Elasticsearch
        Node
            Master node
            Data node
            Ingest node
            Machine learning node
        Cluster
        Documents
        Index
        Shard
    Use cases of Elasticsearch
        Data search
        Data logging and analysis
        Application performance monitoring
        System performance monitoring
        Data Visualization
    Different clients for Elasticsearch
        Java
        PHP
        Perl
        Python
        .NET
        Ruby
        JavaScript
    How to use Elasticsearch
        Elasticsearch as a primary data source
        Elasticsearch as a secondary data source for searching
        Elasticsearch as a standalone system
    Conclusion
    Questions
2. Installing Elasticsearch
    Introduction
    Structure
    Objectives
    What’s new in Elasticsearch 7.x
        Adaptive replica selection
        Skip shard refreshes
        One shard per index by default
        Support for small heap
    Installing Elasticsearch
        Installing Elasticsearch on Linux or macOS
            Installing Elasticsearch on Linux
            Installing Elasticsearch on macOS
        Installing Elasticsearch using the Debian package
            Installing the Debian package manually
        Installing Elasticsearch using the RPM package
            Installing the RPM package manually
    Start the Elasticsearch service and verify
    Elasticsearch REST APIs
        cat APIs
        cat API parameters
        Verbose
        Help
        Headers
        Response formats
        Sort
        cat count API
        cat health API
        cat indices API
        cat master API
        cat nodes API
        cat shards API
        Cluster APIs
        Cluster health API
        Cluster stats API
    Conclusion
    Questions
3. Working with Elastic Stack
    Introduction
    Structure
    Objectives
    What is Elastic Stack
    Elasticsearch
    Logstash
        Logstash input plugin
        Logstash filter plugin
        Logstash output plugin
        Fetch Apache logs using logstash
    Kibana
    Beats
        Filebeat
            Configure input
            Configure output
        Metricbeat
            Configure Metricbeat
            Enabling the required modules
            Output configuration
        Packetbeat
            Configuring Packetbeat
        Winlogbeat
            Configure Winlogbeat
        Auditbeat
            Configuring Auditbeat
        Heartbeat
            Configuring Heartbeat
        Functionbeat
            Configuring Functionbeat
    Conclusion
    Questions
4. Preparing Your Data
    Introduction
    Structure
    Objectives
    Why it is important to prepare the data before indexing
    An introduction to Elasticsearch analyzers
        Built-in analyzer
            Standard analyzer
            Simple analyzer
            Whitespace analyzer
            Stop analyzer
            Keyword analyzer
            Pattern analyzer
            Language analyzers
            Fingerprint analyzer
            Custom analyzer
    Tokenizers
        Word oriented tokenizers
            Standard tokenizer
        Letter tokenizer
            Lowercase tokenizer
            Whitespace tokenizer
            UAX URL email tokenizer
        Classic tokenizer
        Partial word tokenizers
            N-gram tokenizer
            Edge n-gram tokenizer
        Structured text tokenizers
            Keyword tokenizer
            Pattern tokenizer
    Token filters
    Character filters
        HTML strip character filter
        Mapping the char filter
        Pattern replace character filter
    Normalizers
    Conclusion
    Questions
5. Importing Data into Elasticsearch
    Introduction
    Structure
    Objectives
    Why is data so important for business
    Data shipping
    Data ingestion
    Data storage
    Data visualization
    Importing data into Elasticsearch using different Beats
        Pull Apache logs using Filebeat
        Pull server metrics using Metricbeat
        Pulling network data using Packetbeat
        Pulling CSV data using logstash
    Conclusion
    Questions
6. Managing Your Index
    Introduction
    Structure
    Objectives
    Creating index along with mapping
        Creating an index without any document
        Creating index along with the documents
        Get mapping of the index
        Create a mapping of the index
    Index management
    Performing index-level operations
        Close index
        Delete index
        Freeze index
        Refresh index
        Force merge index
        Clear index cache
        Flush index
        Add lifecycle policy
    Index APIs
        Index management
        Creating an index
            Delete index
            Get index
            Close index
            Open index
            Index Exist API
            Shrink index
            Freeze index
            Unfreeze index
            Split index
            Clone index
            Rollover index
        Index settings
            Update index settings
            Get index settings
        Manage index templates
            Creating an index template
            Get index template
            Delete index template
    Index lifecycle management
    Conclusion
    Questions
7. Applying Search on Your Data
    Introduction
    Structure
    Objective
    URI search
        Empty search
        Field search
    Request body search
        Query versus filter
        Query
        Query types
            Full-text search
            Term-level queries
            Compound queries
    Multi-search
        Multi-search API
        Multi search template
    Explain API
    Profile API
    Conclusion
    Questions
8. Handling Geo with Elasticsearch
    Introduction
    Structure
    Objective
    Geodata type
    Geo point data
        Creating mapping
        Saving geo point data
    Geo shape data
        Creating mapping
        Saving geo point data
            Point
            LineString
            Polygon
            MultiPoint
            MultiLineString
            MultiPolygon
            Geometry collection
            Envelope
            Circle
    Geo queries
        Geo-distance queries
        Geo-polygon queries
        Geo-bounding box queries
        Geo-shape queries
    Use case
        Restaurant search
    Aggregate restaurant based on the distance
    Conclusion
    Questions
9. Aggregating Your Data
    Introduction
    Structure
    Objective
    Introduction to Elasticsearch aggregation
    Bucket aggregation
        Range aggregation
        Composite aggregation
            Terms
            Histogram
            Date histogram
        Terms aggregation
        Filter aggregation
        Filters aggregation
        Geo distance aggregation
    Metrics aggregation
        Min aggregation
        Max aggregation
        Avg aggregation
        Sum aggregation
        Value count aggregation
        Stats aggregation
        Extended stats aggregation
        Percentiles aggregation
    Matrix aggregation
        Matrix stats aggregation
    Pipeline aggregation
        Avg bucket aggregation
        Max bucket aggregation
        Sum bucket aggregation
    Conclusion
    Questions
10. Improving the Performance
    Introduction
    Structure
    Objectives
    Introduction
    Tuning Elasticsearch indexing speed
    Bulk R equests instead of a single request
        Smart use of the Elasticsearch Cluster
        Increasing the refresh interval
        Disable replicas
        Using auto-generated ids
        Tweaking the indexing buffer size
        Use of faster hardware
        Allocating memory to the filesystem cache
    Tuning Elasticsearch search speed
        Document modelling
        Search a few fields if possible
        Pre-index data
        Mapping of identifiers as keyword
        We should force merge the read-only indices
        Use filter instead of the query
        Increase the replica count
        Fetch only the required fields
        Use of faster hardware
        Allocate memory to the filesystem cache
        Avoid including stop words in the search
        Avoid the script in the query
    Tuning Elasticsearch for disk usage
        Shrink index
        Force merge
        Disable the unrequired features
        Avoid dynamic string mappings
        Disable_source
        Use the smallest numeric type
    Elasticsearch best practices
        Always define the mapping
        Do your capacity planning
        Avoid split-brain problem
        Enable the slow query log
    Conclusion
    Questions
11. Administering Elasticsearch
    Structure
    Objectives
    Elasticsearch security
        Configuring TLS
        Elasticsearch cluster passwords
        Configuring role-based access using Kibana
            Creating users
            Creating roles
    Index aliases
    Repository and snapshot
        Creating the repository
        Taking the snapshot
        Restoring a snapshot
    Elastic common schema
        Why do we need a common schema
        Introduction to Elastic common schema
        ECS general guidelines
        ECS field name guidelines
        Getting started with ECS
    Conclusion
    Questions
Index