SAP Data Intelligence: The Comprehensive Guide
- Length: 765 pages
- Edition: F
- Language: English
- Publisher: SAP Press
- Publication Date: 2021-10-25
- ISBN-10: 1493221620
- ISBN-13: 9781493221622
- Sales Rank: #2842974 (See Top 100 Books)
Manage your data landscape with SAP Data Intelligence! Begin by understanding its architecture and capabilities and then see how to set up and install SAP Data Intelligence with step-by-step instructions. Walk through SAP Data Intelligence applications and learn how to use them for data governance, orchestration, and machine learning. Integrate with ABAP-based systems, SAP Vora, SAP Analytics Cloud, and more. Manage, secure, and operate SAP Data Intelligence with this all-in-one guide!&
- Install and configure SAP Data Intelligence in the cloud or on-premise
- Govern, process, and orchestrate data workflows with tools like the metadata explorer and the SAP Data Intelligence modeler
- Use machine learning, SAP Analytics Cloud, and SAP Data Warehouse Cloud to enrich business data
Configuration
Build your SAP Data Intelligence landscape! Use SAP Cloud Appliance Library for cloud deployment, including provisioning, sizing, and accessing the launchpad. Perform on-premise installations using tools like the maintenance planner.
Capabilities
Put the core capabilities of SAP Data Intelligence to work! Manage and govern your data with the metadata explorer, use the modeler application to create data processing pipelines, create apps with the Jupyter Notebook, and more.
Integration and Administration
Integrate, manage, and operate SAP Data Intelligence! Get step-by-step instructions for integration with SAP and non-SAP systems. Learn about key administration tasks and make sure your landscape is secure and running smoothly.
*Configuration and installation
*Data governance
*Data processing pipelines
*Docker images
*ML Scenario Manager
*Jupyter Notebook
*Python SDK
*Integration
*Administration
*Security
*Application lifecycle management
*Use cases
Dear Reader Notes on Usage Table of Contents Preface Why Read This Book? Audience Structure of the Book Acknowledgments Conclusion Part I Getting Started 1 The Data Fabric for the Intelligent Enterprise 1.1 Data Fabric 1.1.1 Trends 1.1.2 Benefits 1.2 Data Orchestration 1.3 SAP Business Technology Platform 1.4 SAP Data Intelligence 1.5 Summary 2 Architecture and Capabilities 2.1 Genesis of SAP Data Intelligence 2.1.1 Features from SAP Leonardo Machine Learning Foundation 2.1.2 Evolution from SAP Data Hub to SAP Data Intelligence 2.2 SAP Data Intelligence Architecture 2.3 Deployment Options and Bring Your Own License Model 2.4 Kubernetes Cluster and Containers 2.4.1 Overview of Kubernetes 2.4.2 Kubernetes Cluster Architecture 2.4.3 Container Runtimes 2.4.4 Pods and Workloads 2.4.5 Resources and Policies 2.4.6 Kubernetes and SAP Data Intelligence 2.5 SAP Data Intelligence Launchpad 2.5.1 Persona-Based Application 2.5.2 Overview of Applications 2.6 Summary 3 Setup and Installation 3.1 Landscape Sizing 3.1.1 Sizing Various SAP Data Intelligence Components 3.1.2 Minimum Sizing and Initial Sizing for SAP Data Intelligence 3.1.3 Understanding the T-Shirt Sizing Approach 3.2 SAP Cloud Appliance Library 3.2.1 Getting Started with SAP Cloud Appliance Library 3.2.2 Deploying SAP Solutions in the Cloud 3.2.3 Activating and Creating Solution Instances 3.2.4 Security Considerations for SAP Cloud Appliance Library 3.3 On-Demand Cloud Provisioning and Instance Sizing 3.3.1 Sizing with SAP Cloud Appliance Library 3.3.2 Supported Cloud Providers for SAP Cloud Appliance Library 3.3.3 Understanding Costs and Payments 3.3.4 Backing Up, Restoring, and Terminating an Instance 3.4 Setting Up SAP Data Intelligence on SAP Cloud Appliance Library 3.4.1 Prerequisites for Cloud Provider Account 3.4.2 Connecting to SAP Cloud Appliance Library 3.4.3 Creating and Accessing the Solution 3.4.4 Accessing the Jump Box for Monitoring and Troubleshooting 3.4.5 Running the Solution 3.4.6 Access through Browser Using Local Hosts File 3.4.7 Personalization 3.5 SAP Data Intelligence 3.0 Installation On-Premise 3.5.1 Planning and Prerequisites for an On-Premise Installation 3.5.2 Modular Deployment with SLC Bridge 3.5.3 Installing SAP Data Intelligence with the Maintenance Planner and SLC Bridge 3.6 Summary 4 Using SAP Data Intelligence Applications 4.1 SAP Data Intelligence Launchpad Applications 4.2 Applications for Data Engineers 4.2.1 Connection Management 4.2.2 Metadata Explorer 4.2.3 Modeler 4.2.4 Customer Data Export 4.3 Applications for Data Scientists 4.3.1 ML Scenario Manager 4.3.2 Vora Tools 4.4 Applications for Modelers and Auditors 4.4.1 Monitoring Applications 4.4.2 Audit and System Logs 4.5 Applications for System Administrators 4.5.1 Policy Management 4.5.2 Handling Privileges 4.5.3 System Management 4.5.4 License Management 4.6 Summary Part II Data Management, Orchestration, and Machine Learning 5 Metadata-Driven Data Governance 5.1 Metadata Explorer for Data Governance 5.1.1 Intelligent Information Management with the Discovery Dashboard 5.1.2 Metadata Crawlers to Explore, Classify, and Label Data Assets 5.1.3 Managing Metadata Data across a Connected System Landscape 5.2 Data Profiling to Understand Data 5.2.1 Profiling Data Sets from Connections 5.2.2 Profiling Actions and Monitor 5.2.3 Viewing Profile Fact Sheets 5.3 Managing Publications and Data Catalogs 5.3.1 Catalog of Published Data Sets 5.3.2 Automatic Tags and Hierarchical Tagging 5.3.3 Using Tags as Search Filters 5.3.4 Managing Publications in the Catalog 5.3.5 Lineage Depth Set in Publication Processing 5.4 Defining Data Quality Rules and Running Rulebooks 5.4.1 Rules Determining Business Data Compliance 5.4.2 Categories to Organize Business Rules 5.4.3 Using the Match Pattern Operator 5.4.4 Running and Monitoring Rulebooks 5.4.5 Business Glossary of Terms and Definitions 5.5 Data Lineage from Transformation History 5.5.1 Lineage Analyses for Tracing Data Sets to Sources 5.5.2 Lineage Extraction and Supported Sources 5.5.3 Understanding and Configuring the Lineage View 5.6 Summary 6 Modeling Data Processing Pipelines 6.1 Using the SAP Data Intelligence Modeler 6.1.1 Flow-Based Paradigm as a Network of Information 6.1.2 Data Pipeline Engine in the Flow-Based Modeler 6.1.3 Navigating the Modeler Panes and Toolbars 6.1.4 Built-In Operators 6.1.5 Creating and Validating Graphs 6.2 Creating and Managing Connections 6.2.1 Creating Connections 6.2.2 Connecting to Cloud Foundry 6.2.3 Managing Certificates 6.2.4 Authorizations for Connections 6.3 Self-Service Data Preparation with the Metadata Explorer 6.3.1 Preparing Data for Accurate Results and Better Insights 6.3.2 Self-Service Data Preparation with the Metadata Explorer 6.3.3 Transforming Structured Data Sets 6.3.4 Managing Data Preparation Actions 6.3.5 Processing Data Preparation Actions 6.4 Integrating, Processing, and Orchestrating Workflows 6.4.1 Graph Snippets as a Group of Operators 6.4.2 Working with Data Workflow Operators 6.4.3 Integrating SAP Cloud Applications 6.4.4 Change Data Capture Graph 6.4.5 Custom Operators 6.5 Scheduling and Monitoring Data Pipelines 6.5.1 Scheduling and Monitoring Data Pipelines 6.5.2 Trace Messages 6.5.3 Tracking Model Metrics 6.5.4 Kubernetes Dashboard and Cluster Logs 6.6 Summary 7 Creating Operators and Data Types 7.1 Creating Custom Operators 7.1.1 Visibility of Events 7.1.2 Compatibility of Port Types 7.1.3 Creating and Editing Operators 7.2 Implementing Runtime Operators 7.2.1 Subengines in SAP Data Intelligence Modeler 7.2.2 Working with Subengines to Create Operators 7.3 Creating Data Types 7.3.1 Predefined Global Scalar Types 7.3.2 Defining Your Own Custom Data Types 7.3.3 Leveraging Data Types in Graphs 7.4 Summary 8 Building Docker Images 8.1 Containers in Pods and Pods in Clusters 8.1.1 Delivery of Data-Driven Applications 8.1.2 Helm: Package Manager for Kubernetes 8.1.3 Dockerfiles: Predefined Runtime Environments 8.2 Assembling a Docker Image 8.2.1 Building Docker Images through Dockerfiles 8.2.2 Enhancing Docker Images with Different Package Managers 8.3 Dockerfile Inheritance 8.4 Using Docker with Python 8.5 Summary 9 Machine Learning 9.1 Machine Learning with SAP 9.1.1 Machine Learning Solutions in the SAP Landscape 9.1.2 TEI Methodology in Machine Learning 9.1.3 Transforming Business Use Cases with Machine Learning 9.1.4 Data-Driven Approach versus Traditional Rule-Based Approach 9.1.5 Machine Learning Tasks in Enterprise Contexts 9.1.6 Architectural Principles for Machine Learning 9.2 Machine Learning with SAP Data Intelligence 9.2.1 Scalable Data Pipelines in Complex Data Landscapes 9.2.2 Data and Algorithms as Assets for Machine Learning 9.2.3 Leveraging Open-Source Environments and Skills 9.3 Using the ML Scenario Manager 9.3.1 ML Scenario Manager Overview 9.3.2 Setting Up a Scenario in ML Scenario Manager 9.3.3 Integrating Hyperscale Data and Targets 9.3.4 Leveraging Scenario Templates for Machine Learning 9.3.5 Dockerfile Building and Grouping 9.3.6 Implementing TensorFlow Pipelines 9.3.7 Training and Deploying Models with New Versions 9.3.8 Metrics Explorer and Machine Learning Tracking SDK 9.3.9 Run Collection and Run Performance 9.3.10 Visualizing SAP Data Intelligence Metrics with SAP Analytics Cloud 9.4 ML Data Manager in Data Workspaces and Data Collections 9.4.1 Data Workspaces and Data Collections 9.4.2 Organizing Data Sets in Data Lakes 9.4.3 Curating a Data Collection 9.4.4 Registering a Data Set 9.5 Summary 10 Jupyter Notebook 10.1 Jupyter Notebook Fundamentals 10.1.1 Interactive Tool for Data Science Projects 10.1.2 Jupyter Notebook Dashboard and User Interface 10.1.3 Data Analysis in Jupyter Notebook 10.2 Working with SAP HANA Cloud 10.2.1 SAP HANA Cloud: Cloud Database as a Service 10.2.2 Exploring SAP HANA Cloud on an SAP BTP Trial Account 10.2.3 Understanding the SAP HANA Cockpit and SAP HANA Database Explorer 10.2.4 Using Jupyter Notebook in SAP BTP and Integration with SAP HANA Cloud 10.2.5 SAP Data Intelligence Connection 10.3 Data Science Experiments with Jupyter Notebook 10.3.1 SAP HANA Embedded Machine Learning 10.3.2 Machine Learning Core Operators 10.3.3 SAP HANA ML Training Operator 10.3.4 SAP HANA ML Inference Operator 10.4 JupyterLab as the Next-Gen Jupyter Notebook 10.4.1 JupyterLab: The Next-Gen User Interface with Built-In Libraries 10.4.2 Accessing Jupyter Notebook Artifacts from JupyterLab 10.4.3 SAP HANA Python Client API 10.5 Summary 11 SAP Data Intelligence Python SDK 11.1 Using SAP Data Intelligence Python SDK 11.1.1 Setting a Context in Jupyter Notebook 11.1.2 Data Lake API for SDL 11.1.3 Retrieving Machine Learning Scenario Metadata 11.1.4 Training Container Using the SDK 11.1.5 Executing and Deploying Pipelines 11.2 Accessing Artifacts Using Methods 11.3 Machine Learning Tracking SDK 11.3.1 Initializing Run for an Experiment 11.3.2 Grouping Runs in Run Collections 11.3.3 Analyzing Metrics and Logs 11.4 Summary Part III Integration 12 Integrating with ABAP Systems 12.1 Integration Scenarios 12.1.1 Scenarios and Use Cases for Integration 12.1.2 ABAP Metadata in the Metadata Explorer 12.2 Provisioning Data from ABAP Systems 12.2.1 Exposing the CDS View 12.2.2 Connection Prerequisites for Data Extraction 12.2.3 Connecting On-Premise Systems with the Cloud Connector 12.3 Using Operators to Trigger Execution in an ABAP System 12.3.1 ABAP Operators to Trigger Function Modules or BAPIs 12.3.2 Prerequisites for ABAP Operators in Remote Systems 12.4 SAP BW/4HANA and SAP Data Intelligence Hybrid Data Virtualization 12.4.1 Prerequisites in SAP Business Warehouse 12.4.2 Using Connection Type HANA_DB 12.4.3 Authorization Check for Services 12.4.4 SAP BW Operator for Pipeline 12.5 Additional Connectivity 12.5.1 SAP Information Steward 12.5.2 SAP HANA for SQL Data Warehousing 12.6 Summary 13 Integrating with Non-SAP Systems 13.1 Non-SAP Cloud System Connectivity 13.1.1 Amazon S3 13.1.2 Amazon Redshift 13.1.3 Windows Azure Storage Blob 13.1.4 Microsoft Azure SQL Data Warehouse 13.1.5 Microsoft Azure Data Lake 13.1.6 Google Cloud Storage 13.1.7 Google BigQuery 13.1.8 IBM Cloud Storage 13.2 Non-SAP On-Premise System Connectivity 13.2.1 Oracle Relational Database Management System 13.2.2 Microsoft SQL Server 13.3 Summary 14 Integrating Big Data Workloads with SAP Vora 14.1 SAP Vora in Kubernetes Framework 14.1.1 System Management 14.1.2 SAP Vora Engine Architecture 14.1.3 Accessing SAP Vora User Interface 14.1.4 SAP Vora Data Preview 14.1.5 Using SQL Editor 14.1.6 Using SQL Scripts 14.2 Data Modeling in SAP Vora 14.2.1 Creating Database Schemas 14.2.2 Creating Partition Schemes 14.2.3 Creating Tables and Views 14.2.4 Creating Calculated Columns 14.2.5 Additional Functions for Views 14.3 Hierarchies in SAP Vora 14.3.1 SAP Vora SQL for Hierarchical Data Analysis 14.3.2 Using Adjacency Table to Render a Hierarchy 14.3.3 Caching Hierarchies with Materialized Views 14.4 Full-Text Search in SAP Vora 14.4.1 Text Analysis Graphs in Modeler 14.4.2 Linguistic and Semantic Analysis 14.4.3 Full-Text Search on a Document Collection 14.5 Summary 15 Integrating with SAP Data Warehouse Cloud 15.1 Overview of SAP Data Warehouse Cloud 15.1.1 SAP Cloud Services Ecosystem 15.1.2 Setting Up the Trial Tenant 15.2 Understanding Spaces 15.2.1 Spaces as Virtual Workspaces 15.2.2 Development in a Space 15.2.3 Managing Spaces 15.3 Exploring Connections and Using the Data Builder 15.3.1 Available Connection Types 15.3.2 Data Builder: Model to Business Catalog 15.3.3 Space-Aware Integrated Story Builder 15.4 Data Builder in SAP Data Warehouse Cloud versus Pipelines in SAP Data Intelligence 15.5 Summary 16 Integrating with SAP Analytics Cloud 16.1 Overview of SAP Analytics Cloud 16.1.1 Solution to Analyze, Plan, Predict, and Collaborate 16.1.2 Fundamental Components: Data, Models, and Stories 16.2 Use Operators: Read File, Formatter, and Producer 16.2.1 Read File Operator 16.2.2 Decode Table Operator 16.2.3 SAP Analytics Cloud Formatter 16.2.4 SAP Analytics Cloud Producer 16.3 Pipelines to Train, Predict, and Visualize Data 16.3.1 Using the Dataset API 16.3.2 Data Set Provision and Consumption 16.4 Summary Part IV System Management, Security, and Operations 17 Administration 17.1 System Management Command-Line Client Reference 17.1.1 Command-Line Client for SAP Data Intelligence 17.1.2 Using the VCTL Tool: JavaScript Utility 17.1.3 Useful Commands for Command-Line Client 17.2 Administration Applications 17.2.1 Administrator Access 17.2.2 System Management 17.2.3 License Management 17.2.4 Connection Management 17.3 Monitoring the SAP Data Intelligence Modeler 17.3.1 Monitoring the Status of Graph Execution 17.3.2 Tracing Messages to Isolate Problems and Errors 17.3.3 Downloading Diagnostic Information for Graphs 17.4 SAP Data Intelligence System Logging 17.4.1 Kubernetes Cluster-Level Logging Mechanism 17.4.2 Browsing Application Logs in the Diagnostics Kibana Web User Interface 17.4.3 Aggregating Logs in External Logging Service 17.5 System Diagnostics 17.5.1 SAP Data Intelligence Diagnostics: Diagnostics Grafana 17.5.2 Kubernetes Cluster Metrics 17.5.3 Integrating Diagnostics with External APM Solution 17.6 Summary 18 Security 18.1 Approach to Data Protection 18.1.1 Business Semantics for Industry-Specific Legislations 18.1.2 Functions for Data Privacy Compliance 18.1.3 Security Features for Data Protection and Privacy 18.2 Authenticating Services and Users 18.2.1 Roles and Scope-Driven User Access Control 18.2.2 SAP BTP User Account and Authentication 18.2.3 Self-Signed Certificate Authority and TLS 18.2.4 Leveraging Policy Management for Access Control 18.2.5 Enabling Security Features on Kubernetes Cluster 18.3 Securely Connecting On-Premise Systems 18.3.1 Cloud Connector 18.3.2 Site-to-Site Virtual Private Network 18.3.3 Virtual Private Cloud Peering 18.4 Summary 19 Maintenance 19.1 Understanding Operational Modes or Run Levels 19.2 Switching the Platform to Maintenance Mode 19.2.1 Enabling or Disabling Maintenance Mode 19.2.2 Restarting SAP Data Intelligence Services 19.2.3 Setting Up a Remote Connection to SAP 19.3 Increasing System Management Persistent Volume Size 19.3.1 Persistent Volume Error Handling 19.3.2 Changing the Persistent Storage Size of the SAP Vora Disk Engine 19.3.3 Changing the Buffer and File Size of the SAP Vora Disk Engine 19.4 Performing Backups 19.5 Summary 20 Application Lifecycle Management 20.1 Version Control System 20.2 Git 20.2.1 Git Basics and Terminology 20.2.2 Git Integration and CI/CD Process 20.2.3 Setting Up Your Environment for Git Workflows 20.3 Continuous Integration and Continuous Delivery 20.3.1 Continuous Integration Best Practices 20.3.2 Leveraging SAP Solutions for CI/CD 20.4 DevOps Fundamentals and Tools 20.4.1 The Core Tenets of DevOps 20.4.2 Implement Tooling for DevOps 20.4.3 DevOps for Hybrid Architectures 20.5 SAP Data Intelligence as the MLOps Platform 20.5.1 Production Lifecycle of Machine Learning Models 20.5.2 MLOps Challenges 20.5.3 MLOps Capabilities 20.6 Migrating from SAP Leonardo Machine Learning Foundation 20.6.1 Bring Your Own Model 20.6.2 Migrating the Training Data 20.6.3 Adding the Training Data to a Data Lake 20.7 Summary 21 Business Content and Use Cases 21.1 Digital Transformation and SAP Data Intelligence 21.2 Business Content by Industry 21.3 Finance Use Cases 21.4 Supply Chain Use Cases 21.5 Manufacturing Use Cases 21.6 Summary A Outlook and Roadmap A.1 Release Management A.2 Recent Innovations A.3 Roadmap Explorer A.4 Future Outlook B The Authors Index Service Pages Legal Notes
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.