Graph Data Science with Neo4j: Learn how to use Neo4j 5 with Graph Data Science library 2.0 and its Python driver for your project
- Length: 288 pages
- Edition: 1
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2023-01-31
- ISBN-10: 180461274X
- ISBN-13: 9781804612743
- Sales Rank: #0 (See Top 100 Books)
Supercharge your data with the limitless potential of Neo4j 5, the premier graph database for cutting-edge machine learning
Purchase of the print or Kindle book includes a free PDF eBook
Key Features
- Extract meaningful information from graph data with Neo4j’s latest version 5
- Use Graph Algorithms into a regular Machine Learning pipeline in Python
- Learn the core principles of the Graph Data Science Library to make predictions and create data science pipelines.
Book Description
Neo4j, along with its Graph Data Science (GDS) library, is a complete solution to store, query, and analyze graph data. As graph databases are getting more popular among developers, data scientists are likely to face such databases in their career, making it an indispensable skill to work with graph algorithms for extracting context information and improving the overall model prediction performance.
Data scientists working with Python will be able to put their knowledge to work with this practical guide to Neo4j and the GDS library that offers step-by-step explanations of essential concepts and practical instructions for implementing data science techniques on graph data using the latest Neo4j version 5 and its associated libraries. You’ll start by querying Neo4j with Cypher and learn how to characterize graph datasets. As you get the hang of running graph algorithms on graph data stored into Neo4j, you’ll understand the new and advanced capabilities of the GDS library that enable you to make predictions and write data science pipelines. Using the newly released GDSL Python driver, you’ll be able to integrate graph algorithms into your ML pipeline.
By the end of this book, you’ll be able to take advantage of the relationships in your dataset to improve your current model and make other types of elaborate predictions.
What you will learn
- Use the Cypher query language to query graph databases such as Neo4j
- Build graph datasets from your own data and public knowledge graphs
- Make graph-specific predictions such as link prediction
- Explore the latest version of Neo4j to build a graph data science pipeline
- Run a scikit-learn prediction algorithm with graph data
- Train a predictive embedding algorithm in GDS and manage the model store
Who this book is for
If you’re a data scientist or data professional with a foundation in the basics of Neo4j and are now ready to understand how to build advanced analytics solutions, you’ll find this graph data science book useful. Familiarity with the major components of a data science project in Python and Neo4j is necessary to follow the concepts covered in this book.
Graph Data Science with Neo4j Contributors About the author About the reviewers Preface Who this book is for What this book covers To get the most out of this book Download the example code files Conventions used Get in touch Share Your Thoughts Download a free PDF copy of this book Part 1 – Creating Graph Data in Neo4j Chapter 1: Introducing and Installing Neo4j Technical requirements What is a graph database? Databases Graph database Finding or creating a graph database A note about the graph dataset’s format Modeling your data as a graph Neo4j in the graph databases landscape Neo4j ecosystem Setting up Neo4j Downloading and starting Neo4j Desktop Creating our first Neo4j database Creating a database in the cloud – Neo4j Aura Inserting data into Neo4j with Cypher, the Neo4j query language Extracting data from Neo4j with Cypher pattern matching Summary Further reading Exercises Chapter 2: Importing Data into Neo4j to Build a Knowledge Graph Technical requirements Importing CSV data into Neo4j with Cypher Discovering the Netflix dataset Defining the graph schema Importing data Introducing the APOC library to deal with JSON data Browsing the dataset Getting to know and installing the APOC plugin Loading data Dealing with temporal data Discovering the Wikidata public knowledge graph Data format Query language – SPARQL Enriching our graph with Wikidata information Loading data into Neo4j for one person Importing data for all people Dealing with spatial data in Neo4j Importing data in the cloud Summary Further reading Exercises Part 2 – Exploring and Characterizing Graph Data with Neo4j Chapter 3: Characterizing a Graph Dataset Technical requirements Characterizing a graph from its node and edge properties Link direction Link weight Node type Computing the graph degree distribution Definition of a node’s degree Computing the node degree with Cypher Visualizing the degree distribution with NeoDash Installing and using the Neo4j Python driver Counting node labels and relationship types in Python Building the degree distribution of a graph Improved degree distribution Learning about other characterizing metrics Triangle count Clustering coefficient Summary Further reading Exercises Chapter 4: Using Graph Algorithms to Characterize a Graph Dataset Technical requirements Digging into the Neo4j GDS library GDS content Installing the GDS library with Neo4j Desktop GDS project workflow Projecting a graph for use by GDS Native projections Cypher projections Computing a node’s degree with GDS stream mode The YIELD keyword write mode mutate mode Algorithm configuration Other centrality metrics Understanding a graph’s structure by looking for communities Number of components Modularity and the Louvain algorithm Summary Further reading Chapter 5: Visualizing Graph Data Technical requirements The complexity of graph data visualization Physical networks General case Visualizing a small graph with networkx and matplotlib Visualizing a graph with known coordinates Visualizing a graph with unknown coordinates Configuring object display Discovering the Neo4j Bloom graph application What is Bloom? Bloom installation Selecting data with Neo4j Bloom Configuring the scene in Bloom Visualizing large graphs with Gephi Installing Gephi and its required plugin Using APOC Extended to synchronize Neo4j and Gephi Configuring the view in Gephi Summary Further reading Exercises Part 3 – Making Predictions on a Graph Chapter 6: Building a Machine Learning Model with Graph Features Technical requirements Introducing the GDS Python client GDS Python principles Input and output types Creating a projected graph from Python Running GDS algorithms from Python and extracting data in a dataframe write mode stream mode Dropping the projected graph Using features from graph algorithms in a scikit-learn pipeline Machine learning tasks with graphs Our task Computing features Extracting and visualizing data Building the model Summary Further reading Exercise Chapter 7: Automatically Extracting Features with Graph Embeddings for Machine Learning Technical requirements Introducing graph embedding algorithms Defining embeddings Graph embedding classification Using a transductive graph embedding algorithm Understanding the Node2Vec algorithm Using Node2Vec with GDS Training an inductive embedding algorithm Understanding GraphSAGE Introducing the GDS model catalog Training GraphSAGE with GDS Computing new node representations Summary Further reading Exercises Chapter 8: Building a GDS Pipeline for Node Classification Model Training Technical requirements The GDS pipelines What is a pipeline? Building and training a pipeline Creating the pipeline and choosing the features Setting the pipeline configuration Training the pipeline Making predictions Computing the confusion matrix Using embedding features Choosing the graph embedding algorithm to use Training using Node2Vec Training using GraphSAGE Summary Further reading Exercise Chapter 9: Predicting Future Edges Technical requirements Introducing the LP problem LP examples LP with the Netflix dataset Framing an LP problem LP features Topological features Features based on node properties Building an LP pipeline with the GDS Creating and configuring the pipeline Pipeline training and testing Summary Further reading Chapter 10: Writing Your Custom Graph Algorithms with the Pregel API in Java Technical requirements Introducing the Pregel API GDS’s features The Pregel API Implementing the PageRank algorithm The PageRank algorithm Simple Python implementation Pregel Java implementation Implementing the tolerance-stopping criteria Testing our code Test for the PageRank class Test for the PageRankTol class Using our algorithm from Cypher Adding annotations Building the JAR file Updating the Neo4j configuration Testing our procedure Summary Further reading Exercises Index Why subscribe? Other Books You May Enjoy Packt is searching for authors like you Share Your Thoughts Download a free PDF copy of this book
Donate to keep this site alive
How to download source code?
1. Go to: https://github.com/PacktPublishing
2. In the Find a repository… box, search the book title: Graph Data Science with Neo4j: Learn how to use Neo4j 5 with Graph Data Science library 2.0 and its Python driver for your project
, sometime you may not get the results, please search the main title.
3. Click the book title in the search results.
3. Click Code to download.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.