Basic workflow

One of the most common problems on graphs is finding the shortest path between nodes. This example shows how to create a GDS graph from Neo4j data, run a path finding algorithm, and write the results back to Neo4j.

Create the graph

The following Cypher query creates the graph of a small train network in the Neo4j database. Each relationship includes a distance numerical property representing the distance between two stations.

CREATE
 // Add the stations
 (a:Station {name: 'Kings Cross'}),
 (b:Station {name: 'Euston'}),
 (c:Station {name: 'Camden Town'}),
 (d:Station {name: 'Mornington Crescent'}),
 (e:Station {name: 'Kentish Town'}),
 // Add the connections between stations
 (a)-[:CONNECTION {distance: 0.7}]->(b),
 (b)-[:CONNECTION {distance: 1.3}]->(c),
 (b)-[:CONNECTION {distance: 0.7}]->(d),
 (d)-[:CONNECTION {distance: 0.6}]->(c),
 (c)-[:CONNECTION {distance: 1.3}]->(e)

The graph looks as follows:

Tube graph.

The next query creates an in-memory graph called trainGraph from the :Station nodes and the :CONNECTION relationships with their distance property.

MATCH (source:Station)-[r:CONNECTION]->(target:Station)
RETURN gds.graph.project(
 'trainGraph',
 source,
 target,
 { relationshipProperties: r { .distance } }
)

Run the algorithm in `stream` mode

A good first candidate algorithm to calculate the shortest path in a graph is the Dijkstra Source-Target Shortest Path algorithm. To try it out, use it with the stream mode to see the result in the query output.

MATCH (1)
 (source:Station {name: 'Kings Cross'}),
 (target:Station {name: 'Kentish Town'})
CALL gds.shortestPath.dijkstra.stream( (2)
 'trainGraph', (3)
 { (4)
 sourceNode: source,
 targetNode: target,
 relationshipWeightProperty: 'distance'
 }
)
YIELD (5)
 index,
 sourceNode,
 targetNode,
 totalCost,
 path
RETURN (6)
 index,
 gds.util.asNode(sourceNode).name AS sourceNodeName,
 gds.util.asNode(targetNode).name AS targetNodeName,
 totalCost,
 nodes(path) AS path
ORDER BY index

1 MATCH clause to define the source and target nodes.

2 gds.shortestPath.dijkstra algorithm running in stream mode.

3 Name of the projected graph to run the algorithm on.

4 Configuration parameters listed in the Syntax section of the algorithm (Stream mode panel).

5 Result fields listed in the Syntax section of the algorithm (Stream mode panel). Include only the ones you need.

6 Query result fields, typically the result fields from the YIELD clause wrapped in Cypher functions. The gds.util.asNode() function retrieves the Neo4j node corresponding to a projected node. In the case of path finding algorithms, the nodes() Cypher function is useful to return a node path as a list of nodes.

Query result
index	sourceNodeName	targetNodeName	totalCost	path
0	"Kings Cross"	"Kentish Town"	3.3	[Node[0], Node[1], Node[2], Node[4]]

Write the results

If the results of the algorithm are as expected, the next step can be to write them back to the Neo4j database. The following query is very similar to the stream query, except for the addition of some configuration parameters specific to the write mode and the different format of the result.

MATCH (1)
 (source:Station {name: 'Kings Cross'}),
 (target:Station {name: 'Kentish Town'})
CALL gds.shortestPath.dijkstra.write( (2)
 'trainGraph', (3)
 { (4)
 sourceNode: source,
 targetNode: target,
 relationshipWeightProperty: 'distance',
 writeRelationshipType: 'PATH',
 writeNodeIds: true,
 writeCosts: true
 }
)
YIELD relationshipsWritten
RETURN relationshipsWritten

1 MATCH clause to define the source and target nodes.

2 gds.shortestPath.dijkstra algorithm running in write mode.

3 Name of the projected graph.

4 Configuration parameters listed in the Syntax section of the algorithm (Write mode panel). In this case, the three parameters writeRelationshipType, writeNodeIds, and writeCosts are used to create the new :PATH relationship with its totalCost, nodeIds, and costs properties.

Query result
relationshipsWritten
1

Query the Neo4j database

To check that the results of the algorithm have been correctly written back to Neo4j, you can run a Cypher query that includes the new relationships and relationship properties written in the previous step (in this case, the PATH relationship with its nodeIds, costs, and totalCost properties).

MATCH (source)-[r:PATH]->(target)
RETURN
 source.name,
 [nodeId IN r.nodeIds | gds.util.asNode(nodeId).name] AS nodeNames,
 r.costs,
 r.totalCost,
 target.name

Query result
source.name	nodeNames	r.costs	r.totalCost	target.name
"Kings Cross"	["Kings Cross", "Euston", "Camden Town", "Kentish Town"]	[0.0, 0.7, 2.0, 3.3]	3.3	"Kentish Town"

Next steps

This example covers the basics of using a GDS algorithm. The next example shows a complete end-to-end workflow, including the use of the output of an algorithm with another algorithm.