[フレーム]
Docs
Neo4j DBMS
Neo4j Aura
Neo4j Tools
Neo4j Graph Data Science
Cypher Query Language
Generative AI
Create applications
Connect data sources
Labs
GenAI Ecosystem
Developer Tools
Frameworks & Integrations
RDF & Linked Data
Get Help
Community Forum
Discord Chat
Product Support
Neo4j Developer Blog
Neo4j Videos
GraphAcademy
Beginners Courses
Data Scientist Courses
Generative AI Courses
Neo4j Certification
Get Started Free
Search
Skip to content
Raise an issue

Similarity functions

Definitions

The Neo4j GDS library provides a set of measures that can be used to calculate similarity between two arrays ps, pt of numbers.

The similarity functions can be classified into two groups. The first is categorical measures which treat the arrays as sets and calculate similarity based on the intersection between the two sets. The second is numerical measures which compute similarity based on how close the numbers at each position are to each other.

Similarity Function name Formula Type Value range

gds.similarity.jaccard

jacard

Categorical

[0,1]

gds.similarity.overlap

overlap

Categorical

[0, 1]

gds.similarity.cosine

cosine

Numerical

[-1, 1]

gds.similarity.pearson

pearson

Numerical

[-1, 1]

gds.similarity.euclideanDistance

ed

Numerical

[0, ∞)

gds.similarity.euclidean

euclidean

Numerical

(0, 1]

Examples

An example of usage for each function is provided below:

Jaccard similarity function
RETURN gds.similarity.jaccard(
 [1.0, 5.0, 3.0, 6.7],
 [5.0, 2.5, 3.1, 9.0]
) AS jaccardSimilarity
Table 1. Results
jaccardSimilarity

0.142857142857143

Overlap similarity function
RETURN gds.similarity.overlap(
 [1.0, 5.0, 3.0, 6.7],
 [5.0, 2.5, 3.1, 9.0]
) AS overlapSimilarity
Table 2. Results
overlapSimilarity

0.25

Cosine similarity function
RETURN gds.similarity.cosine(
 [1.0, 5.0, 3.0, 6.7],
 [5.0, 2.5, 3.1, 9.0]
) AS cosineSimilarity
Table 3. Results
cosineSimilarity

0.882757381034594

Pearson similarity function
RETURN gds.similarity.pearson(
 [1.0, 5.0, 3.0, 6.7],
 [5.0, 2.5, 3.1, 9.0]
) AS pearsonSimilarity
Table 4. Results
pearsonSimilarity

0.468277483648113

Euclidean similarity function
RETURN gds.similarity.euclidean(
 [1.0, 5.0, 3.0, 6.7],
 [5.0, 2.5, 3.1, 9.0]
) AS euclideanSimilarity
Table 5. Results
euclideanSimilarity

0.160030485454022

Euclidean distance function
RETURN gds.similarity.euclideanDistance(
 [1.0, 5.0, 3.0, 6.7],
 [5.0, 2.5, 3.1, 9.0]
) AS euclideanDistance
Table 6. Results
euclideanDistance

5.248809388804284

The functions can also compute results when one or more values in the provided vectors are null. In the case of functions based on intersection such as Jaccard or Overlap, the null values are excluded from the set and the computation. In the rest of the functions the null value is replaced with a 0.0 value. See the examples below.

Jaccard with null values
RETURN gds.similarity.jaccard(
 [1.0, null, 3.0],
 [1.0, 2.0, 3.0]
) AS jaccardSimilarity
Table 7. Results
jaccardSimilarity

0.666666666666667

Cosine with null values
RETURN gds.similarity.cosine(
 [1.0, null, 3.0],
 [1.0, 2.0, 3.0]
) AS cosineSimilarity
Table 8. Results
cosineSimilarity

0.845154254728517

AltStyle によって変換されたページ (->オリジナル) /