Use vector search with Spanner Graph

This page describes how to use vector search in Spanner Graph to find K-nearest neighbors (KNN) and approximate nearest neighbors (ANN). You can use vector distance functions to perform KNN and ANN vector search for use cases like similarity search or retrieval-augmented generation for generative AI applications.

Spanner Graph supports the following distance functions to perform KNN vector similarity search:

  • COSINE_DISTANCE(): measures the shortest distance between two vectors.
  • EUCLIDEAN_DISTANCE(): measures the cosine of the angle between two vectors.
  • DOT_PRODUCT(): calculates the cosine of the angle multiplied by the product of corresponding vector magnitudes. If you know that all the vector embeddings in your dataset are normalized, then you can use DOT_PRODUCT() as a distance function.

For more information, see Perform vector similarity search in Spanner by finding the K-nearest neighbors.

Spanner Graph also supports the following approximate distance functions to perform ANN vector similarity search:

  • APPROX_COSINE_DISTANCE: measures the approximate shortest distance between two vectors.
  • APPROX_EUCLIDEAN_DISTANCE: measures the approximate cosine of the angle between two vectors.
  • APPROX_DOT_PRODUCT: calculates the approximate cosine of the angle multiplied by the product of corresponding vector magnitudes. If you know that all the vector embeddings in your dataset are normalized, then you can use DOT_PRODUCT() as a distance function.

For more information, see Find approximate nearest neighbors, create vector index, and query vector embeddings.

Before you begin

To run the examples in this document, you must first follow the steps in Set up and query Spanner Graph to do the following:

  1. Create an instance.
  2. Create a database with a Spanner Graph schema.
  3. Insert essential graph data.

After you insert the essential graph data, make the following updates to your database.

Insert additional vector data in graph database

To make the required updates to your graph database, do the following:

  1. Add a new column, nick_name_embeddings, to the Account input table.

    ALTERTABLEAccount
    ADDCOLUMNnick_name_embeddingsARRAY<FLOAT32>(vector_length=>4);
    
  2. Add data to the nick_name column.

    UPDATEAccountSETnick_name="Fund for a refreshing tropical vacation"WHEREid=7;
    UPDATEAccountSETnick_name="Fund for a rainy day!"WHEREid=16;
    UPDATEAccountSETnick_name="Saving up for travel"WHEREid=20;
    
  3. Create embeddings for the text in the nick_name column, and populate them into the new nick_name_embeddings column.

    To generate Vertex AI embeddings for your operational data in Spanner Graph, see Get Vertex AI text embeddings.

    For illustrative purposes, our examples use artificial, low-dimensional vector values.

    UPDATEAccountSETnick_name_embeddings=ARRAY<FLOAT32>[0.3,0.5,0.8,0.7]WHEREid=7;
    UPDATEAccountSETnick_name_embeddings=ARRAY<FLOAT32>[0.4,0.9,0.7,0.1]WHEREid=16;
    UPDATEAccountSETnick_name_embeddings=ARRAY<FLOAT32>[0.2,0.5,0.6,0.6]WHEREid=20;
    
  4. Add two new columns to the AccountTransferAccount input table: notes and notes_embeddings.

    ALTERTABLEAccountTransferAccount
    ADDCOLUMNnotesSTRING(MAX);
    ALTERTABLEAccountTransferAccount
    ADDCOLUMNnotes_embeddingsARRAY<FLOAT32>(vector_length=>4);
    
  5. Create embeddings for the text in the notes column, and populate them into the notes_embeddings column.

    To generate Vertex AI embeddings for your operational data in Spanner Graph, see Get Vertex AI text embeddings.

    For illustrative purposes, our examples use artificial, low-dimensional vector values.

    UPDATEAccountTransferAccount
    SETnotes="for shared cost of dinner",
    notes_embeddings=ARRAY<FLOAT32>[0.3,0.5,0.8,0.7]
    WHEREid=16ANDto_id=20;
    UPDATEAccountTransferAccount
    SETnotes="fees for tuition",
    notes_embeddings=ARRAY<FLOAT32>[0.1,0.9,0.1,0.7]
    WHEREid=20ANDto_id=7;
    UPDATEAccountTransferAccount
    SETnotes='loved the lunch',
    notes_embeddings=ARRAY<FLOAT32>[0.4,0.5,0.7,0.9]
    WHEREid=20ANDto_id=16;
    
  6. After adding new columns to the Account and AccountTransferAccount input tables, update the property graph definition using the following statements. For more information, see Update existing node or edge definitions.

    CREATEORREPLACEPROPERTYGRAPHFinGraph
    NODETABLES(Account,Person)
    EDGETABLES(
    PersonOwnAccount
    SOURCEKEY(id)REFERENCESPerson(id)
    DESTINATIONKEY(account_id)REFERENCESAccount(id)
    LABELOwns,
    AccountTransferAccount
    SOURCEKEY(id)REFERENCESAccount(id)
    DESTINATIONKEY(to_id)REFERENCESAccount(id)
    LABELTransfers
    );
    

Find K-nearest neighbors

In the following example, use the EUCLIDEAN_DISTANCE() function to perform KNN vector search on the nodes and edges of your graph database.

Perform KNN vector search on graph nodes

You can perform a KNN vector search on the nick_name_embeddings property of the Account node. This KNN vector search returns the account owner's name and the account's nick_name. In the following example, the result shows the top two K-nearest neighbors for accounts for leisure travel and vacation, which is represented by the [0.2, 0.4, 0.9, 0.6] vector embedding.

GRAPHFinGraph
MATCH(p:Person)-[:Owns]->(a:Account)
RETURNp.name,a.nick_name
ORDERBYEUCLIDEAN_DISTANCE(a.nick_name_embeddings,
-- An illustrative embedding for 'accounts for leisure travel and vacation'
ARRAY<FLOAT32>[0.2,0.4,0.9,0.6])
LIMIT2;

Results

name nick_name
Alex Fund for a refreshing tropical vacation
Dana Saving up for travel

Perform KNN vector search on graph edges

You can perform a KNN vector search on the notes_embeddings property of the Owns edge. This KNN vector search returns the account owner's name and the transfer's notes. In the following example, the result shows the top two K-nearest neighbors for food expenses, which is represented by the [0.2, 0.4, 0.9, 0.6] vector embedding.

GRAPHFinGraph
MATCH(p:Person)-[:Owns]->(:Account)-[t:Transfers]->(:Account)
WHEREt.notes_embeddingsISNOTNULL
RETURNp.name,t.notes
ORDERBYEUCLIDEAN_DISTANCE(t.notes_embeddings,
-- An illustrative vector embedding for 'food expenses'
ARRAY<FLOAT32>[0.2,0.4,0.9,0.6])
LIMIT2;

Results

name notes
Lee for shared cost of dinner
Dana loved the lunch

Create a vector index and find approximate nearest neighbors

To perform an ANN search, you must create a specialized vector index that Spanner Graph uses to accelerate the vector search. The vector index must use a specific distance metric. You can choose the distance metric most appropriate for your use case by setting the distance_type parameter to one of COSINE, DOT_PRODUCT or EUCLIDEAN. For more information, see VECTOR INDEX statements.

In the following example, you create a vector index using the euclidean distance type on the nick_name_embedding column of the Account input table:

CREATEVECTORINDEXNickNameEmbeddingIndex
ONAccount(nick_name_embeddings)
WHEREnick_name_embeddingsISNOTNULL
OPTIONS(distance_type='EUCLIDEAN',tree_depth=2,num_leaves=1000);

Perform ANN vector search on graph nodes

After you create a vector index, you can perform a ANN vector search on the nick_name property of the Account node. The ANN vector search returns the account owner's name and the account's nick_name. In the following example, the result shows the top two approximate nearest neighbors for accounts for leisure travel and vacation, which is represented by the [0.2, 0.4, 0.9, 0.6] vector embedding.

The graph hint forces the query optimizer to use the specified, vector index in the query execution plan.

GRAPHFinGraph
MATCH(@{FORCE_INDEX=NickNameEmbeddingIndex}a:Account)
WHEREa.nick_name_embeddingsISNOTNULL
RETURNa,APPROX_EUCLIDEAN_DISTANCE(a.nick_name_embeddings,
-- An illustrative embedding for 'accounts for leisure travel and vacation'
ARRAY<FLOAT32>[0.2,0.4,0.9,0.6],
options=>JSON'{"num_leaves_to_search": 10}')ASdistance
ORDERBYdistance
LIMIT2
NEXT
MATCH(p:Person)-[:Owns]->(a)
RETURNp.name,a.nick_name;

Results

name nick_name
Alex Fund for a refreshing tropical vacation
Dana Saving up for travel

What's next

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025年11月24日 UTC.