Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

parshvadaftari/vector-store

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

3 Commits

Repository files navigation

Vector Store

A small implementation of a Vector Store using Numpy to understand the working of semantic search with support for multiple similarity metrics.

Table of Contents

Introduction

This project provides a lightweight vector store for semantic search. It uses the sentence-transformers library to embed documents and queries, and numpy for efficient similarity calculations. The vector store supports multiple similarity metrics, including Euclidean, Cosine, and Manhattan distances.

Features

  • Embedding documents using sentence-transformers.
  • Support for multiple similarity metrics.
  • Efficient similarity calculations using numpy.
  • Batch query support.
  • Saving and loading the vector store to/from files.

Installation

To install the required dependencies, run:

pip install -r requirements.txt

Usage

Basic Usage

  1. Initialize the Vector Store:
from vectorstore.vectorstore import Vectorstore, SimMetric
from sentence_transformers import SentenceTransformer
from documents import document
docs = document.split('\n')
embedder = SentenceTransformer('all-MiniLM-L6-v2')
store = Vectorstore.from_docs(docs, embedder, similarity_metric=SimMetric.MANHATTAN)
## OR
store = Vectorstore(docs, embedder, similarity_metric=SimMetric.MANHATTAN)
store.build_store()
  1. Single Query:
query = "What happens when someone steps into the circle of birch trees during the solstice?"
results, exectime = store.search(query, k=3)
print("Top results:", results[0])
print(f"Search time: {exectime} ms\n")
  1. Multiple Queries:
queries = ["What does Rachel discover in the library, and who presents it to her?", "What unusual phenomenon is associated with the ancient bell in the village?"]
batch_results, batch_exectime = store.search(queries, k=2)
print("Batch results:")
print("Query 1:", batch_results[0], "\n")
print("Query 2:", batch_results[1])
print(f"Search time: {batch_exectime} ms\n")

Similarity Metrics

The vector store supports the following similarity metrics:

  • EUCLIDEAN: Euclidean distance.
  • COSINE: Cosine similarity.
  • MANHATTAN: Manhattan distance.

You can set the similarity metric when initializing the vector store:

store = Vectorstore.from_docs(docs, embedder, similarity_metric=SimMetric.COSINE)

Saving and Loading the Vector Store

You can save and load the vector store to/from files using the save_store and load_store methods:

# Save the vector store
store.save_store('vector_store.npz')
# Load the vector store
store.load_store('vector_store.npz')

About

A small implementation of Vector Store using Numpy inorder to understand the working.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

Languages

AltStyle によって変換されたページ (->オリジナル) /