Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

deadbits/vector-embedding-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

28 Commits

Repository files navigation

vector-embedding-api

vector-embedding-apiprovides a Flask API server and client to generate text embeddings using either OpenAI's embedding model or the SentenceTransformers library. The API server now supports in-memory LRU caching for faster retrievals, batch processing for handling multiple texts at once, and a health status endpoint for monitoring the server status.

SentenceTransformers supports over 500 models via HuggingFace Hub.

Features 🎯

  • POST endpoint to create text embeddings
    • sentence_transformers
    • OpenAI text-embedding-ada-002
  • In-memory LRU cache for quick retrieval of embeddings
  • Batch processing to handle multiple texts in a single request
  • Easy setup with configuration file
  • Health status endpoint
  • Python client utility for submitting text or files

Installation πŸ’»

To run this server locally, follow the steps below:

Clone the repository: πŸ“¦

git clone https://github.com/deadbits/vector-embedding-api.git
cd vector-embedding-api

Set up a virtual environment (optional but recommended): 🐍

virtualenv -p /usr/bin/python3.10 venv
source venv/bin/activate

Install the required dependencies: πŸ› οΈ

pip install -r requirements.txt

Usage

Modify the server.conf configuration file: βš™οΈ

[main]
openai_api_key = YOUR_OPENAI_API_KEY
sent_transformers_model = sentence-transformers/all-MiniLM-L6-v2
use_cache = true/false

Start the server: πŸš€

python server.py

The server should now be running on http://127.0.0.1:5000/.

API Endpoints 🌐

Client Usage

A small Python client is provided to assist with submitting text strings or files.

Usage python3 client.py -t "Your text here" -m local

python3 client.py -f /path/to/yourfile.txt -m openai

POST /submit

Submits an individual text string or a list of text strings for embedding generation.

Request Parameters

  • text: The text string or list of text strings to generate the embedding for. (Required)
  • model: Type of model to be used, either local for SentenceTransformer models or openai for OpenAI's model. Default is local.

Response

  • embedding: The generated embedding array.
  • status: Status of the request, either success or error.
  • elapsed: The elapsed time taken for generating the embedding (in milliseconds).
  • model: The model used to generate the embedding.
  • cache: Boolean indicating if the result was retrieved from cache. (Optional)
  • message: Error message if the status is error. (Optional)

GET /health

Checks the server's health status.

Response

  • cache.enabled: Boolean indicating status of the cache
  • cache.max_size: Maximum cache size
  • cache.size: Current cache size
  • models.openai: Boolean indicating if OpenAI embeddings are enabled. (Optional)
  • models.sentence-transformers: Name of sentence-transformers model in use.
{
 "cache": {
 "enabled": true,
 "max_size": 500,
 "size": 0
 },
 "models": {
 "openai": true,
 "sentence-transformers": "sentence-transformers/all-MiniLM-L6-v2"
 }
}

Example Usage

Send a POST request to the /submit endpoint with JSON payload:

{
 "text": "Your text here",
 "model": "local"
}
// multi text submission
{
 "text": ["Text1 goes here", "Text2 goes here"], 
 "model": "openai"
}

You'll receive a response containing the embedding and additional information:

[
 {
 "embedding": [...],
 "status": "success",
 "elapsed": 123,
 "model": "sentence-transformers/all-MiniLM-L6-v2"
 }
]
[
 {
 "embedding": [...],
 "status": "success",
 "elapsed": 123,
 "model": "openai"
 }, 
 {
 "embedding": [...],
 "status": "success",
 "elapsed": 123,
 "model": "openai"
 }, 
]

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /