Skip to content

Mlawrence95/postgres-vector-database-service

Repository files navigation

Vector Database Retrieval API

Pylint

Monitoring out of the box for API and Database

Middleware pipes FastAPI telemetry to Prometheus and Grafana. Grafana dashboard can be uploaded from infra/grafana/dashboard.json to see API metrics with no additional work.

Adapted from https://grafana.com/grafana/dashboards/16110-fastapi-observability/

Running and Debugging

Generally you can use commands from /scripts or code from /tests to isolate the issue. The DB can be run in isolation, but the API depends on the DB. Visit http://localhost:8080/docs when the service is up to view FastAPI docs.

Build containers with code snapshot

bash scripts/build_container.sh

Run the service

docker compose up

Run with hot reloads for API code (slow)

docker compose watch retrieval_service

Attach shell to the API service

docker compose exec retrieval_service /bin/bash 

Manually populate test database (Rerunning duplicates data entries)

docker compose exec retrieval_service \
    /bin/bash -c "source scripts/api_debug_setup.sh"

Reset database values

Data persists between sessions. Delete data like,

bash scripts/wipe_database.sh

Warning: This will wipe real data too! Only use on test runs.

Tear down

docker compose down

Monitor or restart services

docker compose ps 
docker compose restart retrieval_service

Dev environment set up

Prereqs:

  • Docker
  • Conda
conda env create -f environment.yaml
conda activate poetry_env
poetry install

Updating Python Dependencies

Add the dependency change to pyproject.toml then update the poetry.lock file like so:

conda activate poetry_env
poetry update

Finally, rebuild the Docker containers.

Ping Service

Single embedding

curl -X 'POST' \
  'http://localhost:8080/similar' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "embedding": [
    0.5,
    0.2,
    -0.1
  ],
  "k": 50,
  "metric": "cosine_distance"
}'

Supported metrics are max_inner_product, cosine_distance, and L2_distance.

Bulk inference

curl -X 'POST' \
  'http://localhost:8080/bulk_similar' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "embedding_list": [
    [0.5,0.2,-0.1],
    [0.1,0.89,-0.21]
    ],
  "k": 2,
  "metric": "cosine_distance"
}' \
| jq .

output:

{
  "most_similar": [
    [
      {
        "name": "sherbert",
        "embedding": [
          11.0,
          2.0,
          3.0
        ],
        "distance": 0.11676757169186391
      },
      {
        "name": "kuma",
        "embedding": [
          1.0,
          6.0,
          3.0
        ],
        "distance": 0.6231326316557115
      }
    ],
    [
      {
        "name": "kuma",
        "embedding": [
          1.0,
          6.0,
          3.0
        ],
        "distance": 0.22904390253150264
      },
      {
        "name": "mike",
        "embedding": [
          1.0,
          2.0,
          3.0
        ],
        "distance": 0.6368304002307885
      }
    ]
  ]
}

Load testing

To ping in bulk for testing,

python scripts/spam_requests.py --spam_seconds=10

Monitoring

Metrics from the Retrieval API are scraped by Prometheus and can be visualized in Grafana.

Links:

Existing features

  • Vector Database via custom extension on Postgres
  • API for querying the nearest neighbors
  • Limited upload support
  • Metrics/Monitoring dash via Prometheus/Grafana
    • Dashboard for API and Database
  • Bulk inference

Future work

About

Vector database service in Postgres with monitoring (docker swarm)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published