Skip to content

BUG:Race Condition Risk in EmbeddingService Model Initialization #193

@g-k-s-03

Description

@g-k-s-03

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

Issue Overview

The model property in EmbeddingService lazy-loads the SentenceTransformer model without any thread-safety.
This can cause multiple concurrent requests to load the model at the same time, leading to memory waste, race conditions, or crashes in a multi-threaded or async environment.

Steps to Reproduce

  1. Start the backend service.
  2. Trigger multiple concurrent calls to any method that accesses EmbeddingService.model (e.g., get_embedding or get_embeddings).
  3. Observe that the model is loaded multiple times concurrently (check logs or GPU memory usage).
  4. Optionally, run stress tests with multiple async profile summaries to see potential blocking or crashes.

Expected Behavior
The model should be loaded only once, regardless of how many concurrent requests access it.
No race conditions or duplicate memory usage should occur.

Actual Behavior
Multiple threads or async tasks can instantiate the model multiple times concurrently.
This may lead to high memory usage, GPU resource exhaustion, or task failures.

Suggested Improvements
Use a thread-safe lock when lazy-loading the model:

import threading

class EmbeddingService:
_model_lock = threading.Lock()

@property
def model(self) -> SentenceTransformer:
    if self._model is None:
        with self._model_lock:
            if self._model is None:
                self._model = SentenceTransformer(self.model_name, device=self.device)
    return self._model

This ensures only one instance of the model is created, preventing race conditions and resource duplication.

Record

  • I agree to follow this project's Code of Conduct
  • I want to work on this issue

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions