Skip to content

This project implements a Key-Value storage engine as a console application. The system supports basic operations and builds upon the Log-Structured Merge-Tree (LSM) architecture to ensure efficient data persistence and retrieval.

Notifications You must be signed in to change notification settings

vasicm4/key-value-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NoSqlDB

Overview

This project implements a Key-Value storage engine as a console application. The system supports basic operations and builds upon the Log-Structured Merge-Tree (LSM) architecture to ensure efficient data persistence and retrieval.

Core Operations

  • PUT(key, value): Insert a new record. Keys are strings, values are byte arrays (or strings converted to byte arrays).
  • GET(key): Retrieve a value by key.
  • DELETE(key): Remove a record by key.

These operations rely on a well-defined write path (WAL → Memtable → SSTable) and read path (Memtable → Cache → SSTables with Bloom filters, Summary, Index, Data).

Components

Write Path

  • Write-Ahead Log (WAL): Records all operations for durability. Implemented as segmented logs with CRC integrity checks.
  • Memtable: In-memory structure (hash map, skip list, or B-tree) for fast inserts and deletes. Flushed to disk when full. Supports multiple instances (N-1 read-only, 1 read-write).
  • SSTable: Immutable disk structures containing Data, Bloom Filter, Index, Summary, and Metadata (Merkle tree for validation).

Read Path

  • Checks Memtable, then Cache, then queries SSTables using Bloom filters and index structures until the record is found.

LSM Trees

  • Organizes SSTables into levels.
  • Supports size tiered compaction algorithm.

Cache

  • Implemented with LRU policy to accelerate lookups.

Block Manager

  • Handles block-level disk I/O with Block Cache (LRU). Supports block sizes of 4KB, 8KB, or 16KB.

Extensions

  • External configuration via JSON for system parameters.
  • Rate limiting with Token Bucket algorithm.
  • Working with Probabilistic data structures: Bloom Filter, Count-Min Sketch, HyperLogLog, and SimHash (fingerprint + Hamming distance).

About

This project implements a Key-Value storage engine as a console application. The system supports basic operations and builds upon the Log-Structured Merge-Tree (LSM) architecture to ensure efficient data persistence and retrieval.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages