Add LevelDB #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

JSCU-CNI wants to merge 7 commits into fox-it:main from JSCU-CNI:leveldb

JSCU-CNI commented Sep 29, 2025

This PR adds a LevelDB storage implementation to dissect.database.

Also adds support for serialization formats building on top of LevelDB: IndexedDB, and Chromium's LocalStorage and SessionStorage. Please let us know if these formats should be structured differently in this project.

Makes use of two (pure Python and/or Rust) dependencies: cramjam (for LevelDB Snappy decompression) and v8serialize (for IndexedDB v8 javascript object deserialization). We do not have the time or resources to port these dependencies to dissect.util or dissect.* - hopefully these dependencies can be accepted.

JSCU-CNI added 5 commits

September 29, 2025 11:33


          add LevelDB implementation

7e64a05


          add LevelDB test data

f8ed9f8


          Add IndexedDB

e64461c


          fix linter

f156f35


          Add Chromium LocalStorage and SessionStorage

b534c63

This was referenced Sep 29, 2025

Add LocalStorage and SessionStorage functions to Chromium browsers fox-it/dissect.target#1343

Open

Unable to deserialize nested objects h4l/v8serialize#8

Closed

h4l reviewed

View reviewed changes

dissect/database/util/blink.py Outdated Show resolved Hide resolved

JSCU-CNI added 2 commits

October 28, 2025 15:52


          implement review feedback

16261e7


          Merge branch 'main' into leveldb

7e37673

Author

JSCU-CNI commented Nov 5, 2025

Please let us know if there is anything we can do to move this PR forward or to ease the review process.

Member

Schamper commented Nov 5, 2025

Please let us know if there is anything we can do to move this PR forward or to ease the review process.

If you could clone me, that'd be great.

Unfortunately this is a huge PR and I simply have not gotten around to looking at it yet. Between reviewing all other PRs and working on large PRs myself, too, I'm simply stretched thin.
What could help is if you could provide me with a prioritization of PRs from your side, so I can look at them in that order.

Schamper requested changes

View reviewed changes

Member

Schamper left a comment

How you doin'? There's a lot of unnecessary patterns in here (class level type hints for no apparent reason, methods that could easily be inlined or more easily replaced by inheritance, you can take my comments on that in the earlier files as generic comments over the rest as well (it's very slow to review a large PR on GitHub).

dissect/database/chromium/localstorage/localstorage.py

    
                      for record in self._leveldb.records:

                          if record.state == c_leveldb.RecordState.LIVE and (

                              record.key[0:5] == b"META:" or record.key[0:11] == b"METAACCESS:"

Member

Schamper Dec 9, 2025

Suggested change

      
                            record.key[0:5] == b"META:" or record.key[0:11] == b"METAACCESS:"
          
                            record.key.startswith((b"META:", b"METAACCESS:"))

dissect/database/chromium/localstorage/localstorage.py

Comment on lines +52 to +53

    
                              meta_keys.setdefault(meta_key.key, [])

                              meta_keys[meta_key.key].append(meta_key)

Member

Schamper Dec 9, 2025

Suggested change

      
                            meta_keys.setdefault(meta_key.key, [])
          
                            meta_keys[meta_key.key].append(meta_key)
          
                            meta_keys.setdefault(meta_key.key, []).append(meta_key)

dissect/database/chromium/localstorage/localstorage.py

Comment on lines +55 to +56

    
                      for meta in meta_keys.values():

                          yield Store(self, meta)

Member

Schamper Dec 9, 2025

Suggested change

      
                    for meta in meta_keys.values():
          
                        yield Store(self, meta)
          
                    return [Store(self, meta) for meta in meta_keys.values()]

dissect/database/chromium/localstorage/localstorage.py

Comment on lines +68 to +71

    
                  host: str

                  records: list[Key]

                  meta: list[MetaKey]

Member

Schamper Dec 9, 2025

Suggested change

      
                host: str
          
                records: list[Key]
          
                meta: list[MetaKey]

dissect/database/chromium/localstorage/localstorage.py

Comment on lines +83 to +98

    
                  @property

                  def records(self) -> Iterator[RecordKey]:

                      """Yield all records related to this store."""

                      if self._records:

                          yield from self._records

                      # e.g. with "_https://google.com\x00\x01MyKey", the prefix would be "_https://google.com\x00"

                      prefix = RecordKey.prefix + self.host.encode("iso-8859-1") + b"\x00"

                      prefix_len = len(prefix)

                      for record in self._local_storage._leveldb.records:

                          if record.key[:prefix_len] == prefix:

                              key = RecordKey(self, record.key, record.value, record.state, record.sequence)

                              self._records.append(key)

                              yield key

Member

Schamper Dec 9, 2025

A few things about this:

A property generator doesn't feel very safe/stable
The cache is dangerous: as soon as you do a single generator iteration (but don't exhaust the generator) you'll have an issue where it will only iterate the up-until-then read records.
The cache in it's current implementation will yield duplicate records (it doesn't exit after reading from the cached records)

It's probably fine not caching this.

dissect/database/util/blink.py

    
                  )

              class BlinkHostObjectHandlerDecodeError(v8serialize.DecodeV8SerializeError):

Member

Schamper Dec 9, 2025

Importing this file will now fail when the dependency is missing.

dissect/database/util/blink.py

Member

Schamper Dec 9, 2025

Isn't this more a LevelDB util?

dissect/database/util/protobuf.py

    
                      - https://github.com/protocolbuffers/protobuf/blob/main/python/google/protobuf/internal/decoder.py

                  """

                  varint_limit: int = 10

Member

Schamper Dec 9, 2025

Just limit or LIMIT might be a better name.

pyproject.toml

    
              ]

              leveldb = [

                  "cramjam>=2.11.0,<3",  # required for snappy decompression

Member

Schamper Dec 9, 2025

Can hopefully soon be replaced with dissect.util once that's merged.

tests/_data/leveldb/indexeddb/larger/file__0.indexeddb.leveldb/000005.ldb

Member

Schamper Dec 9, 2025

Shouldn't these files be in tests/_data/indexeddb for the IndexedDB ones?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet