-
Notifications
You must be signed in to change notification settings - Fork 3
Add LevelDB #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add LevelDB #5
Conversation
|
Please let us know if there is anything we can do to move this PR forward or to ease the review process. |
If you could clone me, that'd be great. Unfortunately this is a huge PR and I simply have not gotten around to looking at it yet. Between reviewing all other PRs and working on large PRs myself, too, I'm simply stretched thin. |
Schamper
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How you doin'? There's a lot of unnecessary patterns in here (class level type hints for no apparent reason, methods that could easily be inlined or more easily replaced by inheritance, you can take my comments on that in the earlier files as generic comments over the rest as well (it's very slow to review a large PR on GitHub).
|
|
||
| for record in self._leveldb.records: | ||
| if record.state == c_leveldb.RecordState.LIVE and ( | ||
| record.key[0:5] == b"META:" or record.key[0:11] == b"METAACCESS:" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| record.key[0:5] == b"META:" or record.key[0:11] == b"METAACCESS:" | |
| record.key.startswith((b"META:", b"METAACCESS:")) |
| meta_keys.setdefault(meta_key.key, []) | ||
| meta_keys[meta_key.key].append(meta_key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| meta_keys.setdefault(meta_key.key, []) | |
| meta_keys[meta_key.key].append(meta_key) | |
| meta_keys.setdefault(meta_key.key, []).append(meta_key) |
| for meta in meta_keys.values(): | ||
| yield Store(self, meta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for meta in meta_keys.values(): | |
| yield Store(self, meta) | |
| return [Store(self, meta) for meta in meta_keys.values()] |
|
|
||
| host: str | ||
| records: list[Key] | ||
| meta: list[MetaKey] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| host: str | |
| records: list[Key] | |
| meta: list[MetaKey] |
| @property | ||
| def records(self) -> Iterator[RecordKey]: | ||
| """Yield all records related to this store.""" | ||
|
|
||
| if self._records: | ||
| yield from self._records | ||
|
|
||
| # e.g. with "_https://google.com\x00\x01MyKey", the prefix would be "_https://google.com\x00" | ||
| prefix = RecordKey.prefix + self.host.encode("iso-8859-1") + b"\x00" | ||
| prefix_len = len(prefix) | ||
|
|
||
| for record in self._local_storage._leveldb.records: | ||
| if record.key[:prefix_len] == prefix: | ||
| key = RecordKey(self, record.key, record.value, record.state, record.sequence) | ||
| self._records.append(key) | ||
| yield key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things about this:
- A property generator doesn't feel very safe/stable
- The cache is dangerous: as soon as you do a single generator iteration (but don't exhaust the generator) you'll have an issue where it will only iterate the up-until-then read records.
- The cache in it's current implementation will yield duplicate records (it doesn't exit after reading from the cached records)
It's probably fine not caching this.
| ) | ||
|
|
||
|
|
||
| class BlinkHostObjectHandlerDecodeError(v8serialize.DecodeV8SerializeError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Importing this file will now fail when the dependency is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this more a LevelDB util?
| - https://github.com/protocolbuffers/protobuf/blob/main/python/google/protobuf/internal/decoder.py | ||
| """ | ||
|
|
||
| varint_limit: int = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just limit or LIMIT might be a better name.
| ] | ||
|
|
||
| leveldb = [ | ||
| "cramjam>=2.11.0,<3", # required for snappy decompression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can hopefully soon be replaced with dissect.util once that's merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't these files be in tests/_data/indexeddb for the IndexedDB ones?
This PR adds a LevelDB storage implementation to
dissect.database.Also adds support for serialization formats building on top of LevelDB: IndexedDB, and Chromium's LocalStorage and SessionStorage. Please let us know if these formats should be structured differently in this project.
Makes use of two (pure Python and/or Rust) dependencies: cramjam (for LevelDB Snappy decompression) and v8serialize (for IndexedDB v8 javascript object deserialization). We do not have the time or resources to port these dependencies to
dissect.utilordissect.*- hopefully these dependencies can be accepted.