Skip to content

Obtaining aggregate information of hashed directory #7

@ruffsl

Description

@ruffsl

It would be handy to obtain an aggregate of the information that dirhash used to compute the final hash for the root directory. For example, in the form of a ordered dictionary data structure that could be pretty printed to a yaml or json file. These printed files could be easily diffable, enabling use cases for logging or highlighting file tree or content changes to end users.

I see from the scantree examples, the .apply() function can be used for such recursive transforms:

Details
hello_count_tree =  tree.apply(
    file_apply=lambda path: {
        'name': path.name,
        'count': sum([
            w.lower() == 'hello'
            for w in path.as_pathlib().read_text().split()
        ])
    },
    dir_apply=lambda dir_: {
        'name': dir_.path.name,
        'count': sum(e['count'] for e in dir_.entries),
        'sub_counts': [e for e in dir_.entries]
    },
)
from pprint import pprint
pprint(hello_count_tree)
{'count': 3,
 'name': 'dir',
 'sub_counts': [{'count': 2, 'name': 'file1.txt'},
                {'count': 1,
                 'name': 'd1',
                 'sub_counts': [{'count': 1, 'name': 'file2.txt'},
                                {'count': 0,
                                 'name': 'd2',
                                 'sub_counts': [{'count': 0,
                                                 'name': 'file3.txt'}]}]}]}

However, the root_node internally computed for this is not easily accessed without reimementing much of the library internals.

root_node = scantree(

_, dirhash_ = root_node.apply(file_apply=file_apply, dir_apply=dir_apply)

I'm also unsure yet how to leverage scantree with the RecursionPath class to render/print this aggregate data structure.

https://github.com/andhus/scantree/blob/25b51ca9be973389d671565de20cbb021871521d/src/scantree/_path.py#L16

Related: colcon/colcon-package-selection#44

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions