-
Notifications
You must be signed in to change notification settings - Fork 15
Obtaining aggregate information of hashed directory #7
Description
It would be handy to obtain an aggregate of the information that dirhash used to compute the final hash for the root directory. For example, in the form of a ordered dictionary data structure that could be pretty printed to a yaml or json file. These printed files could be easily diffable, enabling use cases for logging or highlighting file tree or content changes to end users.
I see from the scantree examples, the .apply() function can be used for such recursive transforms:
Details
hello_count_tree = tree.apply(
file_apply=lambda path: {
'name': path.name,
'count': sum([
w.lower() == 'hello'
for w in path.as_pathlib().read_text().split()
])
},
dir_apply=lambda dir_: {
'name': dir_.path.name,
'count': sum(e['count'] for e in dir_.entries),
'sub_counts': [e for e in dir_.entries]
},
)
from pprint import pprint
pprint(hello_count_tree){'count': 3,
'name': 'dir',
'sub_counts': [{'count': 2, 'name': 'file1.txt'},
{'count': 1,
'name': 'd1',
'sub_counts': [{'count': 1, 'name': 'file2.txt'},
{'count': 0,
'name': 'd2',
'sub_counts': [{'count': 0,
'name': 'file3.txt'}]}]}]}However, the root_node internally computed for this is not easily accessed without reimementing much of the library internals.
dirhash-python/src/dirhash/__init__.py
Line 269 in 37c8974
| root_node = scantree( |
dirhash-python/src/dirhash/__init__.py
Line 296 in 37c8974
| _, dirhash_ = root_node.apply(file_apply=file_apply, dir_apply=dir_apply) |
I'm also unsure yet how to leverage scantree with the RecursionPath class to render/print this aggregate data structure.
Related: colcon/colcon-package-selection#44