ParallelWalk
Multiprocess directory walker
Designed for large file systems
ParallelWalk is intended for use in environments with very large storage arrays on very fast network connections. This software uses parallel processing and is intended to be run on multicore systems.
Baseline stats
On a 24 core system and 10Gbps network connection, ParallelWalk can scan over 6 million files in under 14 minutes.
Usage: walker.py
usage: walker.py [-h] -p ROOT_PATH [-d] [-q QUEUE_SIZE]
optional arguments:
-h, --help show this help message and exit
-p ROOT_PATH, --path ROOT_PATH
Root path to scan
-d, --debug Send debugging output to stderr
-q QUEUE_SIZE, --queue-size QUEUE_SIZE
Length of queues. Should be larger than total file
count.
Usage: walker_attr.py
usage: walker_attr.py [-h] -p ROOT_PATH [-d] [-q QUEUE_SIZE]
optional arguments:
-h, --help show this help message and exit
-p ROOT_PATH, --path ROOT_PATH
Root path to scan
-d, --debug Send debugging output to stderr
-q QUEUE_SIZE, --queue-size QUEUE_SIZE
Length of queues. Should be larger than total file
count.