w_crawl
usage:
w_crawl [-h] [-r RCFILE] [--quiet | --verbose | --debug] [--version]
[--max-queue-length MAX_QUEUE_LENGTH] [-W WEST_H5FILE] [--first-iter N_ITER]
[--last-iter N_ITER] [-c CRAWLER_INSTANCE]
[--serial | --parallel | --work-manager WORK_MANAGER] [--n-workers N_WORKERS]
[--zmq-mode MODE] [--zmq-comm-mode COMM_MODE] [--zmq-write-host-info INFO_FILE]
[--zmq-read-host-info INFO_FILE] [--zmq-upstream-rr-endpoint ENDPOINT]
[--zmq-upstream-ann-endpoint ENDPOINT] [--zmq-downstream-rr-endpoint ENDPOINT]
[--zmq-downstream-ann-endpoint ENDPOINT] [--zmq-master-heartbeat MASTER_HEARTBEAT]
[--zmq-worker-heartbeat WORKER_HEARTBEAT] [--zmq-timeout-factor FACTOR]
[--zmq-startup-timeout STARTUP_TIMEOUT] [--zmq-shutdown-timeout SHUTDOWN_TIMEOUT]
task_callable
Crawl a weighted ensemble dataset, executing a function for each iteration. This can be used for postprocessing of trajectories, cleanup of datasets, or anything else that can be expressed as “do X for iteration N, then do something with the result”. Tasks are parallelized by iteration, and no guarantees are made about evaluation order.
Command-line options
optional arguments:
-h, --help show this help message and exit
general options:
-r RCFILE, --rcfile RCFILE
use RCFILE as the WEST run-time configuration file (default: west.cfg)
--quiet emit only essential information
--verbose emit extra information
--debug enable extra checks and emit copious information
--version show program's version number and exit
parallelization options:
--max-queue-length MAX_QUEUE_LENGTH
Maximum number of tasks that can be queued. Useful to limit RAM use for tasks
that have very large requests/response. Default: no limit.
WEST input data options:
-W WEST_H5FILE, --west-data WEST_H5FILE
Take WEST data from WEST_H5FILE (default: read from the HDF5 file specified in
west.cfg).
iteration range:
--first-iter N_ITER Begin analysis at iteration N_ITER (default: 1).
--last-iter N_ITER Conclude analysis with N_ITER, inclusive (default: last completed iteration).
task options:
-c CRAWLER_INSTANCE, --crawler-instance CRAWLER_INSTANCE
Use CRAWLER_INSTANCE (specified as module.instance) as an instance of
WESTPACrawler to coordinate the calculation. Required only if initialization,
finalization, or task result processing is required.
task_callable Run TASK_CALLABLE (specified as module.function) on each iteration. Required.
parallelization options:
--serial run in serial mode
--parallel run in parallel mode (using processes)
--work-manager WORK_MANAGER
use the given work manager for parallel task distribution. Available work
managers are ('serial', 'threads', 'processes', 'zmq'); default is 'serial'
--n-workers N_WORKERS
Use up to N_WORKERS on this host, for work managers which support this option.
Use 0 for a dedicated server. (Ignored by work managers which do not support
this option.)
options for ZeroMQ (“zmq”) work manager (master or node):
--zmq-mode MODE Operate as a master (server) or a node (workers/client). "server" is a
deprecated synonym for "master" and "client" is a deprecated synonym for
"node".
--zmq-comm-mode COMM_MODE
Use the given communication mode -- TCP or IPC (Unix-domain) -- sockets for
communication within a node. IPC (the default) may be more efficient but is not
available on (exceptionally rare) systems without node-local storage (e.g.
/tmp); on such systems, TCP may be used instead.
--zmq-write-host-info INFO_FILE
Store hostname and port information needed to connect to this instance in
INFO_FILE. This allows the master and nodes assisting in coordinating the
communication of other nodes to choose ports randomly. Downstream nodes read
this file with --zmq-read-host-info and know where how to connect.
--zmq-read-host-info INFO_FILE
Read hostname and port information needed to connect to the master (or other
coordinating node) from INFO_FILE. This allows the master and nodes assisting
in coordinating the communication of other nodes to choose ports randomly,
writing that information with --zmq-write-host-info for this instance to read.
--zmq-upstream-rr-endpoint ENDPOINT
ZeroMQ endpoint to which to send request/response (task and result) traffic
toward the master.
--zmq-upstream-ann-endpoint ENDPOINT
ZeroMQ endpoint on which to receive announcement (heartbeat and shutdown
notification) traffic from the master.
--zmq-downstream-rr-endpoint ENDPOINT
ZeroMQ endpoint on which to listen for request/response (task and result)
traffic from subsidiary workers.
--zmq-downstream-ann-endpoint ENDPOINT
ZeroMQ endpoint on which to send announcement (heartbeat and shutdown
notification) traffic toward workers.
--zmq-master-heartbeat MASTER_HEARTBEAT
Every MASTER_HEARTBEAT seconds, the master announces its presence to workers.
--zmq-worker-heartbeat WORKER_HEARTBEAT
Every WORKER_HEARTBEAT seconds, workers announce their presence to the master.
--zmq-timeout-factor FACTOR
Scaling factor for heartbeat timeouts. If the master doesn't hear from a worker
in WORKER_HEARTBEAT*FACTOR, the worker is assumed to have crashed. If a worker
doesn't hear from the master in MASTER_HEARTBEAT*FACTOR seconds, the master is
assumed to have crashed. Both cases result in shutdown.
--zmq-startup-timeout STARTUP_TIMEOUT
Amount of time (in seconds) to wait for communication between the master and at
least one worker. This may need to be changed on very large, heavily-loaded
computer systems that start all processes simultaneously.
--zmq-shutdown-timeout SHUTDOWN_TIMEOUT
Amount of time (in seconds) to wait for workers to shut down.
westpa.cli.tools.w_crawl module
- class westpa.cli.tools.w_crawl.WESTParallelTool(wm_env=None)
Bases:
WESTTool
Base class for command-line tools parallelized with wwmgr. This automatically adds and processes wwmgr command-line arguments and creates a work manager at self.work_manager.
- make_parser_and_process(prog=None, usage=None, description=None, epilog=None, args=None)
A convenience function to create a parser, call add_all_args(), and then call process_all_args(). The argument namespace is returned.
- add_args(parser)
Add arguments specific to this tool to the given argparse parser.
- process_args(args)
Take argparse-processed arguments associated with this tool and deal with them appropriately (setting instance variables, etc)
- go()
Perform the analysis associated with this tool.
- main()
A convenience function to make a parser, parse and process arguments, then run self.go() in the master process.
- class westpa.cli.tools.w_crawl.WESTDataReader
Bases:
WESTToolComponent
Tool for reading data from WEST-related HDF5 files. Coordinates finding the main HDF5 file from west.cfg or command line arguments, caching of certain kinds of data (eventually), and retrieving auxiliary data sets from various places.
- add_args(parser)
Add arguments specific to this component to the given argparse parser.
- process_args(args)
Take argparse-processed arguments associated with this component and deal with them appropriately (setting instance variables, etc)
- open(mode='r')
- close()
- property weight_dsspec
- property parent_id_dsspec
- class westpa.cli.tools.w_crawl.IterRangeSelection(data_manager=None)
Bases:
WESTToolComponent
Select and record limits on iterations used in analysis and/or reporting. This class provides both the user-facing command-line options and parsing, and the application-side API for recording limits in HDF5.
HDF5 datasets calculated based on a restricted set of iterations should be tagged with the following attributes:
first_iter
The first iteration included in the calculation.
last_iter
One past the last iteration included in the calculation.
iter_step
Blocking or sampling period for iterations included in the calculation.
- add_args(parser)
Add arguments specific to this component to the given argparse parser.
- process_args(args, override_iter_start=None, override_iter_stop=None, default_iter_step=1)
Take argparse-processed arguments associated with this component and deal with them appropriately (setting instance variables, etc)
- iter_block_iter()
Return an iterable of (block_start,block_end) over the blocks of iterations selected by –first-iter/–last-iter/–step-iter.
- n_iter_blocks()
Return the number of blocks of iterations (as returned by
iter_block_iter
) selected by –first-iter/–last-iter/–step-iter.
- record_data_iter_range(h5object, iter_start=None, iter_stop=None)
Store attributes
iter_start
anditer_stop
on the given HDF5 object (group/dataset)
- record_data_iter_step(h5object, iter_step=None)
Store attribute
iter_step
on the given HDF5 object (group/dataset).
- check_data_iter_range_least(h5object, iter_start=None, iter_stop=None)
Check that the given HDF5 object contains (as denoted by its
iter_start
/iter_stop
attributes) data at least for the iteration range specified.
- check_data_iter_range_equal(h5object, iter_start=None, iter_stop=None)
Check that the given HDF5 object contains (as denoted by its
iter_start
/iter_stop
attributes) data exactly for the iteration range specified.
- check_data_iter_step_conformant(h5object, iter_step=None)
Check that the given HDF5 object contains per-iteration data at an iteration stride suitable for extracting data with the given stride (in other words, the given
iter_step
is a multiple of the stride with which data was recorded).
- check_data_iter_step_equal(h5object, iter_step=None)
Check that the given HDF5 object contains per-iteration data at an iteration stride the same as that specified.
- slice_per_iter_data(dataset, iter_start=None, iter_stop=None, iter_step=None, axis=0)
Return the subset of the given dataset corresponding to the given iteration range and stride. Unless otherwise specified, the first dimension of the dataset is the one sliced.
- iter_range(iter_start=None, iter_stop=None, iter_step=None, dtype=None)
Return a sequence for the given iteration numbers and stride, filling in missing values from those stored on
self
. The smallest data type capable of holdingiter_stop
is returned unless otherwise specified using thedtype
argument.
- class westpa.cli.tools.w_crawl.ProgressIndicatorComponent
Bases:
WESTToolComponent
- add_args(parser)
Add arguments specific to this component to the given argparse parser.
- process_args(args)
Take argparse-processed arguments associated with this component and deal with them appropriately (setting instance variables, etc)
- westpa.cli.tools.w_crawl.get_object(object_name, path=None)
Attempt to load the given object, using additional path information if given.
- class westpa.cli.tools.w_crawl.WESTPACrawler
Bases:
object
Base class for general crawling execution. This class only exists on the master.
- initialize(iter_start, iter_stop)
Initialize this crawling process.
- finalize()
Finalize this crawling process.
- process_iter_result(n_iter, result)
Process the result of a per-iteration task.
- class westpa.cli.tools.w_crawl.WCrawl
Bases:
WESTParallelTool
- prog = 'w_crawl'
- description = 'Crawl a weighted ensemble dataset, executing a function for each iteration.\nThis can be used for postprocessing of trajectories, cleanup of datasets,\nor anything else that can be expressed as "do X for iteration N, then do\nsomething with the result". Tasks are parallelized by iteration, and\nno guarantees are made about evaluation order.\n\n\n-----------------------------------------------------------------------------\nCommand-line options\n-----------------------------------------------------------------------------\n\n'
- add_args(parser)
Add arguments specific to this tool to the given argparse parser.
- process_args(args)
Take argparse-processed arguments associated with this tool and deal with them appropriately (setting instance variables, etc)
- go()
Perform the analysis associated with this tool.
- westpa.cli.tools.w_crawl.entry_point()