w_crawl
usage:
w_crawl [-h] [-r RCFILE] [--quiet | --verbose | --debug] [--version]
[--max-queue-length MAX_QUEUE_LENGTH] [-W WEST_H5FILE] [--first-iter N_ITER]
[--last-iter N_ITER] [-c CRAWLER_INSTANCE]
[--serial | --parallel | --work-manager WORK_MANAGER] [--n-workers N_WORKERS]
[--zmq-mode MODE] [--zmq-comm-mode COMM_MODE] [--zmq-write-host-info INFO_FILE]
[--zmq-read-host-info INFO_FILE] [--zmq-upstream-rr-endpoint ENDPOINT]
[--zmq-upstream-ann-endpoint ENDPOINT] [--zmq-downstream-rr-endpoint ENDPOINT]
[--zmq-downstream-ann-endpoint ENDPOINT] [--zmq-master-heartbeat MASTER_HEARTBEAT]
[--zmq-worker-heartbeat WORKER_HEARTBEAT] [--zmq-timeout-factor FACTOR]
[--zmq-startup-timeout STARTUP_TIMEOUT] [--zmq-shutdown-timeout SHUTDOWN_TIMEOUT]
task_callable
Crawl a weighted ensemble dataset, executing a function for each iteration. This can be used for postprocessing of trajectories, cleanup of datasets, or anything else that can be expressed as “do X for iteration N, then do something with the result”. Tasks are parallelized by iteration, and no guarantees are made about evaluation order.
Command-line options
optional arguments:
-h, --help show this help message and exit
general options:
-r RCFILE, --rcfile RCFILE
use RCFILE as the WEST run-time configuration file (default: west.cfg)
--quiet emit only essential information
--verbose emit extra information
--debug enable extra checks and emit copious information
--version show program's version number and exit
parallelization options:
--max-queue-length MAX_QUEUE_LENGTH
Maximum number of tasks that can be queued. Useful to limit RAM use for tasks
that have very large requests/response. Default: no limit.
WEST input data options:
-W WEST_H5FILE, --west-data WEST_H5FILE
Take WEST data from WEST_H5FILE (default: read from the HDF5 file specified in
west.cfg).
iteration range:
--first-iter N_ITER Begin analysis at iteration N_ITER (default: 1).
--last-iter N_ITER Conclude analysis with N_ITER, inclusive (default: last completed iteration).
task options:
-c CRAWLER_INSTANCE, --crawler-instance CRAWLER_INSTANCE
Use CRAWLER_INSTANCE (specified as module.instance) as an instance of
WESTPACrawler to coordinate the calculation. Required only if initialization,
finalization, or task result processing is required.
task_callable Run TASK_CALLABLE (specified as module.function) on each iteration. Required.
parallelization options:
--serial run in serial mode
--parallel run in parallel mode (using processes)
--work-manager WORK_MANAGER
use the given work manager for parallel task distribution. Available work
managers are ('serial', 'threads', 'processes', 'zmq'); default is 'serial'
--n-workers N_WORKERS
Use up to N_WORKERS on this host, for work managers which support this option.
Use 0 for a dedicated server. (Ignored by work managers which do not support
this option.)
options for ZeroMQ (“zmq”) work manager (master or node):
--zmq-mode MODE Operate as a master (server) or a node (workers/client). "server" is a
deprecated synonym for "master" and "client" is a deprecated synonym for
"node".
--zmq-comm-mode COMM_MODE
Use the given communication mode -- TCP or IPC (Unix-domain) -- sockets for
communication within a node. IPC (the default) may be more efficient but is not
available on (exceptionally rare) systems without node-local storage (e.g.
/tmp); on such systems, TCP may be used instead.
--zmq-write-host-info INFO_FILE
Store hostname and port information needed to connect to this instance in
INFO_FILE. This allows the master and nodes assisting in coordinating the
communication of other nodes to choose ports randomly. Downstream nodes read
this file with --zmq-read-host-info and know where how to connect.
--zmq-read-host-info INFO_FILE
Read hostname and port information needed to connect to the master (or other
coordinating node) from INFO_FILE. This allows the master and nodes assisting
in coordinating the communication of other nodes to choose ports randomly,
writing that information with --zmq-write-host-info for this instance to read.
--zmq-upstream-rr-endpoint ENDPOINT
ZeroMQ endpoint to which to send request/response (task and result) traffic
toward the master.
--zmq-upstream-ann-endpoint ENDPOINT
ZeroMQ endpoint on which to receive announcement (heartbeat and shutdown
notification) traffic from the master.
--zmq-downstream-rr-endpoint ENDPOINT
ZeroMQ endpoint on which to listen for request/response (task and result)
traffic from subsidiary workers.
--zmq-downstream-ann-endpoint ENDPOINT
ZeroMQ endpoint on which to send announcement (heartbeat and shutdown
notification) traffic toward workers.
--zmq-master-heartbeat MASTER_HEARTBEAT
Every MASTER_HEARTBEAT seconds, the master announces its presence to workers.
--zmq-worker-heartbeat WORKER_HEARTBEAT
Every WORKER_HEARTBEAT seconds, workers announce their presence to the master.
--zmq-timeout-factor FACTOR
Scaling factor for heartbeat timeouts. If the master doesn't hear from a worker
in WORKER_HEARTBEAT*FACTOR, the worker is assumed to have crashed. If a worker
doesn't hear from the master in MASTER_HEARTBEAT*FACTOR seconds, the master is
assumed to have crashed. Both cases result in shutdown.
--zmq-startup-timeout STARTUP_TIMEOUT
Amount of time (in seconds) to wait for communication between the master and at
least one worker. This may need to be changed on very large, heavily-loaded
computer systems that start all processes simultaneously.
--zmq-shutdown-timeout SHUTDOWN_TIMEOUT
Amount of time (in seconds) to wait for workers to shut down.