w_select

usage:

w_select [-h] [-r RCFILE] [--quiet | --verbose | --debug] [--version]
               [--max-queue-length MAX_QUEUE_LENGTH] [-W WEST_H5FILE] [--first-iter N_ITER]
               [--last-iter N_ITER] [-p MODULE.FUNCTION] [-v] [-a] [-o OUTPUT]
               [--serial | --parallel | --work-manager WORK_MANAGER] [--n-workers N_WORKERS]
               [--zmq-mode MODE] [--zmq-comm-mode COMM_MODE] [--zmq-write-host-info INFO_FILE]
               [--zmq-read-host-info INFO_FILE] [--zmq-upstream-rr-endpoint ENDPOINT]
               [--zmq-upstream-ann-endpoint ENDPOINT] [--zmq-downstream-rr-endpoint ENDPOINT]
               [--zmq-downstream-ann-endpoint ENDPOINT] [--zmq-master-heartbeat MASTER_HEARTBEAT]
               [--zmq-worker-heartbeat WORKER_HEARTBEAT] [--zmq-timeout-factor FACTOR]
               [--zmq-startup-timeout STARTUP_TIMEOUT] [--zmq-shutdown-timeout SHUTDOWN_TIMEOUT]

Select dynamics segments matching various criteria. This requires a user-provided prediate function. By default, only matching segments are stored. If the -a/–include-ancestors option is given, then matching segments and their ancestors will be stored.

Predicate function

Segments are selected based on a predicate function, which must be callable as predicate(n_iter, iter_group) and return a collection of segment IDs matching the predicate in that iteration.

The predicate may be inverted by specifying the -v/–invert command-line argument.

Output format

The output file (-o/–output, by default “select.h5”) contains the following datasets:

``/n_iter`` [iteration]
  *(Integer)* Iteration numbers for each entry in other datasets.

``/n_segs`` [iteration]
  *(Integer)* Number of segment IDs matching the predicate (or inverted
  predicate, if -v/--invert is specified) in the given iteration.

``/seg_ids`` [iteration][segment]
  *(Integer)* Matching segments in each iteration. For an iteration
  ``n_iter``, only the first ``n_iter`` entries are valid. For example,
  the full list of matching seg_ids in the first stored iteration is
  ``seg_ids[0][:n_segs[0]]``.

``/weights`` [iteration][segment]
  *(Floating-point)* Weights for each matching segment in ``/seg_ids``.

Command-line arguments

optional arguments:

-h, --help            show this help message and exit

general options:

-r RCFILE, --rcfile RCFILE
                      use RCFILE as the WEST run-time configuration file (default: west.cfg)
--quiet               emit only essential information
--verbose             emit extra information
--debug               enable extra checks and emit copious information
--version             show program's version number and exit

parallelization options:

--max-queue-length MAX_QUEUE_LENGTH
                      Maximum number of tasks that can be queued. Useful to limit RAM use for tasks that
                      have very large requests/response. Default: no limit.

WEST input data options:

-W WEST_H5FILE, --west-data WEST_H5FILE
                      Take WEST data from WEST_H5FILE (default: read from the HDF5 file specified in
                      west.cfg).

iteration range:

--first-iter N_ITER   Begin analysis at iteration N_ITER (default: 1).
--last-iter N_ITER    Conclude analysis with N_ITER, inclusive (default: last completed iteration).

selection options:

-p MODULE.FUNCTION, --predicate-function MODULE.FUNCTION
                      Use the given predicate function to match segments. This function should take an
                      iteration number and the HDF5 group corresponding to that iteration and return a
                      sequence of seg_ids matching the predicate, as in ``match_predicate(n_iter,
                      iter_group)``.
-v, --invert          Invert the match predicate.
-a, --include-ancestors
                      Include ancestors of matched segments in output.
output options:
-o OUTPUT, --output OUTPUT

Write output to OUTPUT (default: select.h5).

parallelization options:

--serial              run in serial mode
--parallel            run in parallel mode (using processes)
--work-manager WORK_MANAGER
                      use the given work manager for parallel task distribution. Available work managers
                      are ('serial', 'threads', 'processes', 'zmq'); default is 'serial'
--n-workers N_WORKERS
                      Use up to N_WORKERS on this host, for work managers which support this option. Use
                      0 for a dedicated server. (Ignored by work managers which do not support this
                      option.)

options for ZeroMQ (“zmq”) work manager (master or node):

--zmq-mode MODE       Operate as a master (server) or a node (workers/client). "server" is a deprecated
                      synonym for "master" and "client" is a deprecated synonym for "node".
--zmq-comm-mode COMM_MODE
                      Use the given communication mode -- TCP or IPC (Unix-domain) -- sockets for
                      communication within a node. IPC (the default) may be more efficient but is not
                      available on (exceptionally rare) systems without node-local storage (e.g. /tmp);
                      on such systems, TCP may be used instead.
--zmq-write-host-info INFO_FILE
                      Store hostname and port information needed to connect to this instance in
                      INFO_FILE. This allows the master and nodes assisting in coordinating the
                      communication of other nodes to choose ports randomly. Downstream nodes read this
                      file with --zmq-read-host-info and know where how to connect.
--zmq-read-host-info INFO_FILE
                      Read hostname and port information needed to connect to the master (or other
                      coordinating node) from INFO_FILE. This allows the master and nodes assisting in
                      coordinating the communication of other nodes to choose ports randomly, writing
                      that information with --zmq-write-host-info for this instance to read.
--zmq-upstream-rr-endpoint ENDPOINT
                      ZeroMQ endpoint to which to send request/response (task and result) traffic toward
                      the master.
--zmq-upstream-ann-endpoint ENDPOINT
                      ZeroMQ endpoint on which to receive announcement (heartbeat and shutdown
                      notification) traffic from the master.
--zmq-downstream-rr-endpoint ENDPOINT
                      ZeroMQ endpoint on which to listen for request/response (task and result) traffic
                      from subsidiary workers.
--zmq-downstream-ann-endpoint ENDPOINT
                      ZeroMQ endpoint on which to send announcement (heartbeat and shutdown
                      notification) traffic toward workers.
--zmq-master-heartbeat MASTER_HEARTBEAT
                      Every MASTER_HEARTBEAT seconds, the master announces its presence to workers.
--zmq-worker-heartbeat WORKER_HEARTBEAT
                      Every WORKER_HEARTBEAT seconds, workers announce their presence to the master.
--zmq-timeout-factor FACTOR
                      Scaling factor for heartbeat timeouts. If the master doesn't hear from a worker in
                      WORKER_HEARTBEAT*FACTOR, the worker is assumed to have crashed. If a worker
                      doesn't hear from the master in MASTER_HEARTBEAT*FACTOR seconds, the master is
                      assumed to have crashed. Both cases result in shutdown.
--zmq-startup-timeout STARTUP_TIMEOUT
                      Amount of time (in seconds) to wait for communication between the master and at
                      least one worker. This may need to be changed on very large, heavily-loaded
                      computer systems that start all processes simultaneously.
--zmq-shutdown-timeout SHUTDOWN_TIMEOUT
                      Amount of time (in seconds) to wait for workers to shut down.

westpa.cli.tools.w_select module

class westpa.cli.tools.w_select.WESTParallelTool(wm_env=None)

Bases: WESTTool

Base class for command-line tools parallelized with wwmgr. This automatically adds and processes wwmgr command-line arguments and creates a work manager at self.work_manager.

make_parser_and_process(prog=None, usage=None, description=None, epilog=None, args=None)

A convenience function to create a parser, call add_all_args(), and then call process_all_args(). The argument namespace is returned.

add_args(parser)

Add arguments specific to this tool to the given argparse parser.

process_args(args)

Take argparse-processed arguments associated with this tool and deal with them appropriately (setting instance variables, etc)

go()

Perform the analysis associated with this tool.

main()

A convenience function to make a parser, parse and process arguments, then run self.go() in the master process.

class westpa.cli.tools.w_select.WESTDataReader

Bases: WESTToolComponent

Tool for reading data from WEST-related HDF5 files. Coordinates finding the main HDF5 file from west.cfg or command line arguments, caching of certain kinds of data (eventually), and retrieving auxiliary data sets from various places.

add_args(parser)

Add arguments specific to this component to the given argparse parser.

process_args(args)

Take argparse-processed arguments associated with this component and deal with them appropriately (setting instance variables, etc)

open(mode='r')
close()
property weight_dsspec
property parent_id_dsspec
class westpa.cli.tools.w_select.IterRangeSelection(data_manager=None)

Bases: WESTToolComponent

Select and record limits on iterations used in analysis and/or reporting. This class provides both the user-facing command-line options and parsing, and the application-side API for recording limits in HDF5.

HDF5 datasets calculated based on a restricted set of iterations should be tagged with the following attributes:

first_iter

The first iteration included in the calculation.

last_iter

One past the last iteration included in the calculation.

iter_step

Blocking or sampling period for iterations included in the calculation.

add_args(parser)

Add arguments specific to this component to the given argparse parser.

process_args(args, override_iter_start=None, override_iter_stop=None, default_iter_step=1)

Take argparse-processed arguments associated with this component and deal with them appropriately (setting instance variables, etc)

iter_block_iter()

Return an iterable of (block_start,block_end) over the blocks of iterations selected by –first-iter/–last-iter/–step-iter.

n_iter_blocks()

Return the number of blocks of iterations (as returned by iter_block_iter) selected by –first-iter/–last-iter/–step-iter.

record_data_iter_range(h5object, iter_start=None, iter_stop=None)

Store attributes iter_start and iter_stop on the given HDF5 object (group/dataset)

record_data_iter_step(h5object, iter_step=None)

Store attribute iter_step on the given HDF5 object (group/dataset).

check_data_iter_range_least(h5object, iter_start=None, iter_stop=None)

Check that the given HDF5 object contains (as denoted by its iter_start/iter_stop attributes) data at least for the iteration range specified.

check_data_iter_range_equal(h5object, iter_start=None, iter_stop=None)

Check that the given HDF5 object contains (as denoted by its iter_start/iter_stop attributes) data exactly for the iteration range specified.

check_data_iter_step_conformant(h5object, iter_step=None)

Check that the given HDF5 object contains per-iteration data at an iteration stride suitable for extracting data with the given stride (in other words, the given iter_step is a multiple of the stride with which data was recorded).

check_data_iter_step_equal(h5object, iter_step=None)

Check that the given HDF5 object contains per-iteration data at an iteration stride the same as that specified.

slice_per_iter_data(dataset, iter_start=None, iter_stop=None, iter_step=None, axis=0)

Return the subset of the given dataset corresponding to the given iteration range and stride. Unless otherwise specified, the first dimension of the dataset is the one sliced.

iter_range(iter_start=None, iter_stop=None, iter_step=None, dtype=None)

Return a sequence for the given iteration numbers and stride, filling in missing values from those stored on self. The smallest data type capable of holding iter_stop is returned unless otherwise specified using the dtype argument.

class westpa.cli.tools.w_select.ProgressIndicatorComponent

Bases: WESTToolComponent

add_args(parser)

Add arguments specific to this component to the given argparse parser.

process_args(args)

Take argparse-processed arguments associated with this component and deal with them appropriately (setting instance variables, etc)

westpa.cli.tools.w_select.seg_id_dtype

alias of int64

westpa.cli.tools.w_select.n_iter_dtype

alias of uint32

westpa.cli.tools.w_select.weight_dtype

alias of float64

westpa.cli.tools.w_select.get_object(object_name, path=None)

Attempt to load the given object, using additional path information if given.

class westpa.cli.tools.w_select.WSelectTool

Bases: WESTParallelTool

prog = 'w_select'
description = 'Select dynamics segments matching various criteria. This requires a\nuser-provided prediate function. By default, only matching segments are\nstored. If the -a/--include-ancestors option is given, then matching segments\nand their ancestors will be stored.\n\n\n-----------------------------------------------------------------------------\nPredicate function\n-----------------------------------------------------------------------------\n\nSegments are selected based on a predicate function, which must be callable\nas ``predicate(n_iter, iter_group)`` and return a collection of segment IDs\nmatching the predicate in that iteration.\n\nThe predicate may be inverted by specifying the -v/--invert command-line\nargument.\n\n\n-----------------------------------------------------------------------------\nOutput format\n-----------------------------------------------------------------------------\n\nThe output file (-o/--output, by default "select.h5") contains the following\ndatasets:\n\n  ``/n_iter`` [iteration]\n    *(Integer)* Iteration numbers for each entry in other datasets.\n\n  ``/n_segs`` [iteration]\n    *(Integer)* Number of segment IDs matching the predicate (or inverted\n    predicate, if -v/--invert is specified) in the given iteration.\n\n  ``/seg_ids`` [iteration][segment]\n    *(Integer)* Matching segments in each iteration. For an iteration\n    ``n_iter``, only the first ``n_iter`` entries are valid. For example,\n    the full list of matching seg_ids in the first stored iteration is\n    ``seg_ids[0][:n_segs[0]]``.\n\n  ``/weights`` [iteration][segment]\n    *(Floating-point)* Weights for each matching segment in ``/seg_ids``.\n\n\n-----------------------------------------------------------------------------\nCommand-line arguments\n-----------------------------------------------------------------------------\n'
add_args(parser)

Add arguments specific to this tool to the given argparse parser.

process_args(args)

Take argparse-processed arguments associated with this tool and deal with them appropriately (setting instance variables, etc)

go()

Perform the analysis associated with this tool.

westpa.cli.tools.w_select.entry_point()