westpa.analysis package
This subpackage provides an API to facilitate the analysis of WESTPA
simulation data. Its core abstraction is the Run
class.
A Run
instance provides a read-only view of a WEST HDF5 (“west.h5”) file.
API reference: https://westpa.readthedocs.io/en/latest/documentation/analysis/
How To
Open a run:
>>> from westpa.analysis import Run
>>> run = Run.open('west.h5')
>>> run
<WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>
Iterate over iterations and walkers:
>>> for iteration in run:
... for walker in iteration:
... pass
...
Access a particular iteration:
>>> iteration = run.iteration(10)
>>> iteration
Iteration(10, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Access a particular walker:
>>> walker = iteration.walker(4)
>>> walker
Walker(4, Iteration(10, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Get the weight and progress coordinate values of a walker:
>>> walker.weight
9.876543209876543e-06
>>> walker.pcoords
array([[3.1283207],
[3.073721 ],
[2.959221 ],
[2.6756208],
[2.7888207]], dtype=float32)
Get the parent and children of a walker:
>>> walker.parent
Walker(2, Iteration(9, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
>>> for child in walker.children:
... print(child)
...
Walker(0, Iteration(11, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Walker(1, Iteration(11, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Walker(2, Iteration(11, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Walker(3, Iteration(11, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Walker(4, Iteration(11, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Trace the ancestry of a walker:
>>> trace = walker.trace()
>>> trace
Trace(Walker(4, Iteration(10, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>)))
>>> for walker in trace:
... print(walker)
...
Walker(1, Iteration(1, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Walker(4, Iteration(2, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Walker(5, Iteration(3, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Walker(6, Iteration(4, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Walker(9, Iteration(5, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Walker(8, Iteration(6, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Walker(8, Iteration(7, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Walker(13, Iteration(8, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Walker(2, Iteration(9, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Walker(4, Iteration(10, <WESTPA Run with 500 iterations at 0x7fcaf8f0d5b0>))
Close a run (and its underlying HDF5 file):
>>> run.close()
>>> run
<Closed WESTPA Run at 0x7fcaf8f0d5b0>
>>> run.h5file
<Closed HDF5 file>
Retrieving Trajectories
Built-in Reader
MD trajectory data stored in an identical manner as in the
Basic NaCl tutorial
may be retrieved using the built-in BasicMDTrajectory
reader with its
default settings:
>>> from westpa.analysis import BasicMDTrajectory
>>> trajectory = BasicMDTrajectory()
Here trajectory
is a callable object that takes either a Walker
or
a Trace
instance as input and returns an
MDTraj Trajectory:
>>> traj = trajectory(walker)
>>> traj
<mdtraj.Trajectory with 5 frames, 33001 atoms, 6625 residues, and unitcells at 0x7fcae484ad00>
>>> traj = trajectory(trace)
>>> traj
<mdtraj.Trajectory with 41 frames, 33001 atoms, 6625 residues, and unitcells at 0x7fcae487c790>
Minor variations of the “basic” trajectory storage protocol (e.g., use of
different file formats) can be handled by changing the parameters of the
BasicMDTrajectory
reader. For example, suppose that instead of storing
the coordinate and topology data for trajectory segments in separate
files (“seg.dcd” and “bstate.pdb”), we store them together in a
MDTraj HDF5 trajectory file
(“seg.h5”). This change can be accommodated by explicitly setting the
traj_ext
and top
parameters of the trajectory reader:
>>> trajectory = BasicMDTrajectory(traj_ext='.h5', top=None)
Trajectories that are saved with the HDF5 Framework can use HDF5MDTrajectory
reader instead.
Custom Readers
For users requiring greater flexibility, custom trajectory readers can be
implemented using the westpa.analysis.Trajectory
class. Implementing
a custom reader requires two ingredients:
A function for retrieving individual trajectory segments. The function must take a
Walker
instance as its first argument and return a sequence (e.g., a list, NumPy array, or MDTraj Trajectory) representing the trajectory of the walker. Moreover, it must accept a Boolean keyword argumentinclude_initpoint
, which specifies whether the returned trajectory includes its initial point.A function for concatenating trajectory segments. A default implementation is provided by the
concatenate()
function in thewestpa.analysis.trajectories
module.
westpa.analysis.core module
- class westpa.analysis.core.Run(h5filename='west.h5')
A read-only view of a WESTPA simulation run.
- Parameters:
h5filename (str or file-like object, default 'west.h5') – Pathname or stream of a main WESTPA HDF5 data file.
- classmethod open(h5filename='west.h5')
Alternate constructor.
- Parameters:
h5filename (str or file-like object, default 'west.h5') – Pathname or stream of a main WESTPA HDF5 data file.
- close()
Close the Run instance by closing the underlying WESTPA HDF5 file.
- property closed
Whether the Run instance is closed.
- Type:
bool
- property summary
Summary data by iteration.
- Type:
pd.DataFrame
- property num_iterations
Number of completed iterations.
- Type:
int
- property num_walkers
Total number of walkers.
- Type:
int
- property num_segments
Total number of trajectory segments (alias self.num_walkers).
- Type:
int
- class westpa.analysis.core.Iteration(number, run)
An iteration of a WESTPA simulation.
- Parameters:
number (int) – Iteration number (1-based).
run (Run) – Simulation run to which the iteration belongs.
- property h5group
HDF5 group containing the iteration data.
- Type:
h5py.Group
- property summary
Iteration summary.
- Type:
pd.DataFrame
- property segment_summaries
Segment summary data for the iteration.
- Type:
pd.DataFrame
- property pcoords
Progress coordinate snaphots of each walker.
- Type:
3D ndarray
- property weights
Statistical weight of each walker.
- Type:
1D ndarray
- property bin_target_counts
Target count for each bin.
- Type:
1D ndarray, dtype=uint64
- property num_bins
Number of bins.
- Type:
int
- property num_walkers
Number of walkers in the iteration.
- Type:
int
- property num_segments
Number of trajectory segments (alias self.num_walkers).
- Type:
int
- property auxiliary_data
Auxiliary data stored for the iteration.
- Type:
h5py.Group or None
- property basis_state_summaries
Basis state summary data.
- Type:
pd.DataFrame
- property basis_state_pcoords
Progress coordinates of each basis state.
- Type:
2D ndarray
- property basis_states
Basis states in use for the iteration.
- Type:
list[BasisState]
- property has_target_states
Whether target (sink) states are defined for this iteration.
- Type:
bool
- property target_state_summaries
Target state summary data.
- Type:
pd.DataFrame or None
- property target_state_pcoords
Progress coordinates of each target state.
- Type:
2D ndarray or None
- property target_states
Target states in use for the iteration.
- Type:
list[TargetState]
- bin(index)
Return the bin with the given index.
- Parameters:
index (int) – Bin index (0-based).
- Returns:
The bin indexed by index.
- Return type:
- walker(index)
Return the walker with the given index.
- Parameters:
index (int) – Walker index (0-based).
- Returns:
The walker indexed by index.
- Return type:
- basis_state(index)
Return the basis state with the given index.
- Parameters:
index (int) – Basis state index (0-based).
- Returns:
The basis state indexed by index.
- Return type:
- target_state(index)
Return the target state with the given index.
- Parameters:
index (int) – Target state index (0-based).
- Returns:
The target state indexed by index.
- Return type:
- class westpa.analysis.core.Walker(index, iteration)
A walker in an iteration of a WESTPA simulation.
- Parameters:
index (int) – Walker index (0-based).
iteration (Iteration) – Iteration to which the walker belongs.
- property weight
Statistical weight of the walker.
- Type:
float64
- property pcoords
Progress coordinate snapshots.
- Type:
2D ndarray
- property num_snapshots
Number of snapshots.
- Type:
int
- property segment_summary
Segment summary data.
- Type:
pd.Series
- property parent
The parent of the walker.
- Type:
- property recycled
True if the walker stopped in the sink, False otherwise.
- Type:
bool
- property initial
True if the parent of the walker is an initial state, False otherwise.
- Type:
bool
- property auxiliary_data
Auxiliary data for the walker.
- Type:
dict
- class westpa.analysis.core.BinUnion(indices, mapper)
A (disjoint) union of bins defined by a common bin mapper.
- Parameters:
indices (iterable of int) – The indices of the bins comprising the union.
mapper (BinMapper) – The bin mapper defining the bins.
- union(*others)
Return the union of the bin union and all others.
- class westpa.analysis.core.Bin(index, mapper)
A bin defined by a bin mapper.
- Parameters:
index (int) – The index of the bin.
mapper (BinMapper) – The bin mapper defining the bin.
- class westpa.analysis.core.Trace(walker, source=None, max_length=None)
A trace of a walker’s ancestry.
- Parameters:
walker (Walker) – The terminal walker.
source (Bin, BinUnion, or collections.abc.Container, optional) – A source (macro)state, specified as a container object whose
__contains__()
method is the indicator function for the corresponding subset of progress coordinate space. The trace is stopped upon encountering a walker that stopped in source.max_length (int, optional) – The maximum number of walkers in the trace.
westpa.analysis.trajectories module
- class westpa.analysis.trajectories.Trajectory(fget=None, *, fconcat=None)
A callable that returns the trajectory of a walker or trace.
- Parameters:
fget (callable) – Function for retrieving a single trajectory segment. Must take a
Walker
instance as its first argument and accept a boolean keyword argument include_initpoint. The function should return a sequence (e.g., a list or ndarray) representing the trajectory of the walker. If include_initpoint is True, the trajectory segment should include its initial point. Otherwise, the trajectory segment should exclude its initial point.fconcat (callable, optional) – Function for concatenating trajectory segments. Must take a sequence of trajectory segments as input and return their concatenation. The default concatenation function is
concatenate()
.
- property segment_collector
Segment retrieval manager.
- Type:
- property fget
Function for getting trajectory segments.
- Type:
callable
- property fconcat
Function for concatenating trajectory segments.
- Type:
callable
- class westpa.analysis.trajectories.SegmentCollector(trajectory, use_threads=False, max_workers=None, show_progress=False)
An object that manages the retrieval of trajectory segments.
- Parameters:
trajectory (Trajectory) – The trajectory to which the segment collector is attached.
use_threads (bool, default False) – Whether to use a pool of threads to retrieve trajectory segments asynchronously. Setting this parameter to True may be may be useful when segment retrieval is an I/O bound task.
max_workers (int, optional) – Maximum number of threads to use. The default value is specified in the ThreadPoolExecutor documentation.
show_progress (bool, default False) – Whether to show a progress bar when retrieving multiple segments.
- get_segments(walkers, initpoint_mask=None, **kwargs)
Retrieve the trajectories of multiple walkers.
- Parameters:
walkers (sequence of Walker) – The walkers for which to retrieve trajectories.
initpoint_mask (sequence of bool, optional) – A Boolean mask indicating whether each trajectory segment should include (True) or exclude (False) its initial point. Default is all True.
- Returns:
The trajectory of each walker.
- Return type:
list of sequences
- class westpa.analysis.trajectories.BasicMDTrajectory(top='bstate.pdb', traj_ext='.dcd', state_ext='.xml', sim_root='.')
Trajectory reader for MD trajectories stored as in the Basic Tutorial.
- Parameters:
top (str or mdtraj.Topology, default 'bstate.pdb')
traj_ext (str, default '.dcd')
state_ext (str, default '.xml')
sim_root (str, default '.')
- class westpa.analysis.trajectories.HDF5MDTrajectory
Trajectory reader for MD trajectories stored by the HDF5 framework.
- westpa.analysis.trajectories.concatenate(segments)
Return the concatenation of a sequence of trajectory segments.
- Parameters:
segments (sequence of sequences) – A sequence of trajectory segments.
- Returns:
The concatenation of segments.
- Return type:
sequence
westpa.analysis.statistics module
- westpa.analysis.statistics.time_average(observable, iterations)
Compute the time average of an observable.
- Parameters:
- Returns:
The time average of observable over iterations.
- Return type:
ArrayLike