westpa.core modules

westpa.core module

westpa.core.data_manager module

HDF5 data manager for WEST.

Original HDF5 implementation: Joseph W. Kaus Current implementation: Matthew C. Zwier

WEST exclusively uses the cross-platform, self-describing file format HDF5 for data storage. This ensures that data is stored efficiently and portably in a manner that is relatively straightforward for other analysis tools (perhaps written in C/C++/Fortran) to access.

The data is laid out in HDF5 as follows:
  • summary – overall summary data for the simulation

  • /iterations/ – data for individual iterations, one group per iteration under /iterations
    • iter_00000001/ – data for iteration 1
      • seg_index – overall information about segments in the iteration, including weight

      • pcoord – progress coordinate data organized as [seg_id][time][dimension]

      • wtg_parents – data used to reconstruct the split/merge history of trajectories

      • recycling – flux and event count for recycled particles, on a per-target-state basis

      • auxdata/ – auxiliary datasets (data stored on the ‘data’ field of Segment objects)

The file root object has an integer attribute ‘west_file_format_version’ which can be used to determine how to access data even as the file format (i.e. organization of data within HDF5 file) evolves.

Version history:
Version 9
  • Basis states are now saved as iter_segid instead of just segid as a pointer label.

  • Initial states are also saved in the iteration 0 file, with a negative sign.

Version 8
  • Added external links to trajectory files in iterations/iter_* groups, if the HDF5 framework was used.

  • Added an iter group for the iteration 0 to store conformations of basis states.

Version 7
  • Removed bin_assignments, bin_populations, and bin_rates from iteration group.

  • Added new_segments subgroup to iteration group

Version 6
  • ???

Version 5
  • moved iter_* groups into a top-level iterations/ group,

  • added in-HDF5 storage for basis states, target states, and generated states

class westpa.core.data_manager.attrgetter(attr, /, *attrs)

Bases: object

Return a callable object that fetches the given attribute(s) from its operand. After f = attrgetter(‘name’), the call f(r) returns r.name. After g = attrgetter(‘name’, ‘date’), the call g(r) returns (r.name, r.date). After h = attrgetter(‘name.first’, ‘name.last’), the call h(r) returns (r.name.first, r.name.last).

westpa.core.data_manager.relpath(path, start=None)

Return a relative version of a path

westpa.core.data_manager.dirname(p)

Returns the directory component of a pathname

class westpa.core.data_manager.Segment(n_iter=None, seg_id=None, weight=None, endpoint_type=None, parent_id=None, wtg_parent_ids=None, pcoord=None, status=None, walltime=None, cputime=None, data=None)

Bases: object

A class wrapping segment data that must be passed through the work manager or data manager. Most fields are self-explanatory. One item worth noting is that a negative parent ID means that the segment starts from the initial state with ID -(segment.parent_id+1)

SEG_STATUS_UNSET = 0
SEG_STATUS_PREPARED = 1
SEG_STATUS_COMPLETE = 2
SEG_STATUS_FAILED = 3
SEG_INITPOINT_UNSET = 0
SEG_INITPOINT_CONTINUES = 1
SEG_INITPOINT_NEWTRAJ = 2
SEG_ENDPOINT_UNSET = 0
SEG_ENDPOINT_CONTINUES = 1
SEG_ENDPOINT_MERGED = 2
SEG_ENDPOINT_RECYCLED = 3
statuses = {'SEG_STATUS_COMPLETE': 2, 'SEG_STATUS_FAILED': 3, 'SEG_STATUS_PREPARED': 1, 'SEG_STATUS_UNSET': 0}
initpoint_types = {'SEG_INITPOINT_CONTINUES': 1, 'SEG_INITPOINT_NEWTRAJ': 2, 'SEG_INITPOINT_UNSET': 0}
endpoint_types = {'SEG_ENDPOINT_CONTINUES': 1, 'SEG_ENDPOINT_MERGED': 2, 'SEG_ENDPOINT_RECYCLED': 3, 'SEG_ENDPOINT_UNSET': 0}
status_names = {0: 'SEG_STATUS_UNSET', 1: 'SEG_STATUS_PREPARED', 2: 'SEG_STATUS_COMPLETE', 3: 'SEG_STATUS_FAILED'}
initpoint_type_names = {0: 'SEG_INITPOINT_UNSET', 1: 'SEG_INITPOINT_CONTINUES', 2: 'SEG_INITPOINT_NEWTRAJ'}
endpoint_type_names = {0: 'SEG_ENDPOINT_UNSET', 1: 'SEG_ENDPOINT_CONTINUES', 2: 'SEG_ENDPOINT_MERGED', 3: 'SEG_ENDPOINT_RECYCLED'}
static initial_pcoord(segment)

Return the initial progress coordinate point of this segment.

static final_pcoord(segment)

Return the final progress coordinate point of this segment.

property initpoint_type
property initial_state_id
property status_text
property endpoint_type_text
class westpa.core.data_manager.BasisState(label, probability, pcoord=None, auxref=None, state_id=None)

Bases: object

Describes an basis (micro)state. These basis states are used to generate initial states for new trajectories, either at the beginning of the simulation (i.e. at w_init) or due to recycling.

Variables:
  • state_id – Integer identifier of this state, usually set by the data manager.

  • label – A descriptive label for this microstate (may be empty)

  • probability – Probability of this state to be selected when creating a new trajectory.

  • pcoord – The representative progress coordinate of this state.

  • auxref – A user-provided (string) reference for locating data associated with this state (usually a filesystem path).

classmethod states_to_file(states, fileobj)

Write a file defining basis states, which may then be read by states_from_file().

classmethod states_from_file(statefile)

Read a file defining basis states. Each line defines a state, and contains a label, the probability, and optionally a data reference, separated by whitespace, as in:

unbound    1.0

or:

unbound_0    0.6        state0.pdb
unbound_1    0.4        state1.pdb
as_numpy_record()

Return the data for this state as a numpy record array.

class westpa.core.data_manager.TargetState(label, pcoord, state_id=None)

Bases: object

Describes a target state.

Variables:
  • state_id – Integer identifier of this state, usually set by the data manager.

  • label – A descriptive label for this microstate (may be empty)

  • pcoord – The representative progress coordinate of this state.

classmethod states_to_file(states, fileobj)

Write a file defining basis states, which may then be read by states_from_file().

classmethod states_from_file(statefile, dtype)

Read a file defining target states. Each line defines a state, and contains a label followed by a representative progress coordinate value, separated by whitespace, as in:

bound     0.02

for a single target and one-dimensional progress coordinates or:

bound    2.7    0.0
drift    100    50.0

for two targets and a two-dimensional progress coordinate.

class westpa.core.data_manager.InitialState(state_id, basis_state_id, iter_created, iter_used=None, istate_type=None, istate_status=None, pcoord=None, basis_state=None, basis_auxref=None)

Bases: object

Describes an initial state for a new trajectory. These are generally constructed by appropriate modification of a basis state.

Variables:
  • state_id – Integer identifier of this state, usually set by the data manager.

  • basis_state_id – Identifier of the basis state from which this state was generated, or None.

  • basis_state – The BasisState from which this state was generated, or None.

  • iter_created – Iteration in which this state was generated (0 for simulation initialization).

  • iter_used – Iteration in which this state was used to initiate a trajectory (None for unused).

  • istate_type – Integer describing the type of this initial state (ISTATE_TYPE_BASIS for direct use of a basis state, ISTATE_TYPE_GENERATED for a state generated from a basis state, ISTATE_TYPE_RESTART for a state corresponding to the endpoint of a segment in another simulation, or ISTATE_TYPE_START for a state generated from a start state).

  • istate_status – Integer describing whether this initial state has been properly prepared.

  • pcoord – The representative progress coordinate of this state.

ISTATE_TYPE_UNSET = 0
ISTATE_TYPE_BASIS = 1
ISTATE_TYPE_GENERATED = 2
ISTATE_TYPE_RESTART = 3
ISTATE_TYPE_START = 4
ISTATE_UNUSED = 0
ISTATE_STATUS_PENDING = 0
ISTATE_STATUS_PREPARED = 1
ISTATE_STATUS_FAILED = 2
istate_types = {'ISTATE_TYPE_BASIS': 1, 'ISTATE_TYPE_GENERATED': 2, 'ISTATE_TYPE_RESTART': 3, 'ISTATE_TYPE_START': 4, 'ISTATE_TYPE_UNSET': 0}
istate_type_names = {0: 'ISTATE_TYPE_UNSET', 1: 'ISTATE_TYPE_BASIS', 2: 'ISTATE_TYPE_GENERATED', 3: 'ISTATE_TYPE_RESTART', 4: 'ISTATE_TYPE_START'}
istate_statuses = {'ISTATE_STATUS_FAILED': 2, 'ISTATE_STATUS_PENDING': 0, 'ISTATE_STATUS_PREPARED': 1}
istate_status_names = {0: 'ISTATE_STATUS_PENDING', 1: 'ISTATE_STATUS_PREPARED', 2: 'ISTATE_STATUS_FAILED'}
as_numpy_record()
class westpa.core.data_manager.NewWeightEntry(source_type, weight, prev_seg_id=None, prev_init_pcoord=None, prev_final_pcoord=None, new_init_pcoord=None, target_state_id=None, initial_state_id=None)

Bases: object

NW_SOURCE_RECYCLED = 0
class westpa.core.data_manager.ExecutablePropagator(rc=None)

Bases: WESTPropagator

ENV_CURRENT_ITER = 'WEST_CURRENT_ITER'
ENV_CURRENT_SEG_ID = 'WEST_CURRENT_SEG_ID'
ENV_CURRENT_SEG_DATA_REF = 'WEST_CURRENT_SEG_DATA_REF'
ENV_CURRENT_SEG_INITPOINT = 'WEST_CURRENT_SEG_INITPOINT_TYPE'
ENV_PARENT_SEG_ID = 'WEST_PARENT_ID'
ENV_PARENT_DATA_REF = 'WEST_PARENT_DATA_REF'
ENV_BSTATE_ID = 'WEST_BSTATE_ID'
ENV_BSTATE_DATA_REF = 'WEST_BSTATE_DATA_REF'
ENV_ISTATE_ID = 'WEST_ISTATE_ID'
ENV_ISTATE_DATA_REF = 'WEST_ISTATE_DATA_REF'
ENV_STRUCT_DATA_REF = 'WEST_STRUCT_DATA_REF'
ENV_RAND16 = 'WEST_RAND16'
ENV_RAND32 = 'WEST_RAND32'
ENV_RAND64 = 'WEST_RAND64'
ENV_RAND128 = 'WEST_RAND128'
ENV_RANDFLOAT = 'WEST_RANDFLOAT'
static makepath(template, template_args=None, expanduser=True, expandvars=True, abspath=False, realpath=False)
random_val_env_vars()

Return a set of environment variables containing random seeds. These are returned as a dictionary, suitable for use in os.environ.update() or as the env argument to subprocess.Popen(). Every child process executed by exec_child() gets these.

exec_child(executable, environ=None, stdin=None, stdout=None, stderr=None, cwd=None)

Execute a child process with the environment set from the current environment, the values of self.addtl_child_environ, the random numbers returned by self.random_val_env_vars, and the given environ (applied in that order). stdin/stdout/stderr are optionally redirected.

This function waits on the child process to finish, then returns (rc, rusage), where rc is the child’s return code and rusage is the resource usage tuple from os.wait4()

exec_child_from_child_info(child_info, template_args, environ)
update_args_env_basis_state(template_args, environ, basis_state)
update_args_env_initial_state(template_args, environ, initial_state)
update_args_env_iter(template_args, environ, n_iter)
update_args_env_segment(template_args, environ, segment)
template_args_for_segment(segment)
exec_for_segment(child_info, segment, addtl_env=None)

Execute a child process with environment and template expansion from the given segment.

exec_for_iteration(child_info, n_iter, addtl_env=None)

Execute a child process with environment and template expansion from the given iteration number.

exec_for_basis_state(child_info, basis_state, addtl_env=None)

Execute a child process with environment and template expansion from the given basis state

exec_for_initial_state(child_info, initial_state, addtl_env=None)

Execute a child process with environment and template expansion from the given initial state.

prepare_file_system(segment, environ)
setup_dataset_return(segment=None, subset_keys=None)

Set up temporary files and environment variables that point to them for segment runners to return data. segment is the Segment object that the return data is associated with. subset_keys specifies the names of a subset of data to be returned.

retrieve_dataset_return(state, return_files, del_return_files, single_point)

Retrieve returned data from the temporary locations directed by the environment variables. state is a Segment, BasisState , or InitialState``object that the return data is associated with. ``return_files is a dict where the keys are the dataset names and the values are the paths to the temporarily files that contain the returned data. del_return_files is a dict where the keys are the names of datasets to be deleted (if the corresponding value is set to True) once the data is retrieved.

get_pcoord(state)

Get the progress coordinate of the given basis or initial state.

gen_istate(basis_state, initial_state)

Generate a new initial state from the given basis state.

prepare_iteration(n_iter, segments)

Perform any necessary per-iteration preparation. This is run by the work manager.

finalize_iteration(n_iter, segments)

Perform any necessary post-iteration cleanup. This is run by the work manager.

propagate(segments)

Propagate one or more segments, including any necessary per-iteration setup and teardown for this propagator.

westpa.core.data_manager.makepath(template, template_args=None, expanduser=True, expandvars=True, abspath=False, realpath=False)
class westpa.core.data_manager.flushing_lock(lock, fileobj)

Bases: object

class westpa.core.data_manager.expiring_flushing_lock(lock, flush_method, nextsync)

Bases: object

westpa.core.data_manager.seg_id_dtype

alias of int64

westpa.core.data_manager.n_iter_dtype

alias of uint32

westpa.core.data_manager.weight_dtype

alias of float64

westpa.core.data_manager.utime_dtype

alias of float64

westpa.core.data_manager.seg_status_dtype

alias of uint8

westpa.core.data_manager.seg_initpoint_dtype

alias of uint8

westpa.core.data_manager.seg_endpoint_dtype

alias of uint8

westpa.core.data_manager.istate_type_dtype

alias of uint8

westpa.core.data_manager.istate_status_dtype

alias of uint8

westpa.core.data_manager.nw_source_dtype

alias of uint8

class westpa.core.data_manager.WESTDataManager(rc=None)

Bases: object

Data manager for assisiting the reading and writing of WEST data from/to HDF5 files.

default_iter_prec = 8
default_we_h5filename = 'west.h5'
default_we_h5file_driver = None
default_flush_period = 60
default_aux_compression_threshold = 1048576
binning_hchunksize = 4096
table_scan_chunksize = 1024
flushing_lock()
expiring_flushing_lock()
process_config()
property system
property closed
iter_group_name(n_iter, absolute=True)
require_iter_group(n_iter)

Get the group associated with n_iter, creating it if necessary.

del_iter_group(n_iter)
get_iter_group(n_iter)
get_seg_index(n_iter)
property current_iteration
open_backing(mode=None)

Open the (already-created) HDF5 file named in self.west_h5filename.

prepare_backing()

Create new HDF5 file

close_backing()
flush_backing()
save_target_states(tstates, n_iter=None)

Save the given target states in the HDF5 file; they will be used for the next iteration to be propagated. A complete set is required, even if nominally appending to an existing set, which simplifies the mapping of IDs to the table.

find_tstate_group(n_iter)
find_ibstate_group(n_iter)
get_target_states(n_iter)

Return a list of Target objects representing the target (sink) states that are in use for iteration n_iter. Future iterations are assumed to continue from the most recent set of states.

create_ibstate_group(basis_states, n_iter=None)

Create the group used to store basis states and initial states (whose definitions are always coupled). This group is hard-linked into all iteration groups that use these basis and initial states.

create_ibstate_iter_h5file(basis_states)

Create the per-iteration HDF5 file for the basis states (i.e., iteration 0). This special treatment is needed so that the analysis tools can access basis states more easily.

update_iter_h5file(n_iter, segments)

Write out the per-iteration HDF5 file with given segments and add an external link to it in the main HDF5 file (west.h5) if the link is not present.

get_basis_states(n_iter=None)

Return a list of BasisState objects representing the basis states that are in use for iteration n_iter.

create_initial_states(n_states, n_iter=None)

Create storage for n_states initial states associated with iteration n_iter, and return bare InitialState objects with only state_id set.

update_initial_states(initial_states, n_iter=None)

Save the given initial states in the HDF5 file

get_initial_states(n_iter=None)
get_segment_initial_states(segments, n_iter=None)

Retrieve all initial states referenced by the given segments.

get_unused_initial_states(n_states=None, n_iter=None)

Retrieve any prepared but unused initial states applicable to the given iteration. Up to n_states states are returned; if n_states is None, then all unused states are returned.

prepare_iteration(n_iter, segments)

Prepare for a new iteration by creating space to store the new iteration’s data. The number of segments, their IDs, and their lineage must be determined and included in the set of segments passed in.

Update the per-iteration hard links pointing to the tables of target and initial/basis states for the given iteration. These links are not used by this class, but are remarkably convenient for third-party analysis tools and hdfview.

get_iter_summary(n_iter=None)
update_iter_summary(summary, n_iter=None)
del_iter_summary(min_iter)
update_segments(n_iter, segments)

Update segment information in the HDF5 file; all prior information for each segment is overwritten, except for parent and weight transfer information.

get_segments(n_iter=None, seg_ids=None, load_pcoords=True)

Return the given (or all) segments from a given iteration.

If the optional parameter load_auxdata is true, then all auxiliary datasets available are loaded and mapped onto the data dictionary of each segment. If load_auxdata is None, then use the default self.auto_load_auxdata, which can be set by the option load_auxdata in the [data] section of west.cfg. This essentially requires as much RAM as there is per-iteration auxiliary data, so this behavior is not on by default.

prepare_segment_restarts(segments, basis_states=None, initial_states=None)

Prepare the necessary folder and files given the data stored in parent per-iteration HDF5 file for propagating the simulation. basis_states and initial_states should be provided if the segments are newly created

get_all_parent_ids(n_iter)
get_parent_ids(n_iter, seg_ids=None)

Return a sequence of the parent IDs of the given seg_ids.

get_weights(n_iter, seg_ids)

Return the weights associated with the given seg_ids

get_child_ids(n_iter, seg_id)

Return the seg_ids of segments who have the given segment as a parent.

get_children(segment)

Return all segments which have the given segment as a parent

prepare_run()
finalize_run()
save_new_weight_data(n_iter, new_weights)

Save a set of NewWeightEntry objects to HDF5. Note that this should be called for the iteration in which the weights appear in their new locations (e.g. for recycled walkers, the iteration following recycling).

get_new_weight_data(n_iter)
find_bin_mapper(hashval)

Check to see if the given has value is in the binning table. Returns the index in the bin data tables if found, or raises KeyError if not.

get_bin_mapper(hashval)

Look up the given hash value in the binning table, unpickling and returning the corresponding bin mapper if available, or raising KeyError if not.

save_bin_mapper(hashval, pickle_data)

Store the given mapper in the table of saved mappers. If the mapper cannot be stored, PickleError will be raised. Returns the index in the bin data tables where the mapper is stored.

save_iter_binning(n_iter, hashval, pickled_mapper, target_counts)

Save information about the binning used to generate segments for iteration n_iter.

westpa.core.data_manager.normalize_dataset_options(dsopts, path_prefix='', n_iter=0)
westpa.core.data_manager.create_dataset_from_dsopts(group, dsopts, shape=None, dtype=None, data=None, autocompress_threshold=None, n_iter=None)
westpa.core.data_manager.require_dataset_from_dsopts(group, dsopts, shape=None, dtype=None, data=None, autocompress_threshold=None, n_iter=None)
westpa.core.data_manager.calc_chunksize(shape, dtype, max_chunksize=262144)

Calculate a chunk size for HDF5 data, anticipating that access will slice along lower dimensions sooner than higher dimensions.

westpa.core.extloader module

westpa.core.extloader.load_module(module_name, path=None)

Load and return the given module, recursively loading containing packages as necessary.

westpa.core.extloader.get_object(object_name, path=None)

Attempt to load the given object, using additional path information if given.

westpa.core.h5io module

Miscellaneous routines to help with HDF5 input and output of WEST-related data.

class westpa.core.h5io.Trajectory(xyz, topology, time=None, unitcell_lengths=None, unitcell_angles=None)

Bases: object

Container object for a molecular dynamics trajectory

A Trajectory represents a collection of one or more molecular structures, generally (but not necessarily) from a molecular dynamics trajectory. The Trajectory stores a number of fields describing the system through time, including the cartesian coordinates of each atoms (xyz), the topology of the molecular system (topology), and information about the unitcell if appropriate (unitcell_vectors, unitcell_length, unitcell_angles).

A Trajectory should generally be constructed by loading a file from disk. Trajectories can be loaded from (and saved to) the PDB, XTC, TRR, DCD, binpos, NetCDF or MDTraj HDF5 formats.

Trajectory supports fancy indexing, so you can extract one or more frames from a Trajectory as a separate trajectory. For example, to form a trajectory with every other frame, you can slice with traj[::2].

Trajectory uses the nanometer, degree & picosecond unit system.

Examples

>>> # loading a trajectory
>>> import mdtraj as md
>>> md.load('trajectory.xtc', top='native.pdb')
<mdtraj.Trajectory with 1000 frames, 22 atoms at 0x1058a73d0>
>>> # slicing a trajectory
>>> t = md.load('trajectory.h5')
>>> print(t)
<mdtraj.Trajectory with 100 frames, 22 atoms>
>>> print(t[::2])
<mdtraj.Trajectory with 50 frames, 22 atoms>
>>> # calculating the average distance between two atoms
>>> import mdtraj as md
>>> import numpy as np
>>> t = md.load('trajectory.h5')
>>> np.mean(np.sqrt(np.sum((t.xyz[:, 0, :] - t.xyz[:, 21, :])**2, axis=1)))

See also

mdtraj.load

High-level function that loads files and returns an md.Trajectory

n_frames
Type:

int

n_atoms
Type:

int

n_residues
Type:

int

time
Type:

np.ndarray, shape=(n_frames,)

timestep
Type:

float

topology
Type:

md.Topology

top
Type:

md.Topology

xyz
Type:

np.ndarray, shape=(n_frames, n_atoms, 3)

unitcell_vectors
Type:

{np.ndarray, shape=(n_frames, 3, 3), None}

unitcell_lengths
Type:

{np.ndarray, shape=(n_frames, 3), None}

unitcell_angles
Type:

{np.ndarray, shape=(n_frames, 3), None}

property n_frames

Number of frames in the trajectory

Returns:

n_frames – The number of frames in the trajectory

Return type:

int

property n_atoms

Number of atoms in the trajectory

Returns:

n_atoms – The number of atoms in the trajectory

Return type:

int

property n_residues

Number of residues (amino acids) in the trajectory

Returns:

n_residues – The number of residues in the trajectory’s topology

Return type:

int

property n_chains

Number of chains in the trajectory

Returns:

n_chains – The number of chains in the trajectory’s topology

Return type:

int

property top

Alias for self.topology, describing the organization of atoms into residues, bonds, etc

Returns:

topology – The topology object, describing the organization of atoms into residues, bonds, etc

Return type:

md.Topology

property timestep

Timestep between frames, in picoseconds

Returns:

timestep – The timestep between frames, in picoseconds.

Return type:

float

property unitcell_vectors

The vectors that define the shape of the unit cell in each frame

Returns:

vectors – Vectors defining the shape of the unit cell in each frame. The semantics of this array are that the shape of the unit cell in frame i are given by the three vectors, value[i, 0, :], value[i, 1, :], and value[i, 2, :].

Return type:

np.ndarray, shape(n_frames, 3, 3)

property unitcell_volumes

Volumes of unit cell for each frame.

Returns:

volumes – Volumes of the unit cell in each frame, in nanometers^3, or None if the Trajectory contains no unitcell information.

Return type:

{np.ndarray, shape=(n_frames), None}

superpose(reference, frame=0, atom_indices=None, ref_atom_indices=None, parallel=True)

Superpose each conformation in this trajectory upon a reference

Parameters:
  • reference (md.Trajectory) – Align self to a particular frame in reference

  • frame (int) – The index of the conformation in reference to align to.

  • atom_indices (array_like, or None) – The indices of the atoms to superpose. If not supplied, all atoms will be used.

  • ref_atom_indices (array_like, or None) – Use these atoms on the reference structure. If not supplied, the same atom indices will be used for this trajectory and the reference one.

  • parallel (bool) – Use OpenMP to run the superposition in parallel over multiple cores

Return type:

self

join(other, check_topology=True, discard_overlapping_frames=False)

Join two trajectories together along the time/frame axis.

This method joins trajectories along the time axis, giving a new trajectory of length equal to the sum of the lengths of self and other. It can also be called by using self + other

Parameters:
  • other (Trajectory or list of Trajectory) – One or more trajectories to join with this one. These trajectories are appended to the end of this trajectory.

  • check_topology (bool) – Ensure that the topology of self and other are identical before joining them. If false, the resulting trajectory will have the topology of self.

  • discard_overlapping_frames (bool, optional) – If True, compare coordinates at trajectory edges to discard overlapping frames. Default: False.

See also

stack

join two trajectories along the atom axis

stack(other, keep_resSeq=True)

Stack two trajectories along the atom axis

This method joins trajectories along the atom axis, giving a new trajectory with a number of atoms equal to the sum of the number of atoms in self and other.

Notes

The resulting trajectory will have the unitcell and time information the left operand.

Examples

>>> t1 = md.load('traj1.h5')
>>> t2 = md.load('traj2.h5')
>>> # even when t2 contains no unitcell information
>>> t2.unitcell_vectors = None
>>> stacked = t1.stack(t2)
>>> # the stacked trajectory inherits the unitcell information
>>> # from the first trajectory
>>> np.all(stacked.unitcell_vectors == t1.unitcell_vectors)
True
Parameters:
  • other (Trajectory) – The other trajectory to join

  • keep_resSeq (bool, optional, default=True) – see `mdtraj.core.topology.Topology.join` method documentation

See also

join

join two trajectories along the time/frame axis.

slice(key, copy=True)

Slice trajectory, by extracting one or more frames into a separate object

This method can also be called using index bracket notation, i.e traj[1] == traj.slice(1)

Parameters:
  • key ({int, np.ndarray, slice}) – The slice to take. Can be either an int, a list of ints, or a slice object.

  • copy (bool, default=True) – Copy the arrays after slicing. If you set this to false, then if you modify a slice, you’ll modify the original array since they point to the same data.

property topology

Topology of the system, describing the organization of atoms into residues, bonds, etc

Returns:

topology – The topology object, describing the organization of atoms into residues, bonds, etc

Return type:

md.Topology

property xyz

Cartesian coordinates of each atom in each simulation frame

Returns:

xyz – A three dimensional numpy array, with the cartesian coordinates of each atoms in each frame.

Return type:

np.ndarray, shape=(n_frames, n_atoms, 3)

property unitcell_lengths

Lengths that define the shape of the unit cell in each frame.

Returns:

lengths – Lengths of the unit cell in each frame, in nanometers, or None if the Trajectory contains no unitcell information.

Return type:

{np.ndarray, shape=(n_frames, 3), None}

property unitcell_angles

Angles that define the shape of the unit cell in each frame.

Returns:

lengths – The angles between the three unitcell vectors in each frame, alpha, beta, and gamma. alpha' gives the angle between vectors ``b and c, beta gives the angle between vectors c and a, and gamma gives the angle between vectors a and b. The angles are in degrees.

Return type:

np.ndarray, shape=(n_frames, 3)

property time

The simulation time corresponding to each frame, in picoseconds

Returns:

time – The simulation time corresponding to each frame, in picoseconds

Return type:

np.ndarray, shape=(n_frames,)

openmm_positions(frame)

OpenMM-compatable positions of a single frame.

Examples

>>> t = md.load('trajectory.h5')
>>> context.setPositions(t.openmm_positions(0))
Parameters:

frame (int) – The index of frame of the trajectory that you wish to extract

Returns:

positions – The cartesian coordinates of specific trajectory frame, formatted for input to OpenMM

Return type:

list

openmm_boxes(frame)

OpenMM-compatable box vectors of a single frame.

Examples

>>> t = md.load('trajectory.h5')
>>> context.setPeriodicBoxVectors(t.openmm_positions(0))
Parameters:

frame (int) – Return box for this single frame.

Returns:

box – The periodic box vectors for this frame, formatted for input to OpenMM.

Return type:

tuple

static load(filenames, **kwargs)

Load a trajectory from disk

Parameters:
  • filenames ({path-like, [path-like]}) – Either a path or list of paths

  • extension (As requested by the various load functions -- it depends on the)

save(filename, **kwargs)

Save trajectory to disk, in a format determined by the filename extension

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory. The extension will be parsed and will control the format.

  • lossy (bool) – For .h5 or .lh5, whether or not to use compression.

  • no_models (bool) – For .pdb. TODO: Document this?

  • force_overwrite (bool) – If filename already exists, overwrite it.

save_hdf5(filename, force_overwrite=True)

Save trajectory to MDTraj HDF5 format

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there

save_lammpstrj(filename, force_overwrite=True)

Save trajectory to LAMMPS custom dump format

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there

save_xyz(filename, force_overwrite=True)

Save trajectory to .xyz format.

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there

save_pdb(filename, force_overwrite=True, bfactors=None)

Save trajectory to RCSB PDB format

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there

  • bfactors (array_like, default=None, shape=(n_frames, n_atoms) or (n_atoms,)) – Save bfactors with pdb file. If the array is two dimensional it should contain a bfactor for each atom in each frame of the trajectory. Otherwise, the same bfactor will be saved in each frame.

save_xtc(filename, force_overwrite=True)

Save trajectory to Gromacs XTC format

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there

save_trr(filename, force_overwrite=True)

Save trajectory to Gromacs TRR format

Notes

Only the xyz coordinates and the time are saved, the velocities and forces in the trr will be zeros

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there

save_dcd(filename, force_overwrite=True)

Save trajectory to CHARMM/NAMD DCD format

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filenames, if its already there

save_dtr(filename, force_overwrite=True)

Save trajectory to DESMOND DTR format

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filenames, if its already there

save_binpos(filename, force_overwrite=True)

Save trajectory to AMBER BINPOS format

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there

save_mdcrd(filename, force_overwrite=True)

Save trajectory to AMBER mdcrd format

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there

save_netcdf(filename, force_overwrite=True)

Save trajectory in AMBER NetCDF format

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if it’s already there

save_netcdfrst(filename, force_overwrite=True)

Save trajectory in AMBER NetCDF restart format

Parameters:
  • filename (path-like) – filesystem path in which to save the restart

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if it’s already there

Notes

NetCDF restart files can only store a single frame. If only one frame exists, “filename” will be written. Otherwise, “filename.#” will be written, where # is a zero-padded number from 1 to the total number of frames in the trajectory

save_amberrst7(filename, force_overwrite=True)

Save trajectory in AMBER ASCII restart format

Parameters:
  • filename (path-like) – filesystem path in which to save the restart

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if it’s already there

Notes

Amber restart files can only store a single frame. If only one frame exists, “filename” will be written. Otherwise, “filename.#” will be written, where # is a zero-padded number from 1 to the total number of frames in the trajectory

save_lh5(filename, force_overwrite=True)

Save trajectory in deprecated MSMBuilder2 LH5 (lossy HDF5) format.

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if it’s already there

save_gro(filename, force_overwrite=True, precision=3)

Save trajectory in Gromacs .gro format

Parameters:
  • filename (path-like) – Path to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at that filename if it exists

  • precision (int, default=3) – The number of decimal places to use for coordinates in GRO file

save_tng(filename, force_overwrite=True)

Save trajectory to Gromacs TNG format

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there

save_gsd(filename, force_overwrite=True)

Save trajectory to HOOMD GSD format

Parameters:
  • filename (path-like) – filesystem path in which to save the trajectory

  • force_overwrite (bool, default=True) – Overwrite anything that exists at filenames, if its already there

center_coordinates(mass_weighted=False)

Center each trajectory frame at the origin (0,0,0).

This method acts inplace on the trajectory. The centering can be either uniformly weighted (mass_weighted=False) or weighted by the mass of each atom (mass_weighted=True).

Parameters:

mass_weighted (bool, optional (default = False)) – If True, weight atoms by mass when removing COM.

Return type:

self

restrict_atoms(**kwargs)

DEPRECATED: restrict_atoms was replaced by atom_slice and will be removed in 2.0

Retain only a subset of the atoms in a trajectory

Deletes atoms not in atom_indices, and re-indexes those that remain

atom_indicesarray-like, dtype=int, shape=(n_atoms)

List of atom indices to keep.

inplacebool, default=True

If True, the operation is done inplace, modifying self. Otherwise, a copy is returned with the restricted atoms, and self is not modified.

trajmd.Trajectory

The return value is either self, or the new trajectory, depending on the value of inplace.

atom_slice(atom_indices, inplace=False)

Create a new trajectory from a subset of atoms

Parameters:
  • atom_indices (array-like, dtype=int, shape=(n_atoms)) – List of indices of atoms to retain in the new trajectory.

  • inplace (bool, default=False) – If True, the operation is done inplace, modifying self. Otherwise, a copy is returned with the sliced atoms, and self is not modified.

Returns:

traj – The return value is either self, or the new trajectory, depending on the value of inplace.

Return type:

md.Trajectory

See also

stack

stack multiple trajectories along the atom axis

remove_solvent(exclude=None, inplace=False)

Create a new trajectory without solvent atoms

Parameters:
  • exclude (array-like, dtype=str, shape=(n_solvent_types)) – List of solvent residue names to retain in the new trajectory.

  • inplace (bool, default=False) – The return value is either self, or the new trajectory, depending on the value of inplace.

Returns:

traj – The return value is either self, or the new trajectory, depending on the value of inplace.

Return type:

md.Trajectory

smooth(width, order=3, atom_indices=None, inplace=False)

Smoothen a trajectory using a zero-delay Buttersworth filter. Please note that for optimal results the trajectory should be properly aligned prior to smoothing (see md.Trajectory.superpose).

Parameters:
  • width (int) – This acts very similar to the window size in a moving average smoother. In this implementation, the frequency of the low-pass filter is taken to be two over this width, so it’s like “half the period” of the sinusiod where the filter starts to kick in. Must be an integer greater than one.

  • order (int, optional, default=3) – The order of the filter. A small odd number is recommended. Higher order filters cutoff more quickly, but have worse numerical properties.

  • atom_indices (array-like, dtype=int, shape=(n_atoms), default=None) – List of indices of atoms to retain in the new trajectory. Default is set to None, which applies smoothing to all atoms.

  • inplace (bool, default=False) – The return value is either self, or the new trajectory, depending on the value of inplace.

Returns:

traj – The return value is either self, or the new smoothed trajectory, depending on the value of inplace.

Return type:

md.Trajectory

References

make_molecules_whole(inplace=False, sorted_bonds=None)

Only make molecules whole

Parameters:
  • inplace (bool) – If False, a new Trajectory is created and returned. If True, this Trajectory is modified directly.

  • sorted_bonds (array of shape (n_bonds, 2)) – Pairs of atom indices that define bonds, in sorted order. If not specified, these will be determined from the trajectory’s topology.

See also

image_molecules

image_molecules(inplace=False, anchor_molecules=None, other_molecules=None, sorted_bonds=None, make_whole=True)

Recenter and apply periodic boundary conditions to the molecules in each frame of the trajectory.

This method is useful for visualizing a trajectory in which molecules were not wrapped to the periodic unit cell, or in which the macromolecules are not centered with respect to the solvent. It tries to be intelligent in deciding what molecules to center, so you can simply call it and trust that it will “do the right thing”.

Parameters:
  • inplace (bool, default=False) – If False, a new Trajectory is created and returned. If True, this Trajectory is modified directly.

  • anchor_molecules (list of atom sets, optional, default=None) – Molecule that should be treated as an “anchor”. These molecules will be centered in the box and put near each other. If not specified, anchor molecules are guessed using a heuristic.

  • other_molecules (list of atom sets, optional, default=None) – Molecules that are not anchors. If not specified, these will be molecules other than the anchor molecules

  • sorted_bonds (array of shape (n_bonds, 2)) – Pairs of atom indices that define bonds, in sorted order. If not specified, these will be determined from the trajectory’s topology. Only relevant if make_whole is True.

  • make_whole (bool) – Whether to make molecules whole.

Returns:

traj – The return value is either self or the new trajectory, depending on the value of inplace.

Return type:

md.Trajectory

See also

Topology.guess_anchor_molecules

westpa.core.h5io.join_traj(trajs, check_topology=True, discard_overlapping_frames=False)

Concatenate multiple trajectories into one long trajectory

Parameters:
  • trajs (iterable of trajectories) – Combine these into one trajectory

  • check_topology (bool) – Make sure topologies match before joining

  • discard_overlapping_frames (bool) – Check for overlapping frames and discard

westpa.core.h5io.in_units_of(quantity, units_in, units_out, inplace=False)

Convert a numerical quantity between unit systems.

Parameters:
  • quantity ({number, np.ndarray, openmm.unit.Quantity}) – quantity can either be a unitted quantity – i.e. instance of openmm.unit.Quantity, or just a bare number or numpy array

  • units_in (str) – If you supply a quantity that’s not a openmm.unit.Quantity, you should tell me what units it is in. If you don’t, i’m just going to echo you back your quantity without doing any unit checking.

  • units_out (str) – A string description of the units you want out. This should look like “nanometers/picosecond” or “nanometers**3” or whatever

  • inplace (bool) – Attempt to do the transformation inplace, by mutating the quantity argument and avoiding a copy. This is only possible if quantity is a writable numpy array.

Returns:

rquantity – The resulting quantity, in the new unit system. If the function was called with inplace=True and quantity was a writable numpy array, rquantity will alias the same memory as the input quantity, which will have been changed inplace. Otherwise, if a copy was required, rquantity will point to new memory.

Return type:

{number, np.ndarray}

Examples

>>> in_units_of(1, 'meter**2/second', 'nanometers**2/picosecond')
1000000.0
westpa.core.h5io.import_(module)

Import a module, and issue a nice message to stderr if the module isn’t installed.

Currently, this function will print nice error messages for networkx, tables, netCDF4, and openmm.unit, which are optional MDTraj dependencies.

Parameters:

module (str) – The module you’d like to import, as a string

Returns:

module – The module object

Return type:

{module, object}

Examples

>>> # the following two lines are equivalent. the difference is that the
>>> # second will check for an ImportError and print you a very nice
>>> # user-facing message about what's wrong (where you can install the
>>> # module from, etc) if the import fails
>>> import tables
>>> tables = import_('tables')
westpa.core.h5io.ensure_type(val, dtype, ndim, name, length=None, can_be_none=False, shape=None, warn_on_cast=True, add_newaxis_on_deficient_ndim=False)

Typecheck the size, shape and dtype of a numpy array, with optional casting.

Parameters:
  • val ({np.ndaraay, None}) – The array to check

  • dtype ({nd.dtype, str}) – The dtype you’d like the array to have

  • ndim (int) – The number of dimensions you’d like the array to have

  • name (str) – name of the array. This is used when throwing exceptions, so that we can describe to the user which array is messed up.

  • length (int, optional) – How long should the array be?

  • can_be_none (bool) – Is val == None acceptable?

  • shape (tuple, optional) – What should be shape of the array be? If the provided tuple has Nones in it, those will be semantically interpreted as matching any length in that dimension. So, for example, using the shape spec (None, None, 3) will ensure that the last dimension is of length three without constraining the first two dimensions

  • warn_on_cast (bool, default=True) – Raise a warning when the dtypes don’t match and a cast is done.

  • add_newaxis_on_deficient_ndim (bool, default=True) – Add a new axis to the beginining of the array if the number of dimensions is deficient by one compared to your specification. For instance, if you’re trying to get out an array of ndim == 3, but the user provides an array of shape == (10, 10), a new axis will be created with length 1 in front, so that the return value is of shape (1, 10, 10).

Notes

The returned value will always be C-contiguous.

Returns:

typechecked_val – If val=None and can_be_none=True, then this will return None. Otherwise, it will return val (or a copy of val). If the dtype wasn’t right, it’ll be casted to the right shape. If the array was not C-contiguous, it’ll be copied as well.

Return type:

np.ndarray, None

class westpa.core.h5io.HDF5TrajectoryFile(filename, mode='r', force_overwrite=True, compression='zlib')

Bases: object

Interface for reading and writing to a MDTraj HDF5 molecular dynamics trajectory file, whose format is described here.

This is a file-like object, that both reading or writing depending on the mode flag. It implements the context manager protocol, so you can also use it with the python ‘with’ statement.

The format is extremely flexible and high performance. It can hold a wide variety of information about a trajectory, including fields like the temperature and energies. Because it’s built on the fantastic HDF5 library, it’s easily extensible too.

Parameters:
  • filename (path-like) – Path to the file to open

  • mode ({'r, 'w'}) – Mode in which to open the file. ‘r’ is for reading and ‘w’ is for writing

  • force_overwrite (bool) – In mode=’w’, how do you want to behave if a file by the name of filename already exists? if force_overwrite=True, it will be overwritten.

  • compression ({'zlib', None}) – Apply compression to the file? This will save space, and does not cost too many cpu cycles, so it’s recommended.

root
title
application
topology
randomState
forcefield
reference
constraints

See also

mdtraj.load_hdf5

High-level wrapper that returns a md.Trajectory

distance_unit = 'nanometers'
property root

Direct access to the root group of the underlying Tables HDF5 file handle.

This can be used for random or specific access to the underlying arrays on disk

property title

User-defined title for the data represented in the file

property application

Suite of programs that created the file

property topology

Get the topology out from the file

Returns:

topology – A topology object

Return type:

mdtraj.Topology

property randomState

State of the creators internal random number generator at the start of the simulation

property forcefield

Description of the hamiltonian used. A short, human readable string, like AMBER99sbildn.

property reference

A published reference that documents the program or parameters used to generate the data

property constraints

Constraints applied to the bond lengths

Returns:

constraints – A one dimensional array with the a int, int, float type giving the index of the two atoms involved in the constraints and the distance of the constraint. If no constraint information is in the file, the return value is None.

Return type:

{None, np.array, dtype=[(‘atom1’, ‘<i4’), (‘atom2’, ‘<i4’), (‘distance’, ‘<f4’)])}

read_as_traj(n_frames=None, stride=None, atom_indices=None)

Read a trajectory from the HDF5 file

Parameters:
  • n_frames ({int, None}) – The number of frames to read. If not supplied, all of the remaining frames will be read.

  • stride ({int, None}) – By default all of the frames will be read, but you can pass this flag to read a subset of of the data by grabbing only every stride-th frame from disk.

  • atom_indices ({int, None}) – By default all of the atom will be read, but you can pass this flag to read only a subsets of the atoms for the coordinates and velocities fields. Note that you will have to carefully manage the indices and the offsets, since the i-th atom in the topology will not necessarily correspond to the i-th atom in your subset.

Returns:

trajectory – A trajectory object containing the loaded portion of the file.

Return type:

Trajectory

read(n_frames=None, stride=None, atom_indices=None)

Read one or more frames of data from the file

Parameters:
  • n_frames ({int, None}) – The number of frames to read. If not supplied, all of the remaining frames will be read.

  • stride ({int, None}) – By default all of the frames will be read, but you can pass this flag to read a subset of of the data by grabbing only every stride-th frame from disk.

  • atom_indices ({int, None}) – By default all of the atom will be read, but you can pass this flag to read only a subsets of the atoms for the coordinates and velocities fields. Note that you will have to carefully manage the indices and the offsets, since the i-th atom in the topology will not necessarily correspond to the i-th atom in your subset.

Notes

If you’d like more flexible access to the data, that is available by using the pytables group directly, which is accessible via the root property on this class.

Returns:

frames – The returned namedtuple will have the fields “coordinates”, “time”, “cell_lengths”, “cell_angles”, “velocities”, “kineticEnergy”, “potentialEnergy”, “temperature” and “alchemicalLambda”. Each of the fields in the returned namedtuple will either be a numpy array or None, dependening on if that data was saved in the trajectory. All of the data shall be n units of “nanometers”, “picoseconds”, “kelvin”, “degrees” and “kilojoules_per_mole”.

Return type:

namedtuple

write(coordinates, time=None, cell_lengths=None, cell_angles=None, velocities=None, kineticEnergy=None, potentialEnergy=None, temperature=None, alchemicalLambda=None)

Write one or more frames of data to the file

This method saves data that is associated with one or more simulation frames. Note that all of the arguments can either be raw numpy arrays or unitted arrays (with openmm.unit.Quantity). If the arrays are unittted, a unit conversion will be automatically done from the supplied units into the proper units for saving on disk. You won’t have to worry about it.

Furthermore, if you wish to save a single frame of simulation data, you can do so naturally, for instance by supplying a 2d array for the coordinates and a single float for the time. This “shape deficiency” will be recognized, and handled appropriately.

Parameters:
  • coordinates (np.ndarray, shape=(n_frames, n_atoms, 3)) – The cartesian coordinates of the atoms to write. By convention, the lengths should be in units of nanometers.

  • time (np.ndarray, shape=(n_frames,), optional) – You may optionally specify the simulation time, in picoseconds corresponding to each frame.

  • cell_lengths (np.ndarray, shape=(n_frames, 3), dtype=float32, optional) – You may optionally specify the unitcell lengths. The length of the periodic box in each frame, in each direction, a, b, c. By convention the lengths should be in units of angstroms.

  • cell_angles (np.ndarray, shape=(n_frames, 3), dtype=float32, optional) – You may optionally specify the unitcell angles in each frame. Organized analogously to cell_lengths. Gives the alpha, beta and gamma angles respectively. By convention, the angles should be in units of degrees.

  • velocities (np.ndarray, shape=(n_frames, n_atoms, 3), optional) – You may optionally specify the cartesian components of the velocity for each atom in each frame. By convention, the velocities should be in units of nanometers / picosecond.

  • kineticEnergy (np.ndarray, shape=(n_frames,), optional) – You may optionally specify the kinetic energy in each frame. By convention the kinetic energies should b in units of kilojoules per mole.

  • potentialEnergy (np.ndarray, shape=(n_frames,), optional) – You may optionally specify the potential energy in each frame. By convention the kinetic energies should b in units of kilojoules per mole.

  • temperature (np.ndarray, shape=(n_frames,), optional) – You may optionally specify the temperature in each frame. By convention the temperatures should b in units of Kelvin.

  • alchemicalLambda (np.ndarray, shape=(n_frames,), optional) – You may optionally specify the alchemical lambda in each frame. These have no units, but are generally between zero and one.

seek(offset, whence=0)

Move to a new file position

Parameters:
  • offset (int) – A number of frames.

  • whence ({0, 1, 2}) – 0: offset from start of file, offset should be >=0. 1: move relative to the current position, positive or negative 2: move relative to the end of file, offset should be <= 0. Seeking beyond the end of a file is not supported

tell()

Current file position

Returns:

offset – The current frame in the file.

Return type:

int

close()

Close the HDF5 file handle

flush()

Write all buffered data in the to the disk file.

class westpa.core.h5io.Frames(coordinates, time, cell_lengths, cell_angles, velocities, kineticEnergy, potentialEnergy, temperature, alchemicalLambda)

Bases: tuple

Create new instance of Frames(coordinates, time, cell_lengths, cell_angles, velocities, kineticEnergy, potentialEnergy, temperature, alchemicalLambda)

alchemicalLambda

Alias for field number 8

cell_angles

Alias for field number 3

cell_lengths

Alias for field number 2

coordinates

Alias for field number 0

kineticEnergy

Alias for field number 5

potentialEnergy

Alias for field number 6

temperature

Alias for field number 7

time

Alias for field number 1

velocities

Alias for field number 4

class westpa.core.h5io.WESTTrajectory(coordinates, topology=None, time=None, iter_labels=None, seg_labels=None, pcoords=None, parent_ids=None, unitcell_lengths=None, unitcell_angles=None)

Bases: Trajectory

A subclass of mdtraj.Trajectory that contains the trajectory of atom coordinates with pointers denoting the iteration number and segment index of each frame.

iter_label_values()
seg_label_values(iteration=None)
property label_values
property iter_labels

Iteration index corresponding to each frame

Returns:

time – The iteration index corresponding to each frame

Return type:

np.ndarray, shape=(n_frames,)

property seg_labels

Segment index corresponding to each frame

Returns:

time – The segment index corresponding to each frame

Return type:

np.ndarray, shape=(n_frames,)

property pcoords
property parent_ids
join(other, check_topology=True, discard_overlapping_frames=False)

Join two Trajectory``s. This overrides ``mdtraj.Trajectory.join so that it also handles WESTPA pointers. mdtraj.Trajectory.join’s documentation for more details.

slice(key, copy=True)

Slice the Trajectory. This overrides mdtraj.Trajectory.slice so that it also handles WESTPA pointers. Please see mdtraj.Trajectory.slice’s documentation for more details.

westpa.core.h5io.resolve_filepath(path, constructor=<class 'h5py._hl.files.File'>, cargs=None, ckwargs=None, **addtlkwargs)

Use a combined filesystem and HDF5 path to open an HDF5 file and return the appropriate object. Returns (h5file, h5object). The file is opened using constructor(filename, *cargs, **ckwargs).

westpa.core.h5io.calc_chunksize(shape, dtype, max_chunksize=262144)

Calculate a chunk size for HDF5 data, anticipating that access will slice along lower dimensions sooner than higher dimensions.

westpa.core.h5io.tostr(b)

Convert a nonstandard string object b to str with the handling of the case where b is bytes.

westpa.core.h5io.is_within_directory(directory, target)
westpa.core.h5io.safe_extract(tar, path='.', members=None, *, numeric_owner=False)
westpa.core.h5io.create_hdf5_group(parent_group, groupname, replace=False, creating_program=None)

Create (or delete and recreate) and HDF5 group named groupname within the enclosing Group (object) parent_group. If replace is True, then the group is replaced if present; if False, then an error is raised if the group is present. After the group is created, HDF5 attributes are set using stamp_creator_data.

westpa.core.h5io.stamp_creator_data(h5group, creating_program=None)

Mark the following on the HDF5 group h5group:

creation_program:

The name of the program that created the group

creation_user:

The username of the user who created the group

creation_hostname:

The hostname of the machine on which the group was created

creation_time:

The date and time at which the group was created, in the current locale.

creation_unix_time:

The Unix time (seconds from the epoch, UTC) at which the group was created.

This is meant to facilitate tracking the flow of data, but should not be considered a secure paper trail (after all, anyone with write access to the HDF5 file can modify these attributes).

westpa.core.h5io.get_creator_data(h5group)

Read back creator data as written by stamp_creator_data, returning a dictionary with keys as described for stamp_creator_data. Missing fields are denoted with None. The creation_time field is returned as a string.

westpa.core.h5io.load_west(filename)

Load WESTPA trajectory files from disk.

Parameters:

filename (str) – String filename of HDF Trajectory file.

westpa.core.h5io.stamp_iter_range(h5object, start_iter, stop_iter)

Mark that the HDF5 object h5object (dataset or group) contains data from iterations start_iter <= n_iter < stop_iter.

westpa.core.h5io.get_iter_range(h5object)

Read back iteration range data written by stamp_iter_range

westpa.core.h5io.stamp_iter_step(h5group, iter_step)

Mark that the HDF5 object h5object (dataset or group) contains data with an iteration step (stride) of iter_step).

westpa.core.h5io.get_iter_step(h5group)

Read back iteration step (stride) written by stamp_iter_step

westpa.core.h5io.check_iter_range_least(h5object, iter_start, iter_stop)

Return True if the iteration range [iter_start, iter_stop) is the same as or entirely contained within the iteration range stored on h5object.

westpa.core.h5io.check_iter_range_equal(h5object, iter_start, iter_stop)

Return True if the iteration range [iter_start, iter_stop) is the same as the iteration range stored on h5object.

westpa.core.h5io.get_iteration_entry(h5object, n_iter)

Create a slice for data corresponding to iteration n_iter in h5object.

westpa.core.h5io.get_iteration_slice(h5object, iter_start, iter_stop=None, iter_stride=None)

Create a slice for data corresponding to iterations [iter_start,iter_stop), with stride iter_step, in the given h5object.

westpa.core.h5io.label_axes(h5object, labels, units=None)

Stamp the given HDF5 object with axis labels. This stores the axis labels in an array of strings in an attribute called axis_labels on the given object. units if provided is a corresponding list of units.

class westpa.core.h5io.WESTPAH5File(*args, **kwargs)

Bases: File

Generalized input/output for WESTPA simulation (or analysis) data.

Create a new file object.

See the h5py user guide for a detailed explanation of the options.

name

Name of the file on disk, or file-like object. Note: for files created with the ‘core’ driver, HDF5 still requires this be non-empty.

mode

r Readonly, file must exist (default) r+ Read/write, file must exist w Create file, truncate if exists w- or x Create file, fail if exists a Read/write if exists, create otherwise

driver

Name of the driver to use. Legal values are None (default, recommended), ‘core’, ‘sec2’, ‘direct’, ‘stdio’, ‘mpio’, ‘ros3’.

libver

Library version bounds. Supported values: ‘earliest’, ‘v108’, ‘v110’, ‘v112’ and ‘latest’. The ‘v108’, ‘v110’ and ‘v112’ options can only be specified with the HDF5 1.10.2 library or later.

userblock_size

Desired size of user block. Only allowed when creating a new file (mode w, w- or x).

swmr

Open the file in SWMR read mode. Only used when mode = ‘r’.

rdcc_nbytes

Total size of the dataset chunk cache in bytes. The default size is 1024**2 (1 MiB) per dataset. Applies to all datasets unless individually changed.

rdcc_w0

The chunk preemption policy for all datasets. This must be between 0 and 1 inclusive and indicates the weighting according to which chunks which have been fully read or written are penalized when determining which chunks to flush from cache. A value of 0 means fully read or written chunks are treated no differently than other chunks (the preemption is strictly LRU) while a value of 1 means fully read or written chunks are always preempted before other chunks. If your application only reads or writes data once, this can be safely set to 1. Otherwise, this should be set lower depending on how often you re-read or re-write the same data. The default value is 0.75. Applies to all datasets unless individually changed.

rdcc_nslots

The number of chunk slots in the raw data chunk cache for this file. Increasing this value reduces the number of cache collisions, but slightly increases the memory used. Due to the hashing strategy, this value should ideally be a prime number. As a rule of thumb, this value should be at least 10 times the number of chunks that can fit in rdcc_nbytes bytes. For maximum performance, this value should be set approximately 100 times that number of chunks. The default value is 521. Applies to all datasets unless individually changed.

track_order

Track dataset/group/attribute creation order under root group if True. If None use global default h5.get_config().track_order.

fs_strategy

The file space handling strategy to be used. Only allowed when creating a new file (mode w, w- or x). Defined as: “fsm” FSM, Aggregators, VFD “page” Paged FSM, VFD “aggregate” Aggregators, VFD “none” VFD If None use HDF5 defaults.

fs_page_size

File space page size in bytes. Only used when fs_strategy=”page”. If None use the HDF5 default (4096 bytes).

fs_persist

A boolean value to indicate whether free space should be persistent or not. Only allowed when creating a new file. The default value is False.

fs_threshold

The smallest free-space section size that the free space manager will track. Only allowed when creating a new file. The default value is 1.

page_buf_size

Page buffer size in bytes. Only allowed for HDF5 files created with fs_strategy=”page”. Must be a power of two value and greater or equal than the file space page size when creating the file. It is not used by default.

min_meta_keep

Minimum percentage of metadata to keep in the page buffer before allowing pages containing metadata to be evicted. Applicable only if page_buf_size is set. Default value is zero.

min_raw_keep

Minimum percentage of raw data to keep in the page buffer before allowing pages containing raw data to be evicted. Applicable only if page_buf_size is set. Default value is zero.

locking

The file locking behavior. Defined as:

  • False (or “false”) – Disable file locking

  • True (or “true”) – Enable file locking

  • “best-effort” – Enable file locking but ignore some errors

  • None – Use HDF5 defaults

Warning

The HDF5_USE_FILE_LOCKING environment variable can override this parameter.

Only available with HDF5 >= 1.12.1 or 1.10.x >= 1.10.7.

alignment_threshold

Together with alignment_interval, this property ensures that any file object greater than or equal in size to the alignment threshold (in bytes) will be aligned on an address which is a multiple of alignment interval.

alignment_interval

This property should be used in conjunction with alignment_threshold. See the description above. For more details, see https://portal.hdfgroup.org/display/HDF5/H5P_SET_ALIGNMENT

meta_block_size

Set the current minimum size, in bytes, of new metadata block allocations. See https://portal.hdfgroup.org/display/HDF5/H5P_SET_META_BLOCK_SIZE

Additional keywords

Passed on to the selected file driver.

default_iter_prec = 8
replace_dataset(*args, **kwargs)
iter_object_name(n_iter, prefix='', suffix='')

Return a properly-formatted per-iteration name for iteration n_iter. (This is used in create/require/get_iter_group, but may also be useful for naming datasets on a per-iteration basis.)

create_iter_group(n_iter, group=None)

Create a per-iteration data storage group for iteration number n_iter in the group group (which is ‘/iterations’ by default).

require_iter_group(n_iter, group=None)

Ensure that a per-iteration data storage group for iteration number n_iter is available in the group group (which is ‘/iterations’ by default).

get_iter_group(n_iter, group=None)

Get the per-iteration data group for iteration number n_iter from within the group group (‘/iterations’ by default).

class westpa.core.h5io.WESTIterationFile(file, mode='r', force_overwrite=True, compression='zlib', link=None)

Bases: HDF5TrajectoryFile

read(frame_indices=None, atom_indices=None)

Read one or more frames of data from the file

Parameters:
  • n_frames ({int, None}) – The number of frames to read. If not supplied, all of the remaining frames will be read.

  • stride ({int, None}) – By default all of the frames will be read, but you can pass this flag to read a subset of of the data by grabbing only every stride-th frame from disk.

  • atom_indices ({int, None}) – By default all of the atom will be read, but you can pass this flag to read only a subsets of the atoms for the coordinates and velocities fields. Note that you will have to carefully manage the indices and the offsets, since the i-th atom in the topology will not necessarily correspond to the i-th atom in your subset.

Notes

If you’d like more flexible access to the data, that is available by using the pytables group directly, which is accessible via the root property on this class.

Returns:

frames – The returned namedtuple will have the fields “coordinates”, “time”, “cell_lengths”, “cell_angles”, “velocities”, “kineticEnergy”, “potentialEnergy”, “temperature” and “alchemicalLambda”. Each of the fields in the returned namedtuple will either be a numpy array or None, dependening on if that data was saved in the trajectory. All of the data shall be n units of “nanometers”, “picoseconds”, “kelvin”, “degrees” and “kilojoules_per_mole”.

Return type:

namedtuple

has_topology()
has_pointer()
has_restart(segment)
write_data(where, name, data)
read_data(where, name)
read_as_traj(iteration=None, segment=None, atom_indices=None)

Read a trajectory from the HDF5 file

Parameters:
  • n_frames ({int, None}) – The number of frames to read. If not supplied, all of the remaining frames will be read.

  • stride ({int, None}) – By default all of the frames will be read, but you can pass this flag to read a subset of of the data by grabbing only every stride-th frame from disk.

  • atom_indices ({int, None}) – By default all of the atom will be read, but you can pass this flag to read only a subsets of the atoms for the coordinates and velocities fields. Note that you will have to carefully manage the indices and the offsets, since the i-th atom in the topology will not necessarily correspond to the i-th atom in your subset.

Returns:

trajectory – A trajectory object containing the loaded portion of the file.

Return type:

Trajectory

read_restart(segment)
write_segment(segment, pop=False)
class westpa.core.h5io.DSSpec

Bases: object

Generalized WE dataset access

get_iter_data(n_iter, seg_slice=(slice(None, None, None),))
get_segment_data(n_iter, seg_id)
class westpa.core.h5io.FileLinkedDSSpec(h5file_or_name)

Bases: DSSpec

Provide facilities for accessing WESTPA HDF5 files, including auto-opening and the ability to pickle references to such files for transmission (through, e.g., the work manager), provided that the HDF5 file can be accessed by the same path on both the sender and receiver.

property h5file

Lazily open HDF5 file. This is required because allowing an open HDF5 file to cross a fork() boundary generally corrupts the internal state of the HDF5 library.

class westpa.core.h5io.SingleDSSpec(h5file_or_name, dsname, alias=None, slice=None)

Bases: FileLinkedDSSpec

classmethod from_string(dsspec_string, default_h5file)
class westpa.core.h5io.SingleIterDSSpec(h5file_or_name, dsname, alias=None, slice=None)

Bases: SingleDSSpec

get_iter_data(n_iter, seg_slice=(slice(None, None, None),))
class westpa.core.h5io.SingleSegmentDSSpec(h5file_or_name, dsname, alias=None, slice=None)

Bases: SingleDSSpec

get_iter_data(n_iter, seg_slice=(slice(None, None, None),))
get_segment_data(n_iter, seg_id)
class westpa.core.h5io.FnDSSpec(h5file_or_name, fn)

Bases: FileLinkedDSSpec

get_iter_data(n_iter, seg_slice=(slice(None, None, None),))
class westpa.core.h5io.MultiDSSpec(dsspecs)

Bases: DSSpec

get_iter_data(n_iter, seg_slice=(slice(None, None, None),))
class westpa.core.h5io.IterBlockedDataset(dataset_or_array, attrs=None)

Bases: object

classmethod empty_like(blocked_dataset)
cache_data(max_size=None)

Cache this dataset in RAM. If max_size is given, then only cache if the entire dataset fits in max_size bytes. If max_size is the string ‘available’, then only cache if the entire dataset fits in available RAM, as defined by the psutil module.

drop_cache()
iter_entry(n_iter)
iter_slice(start=None, stop=None)

westpa.core.progress module

westpa.core.progress.linregress(x, y=None, alternative='two-sided')

Calculate a linear least-squares regression for two sets of measurements.

Parameters:
  • x (array_like) – Two sets of measurements. Both arrays should have the same length. If only x is given (and y=None), then it must be a two-dimensional array where one dimension has length 2. The two sets of measurements are then found by splitting the array along the length-2 dimension. In the case where y=None and x is a 2x2 array, linregress(x) is equivalent to linregress(x[0], x[1]).

  • y (array_like) – Two sets of measurements. Both arrays should have the same length. If only x is given (and y=None), then it must be a two-dimensional array where one dimension has length 2. The two sets of measurements are then found by splitting the array along the length-2 dimension. In the case where y=None and x is a 2x2 array, linregress(x) is equivalent to linregress(x[0], x[1]).

  • alternative ({'two-sided', 'less', 'greater'}, optional) –

    Defines the alternative hypothesis. Default is ‘two-sided’. The following options are available:

    • ’two-sided’: the slope of the regression line is nonzero

    • ’less’: the slope of the regression line is less than zero

    • ’greater’: the slope of the regression line is greater than zero

    Added in version 1.7.0.

Returns:

result – The return value is an object with the following attributes:

slopefloat

Slope of the regression line.

interceptfloat

Intercept of the regression line.

rvaluefloat

The Pearson correlation coefficient. The square of rvalue is equal to the coefficient of determination.

pvaluefloat

The p-value for a hypothesis test whose null hypothesis is that the slope is zero, using Wald Test with t-distribution of the test statistic. See alternative above for alternative hypotheses.

stderrfloat

Standard error of the estimated slope (gradient), under the assumption of residual normality.

intercept_stderrfloat

Standard error of the estimated intercept, under the assumption of residual normality.

Return type:

LinregressResult instance

See also

scipy.optimize.curve_fit

Use non-linear least squares to fit a function to data.

scipy.optimize.leastsq

Minimize the sum of squares of a set of equations.

Notes

Missing values are considered pair-wise: if a value is missing in x, the corresponding value in y is masked.

For compatibility with older versions of SciPy, the return value acts like a namedtuple of length 5, with fields slope, intercept, rvalue, pvalue and stderr, so one can continue to write:

slope, intercept, r, p, se = linregress(x, y)

With that style, however, the standard error of the intercept is not available. To have access to all the computed values, including the standard error of the intercept, use the return value as an object with attributes, e.g.:

result = linregress(x, y)
print(result.intercept, result.intercept_stderr)

Examples

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from scipy import stats
>>> rng = np.random.default_rng()

Generate some data:

>>> x = rng.random(10)
>>> y = 1.6*x + rng.random(10)

Perform the linear regression:

>>> res = stats.linregress(x, y)

Coefficient of determination (R-squared):

>>> print(f"R-squared: {res.rvalue**2:.6f}")
R-squared: 0.717533

Plot the data along with the fitted line:

>>> plt.plot(x, y, 'o', label='original data')
>>> plt.plot(x, res.intercept + res.slope*x, 'r', label='fitted line')
>>> plt.legend()
>>> plt.show()

Calculate 95% confidence interval on slope and intercept:

>>> # Two-sided inverse Students t-distribution
>>> # p - probability, df - degrees of freedom
>>> from scipy.stats import t
>>> tinv = lambda p, df: abs(t.ppf(p/2, df))
>>> ts = tinv(0.05, len(x)-2)
>>> print(f"slope (95%): {res.slope:.6f} +/- {ts*res.stderr:.6f}")
slope (95%): 1.453392 +/- 0.743465
>>> print(f"intercept (95%): {res.intercept:.6f}"
...       f" +/- {ts*res.intercept_stderr:.6f}")
intercept (95%): 0.616950 +/- 0.544475
westpa.core.progress.nop()
class westpa.core.progress.ProgressIndicator(stream=None, interval=1)

Bases: object

draw_fancy()
draw_simple()
draw()
clear()
property operation
property extent
property progress
new_operation(operation, extent=None, progress=0)
start()
stop()

westpa.core.segment module

class westpa.core.segment.Segment(n_iter=None, seg_id=None, weight=None, endpoint_type=None, parent_id=None, wtg_parent_ids=None, pcoord=None, status=None, walltime=None, cputime=None, data=None)

Bases: object

A class wrapping segment data that must be passed through the work manager or data manager. Most fields are self-explanatory. One item worth noting is that a negative parent ID means that the segment starts from the initial state with ID -(segment.parent_id+1)

SEG_STATUS_UNSET = 0
SEG_STATUS_PREPARED = 1
SEG_STATUS_COMPLETE = 2
SEG_STATUS_FAILED = 3
SEG_INITPOINT_UNSET = 0
SEG_INITPOINT_CONTINUES = 1
SEG_INITPOINT_NEWTRAJ = 2
SEG_ENDPOINT_UNSET = 0
SEG_ENDPOINT_CONTINUES = 1
SEG_ENDPOINT_MERGED = 2
SEG_ENDPOINT_RECYCLED = 3
statuses = {'SEG_STATUS_COMPLETE': 2, 'SEG_STATUS_FAILED': 3, 'SEG_STATUS_PREPARED': 1, 'SEG_STATUS_UNSET': 0}
initpoint_types = {'SEG_INITPOINT_CONTINUES': 1, 'SEG_INITPOINT_NEWTRAJ': 2, 'SEG_INITPOINT_UNSET': 0}
endpoint_types = {'SEG_ENDPOINT_CONTINUES': 1, 'SEG_ENDPOINT_MERGED': 2, 'SEG_ENDPOINT_RECYCLED': 3, 'SEG_ENDPOINT_UNSET': 0}
status_names = {0: 'SEG_STATUS_UNSET', 1: 'SEG_STATUS_PREPARED', 2: 'SEG_STATUS_COMPLETE', 3: 'SEG_STATUS_FAILED'}
initpoint_type_names = {0: 'SEG_INITPOINT_UNSET', 1: 'SEG_INITPOINT_CONTINUES', 2: 'SEG_INITPOINT_NEWTRAJ'}
endpoint_type_names = {0: 'SEG_ENDPOINT_UNSET', 1: 'SEG_ENDPOINT_CONTINUES', 2: 'SEG_ENDPOINT_MERGED', 3: 'SEG_ENDPOINT_RECYCLED'}
static initial_pcoord(segment)

Return the initial progress coordinate point of this segment.

static final_pcoord(segment)

Return the final progress coordinate point of this segment.

property initpoint_type
property initial_state_id
property status_text
property endpoint_type_text

westpa.core.sim_manager module

class westpa.core.sim_manager.timedelta

Bases: object

Difference between two datetime values.

timedelta(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0)

All arguments are optional and default to 0. Arguments may be integers or floats, and may be positive or negative.

days

Number of days.

max = datetime.timedelta(days=999999999, seconds=86399, microseconds=999999)
microseconds

Number of microseconds (>= 0 and less than 1 second).

min = datetime.timedelta(days=-999999999)
resolution = datetime.timedelta(microseconds=1)
seconds

Number of seconds (>= 0 and less than 1 day).

total_seconds()

Total seconds in the duration.

class westpa.core.sim_manager.zip_longest

Bases: object

zip_longest(iter1 [,iter2 […]], [fillvalue=None]) –> zip_longest object

Return a zip_longest object whose .__next__() method returns a tuple where the i-th element comes from the i-th iterable argument. The .__next__() method continues until the longest iterable in the argument sequence is exhausted and then it raises StopIteration. When the shorter iterables are exhausted, the fillvalue is substituted in their place. The fillvalue defaults to None or can be specified by a keyword argument.

exception westpa.core.sim_manager.PickleError

Bases: Exception

westpa.core.sim_manager.weight_dtype

alias of float64

class westpa.core.sim_manager.Segment(n_iter=None, seg_id=None, weight=None, endpoint_type=None, parent_id=None, wtg_parent_ids=None, pcoord=None, status=None, walltime=None, cputime=None, data=None)

Bases: object

A class wrapping segment data that must be passed through the work manager or data manager. Most fields are self-explanatory. One item worth noting is that a negative parent ID means that the segment starts from the initial state with ID -(segment.parent_id+1)

SEG_STATUS_UNSET = 0
SEG_STATUS_PREPARED = 1
SEG_STATUS_COMPLETE = 2
SEG_STATUS_FAILED = 3
SEG_INITPOINT_UNSET = 0
SEG_INITPOINT_CONTINUES = 1
SEG_INITPOINT_NEWTRAJ = 2
SEG_ENDPOINT_UNSET = 0
SEG_ENDPOINT_CONTINUES = 1
SEG_ENDPOINT_MERGED = 2
SEG_ENDPOINT_RECYCLED = 3
statuses = {'SEG_STATUS_COMPLETE': 2, 'SEG_STATUS_FAILED': 3, 'SEG_STATUS_PREPARED': 1, 'SEG_STATUS_UNSET': 0}
initpoint_types = {'SEG_INITPOINT_CONTINUES': 1, 'SEG_INITPOINT_NEWTRAJ': 2, 'SEG_INITPOINT_UNSET': 0}
endpoint_types = {'SEG_ENDPOINT_CONTINUES': 1, 'SEG_ENDPOINT_MERGED': 2, 'SEG_ENDPOINT_RECYCLED': 3, 'SEG_ENDPOINT_UNSET': 0}
status_names = {0: 'SEG_STATUS_UNSET', 1: 'SEG_STATUS_PREPARED', 2: 'SEG_STATUS_COMPLETE', 3: 'SEG_STATUS_FAILED'}
initpoint_type_names = {0: 'SEG_INITPOINT_UNSET', 1: 'SEG_INITPOINT_CONTINUES', 2: 'SEG_INITPOINT_NEWTRAJ'}
endpoint_type_names = {0: 'SEG_ENDPOINT_UNSET', 1: 'SEG_ENDPOINT_CONTINUES', 2: 'SEG_ENDPOINT_MERGED', 3: 'SEG_ENDPOINT_RECYCLED'}
static initial_pcoord(segment)

Return the initial progress coordinate point of this segment.

static final_pcoord(segment)

Return the final progress coordinate point of this segment.

property initpoint_type
property initial_state_id
property status_text
property endpoint_type_text
class westpa.core.sim_manager.InitialState(state_id, basis_state_id, iter_created, iter_used=None, istate_type=None, istate_status=None, pcoord=None, basis_state=None, basis_auxref=None)

Bases: object

Describes an initial state for a new trajectory. These are generally constructed by appropriate modification of a basis state.

Variables:
  • state_id – Integer identifier of this state, usually set by the data manager.

  • basis_state_id – Identifier of the basis state from which this state was generated, or None.

  • basis_state – The BasisState from which this state was generated, or None.

  • iter_created – Iteration in which this state was generated (0 for simulation initialization).

  • iter_used – Iteration in which this state was used to initiate a trajectory (None for unused).

  • istate_type – Integer describing the type of this initial state (ISTATE_TYPE_BASIS for direct use of a basis state, ISTATE_TYPE_GENERATED for a state generated from a basis state, ISTATE_TYPE_RESTART for a state corresponding to the endpoint of a segment in another simulation, or ISTATE_TYPE_START for a state generated from a start state).

  • istate_status – Integer describing whether this initial state has been properly prepared.

  • pcoord – The representative progress coordinate of this state.

ISTATE_TYPE_UNSET = 0
ISTATE_TYPE_BASIS = 1
ISTATE_TYPE_GENERATED = 2
ISTATE_TYPE_RESTART = 3
ISTATE_TYPE_START = 4
ISTATE_UNUSED = 0
ISTATE_STATUS_PENDING = 0
ISTATE_STATUS_PREPARED = 1
ISTATE_STATUS_FAILED = 2
istate_types = {'ISTATE_TYPE_BASIS': 1, 'ISTATE_TYPE_GENERATED': 2, 'ISTATE_TYPE_RESTART': 3, 'ISTATE_TYPE_START': 4, 'ISTATE_TYPE_UNSET': 0}
istate_type_names = {0: 'ISTATE_TYPE_UNSET', 1: 'ISTATE_TYPE_BASIS', 2: 'ISTATE_TYPE_GENERATED', 3: 'ISTATE_TYPE_RESTART', 4: 'ISTATE_TYPE_START'}
istate_statuses = {'ISTATE_STATUS_FAILED': 2, 'ISTATE_STATUS_PENDING': 0, 'ISTATE_STATUS_PREPARED': 1}
istate_status_names = {0: 'ISTATE_STATUS_PENDING', 1: 'ISTATE_STATUS_PREPARED', 2: 'ISTATE_STATUS_FAILED'}
as_numpy_record()
westpa.core.sim_manager.grouper(n, iterable, fillvalue=None)

Collect data into fixed-length chunks or blocks

exception westpa.core.sim_manager.PropagationError

Bases: RuntimeError

class westpa.core.sim_manager.WESimManager(rc=None)

Bases: object

process_config()
register_callback(hook, function, priority=0)

Registers a callback to execute during the given hook into the simulation loop. The optional priority is used to order when the function is called relative to other registered callbacks.

invoke_callbacks(hook, *args, **kwargs)
load_plugins(plugins=None)
report_bin_statistics(bins, target_states, save_summary=False)
get_bstate_pcoords(basis_states, label='basis')

For each of the given basis_states, calculate progress coordinate values as necessary. The HDF5 file is not updated.

report_basis_states(basis_states, label='basis')
report_target_states(target_states)
initialize_simulation(basis_states, target_states, start_states, segs_per_state=1, suppress_we=False)

Initialize a new weighted ensemble simulation, taking segs_per_state initial states from each of the given basis_states.

w_init is the forward-facing version of this function

prepare_iteration()
finalize_iteration()

Clean up after an iteration and prepare for the next.

get_istate_futures()

Add n_states initial states to the internal list of initial states assigned to recycled particles. Spare states are used if available, otherwise new states are created. If created new initial states requires generation, then a set of futures is returned representing work manager tasks corresponding to the necessary generation work.

propagate()
save_bin_data()

Calculate and write flux and transition count matrices to HDF5. Population and rate matrices are likely useless at the single-tau level and are no longer written.

check_propagation()

Check for failures in propagation or initial state generation, and raise an exception if any are found.

run_we()

Run the weighted ensemble algorithm based on the binning in self.final_bins and the recycled particles in self.to_recycle, creating and committing the next iteration’s segments to storage as well.

prepare_new_iteration()

Commit data for the coming iteration to the HDF5 file.

run()
prepare_run()

Prepare a new run.

finalize_run()

Perform cleanup at the normal end of a run

pre_propagation()
post_propagation()
pre_we()
post_we()

westpa.core.states module

class westpa.core.states.Segment(n_iter=None, seg_id=None, weight=None, endpoint_type=None, parent_id=None, wtg_parent_ids=None, pcoord=None, status=None, walltime=None, cputime=None, data=None)

Bases: object

A class wrapping segment data that must be passed through the work manager or data manager. Most fields are self-explanatory. One item worth noting is that a negative parent ID means that the segment starts from the initial state with ID -(segment.parent_id+1)

SEG_STATUS_UNSET = 0
SEG_STATUS_PREPARED = 1
SEG_STATUS_COMPLETE = 2
SEG_STATUS_FAILED = 3
SEG_INITPOINT_UNSET = 0
SEG_INITPOINT_CONTINUES = 1
SEG_INITPOINT_NEWTRAJ = 2
SEG_ENDPOINT_UNSET = 0
SEG_ENDPOINT_CONTINUES = 1
SEG_ENDPOINT_MERGED = 2
SEG_ENDPOINT_RECYCLED = 3
statuses = {'SEG_STATUS_COMPLETE': 2, 'SEG_STATUS_FAILED': 3, 'SEG_STATUS_PREPARED': 1, 'SEG_STATUS_UNSET': 0}
initpoint_types = {'SEG_INITPOINT_CONTINUES': 1, 'SEG_INITPOINT_NEWTRAJ': 2, 'SEG_INITPOINT_UNSET': 0}
endpoint_types = {'SEG_ENDPOINT_CONTINUES': 1, 'SEG_ENDPOINT_MERGED': 2, 'SEG_ENDPOINT_RECYCLED': 3, 'SEG_ENDPOINT_UNSET': 0}
status_names = {0: 'SEG_STATUS_UNSET', 1: 'SEG_STATUS_PREPARED', 2: 'SEG_STATUS_COMPLETE', 3: 'SEG_STATUS_FAILED'}
initpoint_type_names = {0: 'SEG_INITPOINT_UNSET', 1: 'SEG_INITPOINT_CONTINUES', 2: 'SEG_INITPOINT_NEWTRAJ'}
endpoint_type_names = {0: 'SEG_ENDPOINT_UNSET', 1: 'SEG_ENDPOINT_CONTINUES', 2: 'SEG_ENDPOINT_MERGED', 3: 'SEG_ENDPOINT_RECYCLED'}
static initial_pcoord(segment)

Return the initial progress coordinate point of this segment.

static final_pcoord(segment)

Return the final progress coordinate point of this segment.

property initpoint_type
property initial_state_id
property status_text
property endpoint_type_text
class westpa.core.states.BasisState(label, probability, pcoord=None, auxref=None, state_id=None)

Bases: object

Describes an basis (micro)state. These basis states are used to generate initial states for new trajectories, either at the beginning of the simulation (i.e. at w_init) or due to recycling.

Variables:
  • state_id – Integer identifier of this state, usually set by the data manager.

  • label – A descriptive label for this microstate (may be empty)

  • probability – Probability of this state to be selected when creating a new trajectory.

  • pcoord – The representative progress coordinate of this state.

  • auxref – A user-provided (string) reference for locating data associated with this state (usually a filesystem path).

classmethod states_to_file(states, fileobj)

Write a file defining basis states, which may then be read by states_from_file().

classmethod states_from_file(statefile)

Read a file defining basis states. Each line defines a state, and contains a label, the probability, and optionally a data reference, separated by whitespace, as in:

unbound    1.0

or:

unbound_0    0.6        state0.pdb
unbound_1    0.4        state1.pdb
as_numpy_record()

Return the data for this state as a numpy record array.

class westpa.core.states.InitialState(state_id, basis_state_id, iter_created, iter_used=None, istate_type=None, istate_status=None, pcoord=None, basis_state=None, basis_auxref=None)

Bases: object

Describes an initial state for a new trajectory. These are generally constructed by appropriate modification of a basis state.

Variables:
  • state_id – Integer identifier of this state, usually set by the data manager.

  • basis_state_id – Identifier of the basis state from which this state was generated, or None.

  • basis_state – The BasisState from which this state was generated, or None.

  • iter_created – Iteration in which this state was generated (0 for simulation initialization).

  • iter_used – Iteration in which this state was used to initiate a trajectory (None for unused).

  • istate_type – Integer describing the type of this initial state (ISTATE_TYPE_BASIS for direct use of a basis state, ISTATE_TYPE_GENERATED for a state generated from a basis state, ISTATE_TYPE_RESTART for a state corresponding to the endpoint of a segment in another simulation, or ISTATE_TYPE_START for a state generated from a start state).

  • istate_status – Integer describing whether this initial state has been properly prepared.

  • pcoord – The representative progress coordinate of this state.

ISTATE_TYPE_UNSET = 0
ISTATE_TYPE_BASIS = 1
ISTATE_TYPE_GENERATED = 2
ISTATE_TYPE_RESTART = 3
ISTATE_TYPE_START = 4
ISTATE_UNUSED = 0
ISTATE_STATUS_PENDING = 0
ISTATE_STATUS_PREPARED = 1
ISTATE_STATUS_FAILED = 2
istate_types = {'ISTATE_TYPE_BASIS': 1, 'ISTATE_TYPE_GENERATED': 2, 'ISTATE_TYPE_RESTART': 3, 'ISTATE_TYPE_START': 4, 'ISTATE_TYPE_UNSET': 0}
istate_type_names = {0: 'ISTATE_TYPE_UNSET', 1: 'ISTATE_TYPE_BASIS', 2: 'ISTATE_TYPE_GENERATED', 3: 'ISTATE_TYPE_RESTART', 4: 'ISTATE_TYPE_START'}
istate_statuses = {'ISTATE_STATUS_FAILED': 2, 'ISTATE_STATUS_PENDING': 0, 'ISTATE_STATUS_PREPARED': 1}
istate_status_names = {0: 'ISTATE_STATUS_PENDING', 1: 'ISTATE_STATUS_PREPARED', 2: 'ISTATE_STATUS_FAILED'}
as_numpy_record()
class westpa.core.states.TargetState(label, pcoord, state_id=None)

Bases: object

Describes a target state.

Variables:
  • state_id – Integer identifier of this state, usually set by the data manager.

  • label – A descriptive label for this microstate (may be empty)

  • pcoord – The representative progress coordinate of this state.

classmethod states_to_file(states, fileobj)

Write a file defining basis states, which may then be read by states_from_file().

classmethod states_from_file(statefile, dtype)

Read a file defining target states. Each line defines a state, and contains a label followed by a representative progress coordinate value, separated by whitespace, as in:

bound     0.02

for a single target and one-dimensional progress coordinates or:

bound    2.7    0.0
drift    100    50.0

for two targets and a two-dimensional progress coordinate.

westpa.core.states.pare_basis_initial_states(basis_states, initial_states, segments=None)

Given iterables of basis and initial states (and optionally segments that use them), return minimal sets (as in __builtins__.set) of states needed to describe the history of the given segments an initial states.

westpa.core.states.return_state_type(state_obj)

Convinience function for returning the state ID and type of the state_obj pointer

westpa.core.systems module

class westpa.core.systems.NopMapper

Bases: BinMapper

Put everything into one bin.

assign(coords, mask=None, output=None)
class westpa.core.systems.WESTSystem(rc=None)

Bases: object

A description of the system being simulated, including the dimensionality and data type of the progress coordinate, the number of progress coordinate entries expected from each segment, and binning. To construct a simulation, the user must subclass WESTSystem and set several instance variables.

At a minimum, the user must subclass WESTSystem and override :method:`initialize` to set the data type and dimensionality of progress coordinate data and define a bin mapper.

Variables:
  • pcoord_ndim – The number of dimensions in the progress coordinate. Defaults to 1 (i.e. a one-dimensional progress coordinate).

  • pcoord_dtype – The data type of the progress coordinate, which must be callable (e.g. np.float32 and long will work, but '<f4' and '<i8' will not). Defaults to np.float64.

  • pcoord_len – The length of the progress coordinate time series generated by each segment, including both the initial and final values. Defaults to 2 (i.e. only the initial and final progress coordinate values for a segment are returned from propagation).

  • bin_mapper – A bin mapper describing the progress coordinate space.

  • bin_target_counts – A vector of target counts, one per bin.

property bin_target_counts
initialize()

Prepare this system object for use in simulation or analysis, creating a bin space, setting replicas per bin, and so on. This function is called whenever a WEST tool creates an instance of the system driver.

prepare_run()

Prepare this system for use in a simulation run. Called by w_run in all worker processes.

finalize_run()

A hook for system-specific processing for the end of a simulation run (as defined by such things as maximum wallclock time, rather than perhaps more scientifically-significant definitions of “the end of a simulation run”)

new_pcoord_array(pcoord_len=None)

Return an appropriately-sized and -typed pcoord array for a timepoint, segment, or number of segments. If pcoord_len is not specified (or None), then a length appropriate for a segment is returned.

new_region_set()

westpa.core.textio module

Miscellaneous routines to help with input and output of WEST-related data in text format

class westpa.core.textio.NumericTextOutputFormatter(output_file, mode='wt', emit_header=None)

Bases: object

comment_string = '# '
emit_header = True
close()
write(str)
writelines(sequence)
write_comment(line)

Writes a line beginning with the comment string

write_header(line)

Appends a line to those written when the file header is written. The appropriate comment string will be prepended, so line should not include a comment character.

westpa.core.we_driver module

class westpa.core.we_driver.Segment(n_iter=None, seg_id=None, weight=None, endpoint_type=None, parent_id=None, wtg_parent_ids=None, pcoord=None, status=None, walltime=None, cputime=None, data=None)

Bases: object

A class wrapping segment data that must be passed through the work manager or data manager. Most fields are self-explanatory. One item worth noting is that a negative parent ID means that the segment starts from the initial state with ID -(segment.parent_id+1)

SEG_STATUS_UNSET = 0
SEG_STATUS_PREPARED = 1
SEG_STATUS_COMPLETE = 2
SEG_STATUS_FAILED = 3
SEG_INITPOINT_UNSET = 0
SEG_INITPOINT_CONTINUES = 1
SEG_INITPOINT_NEWTRAJ = 2
SEG_ENDPOINT_UNSET = 0
SEG_ENDPOINT_CONTINUES = 1
SEG_ENDPOINT_MERGED = 2
SEG_ENDPOINT_RECYCLED = 3
statuses = {'SEG_STATUS_COMPLETE': 2, 'SEG_STATUS_FAILED': 3, 'SEG_STATUS_PREPARED': 1, 'SEG_STATUS_UNSET': 0}
initpoint_types = {'SEG_INITPOINT_CONTINUES': 1, 'SEG_INITPOINT_NEWTRAJ': 2, 'SEG_INITPOINT_UNSET': 0}
endpoint_types = {'SEG_ENDPOINT_CONTINUES': 1, 'SEG_ENDPOINT_MERGED': 2, 'SEG_ENDPOINT_RECYCLED': 3, 'SEG_ENDPOINT_UNSET': 0}
status_names = {0: 'SEG_STATUS_UNSET', 1: 'SEG_STATUS_PREPARED', 2: 'SEG_STATUS_COMPLETE', 3: 'SEG_STATUS_FAILED'}
initpoint_type_names = {0: 'SEG_INITPOINT_UNSET', 1: 'SEG_INITPOINT_CONTINUES', 2: 'SEG_INITPOINT_NEWTRAJ'}
endpoint_type_names = {0: 'SEG_ENDPOINT_UNSET', 1: 'SEG_ENDPOINT_CONTINUES', 2: 'SEG_ENDPOINT_MERGED', 3: 'SEG_ENDPOINT_RECYCLED'}
static initial_pcoord(segment)

Return the initial progress coordinate point of this segment.

static final_pcoord(segment)

Return the final progress coordinate point of this segment.

property initpoint_type
property initial_state_id
property status_text
property endpoint_type_text
class westpa.core.we_driver.InitialState(state_id, basis_state_id, iter_created, iter_used=None, istate_type=None, istate_status=None, pcoord=None, basis_state=None, basis_auxref=None)

Bases: object

Describes an initial state for a new trajectory. These are generally constructed by appropriate modification of a basis state.

Variables:
  • state_id – Integer identifier of this state, usually set by the data manager.

  • basis_state_id – Identifier of the basis state from which this state was generated, or None.

  • basis_state – The BasisState from which this state was generated, or None.

  • iter_created – Iteration in which this state was generated (0 for simulation initialization).

  • iter_used – Iteration in which this state was used to initiate a trajectory (None for unused).

  • istate_type – Integer describing the type of this initial state (ISTATE_TYPE_BASIS for direct use of a basis state, ISTATE_TYPE_GENERATED for a state generated from a basis state, ISTATE_TYPE_RESTART for a state corresponding to the endpoint of a segment in another simulation, or ISTATE_TYPE_START for a state generated from a start state).

  • istate_status – Integer describing whether this initial state has been properly prepared.

  • pcoord – The representative progress coordinate of this state.

ISTATE_TYPE_UNSET = 0
ISTATE_TYPE_BASIS = 1
ISTATE_TYPE_GENERATED = 2
ISTATE_TYPE_RESTART = 3
ISTATE_TYPE_START = 4
ISTATE_UNUSED = 0
ISTATE_STATUS_PENDING = 0
ISTATE_STATUS_PREPARED = 1
ISTATE_STATUS_FAILED = 2
istate_types = {'ISTATE_TYPE_BASIS': 1, 'ISTATE_TYPE_GENERATED': 2, 'ISTATE_TYPE_RESTART': 3, 'ISTATE_TYPE_START': 4, 'ISTATE_TYPE_UNSET': 0}
istate_type_names = {0: 'ISTATE_TYPE_UNSET', 1: 'ISTATE_TYPE_BASIS', 2: 'ISTATE_TYPE_GENERATED', 3: 'ISTATE_TYPE_RESTART', 4: 'ISTATE_TYPE_START'}
istate_statuses = {'ISTATE_STATUS_FAILED': 2, 'ISTATE_STATUS_PENDING': 0, 'ISTATE_STATUS_PREPARED': 1}
istate_status_names = {0: 'ISTATE_STATUS_PENDING', 1: 'ISTATE_STATUS_PREPARED', 2: 'ISTATE_STATUS_FAILED'}
as_numpy_record()
exception westpa.core.we_driver.ConsistencyError

Bases: RuntimeError

exception westpa.core.we_driver.AccuracyError

Bases: RuntimeError

class westpa.core.we_driver.NewWeightEntry(source_type, weight, prev_seg_id=None, prev_init_pcoord=None, prev_final_pcoord=None, new_init_pcoord=None, target_state_id=None, initial_state_id=None)

Bases: object

NW_SOURCE_RECYCLED = 0
class westpa.core.we_driver.WEDriver(rc=None, system=None)

Bases: object

A class implemented Huber & Kim’s weighted ensemble algorithm over Segment objects. This class handles all binning, recycling, and preparation of new Segment objects for the next iteration. Binning is accomplished using system.bin_mapper, and per-bin target counts are from system.bin_target_counts.

The workflow is as follows:

  1. Call new_iteration() every new iteration, providing any recycling targets that are in force and any available initial states for recycling.

  2. Call assign() to assign segments to bins based on their initial and end points. This returns the number of walkers that were recycled.

  3. Call run_we(), optionally providing a set of initial states that will be used to recycle walkers.

Note the presence of flux_matrix, transition_matrix, current_iter_segments, next_iter_segments, recycling_segments, initial_binning, final_binning, next_iter_binning, and new_weights (to be documented soon).

weight_split_threshold = 2.0
weight_merge_cutoff = 1.0
largest_allowed_weight = 1.0
smallest_allowed_weight = 1e-310
process_config()
property next_iter_segments

Newly-created segments for the next iteration

property current_iter_segments

Segments for the current iteration

property next_iter_assignments

Bin assignments (indices) for initial points of next iteration.

property current_iter_assignments

Bin assignments (indices) for endpoints of current iteration.

property recycling_segments

Segments designated for recycling

property n_recycled_segs

Number of segments recycled this iteration

property n_istates_needed

Number of initial states needed to support recycling for this iteration

check_threshold_configs()

Check to see if weight thresholds parameters are valid

clear()

Explicitly delete all Segment-related state.

new_iteration(initial_states=None, target_states=None, new_weights=None, bin_mapper=None, bin_target_counts=None)

Prepare for a new iteration. initial_states is a sequence of all InitialState objects valid for use in to generating new segments for the next iteration (after the one being begun with the call to new_iteration); that is, these are states available to recycle to. Target states which generate recycling events are specified in target_states, a sequence of TargetState objects. Both initial_states and target_states may be empty as required.

The optional new_weights is a sequence of NewWeightEntry objects which will be used to construct the initial flux matrix.

The given bin_mapper will be used for assignment, and bin_target_counts used for splitting/merging target counts; each will be obtained from the system object if omitted or None.

add_initial_states(initial_states)

Add newly-prepared initial states to the pool available for recycling.

property all_initial_states

Return an iterator over all initial states (available or used)

assign(segments, initializing=False)

Assign segments to initial and final bins, and update the (internal) lists of used and available initial states. If initializing is True, then the “final” bin assignments will be identical to the initial bin assignments, a condition required for seeding a new iteration from pre-existing segments.

populate_initial(initial_states, weights, system=None)

Create walkers for a new weighted ensemble simulation.

One segment is created for each provided initial state, then binned and split/merged as necessary. After this function is called, next_iter_segments will yield the new segments to create, used_initial_states will contain data about which of the provided initial states were used, and avail_initial_states will contain data about which initial states were unused (because their corresponding walkers were merged out of existence).

rebin_current(parent_segments)

Reconstruct walkers for the current iteration based on (presumably) new binning. The previous iteration’s segments must be provided (as parent_segments) in order to update endpoint types appropriately.

construct_next()

Construct walkers for the next iteration, by running weighted ensemble recycling and bin/split/merge on the segments previously assigned to bins using assign. Enough unused initial states must be present in self.avail_initial_states for every recycled walker to be assigned an initial state.

After this function completes, self.flux_matrix contains a valid flux matrix for this iteration (including any contributions from recycling from the previous iteration), and self.next_iter_segments contains a list of segments ready for the next iteration, with appropriate values set for weight, endpoint type, parent walkers, and so on.

westpa.core.wm_ops module

westpa.core.wm_ops.get_pcoord(state)
westpa.core.wm_ops.gen_istate(basis_state, initial_state)
westpa.core.wm_ops.prep_iter(n_iter, segments)
westpa.core.wm_ops.post_iter(n_iter, segments)
westpa.core.wm_ops.propagate(basis_states, initial_states, segments)

westpa.core.yamlcfg module

YAML-based configuration files for WESTPA

westpa.core.yamlcfg.YLoader

alias of CLoader

class westpa.core.yamlcfg.NopMapper

Bases: BinMapper

Put everything into one bin.

assign(coords, mask=None, output=None)
exception westpa.core.yamlcfg.ConfigValueWarning

Bases: UserWarning

westpa.core.yamlcfg.warn_dubious_config_entry(entry, value, expected_type=None, category=<class 'westpa.core.yamlcfg.ConfigValueWarning'>, stacklevel=1)
westpa.core.yamlcfg.check_bool(value, action='warn')

Check that the given value is boolean in type. If not, either raise a warning (if action=='warn') or an exception (action=='raise').

exception westpa.core.yamlcfg.ConfigItemMissing(key, message=None)

Bases: KeyError

exception westpa.core.yamlcfg.ConfigItemTypeError(key, expected_type, message=None)

Bases: TypeError

exception westpa.core.yamlcfg.ConfigValueError(key, value, message=None)

Bases: ValueError

class westpa.core.yamlcfg.YAMLConfig

Bases: object

preload_config_files = ['/etc/westpa/westrc', '/home/docs/.westrc']
update_from_file(file, required=True)
require(key, type_=None)

Ensure that a configuration item with the given key is present. If the optional type_ is given, additionally require that the item has that type.

require_type_if_present(key, type_)

Ensure that the configuration item with the given key has the given type.

coerce_type_if_present(key, type_)
get(key, default=None)
get_typed(key, type_, default=<object object>)
get_path(key, default=<object object>, expandvars=True, expanduser=True, realpath=True, abspath=True)
get_pathlist(key, default=<object object>, sep=':', expandvars=True, expanduser=True, realpath=True, abspath=True)
get_python_object(key, default=<object object>, path=None)
get_choice(key, choices, default=<object object>, value_transform=None)
class westpa.core.yamlcfg.YAMLSystem(rc=None)

Bases: object

A description of the system being simulated, including the dimensionality and data type of the progress coordinate, the number of progress coordinate entries expected from each segment, and binning. To construct a simulation, the user must subclass WESTSystem and set several instance variables.

At a minimum, the user must subclass WESTSystem and override :method:`initialize` to set the data type and dimensionality of progress coordinate data and define a bin mapper.

Variables:
  • pcoord_ndim – The number of dimensions in the progress coordinate. Defaults to 1 (i.e. a one-dimensional progress coordinate).

  • pcoord_dtype – The data type of the progress coordinate, which must be callable (e.g. np.float32 and long will work, but '<f4' and '<i8' will not). Defaults to np.float64.

  • pcoord_len – The length of the progress coordinate time series generated by each segment, including both the initial and final values. Defaults to 2 (i.e. only the initial and final progress coordinate values for a segment are returned from propagation).

  • bin_mapper – A bin mapper describing the progress coordinate space.

  • bin_target_counts – A vector of target counts, one per bin.

property bin_target_counts
initialize()

Prepare this system object for use in simulation or analysis, creating a bin space, setting replicas per bin, and so on. This function is called whenever a WEST tool creates an instance of the system driver.

prepare_run()

Prepare this system for use in a simulation run. Called by w_run in all worker processes.

finalize_run()

A hook for system-specific processing for the end of a simulation run (as defined by such things as maximum wallclock time, rather than perhaps more scientifically-significant definitions of “the end of a simulation run”)

new_pcoord_array(pcoord_len=None)

Return an appropriately-sized and -typed pcoord array for a timepoint, segment, or number of segments. If pcoord_len is not specified (or None), then a length appropriate for a segment is returned.

new_region_set()