westpa.core modules
westpa.core module
westpa.core.data_manager module
HDF5 data manager for WEST.
Original HDF5 implementation: Joseph W. Kaus Current implementation: Matthew C. Zwier
WEST exclusively uses the cross-platform, self-describing file format HDF5 for data storage. This ensures that data is stored efficiently and portably in a manner that is relatively straightforward for other analysis tools (perhaps written in C/C++/Fortran) to access.
- The data is laid out in HDF5 as follows:
summary – overall summary data for the simulation
- /iterations/ – data for individual iterations, one group per iteration under /iterations
- iter_00000001/ – data for iteration 1
seg_index – overall information about segments in the iteration, including weight
pcoord – progress coordinate data organized as [seg_id][time][dimension]
wtg_parents – data used to reconstruct the split/merge history of trajectories
recycling – flux and event count for recycled particles, on a per-target-state basis
auxdata/ – auxiliary datasets (data stored on the ‘data’ field of Segment objects)
The file root object has an integer attribute ‘west_file_format_version’ which can be used to determine how to access data even as the file format (i.e. organization of data within HDF5 file) evolves.
- Version history:
- Version 9
Basis states are now saved as iter_segid instead of just segid as a pointer label.
Initial states are also saved in the iteration 0 file, with a negative sign.
- Version 8
Added external links to trajectory files in iterations/iter_* groups, if the HDF5 framework was used.
Added an iter group for the iteration 0 to store conformations of basis states.
- Version 7
Removed bin_assignments, bin_populations, and bin_rates from iteration group.
Added new_segments subgroup to iteration group
- Version 6
???
- Version 5
moved iter_* groups into a top-level iterations/ group,
added in-HDF5 storage for basis states, target states, and generated states
- class westpa.core.data_manager.attrgetter(attr, /, *attrs)
Bases:
object
Return a callable object that fetches the given attribute(s) from its operand. After f = attrgetter(‘name’), the call f(r) returns r.name. After g = attrgetter(‘name’, ‘date’), the call g(r) returns (r.name, r.date). After h = attrgetter(‘name.first’, ‘name.last’), the call h(r) returns (r.name.first, r.name.last).
- westpa.core.data_manager.relpath(path, start=None)
Return a relative version of a path
- westpa.core.data_manager.dirname(p)
Returns the directory component of a pathname
- class westpa.core.data_manager.Segment(n_iter=None, seg_id=None, weight=None, endpoint_type=None, parent_id=None, wtg_parent_ids=None, pcoord=None, status=None, walltime=None, cputime=None, data=None)
Bases:
object
A class wrapping segment data that must be passed through the work manager or data manager. Most fields are self-explanatory. One item worth noting is that a negative parent ID means that the segment starts from the initial state with ID -(segment.parent_id+1)
- SEG_STATUS_UNSET = 0
- SEG_STATUS_PREPARED = 1
- SEG_STATUS_COMPLETE = 2
- SEG_STATUS_FAILED = 3
- SEG_INITPOINT_UNSET = 0
- SEG_INITPOINT_CONTINUES = 1
- SEG_INITPOINT_NEWTRAJ = 2
- SEG_ENDPOINT_UNSET = 0
- SEG_ENDPOINT_CONTINUES = 1
- SEG_ENDPOINT_MERGED = 2
- SEG_ENDPOINT_RECYCLED = 3
- statuses = {'SEG_STATUS_COMPLETE': 2, 'SEG_STATUS_FAILED': 3, 'SEG_STATUS_PREPARED': 1, 'SEG_STATUS_UNSET': 0}
- initpoint_types = {'SEG_INITPOINT_CONTINUES': 1, 'SEG_INITPOINT_NEWTRAJ': 2, 'SEG_INITPOINT_UNSET': 0}
- endpoint_types = {'SEG_ENDPOINT_CONTINUES': 1, 'SEG_ENDPOINT_MERGED': 2, 'SEG_ENDPOINT_RECYCLED': 3, 'SEG_ENDPOINT_UNSET': 0}
- status_names = {0: 'SEG_STATUS_UNSET', 1: 'SEG_STATUS_PREPARED', 2: 'SEG_STATUS_COMPLETE', 3: 'SEG_STATUS_FAILED'}
- initpoint_type_names = {0: 'SEG_INITPOINT_UNSET', 1: 'SEG_INITPOINT_CONTINUES', 2: 'SEG_INITPOINT_NEWTRAJ'}
- endpoint_type_names = {0: 'SEG_ENDPOINT_UNSET', 1: 'SEG_ENDPOINT_CONTINUES', 2: 'SEG_ENDPOINT_MERGED', 3: 'SEG_ENDPOINT_RECYCLED'}
- static initial_pcoord(segment)
Return the initial progress coordinate point of this segment.
- static final_pcoord(segment)
Return the final progress coordinate point of this segment.
- property initpoint_type
- property initial_state_id
- property status_text
- property endpoint_type_text
- class westpa.core.data_manager.BasisState(label, probability, pcoord=None, auxref=None, state_id=None)
Bases:
object
Describes an basis (micro)state. These basis states are used to generate initial states for new trajectories, either at the beginning of the simulation (i.e. at w_init) or due to recycling.
- Variables:
state_id – Integer identifier of this state, usually set by the data manager.
label – A descriptive label for this microstate (may be empty)
probability – Probability of this state to be selected when creating a new trajectory.
pcoord – The representative progress coordinate of this state.
auxref – A user-provided (string) reference for locating data associated with this state (usually a filesystem path).
- classmethod states_to_file(states, fileobj)
Write a file defining basis states, which may then be read by states_from_file().
- classmethod states_from_file(statefile)
Read a file defining basis states. Each line defines a state, and contains a label, the probability, and optionally a data reference, separated by whitespace, as in:
unbound 1.0
or:
unbound_0 0.6 state0.pdb unbound_1 0.4 state1.pdb
- as_numpy_record()
Return the data for this state as a numpy record array.
- class westpa.core.data_manager.TargetState(label, pcoord, state_id=None)
Bases:
object
Describes a target state.
- Variables:
state_id – Integer identifier of this state, usually set by the data manager.
label – A descriptive label for this microstate (may be empty)
pcoord – The representative progress coordinate of this state.
- classmethod states_to_file(states, fileobj)
Write a file defining basis states, which may then be read by states_from_file().
- classmethod states_from_file(statefile, dtype)
Read a file defining target states. Each line defines a state, and contains a label followed by a representative progress coordinate value, separated by whitespace, as in:
bound 0.02
for a single target and one-dimensional progress coordinates or:
bound 2.7 0.0 drift 100 50.0
for two targets and a two-dimensional progress coordinate.
- class westpa.core.data_manager.InitialState(state_id, basis_state_id, iter_created, iter_used=None, istate_type=None, istate_status=None, pcoord=None, basis_state=None, basis_auxref=None)
Bases:
object
Describes an initial state for a new trajectory. These are generally constructed by appropriate modification of a basis state.
- Variables:
state_id – Integer identifier of this state, usually set by the data manager.
basis_state_id – Identifier of the basis state from which this state was generated, or None.
basis_state – The BasisState from which this state was generated, or None.
iter_created – Iteration in which this state was generated (0 for simulation initialization).
iter_used – Iteration in which this state was used to initiate a trajectory (None for unused).
istate_type – Integer describing the type of this initial state (ISTATE_TYPE_BASIS for direct use of a basis state, ISTATE_TYPE_GENERATED for a state generated from a basis state, ISTATE_TYPE_RESTART for a state corresponding to the endpoint of a segment in another simulation, or ISTATE_TYPE_START for a state generated from a start state).
istate_status – Integer describing whether this initial state has been properly prepared.
pcoord – The representative progress coordinate of this state.
- ISTATE_TYPE_UNSET = 0
- ISTATE_TYPE_BASIS = 1
- ISTATE_TYPE_GENERATED = 2
- ISTATE_TYPE_RESTART = 3
- ISTATE_TYPE_START = 4
- ISTATE_UNUSED = 0
- ISTATE_STATUS_PENDING = 0
- ISTATE_STATUS_PREPARED = 1
- ISTATE_STATUS_FAILED = 2
- istate_types = {'ISTATE_TYPE_BASIS': 1, 'ISTATE_TYPE_GENERATED': 2, 'ISTATE_TYPE_RESTART': 3, 'ISTATE_TYPE_START': 4, 'ISTATE_TYPE_UNSET': 0}
- istate_type_names = {0: 'ISTATE_TYPE_UNSET', 1: 'ISTATE_TYPE_BASIS', 2: 'ISTATE_TYPE_GENERATED', 3: 'ISTATE_TYPE_RESTART', 4: 'ISTATE_TYPE_START'}
- istate_statuses = {'ISTATE_STATUS_FAILED': 2, 'ISTATE_STATUS_PENDING': 0, 'ISTATE_STATUS_PREPARED': 1}
- istate_status_names = {0: 'ISTATE_STATUS_PENDING', 1: 'ISTATE_STATUS_PREPARED', 2: 'ISTATE_STATUS_FAILED'}
- as_numpy_record()
- class westpa.core.data_manager.NewWeightEntry(source_type, weight, prev_seg_id=None, prev_init_pcoord=None, prev_final_pcoord=None, new_init_pcoord=None, target_state_id=None, initial_state_id=None)
Bases:
object
- NW_SOURCE_RECYCLED = 0
- class westpa.core.data_manager.ExecutablePropagator(rc=None)
Bases:
WESTPropagator
- ENV_CURRENT_ITER = 'WEST_CURRENT_ITER'
- ENV_CURRENT_SEG_ID = 'WEST_CURRENT_SEG_ID'
- ENV_CURRENT_SEG_DATA_REF = 'WEST_CURRENT_SEG_DATA_REF'
- ENV_CURRENT_SEG_INITPOINT = 'WEST_CURRENT_SEG_INITPOINT_TYPE'
- ENV_PARENT_SEG_ID = 'WEST_PARENT_ID'
- ENV_PARENT_DATA_REF = 'WEST_PARENT_DATA_REF'
- ENV_BSTATE_ID = 'WEST_BSTATE_ID'
- ENV_BSTATE_DATA_REF = 'WEST_BSTATE_DATA_REF'
- ENV_ISTATE_ID = 'WEST_ISTATE_ID'
- ENV_ISTATE_DATA_REF = 'WEST_ISTATE_DATA_REF'
- ENV_STRUCT_DATA_REF = 'WEST_STRUCT_DATA_REF'
- ENV_RAND16 = 'WEST_RAND16'
- ENV_RAND32 = 'WEST_RAND32'
- ENV_RAND64 = 'WEST_RAND64'
- ENV_RAND128 = 'WEST_RAND128'
- ENV_RANDFLOAT = 'WEST_RANDFLOAT'
- static makepath(template, template_args=None, expanduser=True, expandvars=True, abspath=False, realpath=False)
- random_val_env_vars()
Return a set of environment variables containing random seeds. These are returned as a dictionary, suitable for use in
os.environ.update()
or as theenv
argument tosubprocess.Popen()
. Every child process executed byexec_child()
gets these.
- exec_child(executable, environ=None, stdin=None, stdout=None, stderr=None, cwd=None)
Execute a child process with the environment set from the current environment, the values of self.addtl_child_environ, the random numbers returned by self.random_val_env_vars, and the given
environ
(applied in that order). stdin/stdout/stderr are optionally redirected.This function waits on the child process to finish, then returns (rc, rusage), where rc is the child’s return code and rusage is the resource usage tuple from os.wait4()
- exec_child_from_child_info(child_info, template_args, environ)
- update_args_env_basis_state(template_args, environ, basis_state)
- update_args_env_initial_state(template_args, environ, initial_state)
- update_args_env_iter(template_args, environ, n_iter)
- update_args_env_segment(template_args, environ, segment)
- template_args_for_segment(segment)
- exec_for_segment(child_info, segment, addtl_env=None)
Execute a child process with environment and template expansion from the given segment.
- exec_for_iteration(child_info, n_iter, addtl_env=None)
Execute a child process with environment and template expansion from the given iteration number.
- exec_for_basis_state(child_info, basis_state, addtl_env=None)
Execute a child process with environment and template expansion from the given basis state
- exec_for_initial_state(child_info, initial_state, addtl_env=None)
Execute a child process with environment and template expansion from the given initial state.
- prepare_file_system(segment, environ)
- setup_dataset_return(segment=None, subset_keys=None)
Set up temporary files and environment variables that point to them for segment runners to return data.
segment
is theSegment
object that the return data is associated with.subset_keys
specifies the names of a subset of data to be returned.
- retrieve_dataset_return(state, return_files, del_return_files, single_point)
Retrieve returned data from the temporary locations directed by the environment variables.
state
is aSegment
,BasisState
, orInitialState``object that the return data is associated with. ``return_files
is adict
where the keys are the dataset names and the values are the paths to the temporarily files that contain the returned data.del_return_files
is adict
where the keys are the names of datasets to be deleted (if the corresponding value is set toTrue
) once the data is retrieved.
- get_pcoord(state)
Get the progress coordinate of the given basis or initial state.
- gen_istate(basis_state, initial_state)
Generate a new initial state from the given basis state.
- prepare_iteration(n_iter, segments)
Perform any necessary per-iteration preparation. This is run by the work manager.
- finalize_iteration(n_iter, segments)
Perform any necessary post-iteration cleanup. This is run by the work manager.
- propagate(segments)
Propagate one or more segments, including any necessary per-iteration setup and teardown for this propagator.
- westpa.core.data_manager.makepath(template, template_args=None, expanduser=True, expandvars=True, abspath=False, realpath=False)
- class westpa.core.data_manager.flushing_lock(lock, fileobj)
Bases:
object
- class westpa.core.data_manager.expiring_flushing_lock(lock, flush_method, nextsync)
Bases:
object
- westpa.core.data_manager.seg_id_dtype
alias of
int64
- westpa.core.data_manager.n_iter_dtype
alias of
uint32
- westpa.core.data_manager.weight_dtype
alias of
float64
- westpa.core.data_manager.utime_dtype
alias of
float64
- westpa.core.data_manager.seg_status_dtype
alias of
uint8
- westpa.core.data_manager.seg_initpoint_dtype
alias of
uint8
- westpa.core.data_manager.seg_endpoint_dtype
alias of
uint8
- westpa.core.data_manager.istate_type_dtype
alias of
uint8
- westpa.core.data_manager.istate_status_dtype
alias of
uint8
- westpa.core.data_manager.nw_source_dtype
alias of
uint8
- class westpa.core.data_manager.WESTDataManager(rc=None)
Bases:
object
Data manager for assisiting the reading and writing of WEST data from/to HDF5 files.
- default_iter_prec = 8
- default_we_h5filename = 'west.h5'
- default_we_h5file_driver = None
- default_flush_period = 60
- default_aux_compression_threshold = 1048576
- binning_hchunksize = 4096
- table_scan_chunksize = 1024
- flushing_lock()
- expiring_flushing_lock()
- process_config()
- property system
- property closed
- iter_group_name(n_iter, absolute=True)
- require_iter_group(n_iter)
Get the group associated with n_iter, creating it if necessary.
- del_iter_group(n_iter)
- get_iter_group(n_iter)
- get_seg_index(n_iter)
- property current_iteration
- open_backing(mode=None)
Open the (already-created) HDF5 file named in self.west_h5filename.
- prepare_backing()
Create new HDF5 file
- close_backing()
- flush_backing()
- save_target_states(tstates, n_iter=None)
Save the given target states in the HDF5 file; they will be used for the next iteration to be propagated. A complete set is required, even if nominally appending to an existing set, which simplifies the mapping of IDs to the table.
- find_tstate_group(n_iter)
- find_ibstate_group(n_iter)
- get_target_states(n_iter)
Return a list of Target objects representing the target (sink) states that are in use for iteration n_iter. Future iterations are assumed to continue from the most recent set of states.
- create_ibstate_group(basis_states, n_iter=None)
Create the group used to store basis states and initial states (whose definitions are always coupled). This group is hard-linked into all iteration groups that use these basis and initial states.
- create_ibstate_iter_h5file(basis_states)
Create the per-iteration HDF5 file for the basis states (i.e., iteration 0). This special treatment is needed so that the analysis tools can access basis states more easily.
- update_iter_h5file(n_iter, segments)
Write out the per-iteration HDF5 file with given segments and add an external link to it in the main HDF5 file (west.h5) if the link is not present.
- get_basis_states(n_iter=None)
Return a list of BasisState objects representing the basis states that are in use for iteration n_iter.
- create_initial_states(n_states, n_iter=None)
Create storage for
n_states
initial states associated with iterationn_iter
, and return bare InitialState objects with only state_id set.
- update_initial_states(initial_states, n_iter=None)
Save the given initial states in the HDF5 file
- get_initial_states(n_iter=None)
- get_segment_initial_states(segments, n_iter=None)
Retrieve all initial states referenced by the given segments.
- get_unused_initial_states(n_states=None, n_iter=None)
Retrieve any prepared but unused initial states applicable to the given iteration. Up to
n_states
states are returned; ifn_states
is None, then all unused states are returned.
- prepare_iteration(n_iter, segments)
Prepare for a new iteration by creating space to store the new iteration’s data. The number of segments, their IDs, and their lineage must be determined and included in the set of segments passed in.
- update_iter_group_links(n_iter)
Update the per-iteration hard links pointing to the tables of target and initial/basis states for the given iteration. These links are not used by this class, but are remarkably convenient for third-party analysis tools and hdfview.
- get_iter_summary(n_iter=None)
- update_iter_summary(summary, n_iter=None)
- del_iter_summary(min_iter)
- update_segments(n_iter, segments)
Update segment information in the HDF5 file; all prior information for each
segment
is overwritten, except for parent and weight transfer information.
- get_segments(n_iter=None, seg_ids=None, load_pcoords=True)
Return the given (or all) segments from a given iteration.
If the optional parameter
load_auxdata
is true, then all auxiliary datasets available are loaded and mapped onto thedata
dictionary of each segment. Ifload_auxdata
is None, then use the defaultself.auto_load_auxdata
, which can be set by the optionload_auxdata
in the[data]
section ofwest.cfg
. This essentially requires as much RAM as there is per-iteration auxiliary data, so this behavior is not on by default.
- prepare_segment_restarts(segments, basis_states=None, initial_states=None)
Prepare the necessary folder and files given the data stored in parent per-iteration HDF5 file for propagating the simulation.
basis_states
andinitial_states
should be provided if the segments are newly created
- get_all_parent_ids(n_iter)
- get_parent_ids(n_iter, seg_ids=None)
Return a sequence of the parent IDs of the given seg_ids.
- get_weights(n_iter, seg_ids)
Return the weights associated with the given seg_ids
- get_child_ids(n_iter, seg_id)
Return the seg_ids of segments who have the given segment as a parent.
- get_children(segment)
Return all segments which have the given segment as a parent
- prepare_run()
- finalize_run()
- save_new_weight_data(n_iter, new_weights)
Save a set of NewWeightEntry objects to HDF5. Note that this should be called for the iteration in which the weights appear in their new locations (e.g. for recycled walkers, the iteration following recycling).
- get_new_weight_data(n_iter)
- find_bin_mapper(hashval)
Check to see if the given has value is in the binning table. Returns the index in the bin data tables if found, or raises KeyError if not.
- get_bin_mapper(hashval)
Look up the given hash value in the binning table, unpickling and returning the corresponding bin mapper if available, or raising KeyError if not.
- save_bin_mapper(hashval, pickle_data)
Store the given mapper in the table of saved mappers. If the mapper cannot be stored, PickleError will be raised. Returns the index in the bin data tables where the mapper is stored.
- save_iter_binning(n_iter, hashval, pickled_mapper, target_counts)
Save information about the binning used to generate segments for iteration n_iter.
- westpa.core.data_manager.normalize_dataset_options(dsopts, path_prefix='', n_iter=0)
- westpa.core.data_manager.create_dataset_from_dsopts(group, dsopts, shape=None, dtype=None, data=None, autocompress_threshold=None, n_iter=None)
- westpa.core.data_manager.require_dataset_from_dsopts(group, dsopts, shape=None, dtype=None, data=None, autocompress_threshold=None, n_iter=None)
- westpa.core.data_manager.calc_chunksize(shape, dtype, max_chunksize=262144)
Calculate a chunk size for HDF5 data, anticipating that access will slice along lower dimensions sooner than higher dimensions.
westpa.core.extloader module
- westpa.core.extloader.load_module(module_name, path=None)
Load and return the given module, recursively loading containing packages as necessary.
- westpa.core.extloader.get_object(object_name, path=None)
Attempt to load the given object, using additional path information if given.
westpa.core.h5io module
Miscellaneous routines to help with HDF5 input and output of WEST-related data.
- class westpa.core.h5io.Trajectory(xyz, topology, time=None, unitcell_lengths=None, unitcell_angles=None)
Bases:
object
Container object for a molecular dynamics trajectory
A Trajectory represents a collection of one or more molecular structures, generally (but not necessarily) from a molecular dynamics trajectory. The Trajectory stores a number of fields describing the system through time, including the cartesian coordinates of each atoms (
xyz
), the topology of the molecular system (topology
), and information about the unitcell if appropriate (unitcell_vectors
,unitcell_length
,unitcell_angles
).A Trajectory should generally be constructed by loading a file from disk. Trajectories can be loaded from (and saved to) the PDB, XTC, TRR, DCD, binpos, NetCDF or MDTraj HDF5 formats.
Trajectory supports fancy indexing, so you can extract one or more frames from a Trajectory as a separate trajectory. For example, to form a trajectory with every other frame, you can slice with
traj[::2]
.Trajectory uses the nanometer, degree & picosecond unit system.
Examples
>>> # loading a trajectory >>> import mdtraj as md >>> md.load('trajectory.xtc', top='native.pdb') <mdtraj.Trajectory with 1000 frames, 22 atoms at 0x1058a73d0>
>>> # slicing a trajectory >>> t = md.load('trajectory.h5') >>> print(t) <mdtraj.Trajectory with 100 frames, 22 atoms> >>> print(t[::2]) <mdtraj.Trajectory with 50 frames, 22 atoms>
>>> # calculating the average distance between two atoms >>> import mdtraj as md >>> import numpy as np >>> t = md.load('trajectory.h5') >>> np.mean(np.sqrt(np.sum((t.xyz[:, 0, :] - t.xyz[:, 21, :])**2, axis=1)))
See also
mdtraj.load
High-level function that loads files and returns an
md.Trajectory
- n_frames
- Type:
int
- n_atoms
- Type:
int
- n_residues
- Type:
int
- time
- Type:
np.ndarray, shape=(n_frames,)
- timestep
- Type:
float
- topology
- Type:
md.Topology
- top
- Type:
md.Topology
- xyz
- Type:
np.ndarray, shape=(n_frames, n_atoms, 3)
- unitcell_vectors
- Type:
{np.ndarray, shape=(n_frames, 3, 3), None}
- unitcell_lengths
- Type:
{np.ndarray, shape=(n_frames, 3), None}
- unitcell_angles
- Type:
{np.ndarray, shape=(n_frames, 3), None}
- property n_frames
Number of frames in the trajectory
- Returns:
n_frames – The number of frames in the trajectory
- Return type:
int
- property n_atoms
Number of atoms in the trajectory
- Returns:
n_atoms – The number of atoms in the trajectory
- Return type:
int
- property n_residues
Number of residues (amino acids) in the trajectory
- Returns:
n_residues – The number of residues in the trajectory’s topology
- Return type:
int
- property n_chains
Number of chains in the trajectory
- Returns:
n_chains – The number of chains in the trajectory’s topology
- Return type:
int
- property top
Alias for self.topology, describing the organization of atoms into residues, bonds, etc
- Returns:
topology – The topology object, describing the organization of atoms into residues, bonds, etc
- Return type:
md.Topology
- property timestep
Timestep between frames, in picoseconds
- Returns:
timestep – The timestep between frames, in picoseconds.
- Return type:
float
- property unitcell_vectors
The vectors that define the shape of the unit cell in each frame
- Returns:
vectors – Vectors defining the shape of the unit cell in each frame. The semantics of this array are that the shape of the unit cell in frame
i
are given by the three vectors,value[i, 0, :]
,value[i, 1, :]
, andvalue[i, 2, :]
.- Return type:
np.ndarray, shape(n_frames, 3, 3)
- property unitcell_volumes
Volumes of unit cell for each frame.
- Returns:
volumes – Volumes of the unit cell in each frame, in nanometers^3, or None if the Trajectory contains no unitcell information.
- Return type:
{np.ndarray, shape=(n_frames), None}
- superpose(reference, frame=0, atom_indices=None, ref_atom_indices=None, parallel=True)
Superpose each conformation in this trajectory upon a reference
- Parameters:
reference (md.Trajectory) – Align self to a particular frame in reference
frame (int) – The index of the conformation in reference to align to.
atom_indices (array_like, or None) – The indices of the atoms to superpose. If not supplied, all atoms will be used.
ref_atom_indices (array_like, or None) – Use these atoms on the reference structure. If not supplied, the same atom indices will be used for this trajectory and the reference one.
parallel (bool) – Use OpenMP to run the superposition in parallel over multiple cores
- Return type:
self
- join(other, check_topology=True, discard_overlapping_frames=False)
Join two trajectories together along the time/frame axis.
This method joins trajectories along the time axis, giving a new trajectory of length equal to the sum of the lengths of self and other. It can also be called by using self + other
- Parameters:
other (Trajectory or list of Trajectory) – One or more trajectories to join with this one. These trajectories are appended to the end of this trajectory.
check_topology (bool) – Ensure that the topology of self and other are identical before joining them. If false, the resulting trajectory will have the topology of self.
discard_overlapping_frames (bool, optional) – If True, compare coordinates at trajectory edges to discard overlapping frames. Default: False.
See also
stack
join two trajectories along the atom axis
- stack(other, keep_resSeq=True)
Stack two trajectories along the atom axis
This method joins trajectories along the atom axis, giving a new trajectory with a number of atoms equal to the sum of the number of atoms in self and other.
Notes
The resulting trajectory will have the unitcell and time information the left operand.
Examples
>>> t1 = md.load('traj1.h5') >>> t2 = md.load('traj2.h5') >>> # even when t2 contains no unitcell information >>> t2.unitcell_vectors = None >>> stacked = t1.stack(t2) >>> # the stacked trajectory inherits the unitcell information >>> # from the first trajectory >>> np.all(stacked.unitcell_vectors == t1.unitcell_vectors) True
- Parameters:
other (Trajectory) – The other trajectory to join
keep_resSeq (bool, optional, default=True) – see
`mdtraj.core.topology.Topology.join`
method documentation
See also
join
join two trajectories along the time/frame axis.
- slice(key, copy=True)
Slice trajectory, by extracting one or more frames into a separate object
This method can also be called using index bracket notation, i.e traj[1] == traj.slice(1)
- Parameters:
key ({int, np.ndarray, slice}) – The slice to take. Can be either an int, a list of ints, or a slice object.
copy (bool, default=True) – Copy the arrays after slicing. If you set this to false, then if you modify a slice, you’ll modify the original array since they point to the same data.
- property topology
Topology of the system, describing the organization of atoms into residues, bonds, etc
- Returns:
topology – The topology object, describing the organization of atoms into residues, bonds, etc
- Return type:
md.Topology
- property xyz
Cartesian coordinates of each atom in each simulation frame
- Returns:
xyz – A three dimensional numpy array, with the cartesian coordinates of each atoms in each frame.
- Return type:
np.ndarray, shape=(n_frames, n_atoms, 3)
- property unitcell_lengths
Lengths that define the shape of the unit cell in each frame.
- Returns:
lengths – Lengths of the unit cell in each frame, in nanometers, or None if the Trajectory contains no unitcell information.
- Return type:
{np.ndarray, shape=(n_frames, 3), None}
- property unitcell_angles
Angles that define the shape of the unit cell in each frame.
- Returns:
lengths – The angles between the three unitcell vectors in each frame,
alpha
,beta
, andgamma
.alpha' gives the angle between vectors ``b
andc
,beta
gives the angle between vectorsc
anda
, andgamma
gives the angle between vectorsa
andb
. The angles are in degrees.- Return type:
np.ndarray, shape=(n_frames, 3)
- property time
The simulation time corresponding to each frame, in picoseconds
- Returns:
time – The simulation time corresponding to each frame, in picoseconds
- Return type:
np.ndarray, shape=(n_frames,)
- openmm_positions(frame)
OpenMM-compatable positions of a single frame.
Examples
>>> t = md.load('trajectory.h5') >>> context.setPositions(t.openmm_positions(0))
- Parameters:
frame (int) – The index of frame of the trajectory that you wish to extract
- Returns:
positions – The cartesian coordinates of specific trajectory frame, formatted for input to OpenMM
- Return type:
list
- openmm_boxes(frame)
OpenMM-compatable box vectors of a single frame.
Examples
>>> t = md.load('trajectory.h5') >>> context.setPeriodicBoxVectors(t.openmm_positions(0))
- Parameters:
frame (int) – Return box for this single frame.
- Returns:
box – The periodic box vectors for this frame, formatted for input to OpenMM.
- Return type:
tuple
- static load(filenames, **kwargs)
Load a trajectory from disk
- Parameters:
filenames ({path-like, [path-like]}) – Either a path or list of paths
extension (As requested by the various load functions -- it depends on the)
- save(filename, **kwargs)
Save trajectory to disk, in a format determined by the filename extension
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory. The extension will be parsed and will control the format.
lossy (bool) – For .h5 or .lh5, whether or not to use compression.
no_models (bool) – For .pdb. TODO: Document this?
force_overwrite (bool) – If filename already exists, overwrite it.
- save_hdf5(filename, force_overwrite=True)
Save trajectory to MDTraj HDF5 format
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there
- save_lammpstrj(filename, force_overwrite=True)
Save trajectory to LAMMPS custom dump format
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there
- save_xyz(filename, force_overwrite=True)
Save trajectory to .xyz format.
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there
- save_pdb(filename, force_overwrite=True, bfactors=None)
Save trajectory to RCSB PDB format
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there
bfactors (array_like, default=None, shape=(n_frames, n_atoms) or (n_atoms,)) – Save bfactors with pdb file. If the array is two dimensional it should contain a bfactor for each atom in each frame of the trajectory. Otherwise, the same bfactor will be saved in each frame.
- save_xtc(filename, force_overwrite=True)
Save trajectory to Gromacs XTC format
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there
- save_trr(filename, force_overwrite=True)
Save trajectory to Gromacs TRR format
Notes
Only the xyz coordinates and the time are saved, the velocities and forces in the trr will be zeros
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there
- save_dcd(filename, force_overwrite=True)
Save trajectory to CHARMM/NAMD DCD format
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filenames, if its already there
- save_dtr(filename, force_overwrite=True)
Save trajectory to DESMOND DTR format
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filenames, if its already there
- save_binpos(filename, force_overwrite=True)
Save trajectory to AMBER BINPOS format
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there
- save_mdcrd(filename, force_overwrite=True)
Save trajectory to AMBER mdcrd format
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there
- save_netcdf(filename, force_overwrite=True)
Save trajectory in AMBER NetCDF format
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if it’s already there
- save_netcdfrst(filename, force_overwrite=True)
Save trajectory in AMBER NetCDF restart format
- Parameters:
filename (path-like) – filesystem path in which to save the restart
force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if it’s already there
Notes
NetCDF restart files can only store a single frame. If only one frame exists, “filename” will be written. Otherwise, “filename.#” will be written, where # is a zero-padded number from 1 to the total number of frames in the trajectory
- save_amberrst7(filename, force_overwrite=True)
Save trajectory in AMBER ASCII restart format
- Parameters:
filename (path-like) – filesystem path in which to save the restart
force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if it’s already there
Notes
Amber restart files can only store a single frame. If only one frame exists, “filename” will be written. Otherwise, “filename.#” will be written, where # is a zero-padded number from 1 to the total number of frames in the trajectory
- save_lh5(filename, force_overwrite=True)
Save trajectory in deprecated MSMBuilder2 LH5 (lossy HDF5) format.
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if it’s already there
- save_gro(filename, force_overwrite=True, precision=3)
Save trajectory in Gromacs .gro format
- Parameters:
filename (path-like) – Path to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at that filename if it exists
precision (int, default=3) – The number of decimal places to use for coordinates in GRO file
- save_tng(filename, force_overwrite=True)
Save trajectory to Gromacs TNG format
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filename, if its already there
- save_gsd(filename, force_overwrite=True)
Save trajectory to HOOMD GSD format
- Parameters:
filename (path-like) – filesystem path in which to save the trajectory
force_overwrite (bool, default=True) – Overwrite anything that exists at filenames, if its already there
- center_coordinates(mass_weighted=False)
Center each trajectory frame at the origin (0,0,0).
This method acts inplace on the trajectory. The centering can be either uniformly weighted (mass_weighted=False) or weighted by the mass of each atom (mass_weighted=True).
- Parameters:
mass_weighted (bool, optional (default = False)) – If True, weight atoms by mass when removing COM.
- Return type:
self
- restrict_atoms(**kwargs)
DEPRECATED: restrict_atoms was replaced by atom_slice and will be removed in 2.0
Retain only a subset of the atoms in a trajectory
Deletes atoms not in atom_indices, and re-indexes those that remain
- atom_indicesarray-like, dtype=int, shape=(n_atoms)
List of atom indices to keep.
- inplacebool, default=True
If
True
, the operation is done inplace, modifyingself
. Otherwise, a copy is returned with the restricted atoms, andself
is not modified.
- trajmd.Trajectory
The return value is either
self
, or the new trajectory, depending on the value ofinplace
.
- atom_slice(atom_indices, inplace=False)
Create a new trajectory from a subset of atoms
- Parameters:
atom_indices (array-like, dtype=int, shape=(n_atoms)) – List of indices of atoms to retain in the new trajectory.
inplace (bool, default=False) – If
True
, the operation is done inplace, modifyingself
. Otherwise, a copy is returned with the sliced atoms, andself
is not modified.
- Returns:
traj – The return value is either
self
, or the new trajectory, depending on the value ofinplace
.- Return type:
md.Trajectory
See also
stack
stack multiple trajectories along the atom axis
- remove_solvent(exclude=None, inplace=False)
Create a new trajectory without solvent atoms
- Parameters:
exclude (array-like, dtype=str, shape=(n_solvent_types)) – List of solvent residue names to retain in the new trajectory.
inplace (bool, default=False) – The return value is either
self
, or the new trajectory, depending on the value ofinplace
.
- Returns:
traj – The return value is either
self
, or the new trajectory, depending on the value ofinplace
.- Return type:
md.Trajectory
- smooth(width, order=3, atom_indices=None, inplace=False)
Smoothen a trajectory using a zero-delay Buttersworth filter. Please note that for optimal results the trajectory should be properly aligned prior to smoothing (see md.Trajectory.superpose).
- Parameters:
width (int) – This acts very similar to the window size in a moving average smoother. In this implementation, the frequency of the low-pass filter is taken to be two over this width, so it’s like “half the period” of the sinusiod where the filter starts to kick in. Must be an integer greater than one.
order (int, optional, default=3) – The order of the filter. A small odd number is recommended. Higher order filters cutoff more quickly, but have worse numerical properties.
atom_indices (array-like, dtype=int, shape=(n_atoms), default=None) – List of indices of atoms to retain in the new trajectory. Default is set to None, which applies smoothing to all atoms.
inplace (bool, default=False) – The return value is either
self
, or the new trajectory, depending on the value ofinplace
.
- Returns:
traj – The return value is either
self
, or the new smoothed trajectory, depending on the value ofinplace
.- Return type:
md.Trajectory
References
- make_molecules_whole(inplace=False, sorted_bonds=None)
Only make molecules whole
- Parameters:
inplace (bool) – If False, a new Trajectory is created and returned. If True, this Trajectory is modified directly.
sorted_bonds (array of shape (n_bonds, 2)) – Pairs of atom indices that define bonds, in sorted order. If not specified, these will be determined from the trajectory’s topology.
See also
- image_molecules(inplace=False, anchor_molecules=None, other_molecules=None, sorted_bonds=None, make_whole=True)
Recenter and apply periodic boundary conditions to the molecules in each frame of the trajectory.
This method is useful for visualizing a trajectory in which molecules were not wrapped to the periodic unit cell, or in which the macromolecules are not centered with respect to the solvent. It tries to be intelligent in deciding what molecules to center, so you can simply call it and trust that it will “do the right thing”.
- Parameters:
inplace (bool, default=False) – If False, a new Trajectory is created and returned. If True, this Trajectory is modified directly.
anchor_molecules (list of atom sets, optional, default=None) – Molecule that should be treated as an “anchor”. These molecules will be centered in the box and put near each other. If not specified, anchor molecules are guessed using a heuristic.
other_molecules (list of atom sets, optional, default=None) – Molecules that are not anchors. If not specified, these will be molecules other than the anchor molecules
sorted_bonds (array of shape (n_bonds, 2)) – Pairs of atom indices that define bonds, in sorted order. If not specified, these will be determined from the trajectory’s topology. Only relevant if
make_whole
is True.make_whole (bool) – Whether to make molecules whole.
- Returns:
traj – The return value is either
self
or the new trajectory, depending on the value ofinplace
.- Return type:
md.Trajectory
See also
Topology.guess_anchor_molecules
- westpa.core.h5io.join_traj(trajs, check_topology=True, discard_overlapping_frames=False)
Concatenate multiple trajectories into one long trajectory
- Parameters:
trajs (iterable of trajectories) – Combine these into one trajectory
check_topology (bool) – Make sure topologies match before joining
discard_overlapping_frames (bool) – Check for overlapping frames and discard
- westpa.core.h5io.in_units_of(quantity, units_in, units_out, inplace=False)
Convert a numerical quantity between unit systems.
- Parameters:
quantity ({number, np.ndarray, openmm.unit.Quantity}) – quantity can either be a unitted quantity – i.e. instance of openmm.unit.Quantity, or just a bare number or numpy array
units_in (str) – If you supply a quantity that’s not a openmm.unit.Quantity, you should tell me what units it is in. If you don’t, i’m just going to echo you back your quantity without doing any unit checking.
units_out (str) – A string description of the units you want out. This should look like “nanometers/picosecond” or “nanometers**3” or whatever
inplace (bool) – Attempt to do the transformation inplace, by mutating the quantity argument and avoiding a copy. This is only possible if quantity is a writable numpy array.
- Returns:
rquantity – The resulting quantity, in the new unit system. If the function was called with inplace=True and quantity was a writable numpy array, rquantity will alias the same memory as the input quantity, which will have been changed inplace. Otherwise, if a copy was required, rquantity will point to new memory.
- Return type:
{number, np.ndarray}
Examples
>>> in_units_of(1, 'meter**2/second', 'nanometers**2/picosecond') 1000000.0
- westpa.core.h5io.import_(module)
Import a module, and issue a nice message to stderr if the module isn’t installed.
Currently, this function will print nice error messages for networkx, tables, netCDF4, and openmm.unit, which are optional MDTraj dependencies.
- Parameters:
module (str) – The module you’d like to import, as a string
- Returns:
module – The module object
- Return type:
{module, object}
Examples
>>> # the following two lines are equivalent. the difference is that the >>> # second will check for an ImportError and print you a very nice >>> # user-facing message about what's wrong (where you can install the >>> # module from, etc) if the import fails >>> import tables >>> tables = import_('tables')
- westpa.core.h5io.ensure_type(val, dtype, ndim, name, length=None, can_be_none=False, shape=None, warn_on_cast=True, add_newaxis_on_deficient_ndim=False)
Typecheck the size, shape and dtype of a numpy array, with optional casting.
- Parameters:
val ({np.ndaraay, None}) – The array to check
dtype ({nd.dtype, str}) – The dtype you’d like the array to have
ndim (int) – The number of dimensions you’d like the array to have
name (str) – name of the array. This is used when throwing exceptions, so that we can describe to the user which array is messed up.
length (int, optional) – How long should the array be?
can_be_none (bool) – Is
val == None
acceptable?shape (tuple, optional) – What should be shape of the array be? If the provided tuple has Nones in it, those will be semantically interpreted as matching any length in that dimension. So, for example, using the shape spec
(None, None, 3)
will ensure that the last dimension is of length three without constraining the first two dimensionswarn_on_cast (bool, default=True) – Raise a warning when the dtypes don’t match and a cast is done.
add_newaxis_on_deficient_ndim (bool, default=True) – Add a new axis to the beginining of the array if the number of dimensions is deficient by one compared to your specification. For instance, if you’re trying to get out an array of
ndim == 3
, but the user provides an array ofshape == (10, 10)
, a new axis will be created with length 1 in front, so that the return value is of shape(1, 10, 10)
.
Notes
The returned value will always be C-contiguous.
- Returns:
typechecked_val – If val=None and can_be_none=True, then this will return None. Otherwise, it will return val (or a copy of val). If the dtype wasn’t right, it’ll be casted to the right shape. If the array was not C-contiguous, it’ll be copied as well.
- Return type:
np.ndarray, None
- class westpa.core.h5io.HDF5TrajectoryFile(filename, mode='r', force_overwrite=True, compression='zlib')
Bases:
object
Interface for reading and writing to a MDTraj HDF5 molecular dynamics trajectory file, whose format is described here.
This is a file-like object, that both reading or writing depending on the mode flag. It implements the context manager protocol, so you can also use it with the python ‘with’ statement.
The format is extremely flexible and high performance. It can hold a wide variety of information about a trajectory, including fields like the temperature and energies. Because it’s built on the fantastic HDF5 library, it’s easily extensible too.
- Parameters:
filename (path-like) – Path to the file to open
mode ({'r, 'w'}) – Mode in which to open the file. ‘r’ is for reading and ‘w’ is for writing
force_overwrite (bool) – In mode=’w’, how do you want to behave if a file by the name of filename already exists? if force_overwrite=True, it will be overwritten.
compression ({'zlib', None}) – Apply compression to the file? This will save space, and does not cost too many cpu cycles, so it’s recommended.
- root
- title
- application
- topology
- randomState
- forcefield
- reference
- constraints
See also
mdtraj.load_hdf5
High-level wrapper that returns a
md.Trajectory
- distance_unit = 'nanometers'
- property root
Direct access to the root group of the underlying Tables HDF5 file handle.
This can be used for random or specific access to the underlying arrays on disk
- property title
User-defined title for the data represented in the file
- property application
Suite of programs that created the file
- property topology
Get the topology out from the file
- Returns:
topology – A topology object
- Return type:
mdtraj.Topology
- property randomState
State of the creators internal random number generator at the start of the simulation
- property forcefield
Description of the hamiltonian used. A short, human readable string, like AMBER99sbildn.
- property reference
A published reference that documents the program or parameters used to generate the data
- property constraints
Constraints applied to the bond lengths
- Returns:
constraints – A one dimensional array with the a int, int, float type giving the index of the two atoms involved in the constraints and the distance of the constraint. If no constraint information is in the file, the return value is None.
- Return type:
{None, np.array, dtype=[(‘atom1’, ‘<i4’), (‘atom2’, ‘<i4’), (‘distance’, ‘<f4’)])}
- read_as_traj(n_frames=None, stride=None, atom_indices=None)
Read a trajectory from the HDF5 file
- Parameters:
n_frames ({int, None}) – The number of frames to read. If not supplied, all of the remaining frames will be read.
stride ({int, None}) – By default all of the frames will be read, but you can pass this flag to read a subset of of the data by grabbing only every stride-th frame from disk.
atom_indices ({int, None}) – By default all of the atom will be read, but you can pass this flag to read only a subsets of the atoms for the coordinates and velocities fields. Note that you will have to carefully manage the indices and the offsets, since the i-th atom in the topology will not necessarily correspond to the i-th atom in your subset.
- Returns:
trajectory – A trajectory object containing the loaded portion of the file.
- Return type:
- read(n_frames=None, stride=None, atom_indices=None)
Read one or more frames of data from the file
- Parameters:
n_frames ({int, None}) – The number of frames to read. If not supplied, all of the remaining frames will be read.
stride ({int, None}) – By default all of the frames will be read, but you can pass this flag to read a subset of of the data by grabbing only every stride-th frame from disk.
atom_indices ({int, None}) – By default all of the atom will be read, but you can pass this flag to read only a subsets of the atoms for the coordinates and velocities fields. Note that you will have to carefully manage the indices and the offsets, since the i-th atom in the topology will not necessarily correspond to the i-th atom in your subset.
Notes
If you’d like more flexible access to the data, that is available by using the pytables group directly, which is accessible via the root property on this class.
- Returns:
frames – The returned namedtuple will have the fields “coordinates”, “time”, “cell_lengths”, “cell_angles”, “velocities”, “kineticEnergy”, “potentialEnergy”, “temperature” and “alchemicalLambda”. Each of the fields in the returned namedtuple will either be a numpy array or None, dependening on if that data was saved in the trajectory. All of the data shall be n units of “nanometers”, “picoseconds”, “kelvin”, “degrees” and “kilojoules_per_mole”.
- Return type:
namedtuple
- write(coordinates, time=None, cell_lengths=None, cell_angles=None, velocities=None, kineticEnergy=None, potentialEnergy=None, temperature=None, alchemicalLambda=None)
Write one or more frames of data to the file
This method saves data that is associated with one or more simulation frames. Note that all of the arguments can either be raw numpy arrays or unitted arrays (with openmm.unit.Quantity). If the arrays are unittted, a unit conversion will be automatically done from the supplied units into the proper units for saving on disk. You won’t have to worry about it.
Furthermore, if you wish to save a single frame of simulation data, you can do so naturally, for instance by supplying a 2d array for the coordinates and a single float for the time. This “shape deficiency” will be recognized, and handled appropriately.
- Parameters:
coordinates (np.ndarray, shape=(n_frames, n_atoms, 3)) – The cartesian coordinates of the atoms to write. By convention, the lengths should be in units of nanometers.
time (np.ndarray, shape=(n_frames,), optional) – You may optionally specify the simulation time, in picoseconds corresponding to each frame.
cell_lengths (np.ndarray, shape=(n_frames, 3), dtype=float32, optional) – You may optionally specify the unitcell lengths. The length of the periodic box in each frame, in each direction, a, b, c. By convention the lengths should be in units of angstroms.
cell_angles (np.ndarray, shape=(n_frames, 3), dtype=float32, optional) – You may optionally specify the unitcell angles in each frame. Organized analogously to cell_lengths. Gives the alpha, beta and gamma angles respectively. By convention, the angles should be in units of degrees.
velocities (np.ndarray, shape=(n_frames, n_atoms, 3), optional) – You may optionally specify the cartesian components of the velocity for each atom in each frame. By convention, the velocities should be in units of nanometers / picosecond.
kineticEnergy (np.ndarray, shape=(n_frames,), optional) – You may optionally specify the kinetic energy in each frame. By convention the kinetic energies should b in units of kilojoules per mole.
potentialEnergy (np.ndarray, shape=(n_frames,), optional) – You may optionally specify the potential energy in each frame. By convention the kinetic energies should b in units of kilojoules per mole.
temperature (np.ndarray, shape=(n_frames,), optional) – You may optionally specify the temperature in each frame. By convention the temperatures should b in units of Kelvin.
alchemicalLambda (np.ndarray, shape=(n_frames,), optional) – You may optionally specify the alchemical lambda in each frame. These have no units, but are generally between zero and one.
- seek(offset, whence=0)
Move to a new file position
- Parameters:
offset (int) – A number of frames.
whence ({0, 1, 2}) – 0: offset from start of file, offset should be >=0. 1: move relative to the current position, positive or negative 2: move relative to the end of file, offset should be <= 0. Seeking beyond the end of a file is not supported
- tell()
Current file position
- Returns:
offset – The current frame in the file.
- Return type:
int
- close()
Close the HDF5 file handle
- flush()
Write all buffered data in the to the disk file.
- class westpa.core.h5io.Frames(coordinates, time, cell_lengths, cell_angles, velocities, kineticEnergy, potentialEnergy, temperature, alchemicalLambda)
Bases:
tuple
Create new instance of Frames(coordinates, time, cell_lengths, cell_angles, velocities, kineticEnergy, potentialEnergy, temperature, alchemicalLambda)
- alchemicalLambda
Alias for field number 8
- cell_angles
Alias for field number 3
- cell_lengths
Alias for field number 2
- coordinates
Alias for field number 0
- kineticEnergy
Alias for field number 5
- potentialEnergy
Alias for field number 6
- temperature
Alias for field number 7
- time
Alias for field number 1
- velocities
Alias for field number 4
- class westpa.core.h5io.WESTTrajectory(coordinates, topology=None, time=None, iter_labels=None, seg_labels=None, pcoords=None, parent_ids=None, unitcell_lengths=None, unitcell_angles=None)
Bases:
Trajectory
A subclass of
mdtraj.Trajectory
that contains the trajectory of atom coordinates with pointers denoting the iteration number and segment index of each frame.- iter_label_values()
- seg_label_values(iteration=None)
- property label_values
- property iter_labels
Iteration index corresponding to each frame
- Returns:
time – The iteration index corresponding to each frame
- Return type:
np.ndarray, shape=(n_frames,)
- property seg_labels
Segment index corresponding to each frame
- Returns:
time – The segment index corresponding to each frame
- Return type:
np.ndarray, shape=(n_frames,)
- property pcoords
- property parent_ids
- join(other, check_topology=True, discard_overlapping_frames=False)
Join two
Trajectory``s. This overrides ``mdtraj.Trajectory.join
so that it also handles WESTPA pointers.mdtraj.Trajectory.join
’s documentation for more details.
- slice(key, copy=True)
Slice the
Trajectory
. This overridesmdtraj.Trajectory.slice
so that it also handles WESTPA pointers. Please seemdtraj.Trajectory.slice
’s documentation for more details.
- westpa.core.h5io.resolve_filepath(path, constructor=<class 'h5py._hl.files.File'>, cargs=None, ckwargs=None, **addtlkwargs)
Use a combined filesystem and HDF5 path to open an HDF5 file and return the appropriate object. Returns (h5file, h5object). The file is opened using
constructor(filename, *cargs, **ckwargs)
.
- westpa.core.h5io.calc_chunksize(shape, dtype, max_chunksize=262144)
Calculate a chunk size for HDF5 data, anticipating that access will slice along lower dimensions sooner than higher dimensions.
- westpa.core.h5io.tostr(b)
Convert a nonstandard string object
b
to str with the handling of the case whereb
is bytes.
- westpa.core.h5io.is_within_directory(directory, target)
- westpa.core.h5io.safe_extract(tar, path='.', members=None, *, numeric_owner=False)
- westpa.core.h5io.create_hdf5_group(parent_group, groupname, replace=False, creating_program=None)
Create (or delete and recreate) and HDF5 group named
groupname
within the enclosing Group (object)parent_group
. Ifreplace
is True, then the group is replaced if present; if False, then an error is raised if the group is present. After the group is created, HDF5 attributes are set using stamp_creator_data.
- westpa.core.h5io.stamp_creator_data(h5group, creating_program=None)
Mark the following on the HDF5 group
h5group
:- creation_program:
The name of the program that created the group
- creation_user:
The username of the user who created the group
- creation_hostname:
The hostname of the machine on which the group was created
- creation_time:
The date and time at which the group was created, in the current locale.
- creation_unix_time:
The Unix time (seconds from the epoch, UTC) at which the group was created.
This is meant to facilitate tracking the flow of data, but should not be considered a secure paper trail (after all, anyone with write access to the HDF5 file can modify these attributes).
- westpa.core.h5io.get_creator_data(h5group)
Read back creator data as written by
stamp_creator_data
, returning a dictionary with keys as described forstamp_creator_data
. Missing fields are denoted with None. Thecreation_time
field is returned as a string.
- westpa.core.h5io.load_west(filename)
Load WESTPA trajectory files from disk.
- Parameters:
filename (str) – String filename of HDF Trajectory file.
- westpa.core.h5io.stamp_iter_range(h5object, start_iter, stop_iter)
Mark that the HDF5 object
h5object
(dataset or group) contains data from iterations start_iter <= n_iter < stop_iter.
- westpa.core.h5io.get_iter_range(h5object)
Read back iteration range data written by
stamp_iter_range
- westpa.core.h5io.stamp_iter_step(h5group, iter_step)
Mark that the HDF5 object
h5object
(dataset or group) contains data with an iteration step (stride) of iter_step).
- westpa.core.h5io.get_iter_step(h5group)
Read back iteration step (stride) written by
stamp_iter_step
- westpa.core.h5io.check_iter_range_least(h5object, iter_start, iter_stop)
Return True if the iteration range [iter_start, iter_stop) is the same as or entirely contained within the iteration range stored on
h5object
.
- westpa.core.h5io.check_iter_range_equal(h5object, iter_start, iter_stop)
Return True if the iteration range [iter_start, iter_stop) is the same as the iteration range stored on
h5object
.
- westpa.core.h5io.get_iteration_entry(h5object, n_iter)
Create a slice for data corresponding to iteration
n_iter
inh5object
.
- westpa.core.h5io.get_iteration_slice(h5object, iter_start, iter_stop=None, iter_stride=None)
Create a slice for data corresponding to iterations [iter_start,iter_stop), with stride iter_step, in the given
h5object
.
- westpa.core.h5io.label_axes(h5object, labels, units=None)
Stamp the given HDF5 object with axis labels. This stores the axis labels in an array of strings in an attribute called
axis_labels
on the given object.units
if provided is a corresponding list of units.
- class westpa.core.h5io.WESTPAH5File(*args, **kwargs)
Bases:
File
Generalized input/output for WESTPA simulation (or analysis) data.
Create a new file object.
See the h5py user guide for a detailed explanation of the options.
- name
Name of the file on disk, or file-like object. Note: for files created with the ‘core’ driver, HDF5 still requires this be non-empty.
- mode
r Readonly, file must exist (default) r+ Read/write, file must exist w Create file, truncate if exists w- or x Create file, fail if exists a Read/write if exists, create otherwise
- driver
Name of the driver to use. Legal values are None (default, recommended), ‘core’, ‘sec2’, ‘direct’, ‘stdio’, ‘mpio’, ‘ros3’.
- libver
Library version bounds. Supported values: ‘earliest’, ‘v108’, ‘v110’, ‘v112’ and ‘latest’. The ‘v108’, ‘v110’ and ‘v112’ options can only be specified with the HDF5 1.10.2 library or later.
- userblock_size
Desired size of user block. Only allowed when creating a new file (mode w, w- or x).
- swmr
Open the file in SWMR read mode. Only used when mode = ‘r’.
- rdcc_nbytes
Total size of the dataset chunk cache in bytes. The default size is 1024**2 (1 MiB) per dataset. Applies to all datasets unless individually changed.
- rdcc_w0
The chunk preemption policy for all datasets. This must be between 0 and 1 inclusive and indicates the weighting according to which chunks which have been fully read or written are penalized when determining which chunks to flush from cache. A value of 0 means fully read or written chunks are treated no differently than other chunks (the preemption is strictly LRU) while a value of 1 means fully read or written chunks are always preempted before other chunks. If your application only reads or writes data once, this can be safely set to 1. Otherwise, this should be set lower depending on how often you re-read or re-write the same data. The default value is 0.75. Applies to all datasets unless individually changed.
- rdcc_nslots
The number of chunk slots in the raw data chunk cache for this file. Increasing this value reduces the number of cache collisions, but slightly increases the memory used. Due to the hashing strategy, this value should ideally be a prime number. As a rule of thumb, this value should be at least 10 times the number of chunks that can fit in rdcc_nbytes bytes. For maximum performance, this value should be set approximately 100 times that number of chunks. The default value is 521. Applies to all datasets unless individually changed.
- track_order
Track dataset/group/attribute creation order under root group if True. If None use global default h5.get_config().track_order.
- fs_strategy
The file space handling strategy to be used. Only allowed when creating a new file (mode w, w- or x). Defined as: “fsm” FSM, Aggregators, VFD “page” Paged FSM, VFD “aggregate” Aggregators, VFD “none” VFD If None use HDF5 defaults.
- fs_page_size
File space page size in bytes. Only used when fs_strategy=”page”. If None use the HDF5 default (4096 bytes).
- fs_persist
A boolean value to indicate whether free space should be persistent or not. Only allowed when creating a new file. The default value is False.
- fs_threshold
The smallest free-space section size that the free space manager will track. Only allowed when creating a new file. The default value is 1.
- page_buf_size
Page buffer size in bytes. Only allowed for HDF5 files created with fs_strategy=”page”. Must be a power of two value and greater or equal than the file space page size when creating the file. It is not used by default.
- min_meta_keep
Minimum percentage of metadata to keep in the page buffer before allowing pages containing metadata to be evicted. Applicable only if page_buf_size is set. Default value is zero.
- min_raw_keep
Minimum percentage of raw data to keep in the page buffer before allowing pages containing raw data to be evicted. Applicable only if page_buf_size is set. Default value is zero.
- locking
The file locking behavior. Defined as:
False (or “false”) – Disable file locking
True (or “true”) – Enable file locking
“best-effort” – Enable file locking but ignore some errors
None – Use HDF5 defaults
Warning
The HDF5_USE_FILE_LOCKING environment variable can override this parameter.
Only available with HDF5 >= 1.12.1 or 1.10.x >= 1.10.7.
- alignment_threshold
Together with
alignment_interval
, this property ensures that any file object greater than or equal in size to the alignment threshold (in bytes) will be aligned on an address which is a multiple of alignment interval.- alignment_interval
This property should be used in conjunction with
alignment_threshold
. See the description above. For more details, see https://portal.hdfgroup.org/display/HDF5/H5P_SET_ALIGNMENT- meta_block_size
Set the current minimum size, in bytes, of new metadata block allocations. See https://portal.hdfgroup.org/display/HDF5/H5P_SET_META_BLOCK_SIZE
- Additional keywords
Passed on to the selected file driver.
- default_iter_prec = 8
- replace_dataset(*args, **kwargs)
- iter_object_name(n_iter, prefix='', suffix='')
Return a properly-formatted per-iteration name for iteration
n_iter
. (This is used in create/require/get_iter_group, but may also be useful for naming datasets on a per-iteration basis.)
- create_iter_group(n_iter, group=None)
Create a per-iteration data storage group for iteration number
n_iter
in the groupgroup
(which is ‘/iterations’ by default).
- require_iter_group(n_iter, group=None)
Ensure that a per-iteration data storage group for iteration number
n_iter
is available in the groupgroup
(which is ‘/iterations’ by default).
- get_iter_group(n_iter, group=None)
Get the per-iteration data group for iteration number
n_iter
from within the groupgroup
(‘/iterations’ by default).
- class westpa.core.h5io.WESTIterationFile(file, mode='r', force_overwrite=True, compression='zlib', link=None)
Bases:
HDF5TrajectoryFile
- read(frame_indices=None, atom_indices=None)
Read one or more frames of data from the file
- Parameters:
n_frames ({int, None}) – The number of frames to read. If not supplied, all of the remaining frames will be read.
stride ({int, None}) – By default all of the frames will be read, but you can pass this flag to read a subset of of the data by grabbing only every stride-th frame from disk.
atom_indices ({int, None}) – By default all of the atom will be read, but you can pass this flag to read only a subsets of the atoms for the coordinates and velocities fields. Note that you will have to carefully manage the indices and the offsets, since the i-th atom in the topology will not necessarily correspond to the i-th atom in your subset.
Notes
If you’d like more flexible access to the data, that is available by using the pytables group directly, which is accessible via the root property on this class.
- Returns:
frames – The returned namedtuple will have the fields “coordinates”, “time”, “cell_lengths”, “cell_angles”, “velocities”, “kineticEnergy”, “potentialEnergy”, “temperature” and “alchemicalLambda”. Each of the fields in the returned namedtuple will either be a numpy array or None, dependening on if that data was saved in the trajectory. All of the data shall be n units of “nanometers”, “picoseconds”, “kelvin”, “degrees” and “kilojoules_per_mole”.
- Return type:
namedtuple
- has_topology()
- has_pointer()
- has_restart(segment)
- write_data(where, name, data)
- read_data(where, name)
- read_as_traj(iteration=None, segment=None, atom_indices=None)
Read a trajectory from the HDF5 file
- Parameters:
n_frames ({int, None}) – The number of frames to read. If not supplied, all of the remaining frames will be read.
stride ({int, None}) – By default all of the frames will be read, but you can pass this flag to read a subset of of the data by grabbing only every stride-th frame from disk.
atom_indices ({int, None}) – By default all of the atom will be read, but you can pass this flag to read only a subsets of the atoms for the coordinates and velocities fields. Note that you will have to carefully manage the indices and the offsets, since the i-th atom in the topology will not necessarily correspond to the i-th atom in your subset.
- Returns:
trajectory – A trajectory object containing the loaded portion of the file.
- Return type:
- read_restart(segment)
- write_segment(segment, pop=False)
- class westpa.core.h5io.DSSpec
Bases:
object
Generalized WE dataset access
- get_iter_data(n_iter, seg_slice=(slice(None, None, None),))
- get_segment_data(n_iter, seg_id)
- class westpa.core.h5io.FileLinkedDSSpec(h5file_or_name)
Bases:
DSSpec
Provide facilities for accessing WESTPA HDF5 files, including auto-opening and the ability to pickle references to such files for transmission (through, e.g., the work manager), provided that the HDF5 file can be accessed by the same path on both the sender and receiver.
- property h5file
Lazily open HDF5 file. This is required because allowing an open HDF5 file to cross a fork() boundary generally corrupts the internal state of the HDF5 library.
- class westpa.core.h5io.SingleDSSpec(h5file_or_name, dsname, alias=None, slice=None)
Bases:
FileLinkedDSSpec
- classmethod from_string(dsspec_string, default_h5file)
- class westpa.core.h5io.SingleIterDSSpec(h5file_or_name, dsname, alias=None, slice=None)
Bases:
SingleDSSpec
- get_iter_data(n_iter, seg_slice=(slice(None, None, None),))
- class westpa.core.h5io.SingleSegmentDSSpec(h5file_or_name, dsname, alias=None, slice=None)
Bases:
SingleDSSpec
- get_iter_data(n_iter, seg_slice=(slice(None, None, None),))
- get_segment_data(n_iter, seg_id)
- class westpa.core.h5io.FnDSSpec(h5file_or_name, fn)
Bases:
FileLinkedDSSpec
- get_iter_data(n_iter, seg_slice=(slice(None, None, None),))
- class westpa.core.h5io.MultiDSSpec(dsspecs)
Bases:
DSSpec
- get_iter_data(n_iter, seg_slice=(slice(None, None, None),))
- class westpa.core.h5io.IterBlockedDataset(dataset_or_array, attrs=None)
Bases:
object
- classmethod empty_like(blocked_dataset)
- cache_data(max_size=None)
Cache this dataset in RAM. If
max_size
is given, then only cache if the entire dataset fits inmax_size
bytes. Ifmax_size
is the string ‘available’, then only cache if the entire dataset fits in available RAM, as defined by thepsutil
module.
- drop_cache()
- iter_entry(n_iter)
- iter_slice(start=None, stop=None)
westpa.core.progress module
- westpa.core.progress.linregress(x, y=None, alternative='two-sided')
Calculate a linear least-squares regression for two sets of measurements.
- Parameters:
x (array_like) – Two sets of measurements. Both arrays should have the same length. If only x is given (and
y=None
), then it must be a two-dimensional array where one dimension has length 2. The two sets of measurements are then found by splitting the array along the length-2 dimension. In the case wherey=None
and x is a 2x2 array,linregress(x)
is equivalent tolinregress(x[0], x[1])
.y (array_like) – Two sets of measurements. Both arrays should have the same length. If only x is given (and
y=None
), then it must be a two-dimensional array where one dimension has length 2. The two sets of measurements are then found by splitting the array along the length-2 dimension. In the case wherey=None
and x is a 2x2 array,linregress(x)
is equivalent tolinregress(x[0], x[1])
.alternative ({'two-sided', 'less', 'greater'}, optional) –
Defines the alternative hypothesis. Default is ‘two-sided’. The following options are available:
’two-sided’: the slope of the regression line is nonzero
’less’: the slope of the regression line is less than zero
’greater’: the slope of the regression line is greater than zero
Added in version 1.7.0.
- Returns:
result – The return value is an object with the following attributes:
- slopefloat
Slope of the regression line.
- interceptfloat
Intercept of the regression line.
- rvaluefloat
The Pearson correlation coefficient. The square of
rvalue
is equal to the coefficient of determination.- pvaluefloat
The p-value for a hypothesis test whose null hypothesis is that the slope is zero, using Wald Test with t-distribution of the test statistic. See alternative above for alternative hypotheses.
- stderrfloat
Standard error of the estimated slope (gradient), under the assumption of residual normality.
- intercept_stderrfloat
Standard error of the estimated intercept, under the assumption of residual normality.
- Return type:
LinregressResult
instance
See also
scipy.optimize.curve_fit
Use non-linear least squares to fit a function to data.
scipy.optimize.leastsq
Minimize the sum of squares of a set of equations.
Notes
Missing values are considered pair-wise: if a value is missing in x, the corresponding value in y is masked.
For compatibility with older versions of SciPy, the return value acts like a
namedtuple
of length 5, with fieldsslope
,intercept
,rvalue
,pvalue
andstderr
, so one can continue to write:slope, intercept, r, p, se = linregress(x, y)
With that style, however, the standard error of the intercept is not available. To have access to all the computed values, including the standard error of the intercept, use the return value as an object with attributes, e.g.:
result = linregress(x, y) print(result.intercept, result.intercept_stderr)
Examples
>>> import numpy as np >>> import matplotlib.pyplot as plt >>> from scipy import stats >>> rng = np.random.default_rng()
Generate some data:
>>> x = rng.random(10) >>> y = 1.6*x + rng.random(10)
Perform the linear regression:
>>> res = stats.linregress(x, y)
Coefficient of determination (R-squared):
>>> print(f"R-squared: {res.rvalue**2:.6f}") R-squared: 0.717533
Plot the data along with the fitted line:
>>> plt.plot(x, y, 'o', label='original data') >>> plt.plot(x, res.intercept + res.slope*x, 'r', label='fitted line') >>> plt.legend() >>> plt.show()
Calculate 95% confidence interval on slope and intercept:
>>> # Two-sided inverse Students t-distribution >>> # p - probability, df - degrees of freedom >>> from scipy.stats import t >>> tinv = lambda p, df: abs(t.ppf(p/2, df))
>>> ts = tinv(0.05, len(x)-2) >>> print(f"slope (95%): {res.slope:.6f} +/- {ts*res.stderr:.6f}") slope (95%): 1.453392 +/- 0.743465 >>> print(f"intercept (95%): {res.intercept:.6f}" ... f" +/- {ts*res.intercept_stderr:.6f}") intercept (95%): 0.616950 +/- 0.544475
- westpa.core.progress.nop()
westpa.core.segment module
- class westpa.core.segment.Segment(n_iter=None, seg_id=None, weight=None, endpoint_type=None, parent_id=None, wtg_parent_ids=None, pcoord=None, status=None, walltime=None, cputime=None, data=None)
Bases:
object
A class wrapping segment data that must be passed through the work manager or data manager. Most fields are self-explanatory. One item worth noting is that a negative parent ID means that the segment starts from the initial state with ID -(segment.parent_id+1)
- SEG_STATUS_UNSET = 0
- SEG_STATUS_PREPARED = 1
- SEG_STATUS_COMPLETE = 2
- SEG_STATUS_FAILED = 3
- SEG_INITPOINT_UNSET = 0
- SEG_INITPOINT_CONTINUES = 1
- SEG_INITPOINT_NEWTRAJ = 2
- SEG_ENDPOINT_UNSET = 0
- SEG_ENDPOINT_CONTINUES = 1
- SEG_ENDPOINT_MERGED = 2
- SEG_ENDPOINT_RECYCLED = 3
- statuses = {'SEG_STATUS_COMPLETE': 2, 'SEG_STATUS_FAILED': 3, 'SEG_STATUS_PREPARED': 1, 'SEG_STATUS_UNSET': 0}
- initpoint_types = {'SEG_INITPOINT_CONTINUES': 1, 'SEG_INITPOINT_NEWTRAJ': 2, 'SEG_INITPOINT_UNSET': 0}
- endpoint_types = {'SEG_ENDPOINT_CONTINUES': 1, 'SEG_ENDPOINT_MERGED': 2, 'SEG_ENDPOINT_RECYCLED': 3, 'SEG_ENDPOINT_UNSET': 0}
- status_names = {0: 'SEG_STATUS_UNSET', 1: 'SEG_STATUS_PREPARED', 2: 'SEG_STATUS_COMPLETE', 3: 'SEG_STATUS_FAILED'}
- initpoint_type_names = {0: 'SEG_INITPOINT_UNSET', 1: 'SEG_INITPOINT_CONTINUES', 2: 'SEG_INITPOINT_NEWTRAJ'}
- endpoint_type_names = {0: 'SEG_ENDPOINT_UNSET', 1: 'SEG_ENDPOINT_CONTINUES', 2: 'SEG_ENDPOINT_MERGED', 3: 'SEG_ENDPOINT_RECYCLED'}
- static initial_pcoord(segment)
Return the initial progress coordinate point of this segment.
- static final_pcoord(segment)
Return the final progress coordinate point of this segment.
- property initpoint_type
- property initial_state_id
- property status_text
- property endpoint_type_text
westpa.core.sim_manager module
- class westpa.core.sim_manager.timedelta
Bases:
object
Difference between two datetime values.
timedelta(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0)
All arguments are optional and default to 0. Arguments may be integers or floats, and may be positive or negative.
- days
Number of days.
- max = datetime.timedelta(days=999999999, seconds=86399, microseconds=999999)
- microseconds
Number of microseconds (>= 0 and less than 1 second).
- min = datetime.timedelta(days=-999999999)
- resolution = datetime.timedelta(microseconds=1)
- seconds
Number of seconds (>= 0 and less than 1 day).
- total_seconds()
Total seconds in the duration.
- class westpa.core.sim_manager.zip_longest
Bases:
object
zip_longest(iter1 [,iter2 […]], [fillvalue=None]) –> zip_longest object
Return a zip_longest object whose .__next__() method returns a tuple where the i-th element comes from the i-th iterable argument. The .__next__() method continues until the longest iterable in the argument sequence is exhausted and then it raises StopIteration. When the shorter iterables are exhausted, the fillvalue is substituted in their place. The fillvalue defaults to None or can be specified by a keyword argument.
- exception westpa.core.sim_manager.PickleError
Bases:
Exception
- westpa.core.sim_manager.weight_dtype
alias of
float64
- class westpa.core.sim_manager.Segment(n_iter=None, seg_id=None, weight=None, endpoint_type=None, parent_id=None, wtg_parent_ids=None, pcoord=None, status=None, walltime=None, cputime=None, data=None)
Bases:
object
A class wrapping segment data that must be passed through the work manager or data manager. Most fields are self-explanatory. One item worth noting is that a negative parent ID means that the segment starts from the initial state with ID -(segment.parent_id+1)
- SEG_STATUS_UNSET = 0
- SEG_STATUS_PREPARED = 1
- SEG_STATUS_COMPLETE = 2
- SEG_STATUS_FAILED = 3
- SEG_INITPOINT_UNSET = 0
- SEG_INITPOINT_CONTINUES = 1
- SEG_INITPOINT_NEWTRAJ = 2
- SEG_ENDPOINT_UNSET = 0
- SEG_ENDPOINT_CONTINUES = 1
- SEG_ENDPOINT_MERGED = 2
- SEG_ENDPOINT_RECYCLED = 3
- statuses = {'SEG_STATUS_COMPLETE': 2, 'SEG_STATUS_FAILED': 3, 'SEG_STATUS_PREPARED': 1, 'SEG_STATUS_UNSET': 0}
- initpoint_types = {'SEG_INITPOINT_CONTINUES': 1, 'SEG_INITPOINT_NEWTRAJ': 2, 'SEG_INITPOINT_UNSET': 0}
- endpoint_types = {'SEG_ENDPOINT_CONTINUES': 1, 'SEG_ENDPOINT_MERGED': 2, 'SEG_ENDPOINT_RECYCLED': 3, 'SEG_ENDPOINT_UNSET': 0}
- status_names = {0: 'SEG_STATUS_UNSET', 1: 'SEG_STATUS_PREPARED', 2: 'SEG_STATUS_COMPLETE', 3: 'SEG_STATUS_FAILED'}
- initpoint_type_names = {0: 'SEG_INITPOINT_UNSET', 1: 'SEG_INITPOINT_CONTINUES', 2: 'SEG_INITPOINT_NEWTRAJ'}
- endpoint_type_names = {0: 'SEG_ENDPOINT_UNSET', 1: 'SEG_ENDPOINT_CONTINUES', 2: 'SEG_ENDPOINT_MERGED', 3: 'SEG_ENDPOINT_RECYCLED'}
- static initial_pcoord(segment)
Return the initial progress coordinate point of this segment.
- static final_pcoord(segment)
Return the final progress coordinate point of this segment.
- property initpoint_type
- property initial_state_id
- property status_text
- property endpoint_type_text
- class westpa.core.sim_manager.InitialState(state_id, basis_state_id, iter_created, iter_used=None, istate_type=None, istate_status=None, pcoord=None, basis_state=None, basis_auxref=None)
Bases:
object
Describes an initial state for a new trajectory. These are generally constructed by appropriate modification of a basis state.
- Variables:
state_id – Integer identifier of this state, usually set by the data manager.
basis_state_id – Identifier of the basis state from which this state was generated, or None.
basis_state – The BasisState from which this state was generated, or None.
iter_created – Iteration in which this state was generated (0 for simulation initialization).
iter_used – Iteration in which this state was used to initiate a trajectory (None for unused).
istate_type – Integer describing the type of this initial state (ISTATE_TYPE_BASIS for direct use of a basis state, ISTATE_TYPE_GENERATED for a state generated from a basis state, ISTATE_TYPE_RESTART for a state corresponding to the endpoint of a segment in another simulation, or ISTATE_TYPE_START for a state generated from a start state).
istate_status – Integer describing whether this initial state has been properly prepared.
pcoord – The representative progress coordinate of this state.
- ISTATE_TYPE_UNSET = 0
- ISTATE_TYPE_BASIS = 1
- ISTATE_TYPE_GENERATED = 2
- ISTATE_TYPE_RESTART = 3
- ISTATE_TYPE_START = 4
- ISTATE_UNUSED = 0
- ISTATE_STATUS_PENDING = 0
- ISTATE_STATUS_PREPARED = 1
- ISTATE_STATUS_FAILED = 2
- istate_types = {'ISTATE_TYPE_BASIS': 1, 'ISTATE_TYPE_GENERATED': 2, 'ISTATE_TYPE_RESTART': 3, 'ISTATE_TYPE_START': 4, 'ISTATE_TYPE_UNSET': 0}
- istate_type_names = {0: 'ISTATE_TYPE_UNSET', 1: 'ISTATE_TYPE_BASIS', 2: 'ISTATE_TYPE_GENERATED', 3: 'ISTATE_TYPE_RESTART', 4: 'ISTATE_TYPE_START'}
- istate_statuses = {'ISTATE_STATUS_FAILED': 2, 'ISTATE_STATUS_PENDING': 0, 'ISTATE_STATUS_PREPARED': 1}
- istate_status_names = {0: 'ISTATE_STATUS_PENDING', 1: 'ISTATE_STATUS_PREPARED', 2: 'ISTATE_STATUS_FAILED'}
- as_numpy_record()
- westpa.core.sim_manager.grouper(n, iterable, fillvalue=None)
Collect data into fixed-length chunks or blocks
- exception westpa.core.sim_manager.PropagationError
Bases:
RuntimeError
- class westpa.core.sim_manager.WESimManager(rc=None)
Bases:
object
- process_config()
- register_callback(hook, function, priority=0)
Registers a callback to execute during the given
hook
into the simulation loop. The optional priority is used to order when the function is called relative to other registered callbacks.
- invoke_callbacks(hook, *args, **kwargs)
- load_plugins(plugins=None)
- report_bin_statistics(bins, target_states, save_summary=False)
- get_bstate_pcoords(basis_states, label='basis')
For each of the given
basis_states
, calculate progress coordinate values as necessary. The HDF5 file is not updated.
- report_basis_states(basis_states, label='basis')
- report_target_states(target_states)
- initialize_simulation(basis_states, target_states, start_states, segs_per_state=1, suppress_we=False)
Initialize a new weighted ensemble simulation, taking
segs_per_state
initial states from each of the givenbasis_states
.w_init
is the forward-facing version of this function
- prepare_iteration()
- finalize_iteration()
Clean up after an iteration and prepare for the next.
- get_istate_futures()
Add
n_states
initial states to the internal list of initial states assigned to recycled particles. Spare states are used if available, otherwise new states are created. If created new initial states requires generation, then a set of futures is returned representing work manager tasks corresponding to the necessary generation work.
- propagate()
- save_bin_data()
Calculate and write flux and transition count matrices to HDF5. Population and rate matrices are likely useless at the single-tau level and are no longer written.
- check_propagation()
Check for failures in propagation or initial state generation, and raise an exception if any are found.
- run_we()
Run the weighted ensemble algorithm based on the binning in self.final_bins and the recycled particles in self.to_recycle, creating and committing the next iteration’s segments to storage as well.
- prepare_new_iteration()
Commit data for the coming iteration to the HDF5 file.
- run()
- prepare_run()
Prepare a new run.
- finalize_run()
Perform cleanup at the normal end of a run
- pre_propagation()
- post_propagation()
- pre_we()
- post_we()
westpa.core.states module
- class westpa.core.states.Segment(n_iter=None, seg_id=None, weight=None, endpoint_type=None, parent_id=None, wtg_parent_ids=None, pcoord=None, status=None, walltime=None, cputime=None, data=None)
Bases:
object
A class wrapping segment data that must be passed through the work manager or data manager. Most fields are self-explanatory. One item worth noting is that a negative parent ID means that the segment starts from the initial state with ID -(segment.parent_id+1)
- SEG_STATUS_UNSET = 0
- SEG_STATUS_PREPARED = 1
- SEG_STATUS_COMPLETE = 2
- SEG_STATUS_FAILED = 3
- SEG_INITPOINT_UNSET = 0
- SEG_INITPOINT_CONTINUES = 1
- SEG_INITPOINT_NEWTRAJ = 2
- SEG_ENDPOINT_UNSET = 0
- SEG_ENDPOINT_CONTINUES = 1
- SEG_ENDPOINT_MERGED = 2
- SEG_ENDPOINT_RECYCLED = 3
- statuses = {'SEG_STATUS_COMPLETE': 2, 'SEG_STATUS_FAILED': 3, 'SEG_STATUS_PREPARED': 1, 'SEG_STATUS_UNSET': 0}
- initpoint_types = {'SEG_INITPOINT_CONTINUES': 1, 'SEG_INITPOINT_NEWTRAJ': 2, 'SEG_INITPOINT_UNSET': 0}
- endpoint_types = {'SEG_ENDPOINT_CONTINUES': 1, 'SEG_ENDPOINT_MERGED': 2, 'SEG_ENDPOINT_RECYCLED': 3, 'SEG_ENDPOINT_UNSET': 0}
- status_names = {0: 'SEG_STATUS_UNSET', 1: 'SEG_STATUS_PREPARED', 2: 'SEG_STATUS_COMPLETE', 3: 'SEG_STATUS_FAILED'}
- initpoint_type_names = {0: 'SEG_INITPOINT_UNSET', 1: 'SEG_INITPOINT_CONTINUES', 2: 'SEG_INITPOINT_NEWTRAJ'}
- endpoint_type_names = {0: 'SEG_ENDPOINT_UNSET', 1: 'SEG_ENDPOINT_CONTINUES', 2: 'SEG_ENDPOINT_MERGED', 3: 'SEG_ENDPOINT_RECYCLED'}
- static initial_pcoord(segment)
Return the initial progress coordinate point of this segment.
- static final_pcoord(segment)
Return the final progress coordinate point of this segment.
- property initpoint_type
- property initial_state_id
- property status_text
- property endpoint_type_text
- class westpa.core.states.BasisState(label, probability, pcoord=None, auxref=None, state_id=None)
Bases:
object
Describes an basis (micro)state. These basis states are used to generate initial states for new trajectories, either at the beginning of the simulation (i.e. at w_init) or due to recycling.
- Variables:
state_id – Integer identifier of this state, usually set by the data manager.
label – A descriptive label for this microstate (may be empty)
probability – Probability of this state to be selected when creating a new trajectory.
pcoord – The representative progress coordinate of this state.
auxref – A user-provided (string) reference for locating data associated with this state (usually a filesystem path).
- classmethod states_to_file(states, fileobj)
Write a file defining basis states, which may then be read by states_from_file().
- classmethod states_from_file(statefile)
Read a file defining basis states. Each line defines a state, and contains a label, the probability, and optionally a data reference, separated by whitespace, as in:
unbound 1.0
or:
unbound_0 0.6 state0.pdb unbound_1 0.4 state1.pdb
- as_numpy_record()
Return the data for this state as a numpy record array.
- class westpa.core.states.InitialState(state_id, basis_state_id, iter_created, iter_used=None, istate_type=None, istate_status=None, pcoord=None, basis_state=None, basis_auxref=None)
Bases:
object
Describes an initial state for a new trajectory. These are generally constructed by appropriate modification of a basis state.
- Variables:
state_id – Integer identifier of this state, usually set by the data manager.
basis_state_id – Identifier of the basis state from which this state was generated, or None.
basis_state – The BasisState from which this state was generated, or None.
iter_created – Iteration in which this state was generated (0 for simulation initialization).
iter_used – Iteration in which this state was used to initiate a trajectory (None for unused).
istate_type – Integer describing the type of this initial state (ISTATE_TYPE_BASIS for direct use of a basis state, ISTATE_TYPE_GENERATED for a state generated from a basis state, ISTATE_TYPE_RESTART for a state corresponding to the endpoint of a segment in another simulation, or ISTATE_TYPE_START for a state generated from a start state).
istate_status – Integer describing whether this initial state has been properly prepared.
pcoord – The representative progress coordinate of this state.
- ISTATE_TYPE_UNSET = 0
- ISTATE_TYPE_BASIS = 1
- ISTATE_TYPE_GENERATED = 2
- ISTATE_TYPE_RESTART = 3
- ISTATE_TYPE_START = 4
- ISTATE_UNUSED = 0
- ISTATE_STATUS_PENDING = 0
- ISTATE_STATUS_PREPARED = 1
- ISTATE_STATUS_FAILED = 2
- istate_types = {'ISTATE_TYPE_BASIS': 1, 'ISTATE_TYPE_GENERATED': 2, 'ISTATE_TYPE_RESTART': 3, 'ISTATE_TYPE_START': 4, 'ISTATE_TYPE_UNSET': 0}
- istate_type_names = {0: 'ISTATE_TYPE_UNSET', 1: 'ISTATE_TYPE_BASIS', 2: 'ISTATE_TYPE_GENERATED', 3: 'ISTATE_TYPE_RESTART', 4: 'ISTATE_TYPE_START'}
- istate_statuses = {'ISTATE_STATUS_FAILED': 2, 'ISTATE_STATUS_PENDING': 0, 'ISTATE_STATUS_PREPARED': 1}
- istate_status_names = {0: 'ISTATE_STATUS_PENDING', 1: 'ISTATE_STATUS_PREPARED', 2: 'ISTATE_STATUS_FAILED'}
- as_numpy_record()
- class westpa.core.states.TargetState(label, pcoord, state_id=None)
Bases:
object
Describes a target state.
- Variables:
state_id – Integer identifier of this state, usually set by the data manager.
label – A descriptive label for this microstate (may be empty)
pcoord – The representative progress coordinate of this state.
- classmethod states_to_file(states, fileobj)
Write a file defining basis states, which may then be read by states_from_file().
- classmethod states_from_file(statefile, dtype)
Read a file defining target states. Each line defines a state, and contains a label followed by a representative progress coordinate value, separated by whitespace, as in:
bound 0.02
for a single target and one-dimensional progress coordinates or:
bound 2.7 0.0 drift 100 50.0
for two targets and a two-dimensional progress coordinate.
- westpa.core.states.pare_basis_initial_states(basis_states, initial_states, segments=None)
Given iterables of basis and initial states (and optionally segments that use them), return minimal sets (as in __builtins__.set) of states needed to describe the history of the given segments an initial states.
- westpa.core.states.return_state_type(state_obj)
Convinience function for returning the state ID and type of the state_obj pointer
westpa.core.systems module
- class westpa.core.systems.NopMapper
Bases:
BinMapper
Put everything into one bin.
- assign(coords, mask=None, output=None)
- class westpa.core.systems.WESTSystem(rc=None)
Bases:
object
A description of the system being simulated, including the dimensionality and data type of the progress coordinate, the number of progress coordinate entries expected from each segment, and binning. To construct a simulation, the user must subclass WESTSystem and set several instance variables.
At a minimum, the user must subclass
WESTSystem
and override :method:`initialize` to set the data type and dimensionality of progress coordinate data and define a bin mapper.- Variables:
pcoord_ndim – The number of dimensions in the progress coordinate. Defaults to 1 (i.e. a one-dimensional progress coordinate).
pcoord_dtype – The data type of the progress coordinate, which must be callable (e.g.
np.float32
andlong
will work, but'<f4'
and'<i8'
will not). Defaults tonp.float64
.pcoord_len – The length of the progress coordinate time series generated by each segment, including both the initial and final values. Defaults to 2 (i.e. only the initial and final progress coordinate values for a segment are returned from propagation).
bin_mapper – A bin mapper describing the progress coordinate space.
bin_target_counts – A vector of target counts, one per bin.
- property bin_target_counts
- initialize()
Prepare this system object for use in simulation or analysis, creating a bin space, setting replicas per bin, and so on. This function is called whenever a WEST tool creates an instance of the system driver.
- prepare_run()
Prepare this system for use in a simulation run. Called by w_run in all worker processes.
- finalize_run()
A hook for system-specific processing for the end of a simulation run (as defined by such things as maximum wallclock time, rather than perhaps more scientifically-significant definitions of “the end of a simulation run”)
- new_pcoord_array(pcoord_len=None)
Return an appropriately-sized and -typed pcoord array for a timepoint, segment, or number of segments. If
pcoord_len
is not specified (or None), then a length appropriate for a segment is returned.
- new_region_set()
westpa.core.textio module
Miscellaneous routines to help with input and output of WEST-related data in text format
- class westpa.core.textio.NumericTextOutputFormatter(output_file, mode='wt', emit_header=None)
Bases:
object
- comment_string = '# '
- emit_header = True
- close()
- write(str)
- writelines(sequence)
- write_comment(line)
Writes a line beginning with the comment string
- write_header(line)
Appends a line to those written when the file header is written. The appropriate comment string will be prepended, so
line
should not include a comment character.
westpa.core.we_driver module
- class westpa.core.we_driver.Segment(n_iter=None, seg_id=None, weight=None, endpoint_type=None, parent_id=None, wtg_parent_ids=None, pcoord=None, status=None, walltime=None, cputime=None, data=None)
Bases:
object
A class wrapping segment data that must be passed through the work manager or data manager. Most fields are self-explanatory. One item worth noting is that a negative parent ID means that the segment starts from the initial state with ID -(segment.parent_id+1)
- SEG_STATUS_UNSET = 0
- SEG_STATUS_PREPARED = 1
- SEG_STATUS_COMPLETE = 2
- SEG_STATUS_FAILED = 3
- SEG_INITPOINT_UNSET = 0
- SEG_INITPOINT_CONTINUES = 1
- SEG_INITPOINT_NEWTRAJ = 2
- SEG_ENDPOINT_UNSET = 0
- SEG_ENDPOINT_CONTINUES = 1
- SEG_ENDPOINT_MERGED = 2
- SEG_ENDPOINT_RECYCLED = 3
- statuses = {'SEG_STATUS_COMPLETE': 2, 'SEG_STATUS_FAILED': 3, 'SEG_STATUS_PREPARED': 1, 'SEG_STATUS_UNSET': 0}
- initpoint_types = {'SEG_INITPOINT_CONTINUES': 1, 'SEG_INITPOINT_NEWTRAJ': 2, 'SEG_INITPOINT_UNSET': 0}
- endpoint_types = {'SEG_ENDPOINT_CONTINUES': 1, 'SEG_ENDPOINT_MERGED': 2, 'SEG_ENDPOINT_RECYCLED': 3, 'SEG_ENDPOINT_UNSET': 0}
- status_names = {0: 'SEG_STATUS_UNSET', 1: 'SEG_STATUS_PREPARED', 2: 'SEG_STATUS_COMPLETE', 3: 'SEG_STATUS_FAILED'}
- initpoint_type_names = {0: 'SEG_INITPOINT_UNSET', 1: 'SEG_INITPOINT_CONTINUES', 2: 'SEG_INITPOINT_NEWTRAJ'}
- endpoint_type_names = {0: 'SEG_ENDPOINT_UNSET', 1: 'SEG_ENDPOINT_CONTINUES', 2: 'SEG_ENDPOINT_MERGED', 3: 'SEG_ENDPOINT_RECYCLED'}
- static initial_pcoord(segment)
Return the initial progress coordinate point of this segment.
- static final_pcoord(segment)
Return the final progress coordinate point of this segment.
- property initpoint_type
- property initial_state_id
- property status_text
- property endpoint_type_text
- class westpa.core.we_driver.InitialState(state_id, basis_state_id, iter_created, iter_used=None, istate_type=None, istate_status=None, pcoord=None, basis_state=None, basis_auxref=None)
Bases:
object
Describes an initial state for a new trajectory. These are generally constructed by appropriate modification of a basis state.
- Variables:
state_id – Integer identifier of this state, usually set by the data manager.
basis_state_id – Identifier of the basis state from which this state was generated, or None.
basis_state – The BasisState from which this state was generated, or None.
iter_created – Iteration in which this state was generated (0 for simulation initialization).
iter_used – Iteration in which this state was used to initiate a trajectory (None for unused).
istate_type – Integer describing the type of this initial state (ISTATE_TYPE_BASIS for direct use of a basis state, ISTATE_TYPE_GENERATED for a state generated from a basis state, ISTATE_TYPE_RESTART for a state corresponding to the endpoint of a segment in another simulation, or ISTATE_TYPE_START for a state generated from a start state).
istate_status – Integer describing whether this initial state has been properly prepared.
pcoord – The representative progress coordinate of this state.
- ISTATE_TYPE_UNSET = 0
- ISTATE_TYPE_BASIS = 1
- ISTATE_TYPE_GENERATED = 2
- ISTATE_TYPE_RESTART = 3
- ISTATE_TYPE_START = 4
- ISTATE_UNUSED = 0
- ISTATE_STATUS_PENDING = 0
- ISTATE_STATUS_PREPARED = 1
- ISTATE_STATUS_FAILED = 2
- istate_types = {'ISTATE_TYPE_BASIS': 1, 'ISTATE_TYPE_GENERATED': 2, 'ISTATE_TYPE_RESTART': 3, 'ISTATE_TYPE_START': 4, 'ISTATE_TYPE_UNSET': 0}
- istate_type_names = {0: 'ISTATE_TYPE_UNSET', 1: 'ISTATE_TYPE_BASIS', 2: 'ISTATE_TYPE_GENERATED', 3: 'ISTATE_TYPE_RESTART', 4: 'ISTATE_TYPE_START'}
- istate_statuses = {'ISTATE_STATUS_FAILED': 2, 'ISTATE_STATUS_PENDING': 0, 'ISTATE_STATUS_PREPARED': 1}
- istate_status_names = {0: 'ISTATE_STATUS_PENDING', 1: 'ISTATE_STATUS_PREPARED', 2: 'ISTATE_STATUS_FAILED'}
- as_numpy_record()
- exception westpa.core.we_driver.ConsistencyError
Bases:
RuntimeError
- exception westpa.core.we_driver.AccuracyError
Bases:
RuntimeError
- class westpa.core.we_driver.NewWeightEntry(source_type, weight, prev_seg_id=None, prev_init_pcoord=None, prev_final_pcoord=None, new_init_pcoord=None, target_state_id=None, initial_state_id=None)
Bases:
object
- NW_SOURCE_RECYCLED = 0
- class westpa.core.we_driver.WEDriver(rc=None, system=None)
Bases:
object
A class implemented Huber & Kim’s weighted ensemble algorithm over Segment objects. This class handles all binning, recycling, and preparation of new Segment objects for the next iteration. Binning is accomplished using system.bin_mapper, and per-bin target counts are from system.bin_target_counts.
The workflow is as follows:
Call new_iteration() every new iteration, providing any recycling targets that are in force and any available initial states for recycling.
Call assign() to assign segments to bins based on their initial and end points. This returns the number of walkers that were recycled.
Call run_we(), optionally providing a set of initial states that will be used to recycle walkers.
Note the presence of flux_matrix, transition_matrix, current_iter_segments, next_iter_segments, recycling_segments, initial_binning, final_binning, next_iter_binning, and new_weights (to be documented soon).
- weight_split_threshold = 2.0
- weight_merge_cutoff = 1.0
- largest_allowed_weight = 1.0
- smallest_allowed_weight = 1e-310
- process_config()
- property next_iter_segments
Newly-created segments for the next iteration
- property current_iter_segments
Segments for the current iteration
- property next_iter_assignments
Bin assignments (indices) for initial points of next iteration.
- property current_iter_assignments
Bin assignments (indices) for endpoints of current iteration.
- property recycling_segments
Segments designated for recycling
- property n_recycled_segs
Number of segments recycled this iteration
- property n_istates_needed
Number of initial states needed to support recycling for this iteration
- check_threshold_configs()
Check to see if weight thresholds parameters are valid
- clear()
Explicitly delete all Segment-related state.
- new_iteration(initial_states=None, target_states=None, new_weights=None, bin_mapper=None, bin_target_counts=None)
Prepare for a new iteration.
initial_states
is a sequence of all InitialState objects valid for use in to generating new segments for the next iteration (after the one being begun with the call to new_iteration); that is, these are states available to recycle to. Target states which generate recycling events are specified intarget_states
, a sequence of TargetState objects. Bothinitial_states
andtarget_states
may be empty as required.The optional
new_weights
is a sequence of NewWeightEntry objects which will be used to construct the initial flux matrix.The given
bin_mapper
will be used for assignment, andbin_target_counts
used for splitting/merging target counts; each will be obtained from the system object if omitted or None.
- add_initial_states(initial_states)
Add newly-prepared initial states to the pool available for recycling.
- property all_initial_states
Return an iterator over all initial states (available or used)
- assign(segments, initializing=False)
Assign segments to initial and final bins, and update the (internal) lists of used and available initial states. If
initializing
is True, then the “final” bin assignments will be identical to the initial bin assignments, a condition required for seeding a new iteration from pre-existing segments.
- populate_initial(initial_states, weights, system=None)
Create walkers for a new weighted ensemble simulation.
One segment is created for each provided initial state, then binned and split/merged as necessary. After this function is called, next_iter_segments will yield the new segments to create, used_initial_states will contain data about which of the provided initial states were used, and avail_initial_states will contain data about which initial states were unused (because their corresponding walkers were merged out of existence).
- rebin_current(parent_segments)
Reconstruct walkers for the current iteration based on (presumably) new binning. The previous iteration’s segments must be provided (as
parent_segments
) in order to update endpoint types appropriately.
- construct_next()
Construct walkers for the next iteration, by running weighted ensemble recycling and bin/split/merge on the segments previously assigned to bins using
assign
. Enough unused initial states must be present inself.avail_initial_states
for every recycled walker to be assigned an initial state.After this function completes,
self.flux_matrix
contains a valid flux matrix for this iteration (including any contributions from recycling from the previous iteration), andself.next_iter_segments
contains a list of segments ready for the next iteration, with appropriate values set for weight, endpoint type, parent walkers, and so on.
westpa.core.wm_ops module
- westpa.core.wm_ops.get_pcoord(state)
- westpa.core.wm_ops.gen_istate(basis_state, initial_state)
- westpa.core.wm_ops.prep_iter(n_iter, segments)
- westpa.core.wm_ops.post_iter(n_iter, segments)
- westpa.core.wm_ops.propagate(basis_states, initial_states, segments)
westpa.core.yamlcfg module
YAML-based configuration files for WESTPA
- westpa.core.yamlcfg.YLoader
alias of
CLoader
- class westpa.core.yamlcfg.NopMapper
Bases:
BinMapper
Put everything into one bin.
- assign(coords, mask=None, output=None)
- exception westpa.core.yamlcfg.ConfigValueWarning
Bases:
UserWarning
- westpa.core.yamlcfg.warn_dubious_config_entry(entry, value, expected_type=None, category=<class 'westpa.core.yamlcfg.ConfigValueWarning'>, stacklevel=1)
- westpa.core.yamlcfg.check_bool(value, action='warn')
Check that the given
value
is boolean in type. If not, either raise a warning (ifaction=='warn'
) or an exception (action=='raise'
).
- exception westpa.core.yamlcfg.ConfigItemMissing(key, message=None)
Bases:
KeyError
- exception westpa.core.yamlcfg.ConfigItemTypeError(key, expected_type, message=None)
Bases:
TypeError
- exception westpa.core.yamlcfg.ConfigValueError(key, value, message=None)
Bases:
ValueError
- class westpa.core.yamlcfg.YAMLConfig
Bases:
object
- preload_config_files = ['/etc/westpa/westrc', '/home/docs/.westrc']
- update_from_file(file, required=True)
- require(key, type_=None)
Ensure that a configuration item with the given
key
is present. If the optionaltype_
is given, additionally require that the item has that type.
- require_type_if_present(key, type_)
Ensure that the configuration item with the given
key
has the given type.
- coerce_type_if_present(key, type_)
- get(key, default=None)
- get_typed(key, type_, default=<object object>)
- get_path(key, default=<object object>, expandvars=True, expanduser=True, realpath=True, abspath=True)
- get_pathlist(key, default=<object object>, sep=':', expandvars=True, expanduser=True, realpath=True, abspath=True)
- get_python_object(key, default=<object object>, path=None)
- get_choice(key, choices, default=<object object>, value_transform=None)
- class westpa.core.yamlcfg.YAMLSystem(rc=None)
Bases:
object
A description of the system being simulated, including the dimensionality and data type of the progress coordinate, the number of progress coordinate entries expected from each segment, and binning. To construct a simulation, the user must subclass WESTSystem and set several instance variables.
At a minimum, the user must subclass
WESTSystem
and override :method:`initialize` to set the data type and dimensionality of progress coordinate data and define a bin mapper.- Variables:
pcoord_ndim – The number of dimensions in the progress coordinate. Defaults to 1 (i.e. a one-dimensional progress coordinate).
pcoord_dtype – The data type of the progress coordinate, which must be callable (e.g.
np.float32
andlong
will work, but'<f4'
and'<i8'
will not). Defaults tonp.float64
.pcoord_len – The length of the progress coordinate time series generated by each segment, including both the initial and final values. Defaults to 2 (i.e. only the initial and final progress coordinate values for a segment are returned from propagation).
bin_mapper – A bin mapper describing the progress coordinate space.
bin_target_counts – A vector of target counts, one per bin.
- property bin_target_counts
- initialize()
Prepare this system object for use in simulation or analysis, creating a bin space, setting replicas per bin, and so on. This function is called whenever a WEST tool creates an instance of the system driver.
- prepare_run()
Prepare this system for use in a simulation run. Called by w_run in all worker processes.
- finalize_run()
A hook for system-specific processing for the end of a simulation run (as defined by such things as maximum wallclock time, rather than perhaps more scientifically-significant definitions of “the end of a simulation run”)
- new_pcoord_array(pcoord_len=None)
Return an appropriately-sized and -typed pcoord array for a timepoint, segment, or number of segments. If
pcoord_len
is not specified (or None), then a length appropriate for a segment is returned.
- new_region_set()