w_fluxanl

w_fluxanl calculates the probability flux of a weighted ensemble simulation based on a pre-defined target state. Also calculates confidence interval of average flux. Monte Carlo bootstrapping techniques are used to account for autocorrelation between fluxes and/or errors that are not normally distributed.

Overview

usage:

w_fluxanl [-h] [-r RCFILE] [--quiet | --verbose | --debug] [--version]
                         [-W WEST_H5FILE] [-o OUTPUT]
                         [--first-iter N_ITER] [--last-iter N_ITER]
                         [-a ALPHA] [--autocorrel-alpha ACALPHA] [-N NSETS] [--evol] [--evol-step ESTEP]

Note: All command line arguments are optional for w_fluxanl.

Command-Line Options

See the general command-line tool reference for more information on the general options.

Input/output options

These arguments allow the user to specify where to read input simulation result data and where to output calculated progress coordinate probability distribution data.

Both input and output files are hdf5 format.:

-W, --west-data file
  Read simulation result data from file *file*. (**Default:** The
  *hdf5* file specified in the configuration file)

-o, --output file
  Store this tool's output in *file*. (**Default:** The *hdf5* file
  **pcpdist.h5**)

Iteration range options

Specify the range of iterations over which to construct the progress coordinate probability distribution.:

--first-iter n_iter
  Construct probability distribution starting with iteration *n_iter*
  (**Default:** 1)

--last-iter n_iter
  Construct probability distribution's time evolution up to (and
  including) iteration *n_iter* (**Default:** Last completed
  iteration)

Confidence interval and bootstrapping options

Specify alpha values of constructed confidence intervals.:

-a alpha
  Calculate a (1 - *alpha*) confidence interval for the mean flux
  (**Default:** 0.05)

--autocorrel-alpha ACalpha
  Identify autocorrelation of fluxes at *ACalpha* significance level.
  Note: Specifying an *ACalpha* level that is too small may result in
  failure to find autocorrelation in noisy flux signals (**Default:**
  Same level as *alpha*)

-N n_sets, --nsets n_sets
  Use *n_sets* samples for bootstrapping (**Default:** Chosen based
  on *alpha*)

--evol
  Calculate the time evolution of flux confidence intervals
  (**Warning:** computationally expensive calculation)

--evol-step estep
  (if ``'--evol'`` specified) Calculate the time evolution of flux
  confidence intervals for every *estep* iterations (**Default:** 1)

Examples

Calculate the time evolution flux every 5 iterations:

w_fluxanl --evol --evol-step 5

Calculate mean flux confidence intervals at 0.01 signicance level and calculate autocorrelations at 0.05 significance:

w_fluxanl --alpha 0.01 --autocorrel-alpha 0.05

Calculate the mean flux confidence intervals using a custom bootstrap sample size of 500:

w_fluxanl --n-sets 500

westpa.cli.tools.w_fluxanl module

westpa.cli.tools.w_fluxanl.fftconvolve(in1, in2, mode='full', axes=None)

Convolve two N-dimensional arrays using FFT.

Convolve in1 and in2 using the fast Fourier transform method, with the output size determined by the mode argument.

This is generally much faster than convolve for large arrays (n > ~500), but can be slower when only a few output values are needed, and can only output float arrays (int or object array inputs will be cast to float).

As of v0.19, convolve automatically chooses this method or the direct method based on an estimation of which is faster.

Parameters:

in1 (array_like) – First input.
in2 (array_like) – Second input. Should have the same number of dimensions as in1.
mode (str {'full', 'valid', 'same'}, optional) –
A string indicating the size of the output:

full
The output is the full discrete linear convolution of the inputs. (Default)

valid
The output consists only of those elements that do not rely on the zero-padding. In ‘valid’ mode, either in1 or in2 must be at least as large as the other in every dimension.

same
The output is the same size as in1, centered with respect to the ‘full’ output.
axes (int or array_like of ints or None, optional) – Axes over which to compute the convolution. The default is over all axes.

Returns:

out – An N-dimensional array containing a subset of the discrete linear convolution of in1 with in2.

Return type:

array

See also

convolve: Uses the direct convolution or FFT convolution algorithm depending on which is faster.
oaconvolve: Uses the overlap-add method to do convolution, which is generally faster when the input arrays are large and significantly different in size.

Examples

Autocorrelation of white noise is an impulse.

>>> import numpy as np
>>> from scipy import signal
>>> rng = np.random.default_rng()
>>> sig = rng.standard_normal(1000)
>>> autocorr = signal.fftconvolve(sig, sig[::-1], mode='full')

>>> import matplotlib.pyplot as plt
>>> fig, (ax_orig, ax_mag) = plt.subplots(2, 1)
>>> ax_orig.plot(sig)
>>> ax_orig.set_title('White noise')
>>> ax_mag.plot(np.arange(-len(sig)+1,len(sig)), autocorr)
>>> ax_mag.set_title('Autocorrelation')
>>> fig.tight_layout()
>>> fig.show()

Gaussian blur implemented using FFT convolution. Notice the dark borders around the image, due to the zero-padding beyond its boundaries. The convolve2d function allows for other types of image boundaries, but is far slower.

>>> from scipy import datasets
>>> face = datasets.face(gray=True)
>>> kernel = np.outer(signal.windows.gaussian(70, 8),
...                   signal.windows.gaussian(70, 8))
>>> blurred = signal.fftconvolve(face, kernel, mode='same')

>>> fig, (ax_orig, ax_kernel, ax_blurred) = plt.subplots(3, 1,
...                                                      figsize=(6, 15))
>>> ax_orig.imshow(face, cmap='gray')
>>> ax_orig.set_title('Original')
>>> ax_orig.set_axis_off()
>>> ax_kernel.imshow(kernel, cmap='gray')
>>> ax_kernel.set_title('Gaussian kernel')
>>> ax_kernel.set_axis_off()
>>> ax_blurred.imshow(blurred, cmap='gray')
>>> ax_blurred.set_title('Blurred')
>>> ax_blurred.set_axis_off()
>>> fig.show()

westpa.cli.tools.w_fluxanl.warn()

Issue a warning, or maybe ignore it or raise an exception.

message: Text of the warning message.
category: The Warning category subclass. Defaults to UserWarning.
stacklevel: How far up the call stack to make this warning appear. A value of 2 for example attributes the warning to the caller of the code calling warn().
source: If supplied, the destroyed object which emitted a ResourceWarning
skip_file_prefixes: An optional tuple of module filename prefixes indicating frames to skip during stacklevel computations for stack frame attribution.

westpa.cli.tools.w_fluxanl.weight_dtype: alias of float64

westpa.cli.tools.w_fluxanl.n_iter_dtype: alias of uint32

class westpa.cli.tools.w_fluxanl.NewWeightEntry(source_type, weight, prev_seg_id=None, prev_init_pcoord=None, prev_final_pcoord=None, new_init_pcoord=None, target_state_id=None, initial_state_id=None)

Bases: object

NW_SOURCE_RECYCLED = 0

class westpa.cli.tools.w_fluxanl.WESTTool

Bases: WESTToolComponent

Base class for WEST command line tools

prog = None

usage = None

description = None

epilog = None

add_args(parser): Add arguments specific to this tool to the given argparse parser.

process_args(args): Take argparse-processed arguments associated with this tool and deal with them appropriately (setting instance variables, etc)

make_parser(prog=None, usage=None, description=None, epilog=None, args=None)

make_parser_and_process(prog=None, usage=None, description=None, epilog=None, args=None): A convenience function to create a parser, call add_all_args(), and then call process_all_args(). The argument namespace is returned.

go(): Perform the analysis associated with this tool.

main(): A convenience function to make a parser, parse and process arguments, then call self.go()

class westpa.cli.tools.w_fluxanl.WESTDataReader

Bases: WESTToolComponent

Tool for reading data from WEST-related HDF5 files. Coordinates finding the main HDF5 file from west.cfg or command line arguments, caching of certain kinds of data (eventually), and retrieving auxiliary data sets from various places.

add_args(parser): Add arguments specific to this component to the given argparse parser.

process_args(args): Take argparse-processed arguments associated with this component and deal with them appropriately (setting instance variables, etc)

open(mode='r')

close()

property weight_dsspec

property parent_id_dsspec

class westpa.cli.tools.w_fluxanl.IterRangeSelection(data_manager=None)

Bases: WESTToolComponent

Select and record limits on iterations used in analysis and/or reporting. This class provides both the user-facing command-line options and parsing, and the application-side API for recording limits in HDF5.

HDF5 datasets calculated based on a restricted set of iterations should be tagged with the following attributes:

first_iter
The first iteration included in the calculation.

last_iter
One past the last iteration included in the calculation.

iter_step
Blocking or sampling period for iterations included in the calculation.

add_args(parser): Add arguments specific to this component to the given argparse parser.

process_args(args, override_iter_start=None, override_iter_stop=None, default_iter_step=1): Take argparse-processed arguments associated with this component and deal with them appropriately (setting instance variables, etc)

iter_block_iter(): Return an iterable of (block_start,block_end) over the blocks of iterations selected by –first-iter/–last-iter/–step-iter.

n_iter_blocks(): Return the number of blocks of iterations (as returned by iter_block_iter) selected by –first-iter/–last-iter/–step-iter.

record_data_iter_range(h5object, iter_start=None, iter_stop=None): Store attributes iter_start and iter_stop on the given HDF5 object (group/dataset)

record_data_iter_step(h5object, iter_step=None): Store attribute iter_step on the given HDF5 object (group/dataset).

check_data_iter_range_least(h5object, iter_start=None, iter_stop=None): Check that the given HDF5 object contains (as denoted by its iter_start/iter_stop attributes) data at least for the iteration range specified.

check_data_iter_range_equal(h5object, iter_start=None, iter_stop=None): Check that the given HDF5 object contains (as denoted by its iter_start/iter_stop attributes) data exactly for the iteration range specified.

check_data_iter_step_conformant(h5object, iter_step=None): Check that the given HDF5 object contains per-iteration data at an iteration stride suitable for extracting data with the given stride (in other words, the given iter_step is a multiple of the stride with which data was recorded).

check_data_iter_step_equal(h5object, iter_step=None): Check that the given HDF5 object contains per-iteration data at an iteration stride the same as that specified.

slice_per_iter_data(dataset, iter_start=None, iter_stop=None, iter_step=None, axis=0): Return the subset of the given dataset corresponding to the given iteration range and stride. Unless otherwise specified, the first dimension of the dataset is the one sliced.

iter_range(iter_start=None, iter_stop=None, iter_step=None, dtype=None): Return a sequence for the given iteration numbers and stride, filling in missing values from those stored on self. The smallest data type capable of holding iter_stop is returned unless otherwise specified using the dtype argument.

westpa.cli.tools.w_fluxanl.extract_fluxes(iter_start=None, iter_stop=None, data_manager=None)

Extract flux values from the WEST HDF5 file for iterations >= iter_start and < iter_stop, optionally using another data manager instance instead of the global one returned by westpa.rc.get_data_manager().

Returns a dictionary mapping target names (if available, target index otherwise) to a 1-D array of type fluxentry_dtype, which contains columns for iteration number, flux, and count.

class westpa.cli.tools.w_fluxanl.WFluxanlTool

Bases: WESTTool

prog = 'w_fluxanl'

description = 'Extract fluxes into pre-defined target states from WEST data,\naverage, and construct confidence intervals. Monte Carlo bootstrapping\nis used to account for the correlated and possibly non-Gaussian statistical\nerror in flux measurements.\n\nAll non-graphical output (including that to the terminal and HDF5) assumes that\nthe propagation/resampling period ``tau`` is equal to unity; to obtain results\nin familiar units, divide all fluxes and multiply all correlation lengths by\nthe true value of ``tau``.\n'

output_format_version = 2

add_args(parser): Add arguments specific to this tool to the given argparse parser.

process_args(args): Take argparse-processed arguments associated with this tool and deal with them appropriately (setting instance variables, etc)

calc_store_flux_data()

calc_evol_flux()

go(): Perform the analysis associated with this tool.

westpa.cli.tools.w_fluxanl.entry_point()