w_reweight

westpa.cli.tools.w_reweight module

class westpa.cli.tools.w_reweight.WESTMasterCommand

Bases: WESTTool

Base class for command-line tools that employ subcommands

subparsers_title = None

subcommands = None

include_help_command = True

add_args(parser): Add arguments specific to this tool to the given argparse parser.

process_args(args): Take argparse-processed arguments associated with this tool and deal with them appropriately (setting instance variables, etc)

go(): Perform the analysis associated with this tool.

class westpa.cli.tools.w_reweight.WESTParallelTool(wm_env=None)

Bases: WESTTool

Base class for command-line tools parallelized with wwmgr. This automatically adds and processes wwmgr command-line arguments and creates a work manager at self.work_manager.

make_parser_and_process(prog=None, usage=None, description=None, epilog=None, args=None): A convenience function to create a parser, call add_all_args(), and then call process_all_args(). The argument namespace is returned.

add_args(parser): Add arguments specific to this tool to the given argparse parser.

process_args(args): Take argparse-processed arguments associated with this tool and deal with them appropriately (setting instance variables, etc)

go(): Perform the analysis associated with this tool.

main(): A convenience function to make a parser, parse and process arguments, then run self.go() in the master process.

class westpa.cli.tools.w_reweight.WESTKineticsBase(parent)

Bases: WESTSubcommand

Common argument processing for w_direct/w_reweight subcommands. Mostly limited to handling input and output from w_assign.

add_args(parser): Add arguments specific to this component to the given argparse parser.

process_args(args): Take argparse-processed arguments associated with this component and deal with them appropriately (setting instance variables, etc)

class westpa.cli.tools.w_reweight.AverageCommands(parent)

Bases: WESTKineticsBase

default_output_file = 'direct.h5'

add_args(parser): Add arguments specific to this component to the given argparse parser.

process_args(args): Take argparse-processed arguments associated with this component and deal with them appropriately (setting instance variables, etc)

stamp_mcbs_info(dataset)

open_files()

open_assignments()

print_averages(dataset, header, dim=1)

run_calculation(pi, nstates, start_iter, stop_iter, step_iter, dataset, eval_block, name, dim, do_averages=False, **extra)

westpa.cli.tools.w_reweight.generate_future(work_manager, name, eval_block, kwargs)

westpa.cli.tools.w_reweight.mcbs_ci_correl(estimator_datasets, estimator, alpha, n_sets=None, args=None, autocorrel_alpha=None, autocorrel_n_sets=None, subsample=None, do_correl=True, mcbs_enable=None, estimator_kwargs={})

Perform a Monte Carlo bootstrap estimate for the (1-alpha) confidence interval on the given dataset with the given estimator. This routine is appropriate for time-correlated data, using the method described in Huber & Kim, “Weighted-ensemble Brownian dynamics simulations for protein association reactions” (1996), doi:10.1016/S0006-3495(96)79552-8 to determine a statistically-significant correlation time and then reducing the dataset by a factor of that correlation time before running a “classic” Monte Carlo bootstrap.

Returns (estimate, ci_lb, ci_ub, correl_time) where estimate is the application of the given estimator to the input dataset, ci_lb and ci_ub are the lower and upper limits, respectively, of the (1-alpha) confidence interval on estimate, and correl_time is the correlation time of the dataset, significant to (1-autocorrel_alpha).

estimator is called as estimator(dataset, *args, **kwargs). Common estimators include:

np.mean – calculate the confidence interval on the mean of dataset
np.median – calculate a confidence interval on the median of dataset
np.std – calculate a confidence interval on the standard deviation of datset.

n_sets is the number of synthetic data sets to generate using the given estimator, which will be chosen using `get_bssize()`_ if n_sets is not given.

autocorrel_alpha (which defaults to alpha) can be used to adjust the significance level of the autocorrelation calculation. Note that too high a significance level (too low an alpha) for evaluating the significance of autocorrelation values can result in a failure to detect correlation if the autocorrelation function is noisy.

The given subsample function is used, if provided, to subsample the dataset prior to running the full Monte Carlo bootstrap. If none is provided, then a random entry from each correlated block is used as the value for that block. Other reasonable choices include np.mean, np.median, (lambda x: x[0]) or (lambda x: x[-1]). In particular, using subsample=np.mean will converge to the block averaged mean and standard error, while accounting for any non-normality in the distribution of the mean.

westpa.cli.tools.w_reweight.reweight_for_c(rows, cols, obs, flux, insert, indices, nstates, nbins, state_labels, state_map, nfbins, istate, jstate, stride, bin_last_state_map, bin_state_map, return_obs, obs_threshold=1)

class westpa.cli.tools.w_reweight.FluxMatrix

Bases: object

w_postanalysis_matrix()

class westpa.cli.tools.w_reweight.RWMatrix(parent)

Bases: WESTKineticsBase, FluxMatrix

subcommand = 'init'

default_kinetics_file = 'reweight.h5'

default_output_file = 'reweight.h5'

help_text = 'create a color-labeled transition matrix from a WESTPA simulation'

description = 'Generate a colored transition matrix from a WE assignment file. The subsequent\nanalysis requires that the assignments are calculated using only the initial and\nfinal time points of each trajectory segment. This may require downsampling the\nh5file generated by a WE simulation. In the future w_assign may be enhanced to optionally\ngenerate the necessary assignment file from a h5file with intermediate time points.\nAdditionally, this analysis is currently only valid on simulations performed under\neither equilibrium or steady-state conditions without recycling target states.\n\n-----------------------------------------------------------------------------\nOutput format\n-----------------------------------------------------------------------------\n\nThe output file (-o/--output, by default "reweight.h5") contains the\nfollowing datasets:\n\n ``/bin_populations`` [window, bin]\n The reweighted populations of each bin based on windows. Bins contain\n one color each, so to recover the original un-colored spatial bins,\n one must sum over all states.\n\n ``/iterations`` [iteration]\n *(Structured -- see below)* Sparse matrix data from each\n iteration. They are reconstructed and averaged within the\n w_reweight {kinetics/probs} routines so that observables may\n be calculated. Each group contains 4 vectors of data:\n\n flux\n *(Floating-point)* The weight of a series of flux events\n cols\n *(Integer)* The bin from which a flux event began.\n cols\n *(Integer)* The bin into which the walker fluxed.\n obs\n *(Integer)* How many flux events were observed during this\n iteration.\n\n-----------------------------------------------------------------------------\nCommand-line options\n-----------------------------------------------------------------------------\n'

add_args(parser): Add arguments specific to this component to the given argparse parser.

process_args(args): Take argparse-processed arguments associated with this component and deal with them appropriately (setting instance variables, etc)

go()

class westpa.cli.tools.w_reweight.RWReweight(parent)

Bases: AverageCommands

help_text = 'Parent class for all reweighting routines, as they all use the same estimator code.'

add_args(parser): Add arguments specific to this component to the given argparse parser.

process_args(args): Take argparse-processed arguments associated with this component and deal with them appropriately (setting instance variables, etc)

accumulate_statistics(start_iter, stop_iter): This function pulls previously generated flux matrix data into memory. The data is assumed to exist within an HDF5 file that is available as a property. The data is kept as a single dimensional numpy array to use with the cython estimator.

generate_reweight_data(): This function ensures all the appropriate files are loaded, sets appropriate attributes necessary for all calling functions/children, and then calls the function to load in the flux matrix data.

class westpa.cli.tools.w_reweight.RWRate(parent)

Bases: RWReweight

subcommand = 'kinetics'

help_text = 'Generates rate and flux values from a WESTPA simulation via reweighting.'

default_kinetics_file = 'reweight.h5'

default_output_file = 'reweight.h5'

description = 'Calculate average rates from weighted ensemble data using the postanalysis\nreweighting scheme. Bin assignments (usually "assign.h5") and pre-calculated\niteration flux matrices (usually "reweight.h5") data files must have been\npreviously generated using w_reweight matrix (see "w_assign --help" and\n"w_reweight init --help" for information on generating these files).\n\n-----------------------------------------------------------------------------\nOutput format\n-----------------------------------------------------------------------------\nThe output file (-o/--output, usually "kinrw.h5") contains the following\ndataset:\n\n /avg_rates [state,state]\n (Structured -- see below) State-to-state rates based on entire window of\n iterations selected.\n\n /avg_total_fluxes [state]\n (Structured -- see below) Total fluxes into each state based on entire\n window of iterations selected.\n\n /avg_conditional_fluxes [state,state]\n (Structured -- see below) State-to-state fluxes based on entire window of\n iterations selected.\n\nIf --evolution-mode is specified, then the following additional datasets are\navailable:\n\n /rate_evolution [window][state][state]\n (Structured -- see below). State-to-state rates based on windows of\n iterations of varying width. If --evolution-mode=cumulative, then\n these windows all begin at the iteration specified with\n --start-iter and grow in length by --step-iter for each successive\n element. If --evolution-mode=blocked, then these windows are all of\n width --step-iter (excluding the last, which may be shorter), the first\n of which begins at iteration --start-iter.\n\n /target_flux_evolution [window,state]\n (Structured -- see below). Total flux into a given macro state based on\n windows of iterations of varying width, as in /rate_evolution.\n\n /conditional_flux_evolution [window,state,state]\n (Structured -- see below). State-to-state fluxes based on windows of\n varying width, as in /rate_evolution.\n\nThe structure of these datasets is as follows:\n\n iter_start\n (Integer) Iteration at which the averaging window begins (inclusive).\n\n iter_stop\n (Integer) Iteration at which the averaging window ends (exclusive).\n\n expected\n (Floating-point) Expected (mean) value of the observable as evaluated within\n this window, in units of inverse tau.\n\n ci_lbound\n (Floating-point) Lower bound of the confidence interval of the observable\n within this window, in units of inverse tau.\n\n ci_ubound\n (Floating-point) Upper bound of the confidence interval of the observable\n within this window, in units of inverse tau.\n\n stderr\n (Floating-point) The standard error of the mean of the observable\n within this window, in units of inverse tau.\n\n corr_len\n (Integer) Correlation length of the observable within this window, in units\n of tau.\n\nEach of these datasets is also stamped with a number of attributes:\n\n mcbs_alpha\n (Floating-point) Alpha value of confidence intervals. (For example,\n *alpha=0.05* corresponds to a 95% confidence interval.)\n\n mcbs_nsets\n (Integer) Number of bootstrap data sets used in generating confidence\n intervals.\n\n mcbs_acalpha\n (Floating-point) Alpha value for determining correlation lengths.\n\n\n-----------------------------------------------------------------------------\nCommand-line options\n-----------------------------------------------------------------------------\n '

w_postanalysis_reweight(): This function ensures the data is ready to send in to the estimator and the bootstrapping routine, then does so. Much of this is simply setting up appropriate args and kwargs, then passing them in to the ‘run_calculation’ function, which sets up future objects to send to the work manager. The results are returned, and then written to the appropriate HDF5 dataset. This function is specific for the rates and fluxes from the reweighting method.

go()

class westpa.cli.tools.w_reweight.RWStateProbs(parent)

Bases: RWReweight

subcommand = 'probs'

help_text = 'Calculates color and state probabilities via reweighting.'

default_kinetics_file = 'reweight.h5'

description = 'Calculate average populations from weighted ensemble data using the postanalysis\nreweighting scheme. Bin assignments (usually "assign.h5") and pre-calculated\niteration flux matrices (usually "reweight.h5") data files must have been\npreviously generated using w_reweight matrix (see "w_assign --help" and\n"w_reweight init --help" for information on generating these files).\n\n-----------------------------------------------------------------------------\nOutput format\n-----------------------------------------------------------------------------\n\nThe output file (-o/--output, usually "direct.h5") contains the following\ndataset:\n\n /avg_state_probs [state]\n (Structured -- see below) Population of each state across entire\n range specified.\n\n /avg_color_probs [state]\n (Structured -- see below) Population of each ensemble across entire\n range specified.\n\nIf --evolution-mode is specified, then the following additional datasets are\navailable:\n\n /state_pop_evolution [window][state]\n (Structured -- see below). State populations based on windows of\n iterations of varying width. If --evolution-mode=cumulative, then\n these windows all begin at the iteration specified with\n --start-iter and grow in length by --step-iter for each successive\n element. If --evolution-mode=blocked, then these windows are all of\n width --step-iter (excluding the last, which may be shorter), the first\n of which begins at iteration --start-iter.\n\n /color_prob_evolution [window][state]\n (Structured -- see below). Ensemble populations based on windows of\n iterations of varying width. If --evolution-mode=cumulative, then\n these windows all begin at the iteration specified with\n --start-iter and grow in length by --step-iter for each successive\n element. If --evolution-mode=blocked, then these windows are all of\n width --step-iter (excluding the last, which may be shorter), the first\n of which begins at iteration --start-iter.\n\nThe structure of these datasets is as follows:\n\n iter_start\n (Integer) Iteration at which the averaging window begins (inclusive).\n\n iter_stop\n (Integer) Iteration at which the averaging window ends (exclusive).\n\n expected\n (Floating-point) Expected (mean) value of the observable as evaluated within\n this window, in units of inverse tau.\n\n ci_lbound\n (Floating-point) Lower bound of the confidence interval of the observable\n within this window, in units of inverse tau.\n\n ci_ubound\n (Floating-point) Upper bound of the confidence interval of the observable\n within this window, in units of inverse tau.\n\n stderr\n (Floating-point) The standard error of the mean of the observable\n within this window, in units of inverse tau.\n\n corr_len\n (Integer) Correlation length of the observable within this window, in units\n of tau.\n\n\nEach of these datasets is also stamped with a number of attributes:\n\n mcbs_alpha\n (Floating-point) Alpha value of confidence intervals. (For example,\n *alpha=0.05* corresponds to a 95% confidence interval.)\n\n mcbs_nsets\n (Integer) Number of bootstrap data sets used in generating confidence\n intervals.\n\n mcbs_acalpha\n (Floating-point) Alpha value for determining correlation lengths.\n\n\n-----------------------------------------------------------------------------\nCommand-line options\n-----------------------------------------------------------------------------\n'

w_postanalysis_stateprobs(): This function ensures the data is ready to send in to the estimator and the bootstrapping routine, then does so. Much of this is simply setting up appropriate args and kwargs, then passing them in to the ‘run_calculation’ function, which sets up future objects to send to the work manager. The results are returned, and then written to the appropriate HDF5 dataset. This function is specific for the color (steady-state) and macrostate probabilities from the reweighting method.

go()

class westpa.cli.tools.w_reweight.RWAll(parent)

Bases: RWMatrix, RWStateProbs, RWRate

subcommand = 'all'

help_text = 'Runs the full suite, including the generation of the flux matrices.'

default_kinetics_file = 'reweight.h5'

default_output_file = 'reweight.h5'

description = 'A convenience function to run init/kinetics/probs. Bin assignments,\nincluding macrostate definitions, are required. (See\n"w_assign --help" for more information).\n\nFor more information on the individual subcommands this subs in for, run\nw_reweight {init/kinetics/probs} --help.\n\n-----------------------------------------------------------------------------\nCommand-line options\n-----------------------------------------------------------------------------\n'

go()

class westpa.cli.tools.w_reweight.RWAverage(parent)

Bases: RWStateProbs, RWRate

subcommand = 'average'

help_text = 'Averages and returns fluxes, rates, and color/state populations.'

default_kinetics_file = 'reweight.h5'

default_output_file = 'reweight.h5'

description = 'A convenience function to run kinetics/probs. Bin assignments,\nincluding macrostate definitions, are required. (See\n"w_assign --help" for more information).\n\nFor more information on the individual subcommands this subs in for, run\nw_reweight {kinetics/probs} --help.\n\n-----------------------------------------------------------------------------\nCommand-line options\n-----------------------------------------------------------------------------\n'

go()

class westpa.cli.tools.w_reweight.WReweight

Bases: WESTMasterCommand, WESTParallelTool

prog = 'w_reweight'

subcommands = [<class 'westpa.cli.tools.w_reweight.RWMatrix'>, <class 'westpa.cli.tools.w_reweight.RWAverage'>, <class 'westpa.cli.tools.w_reweight.RWRate'>, <class 'westpa.cli.tools.w_reweight.RWStateProbs'>, <class 'westpa.cli.tools.w_reweight.RWAll'>]

subparsers_title = 'reweighting kinetics analysis scheme'

westpa.cli.tools.w_reweight.entry_point()