westpa.oldtools.stats package
westpa.oldtools.stats module
westpa.oldtools.stats.accumulator module
westpa.oldtools.stats.edfs module
- class westpa.oldtools.stats.edfs.EDF(values, weights=None)
Bases:
object
A class for creating and manipulating empirical distribution functions (cumulative distribution functions derived from sample data).
Construct a new EDF from the given values and (optionally) weights.
- static from_array(array)
- static from_arrays(x, F)
- as_array()
Return this EDF as a (N,2) array, where N is the number of unique values passed to the constructor. Numpy type casting rules are applied (so, for instance, integral abcissae are converted to floating-point values).
- quantiles(p)
Treating the EDF as a quantile function, return the values of the (statistical) variable whose probabilities are at least p. That is, Q(p) = inf {x: p <= F(x) }.
- quantile(p)
- median()
- moment(n)
Calculate the nth moment of this probability distribution
<x^n> = int_{-inf}^{inf} x^n dF(x)
- cmoment(n)
Calculate the nth central moment of this probability distribution
- mean()
- var()
Return the second central moment of this probability distribution.
- std()
Return the standard deviation (root of the variance) of this probability distribution.
westpa.oldtools.stats.mcbs module
Tools for Monte Carlo bootstrap error analysis
- westpa.oldtools.stats.mcbs.add_mcbs_options(parser)
Add arguments concerning Monte Carlo bootstrap (
confidence
andbssize
) to the given parser
- westpa.oldtools.stats.mcbs.get_bssize(alpha)
Return a bootstrap data set size appropriate for the given confidence level
- westpa.oldtools.stats.mcbs.bootstrap_ci(estimator, data, alpha, n_sets=None, args=(), kwargs={}, sort=<function msort>, extended_output=False)
Perform a Monte Carlo bootstrap of a (1-alpha) confidence interval for the given
estimator
. Returns (fhat, ci_lower, ci_upper), where fhat is the result ofestimator(data, *args, **kwargs)
, andci_lower
andci_upper
are the lower and upper bounds of the surrounding confidence interval, calculated by callingestimator(syndata, *args, **kwargs)
on each synthetic data setsyndata
. Ifn_sets
is provided, that is the number of synthetic data sets generated, otherwise an appropriate size is selected automatically (seeget_bssize()
).sort
, if given, is applied to sort the results of callingestimator
on each synthetic data set prior to obtaining the confidence interval.Individual entries in synthetic data sets are selected by the first index of
data
, allowing this function to be used on arrays of multidimensional data.If
extended_output
is True (by default not), instead of returning (fhat, lb, ub), this function returns (fhat, lb, ub, ub-lb, abs((ub-lb)/fhat), and max(ub-fhat,fhat-lb)) (that is, the estimated value, the lower and upper bounds of the confidence interval, the width of the confidence interval, the relative width of the confidence interval, and the symmetrized error bar of the confidence interval).