# Miscellaneous Statistical Functions – catsim.stats¶

Miscellaneous statistical functions

catsim.stats.bincount(x)[source]

Count the number of occurrences from each integer in a list or 1-D numpy.ndarray. If there are gaps between the numbers, then the numbers in those gaps are given a 0 value of occurrences.

>>> bincount(numpy.array([-4, 0, 1, 1, 3, 2, 1, 5]))
array([1, 0, 0, 0, 1, 3, 1, 1, 0, 1], dtype=int32)

catsim.stats.coef_variation(x: numpy.ndarray, axis: int = 0) → numpy.ndarray[source]

Calculates the coefficientof variation of the rows or columns of a matrix. The coefficient of variation is given by the standard deviation divided by the mean of a variable:

$\frac{\sigma}{\mu}$
Return type: ndarray x (ndarray) – the data matrix axis (int) – 0 to calculate for columns, 1 for rows a vector containing the coefficient of variations along the chosen axis
catsim.stats.covariance(x: numpy.ndarray, minus_one: bool = True)[source]

Calculates the covariance matrix of another matrix

Parameters: x (ndarray) – a data matrix minus_one (bool) – subtract one from the total number of observations
>>> from sklearn.datasets import load_iris

catsim.stats.scatter_matrix(data: numpy.ndarray) → numpy.ndarray[source]
$S=\sum_{{j=1}}^{n}({\mathbf{x}}_{j}-\overline {{\mathbf{x}}})({\mathbf{x}}_{j}-\overline {{\mathbf{x}}})^{T}=\sum _{{j=1}}^{n}({\mathbf{x}}_{j}-\overline {{\mathbf{x}}})\otimes({\mathbf{x}}_{j}-\overline{{\mathbf{x}}})=\left(\sum _{{j=1}}^{n}{\mathbf {x}}_{j}{\mathbf {x}}_{j}^{T}\right)-n\overline {{\mathbf {x}}}\overline {{\mathbf {x}}}^{T}$
Return type: ndarray data (ndarray) – the data matrix the scatter matrix of the given data matrix