Skip to content

Files

Latest commit

6867f86 · May 30, 2023

History

History
72 lines (56 loc) · 3.3 KB
·

File metadata and controls

72 lines (56 loc) · 3.3 KB
·

Statistical Methods Library 12.0.0

Release date: 2023-05-30

Synopsis

This release brings a fundamental rewrite of the imputation API as well as more complete (and differently formatted) API documentation. Support for Mean of Ratios Imputation has also been added. In addition, the library packaging has been altered to allow the overall statistical methods library, as well as subpackages, to be imported more easily.

Changes

Imputation

The imputation API has been changed to use a generic engine and ratio calculators. Ratio of Means has been rewritten as a ratio calculator with the construction calculation separated out into its own ratio calculator. In addition Mean of Ratios has been added as a ratio calculator. This new API also means that when passing forward and backward ratio (a.k.a. link) columns the method can perform construction ratio calculations and vice versa.

Individual ratio calculators can now output additional columns. In addition, filtering now outputs marker columns to reflect how responses are used in ratio calculations.

To remove confusion, the behaviour of observation count columns has changed. Instead of null and zero being used to differentiate reasons for ratio defaulting, the columns always reflect the actual number. A per ratio marker column is used to indicate whether ratios are defaulted.

Weighting functionality is now available for all ratio calculator types. In order to weight current ratios, back data for weighting now needs to contain unweighted ratios for corresponding previous periods.

Input validation previously required that a single identifier appear once in each period. This requirement has been changed to allow an identifier to appear once per grouping and period combination. This permits imputing multiple independent variables at once by using the grouping column to reflect both imputation grouping and variable.

General

The API documentation for other modules has been either added or edited (mostly reformatted). In addition the period calculation functions which were previously part of the imputation engine have been moved into a utility module and all categories of method are now subpackages with imports into the top level package so that one can import the statistical methods library and then reference individual parts without having to import them separately if required. This also allows easier examination of the documentation from within Python (e.g. via the help() function in the repl).

Notes

Imputation

The new ratio calculator API and the corresponding separation of forward and backward ratio calculation from construction ratio calculation do not allow the implicit disabling of usage of either type of ratio during imputation. This is to ensure that the output is fully imputed.

Although the API is incompatible and there are new output columns, the only existing output columns that have changed in terms of output values are the observation counts. This means that existing data can be used as back data as the observation count columns are not used for this purpose. In addition, although periods containing unweighted ratios may be required for weighting, it is still the case that the only target variable values used from back data are those from the period preceding the first period to be imputed with respect to the specified periodicity.