# ab.impute¶

Layers that impute missing data.

class aboleth.impute.ExtraCategoryImpute(datalayer, masklayer, ncategory_list)

Impute missing values from categorical data with an extra category.

Given categorical data, a missing mask and a number of categories for each feature (last dimension), this will assign missing values as an extra category equal to the number of categories. e.g. for 2 categories (0 and 1) missing data will be assigned 2.

Parameters: datalayer (callable) – A layer that returns a data tensor. Must be an InputLayer. masklayer (callable) – A layer that returns a boolean mask tensor where True values are masked. Must be an InputLayer. ncategory_list (list) – A list that provides the total number of categories for each feature (last dimension) of the input. Length of the list must be equal to the size of the last dimension of X.
__call__(**kwargs)

Construct the subgraph for this layer.

Parameters: **kwargs – the inputs to this layer (Tensors) Net (Tensor) – the output of this layer KL (float, Tensor) – the regularizer/Kullback Leibler ‘cost’ of the parameters in this layer.
class aboleth.impute.FixedNormalImpute(datalayer, masklayer, loc, scale)

Impute the missing values using marginal Gaussians over each column.

Takes two layers, one the returns a data tensor and the other returns a mask layer. Creates a layer that returns a tensor in which the masked values have been imputed as random draws from the marginal Gaussians.

Parameters: datalayer (callable) – A layer that returns a data tensor. Must be of form f(**kwargs). masklayer (callable) – A layer that returns a boolean mask tensor where True values are masked. Must be of form f(**kwargs). loc (array-like) – A list of the global mean values of each data column scale (array-like) – A list of the global standard deviation of each data column
__call__(**kwargs)

Construct the subgraph for this layer.

Parameters: **kwargs – the inputs to this layer (Tensors) Net (Tensor) – the output of this layer KL (float, Tensor) – the regularizer/Kullback Leibler ‘cost’ of the parameters in this layer.
class aboleth.impute.ImputeColumnWise(datalayer, masklayer)

Abstract class for imputing column-wise from a vector or scalar.

This implements _impute2D and this calls the _impute_columns method that returns a vector or scalar to impute X column-wise (as opposed to element-wise). You need to supply the _impute_columns method.

__call__(**kwargs)

Construct the subgraph for this layer.

Parameters: **kwargs – the inputs to this layer (Tensors) Net (Tensor) – the output of this layer KL (float, Tensor) – the regularizer/Kullback Leibler ‘cost’ of the parameters in this layer.
class aboleth.impute.ImputeOp(datalayer, masklayer)

Abstract Base Impute operation. These specialise MultiLayers.

They expect a data InputLayer and a mask InputLayer. They return layers in which the masked values have been imputed.

Parameters: datalayer (callable) – A layer that returns a data tensor. Must be of form f(**kwargs). masklayer (callable) – A layer that returns a boolean mask tensor where True values are masked. Must be of form f(**kwargs).
__call__(**kwargs)

Construct the subgraph for this layer.

Parameters: **kwargs – the inputs to this layer (Tensors) Net (Tensor) – the output of this layer KL (float, Tensor) – the regularizer/Kullback Leibler ‘cost’ of the parameters in this layer.
class aboleth.impute.LearnedNormalImpute(datalayer, masklayer)

Impute the missing values with draws from learned normal distributions.

Takes two layers, one the returns a data tensor and the other returns a mask layer. This creates a layer that will learn marginal Gaussian parameters per column, and infill missing values using draws from these Gaussians.

Parameters: datalayer (callable) – A layer that returns a data tensor. Must be an InputLayer. masklayer (callable) – A layer that returns a boolean mask tensor where True values are masked. Must be an InputLayer.
__call__(**kwargs)

Construct the subgraph for this layer.

Parameters: **kwargs – the inputs to this layer (Tensors) Net (Tensor) – the output of this layer KL (float, Tensor) – the regularizer/Kullback Leibler ‘cost’ of the parameters in this layer.
class aboleth.impute.LearnedScalarImpute(datalayer, masklayer)

Impute the missing values using learnt scalar for each column.

Takes two layers, one the returns a data tensor and the other returns a mask layer. Creates a layer that returns a tensor in which the masked values have been imputed with a learned scalar value per colum.

Parameters: datalayer (callable) – A layer that returns a data tensor. Must be an InputLayer. masklayer (callable) – A layer that returns a boolean mask tensor where True values are masked. Must be an InputLayer.
__call__(**kwargs)

Construct the subgraph for this layer.

Parameters: **kwargs – the inputs to this layer (Tensors) Net (Tensor) – the output of this layer KL (float, Tensor) – the regularizer/Kullback Leibler ‘cost’ of the parameters in this layer.
class aboleth.impute.MaskInputLayer(name)

Create an input layer for a binary mask tensor.

This layer defines input kwargs so that a user may easily provide the right binary mask inputs to a complex set of layers to enable imputation.

Parameters: name (string) – The name of the input. Used as the agument for input into the net.
__call__(**kwargs)

Construct the subgraph for this layer.

Parameters: **kwargs – the inputs to this layer (Tensors) Net (Tensor) – the output of this layer KL (float, Tensor) – the regularizer/Kullback Leibler ‘cost’ of the parameters in this layer.
class aboleth.impute.MeanImpute(datalayer, masklayer)

Impute the missing values using the stochastic mean of their column.

Takes two layers, one the returns a data tensor and the other returns a mask layer. Returns a layer that returns a tensor in which the masked values have been imputed as the column means calculated from the batch.

Parameters: datalayer (callable) – A layer that returns a data tensor. Must be of form f(**kwargs). masklayer (callable) – A layer that returns a boolean mask tensor where True values are masked. Must be of form f(**kwargs).
__call__(**kwargs)

Construct the subgraph for this layer.

Parameters: **kwargs – the inputs to this layer (Tensors) Net (Tensor) – the output of this layer KL (float, Tensor) – the regularizer/Kullback Leibler ‘cost’ of the parameters in this layer.