# ab.layers¶

Network layers and utilities.

class aboleth.layers.Activation(h=<function Activation.<lambda>>)

Activation function layer.

Parameters: h (callable) – the element-wise activation function.
__call__(X)

Construct the subgraph for this layer.

Parameters: X (Tensor) – the input to this layer Net (Tensor) – the output of this layer KL (float, Tensor) – the regularizer/Kullback Leibler ‘cost’ of the parameters in this layer.
class aboleth.layers.DenseMAP(output_dim, l1_reg=1.0, l2_reg=1.0, use_bias=True)

Dense (fully connected) linear layer, with MAP inference.

Parameters: output_dim (int) – the dimension of the output of this layer l1_reg (float) – the value of the l1 weight regularizer, $$\text{l1_reg} \times \|\mathbf{W}\|_1$$ l2_reg (float) – the value of the l2 weight regularizer, $$\frac{1}{2} \text{l2_reg} \times \|\mathbf{W}\|^2_2$$ use_bias (bool) – If true, also learn a bias weight, e.g. a constant offset weight.
class aboleth.layers.DenseVariational(output_dim, var=1.0, full=False, use_bias=True, prior_W=None, prior_b=None, post_W=None, post_b=None)

Dense (fully connected) linear layer, with variational inference.

Parameters: output_dim (int) – the dimension of the output of this layer var (float) – the initial value of the weight prior variance, which defaults to $$\mathbf{W} \sim \mathcal{N}(\mathbf{0}, \text{var} \mathbf{I})$$, this is optimized (a la maximum likelihood type II). full (bool) – If true, use a full covariance Gaussian posterior for each of the output weight columns, otherwise use an independent (diagonal) Normal posterior. use_bias (bool) – If true, also learn a bias weight, e.g. a constant offset weight. prior_W (distributions.Normal, distributions.Gaussian, optional) – This is the prior distribution object to use on the layer weights. It must have parameters compatible with (input_dim, output_dim) shaped weights. This ignores the var parameter. prior_b (distributions.Normal, distributions.Gaussian, optional) – This is the prior distribution object to use on the layer intercept. It must have parameters compatible with (output_dim,) shaped weights. This ignores the var and use_bias parameters. post_W (distributions.Normal, distributions.Gaussian, optional) – It must have parameters compatible with (input_dim, output_dim) shaped weights. This ignores the full parameter. See also distributions.gaus_posterior. post_b (distributions.Normal, distributions.Gaussian, optional) – This is the posterior distribution object to use on the layer intercept. It must have parameters compatible with (output_dim,) shaped weights. This ignores the use_bias parameters. See also distributions.norm_posterior.
class aboleth.layers.DropOut(keep_prob)

Dropout layer, Bernoulli probability of not setting an input to zero.

This is just a thin wrapper around tf.dropout

Parameters: keep_prob (float, Tensor) – the probability of keeping an input. See tf.dropout.
class aboleth.layers.EmbedVariational(output_dim, n_categories, var=1.0, full=False, prior_W=None, post_W=None)

Dense (fully connected) embedding layer, with variational inference.

This layer works directly on shape (N, 1) inputs of category indices rather than one-hot representations, for efficiency.

Parameters: output_dim (int) – the dimension of the output (embedding) of this layer n_categories (int) – the number of categories in the input variable var (float) – the initial value of the weight prior variance, which defaults to $$\mathbf{W} \sim \mathcal{N}(\mathbf{0}, \text{var} \mathbf{I})$$, this is optimized (a la maximum likelihood type II). full (bool) – If true, use a full covariance Gaussian posterior for each of the output weight columns, otherwise use an independent (diagonal) Normal posterior. prior_W (distributions.Normal, distributions.Gaussian, optional) – This is the prior distribution object to use on the layer weights. It must have parameters compatible with (input_dim, output_dim) shaped weights. This ignores the var parameter. post_W (distributions.Normal, distributions.Gaussian, optional) – This is the posterior distribution object to use on the layer weights. It must have parameters compatible with (input_dim, output_dim) shaped weights. This ignores the full parameter. See also distributions.gaus_posterior.
class aboleth.layers.InputLayer(name, n_samples=None)

Create an input layer.

This layer defines input kwargs so that a user may easily provide the right inputs to a complex set of layers. It takes a 2D tensor of shape (N, D). If n_samples is specified, the input is tiled along a new first axis creating a (n_samples, N, D) tensor for propogating samples through a variational deep net.

Parameters: name (string) – The name of the input. Used as the agument for input into the net. n_samples (int > 0) – The number of samples.
class aboleth.layers.MaxPool2D(pool_size, strides, padding='SAME')

Max pooling layer for 2D inputs (e.g. images).

This is just a thin wrapper around tf.nn.max_pool

Parameters: pool_size (tuple or list of 2 ints) – width and height of the pooling window. strides (tuple or list of 2 ints) – the strides of the pooling operation along the height and width. padding (str) – One of ‘SAME’ or ‘VALID’. Defaults to ‘SAME’. The type of padding
class aboleth.layers.RandomArcCosine(n_features, lenscale=1.0, p=1, variational=False, lenscale_posterior=None)

Random arc-cosine kernel layer.

NOTE: This should be followed by a dense layer to properly implement a
kernel approximation.
Parameters: n_features (int) – the number of unique random features, the actual output dimension of this layer will be 2 * n_features. lenscale (float, ndarray, Tensor) – the lenght scales of the ar-cosine kernel, this can be a scalar for an isotropic kernel, or a vector for an automatic relevance detection (ARD) kernel. p (int) – The order of the arc-cosine kernel, this must be an integer greater than, or eual to zero. 0 will lead to sigmoid-like kernels, 1 will lead to relu-like kernels, 2 quadratic-relu kernels etc. variational (bool) – use variational features instead of random features, (i.e. VAR-FIXED in [2]). lenscale_posterior (float, ndarray, optional) – the initial value for the posterior length scale. This is only used if variational==True. This can be a scalar or vector (different initial value per input dimension). If this is left as None, it will be set to sqrt(1 / input_dim) (this is similar to the ‘auto’ setting for a scikit learn SVM with a RBF kernel).

[1] Cho, Youngmin, and Lawrence K. Saul.
“Analysis and extension of arc-cosine kernels for large margin classification.” arXiv preprint arXiv:1112.3712 (2011).
[2] Cutajar, K. Bonilla, E. Michiardi, P. Filippone, M.
Random Feature Expansions for Deep Gaussian Processes. In ICML, 2017.
class aboleth.layers.RandomFourier(n_features, kernel)

Random Fourier feature (RFF) kernel approximation layer.

NOTE: This should be followed by a dense layer to properly implement a
kernel approximation.
Parameters: n_features (int) – the number of unique random features, the actual output dimension of this layer will be 2 * n_features. kernel (kernels.ShiftInvariant) – the kernel object that yeilds the random samples from the fourier spectrum of a particular kernel to approximate. See the ab.kernels module.
class aboleth.layers.Reshape(target_shape)

Reshape layer.

Reshape and output an tensor to a specified shape.

Parameters: targe_shape (tuple of ints) – Does not include the samples or batch axes.
class aboleth.layers.SampleLayer

Sample Layer base class.

This is the base class for layers that build upon stochastic (variational) nets. These expect rank >= 3 input Tensors, where the first dimension indexes the random samples of the stochastic net.

class aboleth.layers.SampleLayer3

Special case of SampleLayer restricted to rank == 3 input Tensors.

