Convolutional layers¶

class lasagne.layers.Conv1DLayer(incoming, num_filters, filter_size, stride=1, pad=0, untie_biases=False, W=lasagne.init.GlorotUniform(), b=lasagne.init.Constant(0.), nonlinearity=lasagne.nonlinearities.rectify, flip_filters=True, convolution=lasagne.theano_extensions.conv.conv1d_mc0, **kwargs)[source]¶

1D convolutional layer

Performs a 1D convolution on its input and optionally adds a bias and applies an elementwise nonlinearity.

Parameters:

incoming : a Layer instance or a tuple

The layer feeding into this layer, or the expected input shape. The output of this layer should be a 3D tensor, with shape (batch_size, num_input_channels, input_length).

num_filters : int

The number of learnable convolutional filters this layer has.

filter_size : int or iterable of int

An integer or a 1-element tuple specifying the size of the filters.

stride : int or iterable of int

An integer or a 1-element tuple specifying the stride of the convolution operation.

pad : int, iterable of int, ‘full’, ‘same’ or ‘valid’ (default: 0)

By default, the convolution is only computed where the input and the filter fully overlap (a valid convolution). When stride=1, this yields an output that is smaller than the input by filter_size - 1. The pad argument allows you to implicitly pad the input with zeros, extending the output size.

An integer or a 1-element tuple results in symmetric zero-padding of the given size on both borders.

'full' pads with one less than the filter size on both sides. This is equivalent to computing the convolution wherever the input and the filter overlap by at least one position.

'same' pads with half the filter size (rounded down) on both sides. When stride=1 this results in an output size equal to the input size. Even filter size is not supported.

'valid' is an alias for 0 (no padding / a valid convolution).

untie_biases : bool (default: False)

If False, the layer will have a bias parameter for each channel, which is shared across all positions in this channel. As a result, the b attribute will be a vector (1D).

If True, the layer will have separate bias parameters for each position in each channel. As a result, the b attribute will be a matrix (2D).

W : Theano shared variable, expression, numpy array or callable

Initial value, expression or initializer for the weights. These should be a 3D tensor with shape (num_filters, num_input_channels, filter_length). See lasagne.utils.create_param() for more information.

b : Theano shared variable, expression, numpy array, callable or None

Initial value, expression or initializer for the biases. If set to None, the layer will have no biases. Otherwise, biases should be a 1D array with shape (num_filters,) if untied_biases is set to False. If it is set to True, its shape should be (num_filters, input_length) instead. See lasagne.utils.create_param() for more information.

nonlinearity : callable or None

The nonlinearity that is applied to the layer activations. If None is provided, the layer will be linear.

flip_filters : bool (default: True)

Whether to flip the filters before sliding them over the input, performing a convolution (this is the default), or not to flip them and perform a correlation. Note that for some other convolutional layers in Lasagne, flipping incurs an overhead and is disabled by default – check the documentation when using learned weights from another layer.

num_groups : int (default: 1)

The number of groups to split the input channels and output channels into, such that data does not cross the group boundaries. Requires the number of channels to be divisible by the number of groups, and requires Theano 0.10 or later for more than one group.

convolution : callable

The convolution implementation to use. The lasagne.theano_extensions.conv module provides some alternative implementations for 1D convolutions, because the Theano API only features a 2D convolution implementation. Usually it should be fine to leave this at the default value. Note that not all implementations support all settings for pad, subsample and num_groups.

**kwargs

Any additional keyword arguments are passed to the Layer superclass.

Attributes:

W : Theano shared variable or expression: Variable or expression representing the filter weights.
b : Theano shared variable or expression: Variable or expression representing the biases.

class lasagne.layers.Conv2DLayer(incoming, num_filters, filter_size, stride=(1, 1), pad=0, untie_biases=False, W=lasagne.init.GlorotUniform(), b=lasagne.init.Constant(0.), nonlinearity=lasagne.nonlinearities.rectify, flip_filters=True, convolution=theano.tensor.nnet.conv2d, **kwargs)[source]¶

2D convolutional layer

Performs a 2D convolution on its input and optionally adds a bias and applies an elementwise nonlinearity.

Parameters:

incoming : a Layer instance or a tuple

The layer feeding into this layer, or the expected input shape. The output of this layer should be a 4D tensor, with shape (batch_size, num_input_channels, input_rows, input_columns).

num_filters : int

The number of learnable convolutional filters this layer has.

filter_size : int or iterable of int

An integer or a 2-element tuple specifying the size of the filters.

stride : int or iterable of int

An integer or a 2-element tuple specifying the stride of the convolution operation.

pad : int, iterable of int, ‘full’, ‘same’ or ‘valid’ (default: 0)

By default, the convolution is only computed where the input and the filter fully overlap (a valid convolution). When stride=1, this yields an output that is smaller than the input by filter_size - 1. The pad argument allows you to implicitly pad the input with zeros, extending the output size.

A single integer results in symmetric zero-padding of the given size on all borders, a tuple of two integers allows different symmetric padding per dimension.

'full' pads with one less than the filter size on both sides. This is equivalent to computing the convolution wherever the input and the filter overlap by at least one position.

'same' pads with half the filter size (rounded down) on both sides. When stride=1 this results in an output size equal to the input size. Even filter size is not supported.

'valid' is an alias for 0 (no padding / a valid convolution).

Note that 'full' and 'same' can be faster than equivalent integer values due to optimizations by Theano.

untie_biases : bool (default: False)

If False, the layer will have a bias parameter for each channel, which is shared across all positions in this channel. As a result, the b attribute will be a vector (1D).

If True, the layer will have separate bias parameters for each position in each channel. As a result, the b attribute will be a 3D tensor.

W : Theano shared variable, expression, numpy array or callable

Initial value, expression or initializer for the weights. These should be a 4D tensor with shape (num_filters, num_input_channels, filter_rows, filter_columns). See lasagne.utils.create_param() for more information.

b : Theano shared variable, expression, numpy array, callable or None

Initial value, expression or initializer for the biases. If set to None, the layer will have no biases. Otherwise, biases should be a 1D array with shape (num_filters,) if untied_biases is set to False. If it is set to True, its shape should be (num_filters, output_rows, output_columns) instead. See lasagne.utils.create_param() for more information.

nonlinearity : callable or None

The nonlinearity that is applied to the layer activations. If None is provided, the layer will be linear.

flip_filters : bool (default: True)

Whether to flip the filters before sliding them over the input, performing a convolution (this is the default), or not to flip them and perform a correlation. Note that for some other convolutional layers in Lasagne, flipping incurs an overhead and is disabled by default – check the documentation when using learned weights from another layer.

num_groups : int (default: 1)

The number of groups to split the input channels and output channels into, such that data does not cross the group boundaries. Requires the number of channels to be divisible by the number of groups, and requires Theano 0.10 or later for more than one group.

convolution : callable

The convolution implementation to use. Usually it should be fine to leave this at the default value.

**kwargs

Any additional keyword arguments are passed to the Layer superclass.

Attributes:

W : Theano shared variable or expression: Variable or expression representing the filter weights.
b : Theano shared variable or expression: Variable or expression representing the biases.

Note

For experts: Conv2DLayer will create a convolutional layer using T.nnet.conv2d, Theano’s default convolution. On compilation for GPU, Theano replaces this with a cuDNN-based implementation if available, otherwise falls back to a gemm-based implementation. For details on this, please see the Theano convolution documentation.

Lasagne also provides convolutional layers directly enforcing a specific implementation: lasagne.layers.dnn.Conv2DDNNLayer to enforce cuDNN, lasagne.layers.corrmm.Conv2DMMLayer to enforce the gemm-based one, lasagne.layers.cuda_convnet.Conv2DCCLayer for Krizhevsky’s cuda-convnet.

class lasagne.layers.Conv3DLayer(incoming, num_filters, filter_size, stride=(1, 1, 1), pad=0, untie_biases=False, W=lasagne.init.GlorotUniform(), b=lasagne.init.Constant(0.), nonlinearity=lasagne.nonlinearities.rectify, flip_filters=True, convolution=theano.tensor.nnet.conv3d, **kwargs)[source]¶

3D convolutional layer

Performs a 3D convolution on its input and optionally adds a bias and applies an elementwise nonlinearity.

Parameters:

incoming : a Layer instance or a tuple

The layer feeding into this layer, or the expected input shape. The output of this layer should be a 5D tensor, with shape (batch_size, num_input_channels, input_depth, input_rows, input_columns).

num_filters : int

The number of learnable convolutional filters this layer has.

filter_size : int or iterable of int

An integer or a 3-element tuple specifying the size of the filters.

stride : int or iterable of int

An integer or a 3-element tuple specifying the stride of the convolution operation.

pad : int, iterable of int, ‘full’, ‘same’ or ‘valid’ (default: 0)

By default, the convolution is only computed where the input and the filter fully overlap (a valid convolution). When stride=1, this yields an output that is smaller than the input by filter_size - 1. The pad argument allows you to implicitly pad the input with zeros, extending the output size.

A single integer results in symmetric zero-padding of the given size on all borders, a tuple of two integers allows different symmetric padding per dimension.

'full' pads with one less than the filter size on both sides. This is equivalent to computing the convolution wherever the input and the filter overlap by at least one position.

'same' pads with half the filter size (rounded down) on both sides. When stride=1 this results in an output size equal to the input size. Even filter size is not supported.

'valid' is an alias for 0 (no padding / a valid convolution).

Note that 'full' and 'same' can be faster than equivalent integer values due to optimizations by Theano.

untie_biases : bool (default: False)

If False, the layer will have a bias parameter for each channel, which is shared across all positions in this channel. As a result, the b attribute will be a vector (1D).

If True, the layer will have separate bias parameters for each position in each channel. As a result, the b attribute will be a 4D tensor.

W : Theano shared variable, expression, numpy array or callable

Initial value, expression or initializer for the weights. These should be a 5D tensor with shape (num_filters, num_input_channels, filter_depth, filter_rows, filter_columns). See lasagne.utils.create_param() for more information.

b : Theano shared variable, expression, numpy array, callable or None

Initial value, expression or initializer for the biases. If set to None, the layer will have no biases. Otherwise, biases should be a 1D array with shape (num_filters,) if untied_biases is set to False. If it is set to True, its shape should be (num_filters, output_depth, output_rows, output_columns) instead. See lasagne.utils.create_param() for more information.

nonlinearity : callable or None

The nonlinearity that is applied to the layer activations. If None is provided, the layer will be linear.

flip_filters : bool (default: True)

Whether to flip the filters before sliding them over the input, performing a convolution (this is the default), or not to flip them and perform a correlation. Note that for some other convolutional layers in Lasagne, flipping incurs an overhead and is disabled by default – check the documentation when using learned weights from another layer.

num_groups : int (default: 1)

The number of groups to split the input channels and output channels into, such that data does not cross the group boundaries. Requires the number of channels to be divisible by the number of groups, and requires Theano 0.10 or later for more than one group.

convolution : callable

The convolution implementation to use. Usually it should be fine to leave this at the default value.

**kwargs

Any additional keyword arguments are passed to the Layer superclass.

Attributes:

W : Theano shared variable or expression: Variable or expression representing the filter weights.
b : Theano shared variable or expression: Variable or expression representing the biases.

class lasagne.layers.TransposedConv2DLayer(incoming, num_filters, filter_size, stride=(1, 1), crop=0, untie_biases=False, W=lasagne.init.GlorotUniform(), b=lasagne.init.Constant(0.), nonlinearity=lasagne.nonlinearities.rectify, flip_filters=False, **kwargs)[source]¶

2D transposed convolution layer

Performs the backward pass of a 2D convolution (also called transposed convolution, fractionally-strided convolution or deconvolution in the literature) on its input and optionally adds a bias and applies an elementwise nonlinearity.

Parameters:

incoming : a Layer instance or a tuple

The layer feeding into this layer, or the expected input shape. The output of this layer should be a 4D tensor, with shape (batch_size, num_input_channels, input_rows, input_columns).

num_filters : int

The number of learnable convolutional filters this layer has.

filter_size : int or iterable of int

An integer or a 2-element tuple specifying the size of the filters.

stride : int or iterable of int

An integer or a 2-element tuple specifying the stride of the transposed convolution operation. For the transposed convolution, this gives the dilation factor for the input – increasing it increases the output size.

crop : int, iterable of int, ‘full’, ‘same’ or ‘valid’ (default: 0)

By default, the transposed convolution is computed where the input and the filter overlap by at least one position (a full convolution). When stride=1, this yields an output that is larger than the input by filter_size - 1. It can be thought of as a valid convolution padded with zeros. The crop argument allows you to decrease the amount of this zero-padding, reducing the output size. It is the counterpart to the pad argument in a non-transposed convolution.

A single integer results in symmetric cropping of the given size on all borders, a tuple of two integers allows different symmetric cropping per dimension.

'full' disables zero-padding. It is is equivalent to computing the convolution wherever the input and the filter fully overlap.

'same' pads with half the filter size (rounded down) on both sides. When stride=1 this results in an output size equal to the input size. Even filter size is not supported.

'valid' is an alias for 0 (no cropping / a full convolution).

Note that 'full' and 'same' can be faster than equivalent integer values due to optimizations by Theano.

untie_biases : bool (default: False)

If False, the layer will have a bias parameter for each channel, which is shared across all positions in this channel. As a result, the b attribute will be a vector (1D).

If True, the layer will have separate bias parameters for each position in each channel. As a result, the b attribute will be a 3D tensor.

W : Theano shared variable, expression, numpy array or callable

Initial value, expression or initializer for the weights. These should be a 4D tensor with shape (num_input_channels, num_filters, filter_rows, filter_columns). Note that the first two dimensions are swapped compared to a non-transposed convolution. See lasagne.utils.create_param() for more information.

b : Theano shared variable, expression, numpy array, callable or None

Initial value, expression or initializer for the biases. If set to None, the layer will have no biases. Otherwise, biases should be a 1D array with shape (num_filters,) if untied_biases is set to False. If it is set to True, its shape should be (num_filters, output_rows, output_columns) instead. See lasagne.utils.create_param() for more information.

nonlinearity : callable or None

The nonlinearity that is applied to the layer activations. If None is provided, the layer will be linear.

flip_filters : bool (default: False)

Whether to flip the filters before sliding them over the input, performing a convolution, or not to flip them and perform a correlation (this is the default). Note that this flag is inverted compared to a non-transposed convolution.

output_size : int or iterable of int or symbolic tuple of ints

The output size of the transposed convolution. Allows to specify which of the possible output shapes to return when stride > 1. If not specified, the smallest shape will be returned.

**kwargs

Any additional keyword arguments are passed to the Layer superclass.

Notes

The transposed convolution is implemented as the backward pass of a corresponding non-transposed convolution. It can be thought of as dilating the input (by adding stride - 1 zeros between adjacent input elements), padding it with filter_size - 1 - crop zeros, and cross-correlating it with the filters. See [1] for more background.

References

[1]	(1, 2) Vincent Dumoulin, Francesco Visin (2016): A guide to convolution arithmetic for deep learning. arXiv. http://arxiv.org/abs/1603.07285, https://github.com/vdumoulin/conv_arithmetic

Examples

To transpose an existing convolution, with tied filter weights:

>>> from lasagne.layers import Conv2DLayer, TransposedConv2DLayer
>>> conv = Conv2DLayer((None, 1, 32, 32), 16, 3, stride=2, pad=2)
>>> deconv = TransposedConv2DLayer(conv, conv.input_shape[1],
...         conv.filter_size, stride=conv.stride, crop=conv.pad,
...         W=conv.W, flip_filters=not conv.flip_filters)

Attributes:	W : Theano shared variable or expression Variable or expression representing the filter weights. b : Theano shared variable or expression Variable or expression representing the biases.

lasagne.layers.Deconv2DLayer[source]¶: alias of TransposedConv2DLayer

class lasagne.layers.DilatedConv2DLayer(incoming, num_filters, filter_size, dilation=(1, 1), pad=0, untie_biases=False, W=lasagne.init.GlorotUniform(), b=lasagne.init.Constant(0.), nonlinearity=lasagne.nonlinearities.rectify, flip_filters=False, **kwargs)[source]¶

2D dilated convolution layer

Performs a 2D convolution with dilated filters, then optionally adds a bias and applies an elementwise nonlinearity.

Parameters:

incoming : a Layer instance or a tuple

The layer feeding into this layer, or the expected input shape. The output of this layer should be a 4D tensor, with shape (batch_size, num_input_channels, input_rows, input_columns).

num_filters : int

The number of learnable convolutional filters this layer has.

filter_size : int or iterable of int

An integer or a 2-element tuple specifying the size of the filters.

dilation : int or iterable of int

An integer or a 2-element tuple specifying the dilation factor of the filters. A factor of \(x\) corresponds to \(x - 1\) zeros inserted between adjacent filter elements.

pad : int, iterable of int, or ‘valid’ (default: 0)

The amount of implicit zero padding of the input. This implementation does not support padding, the argument is provided for compatibility to other convolutional layers only.

untie_biases : bool (default: False)

If False, the layer will have a bias parameter for each channel, which is shared across all positions in this channel. As a result, the b attribute will be a vector (1D).

If True, the layer will have separate bias parameters for each position in each channel. As a result, the b attribute will be a 3D tensor.

W : Theano shared variable, expression, numpy array or callable

Initial value, expression or initializer for the weights. These should be a 4D tensor with shape (num_input_channels, num_filters, filter_rows, filter_columns). Note that the first two dimensions are swapped compared to a non-dilated convolution. See lasagne.utils.create_param() for more information.

b : Theano shared variable, expression, numpy array, callable or None

Initial value, expression or initializer for the biases. If set to None, the layer will have no biases. Otherwise, biases should be a 1D array with shape (num_filters,) if untied_biases is set to False. If it is set to True, its shape should be (num_filters, output_rows, output_columns) instead. See lasagne.utils.create_param() for more information.

nonlinearity : callable or None

The nonlinearity that is applied to the layer activations. If None is provided, the layer will be linear.

flip_filters : bool (default: False)

Whether to flip the filters before sliding them over the input, performing a convolution, or not to flip them and perform a correlation (this is the default). This implementation does not support flipped filters, the argument is provided for compatibility to other convolutional layers only.

**kwargs

Any additional keyword arguments are passed to the Layer superclass.

Notes

The dilated convolution is implemented as the backward pass of a convolution wrt. weights, passing the filters as the output gradient. It can be thought of as dilating the filters (by adding dilation - 1 zeros between adjacent filter elements) and cross-correlating them with the input. See [1] for more background.

References

[1]	(1, 2) Fisher Yu, Vladlen Koltun (2016), Multi-Scale Context Aggregation by Dilated Convolutions. ICLR 2016. http://arxiv.org/abs/1511.07122, https://github.com/fyu/dilation

Attributes:	W : Theano shared variable or expression Variable or expression representing the filter weights. b : Theano shared variable or expression Variable or expression representing the biases.