lasagne.init

Functions to create initializers for parameter variables.

Examples

>>> from lasagne.layers import DenseLayer
>>> from lasagne.init import Constant, GlorotUniform
>>> l1 = DenseLayer((100,20), num_units=50, W=GlorotUniform(), b=Constant(0.0))
class lasagne.init.Initializer[source]

Base class for parameter tensor initializers.

The Initializer class represents a weight initializer used to initialize weight parameters in a neural network layer. It should be subclassed when implementing new types of weight initializers.

sample(shape)[source]

Sample should return a theano.tensor of size shape and data type theano.config.floatX.

Parameters:

shape : tuple or int

Integer or tuple specifying the size of the returned matrix.

returns : theano.tensor

Matrix of size shape and dtype theano.config.floatX.

class lasagne.init.Constant(val=0.0)[source]

Initialize weights with constant value.

Parameters:

val : float

Constant value for weights.

class lasagne.init.Normal(std=0.01, mean=0.0)[source]

Sample initial weights from the Gaussian distribution.

Initial weight parameters are sampled from N(mean, std).

Parameters:

std : float

Std of initial parameters.

mean : float

Mean of initial parameters.

class lasagne.init.Uniform(range=0.01, std=None, mean=0.0)[source]

Sample initial weights from the uniform distribution.

Parameters are sampled from U(a, b).

Parameters:

range : float or tuple

When std is None then range determines a, b. If range is a float the weights are sampled from U(-range, range). If range is a tuple the weights are sampled from U(range[0], range[1]).

std : float or None

If std is a float then the weights are sampled from U(mean - np.sqrt(3) * std, mean + np.sqrt(3) * std).

mean : float

see std for description.

class lasagne.init.Glorot(initializer, gain=1.0, c01b=False)[source]

Glorot weight initialization [R1].

This is also known as Xavier initialization.

Parameters:

initializer : lasagne.init.Initializer

Initializer used to sample the weights, must accept std in its constructor to sample from a distribution with a given standard deviation.

gain : float or ‘relu’

Scaling factor for the weights. Set this to 1.0 for linear and sigmoid units, to ‘relu’ or sqrt(2) for rectified linear units. Other transfer functions may need different factors.

c01b : bool

For a lasagne.layers.cuda_convnet.Conv2DCCLayer constructed with dimshuffle=False, c01b must be set to True to compute the correct fan-in and fan-out.

See also

GlorotNormal
Shortcut with Gaussian initializer.
GlorotUniform
Shortcut with uniform initializer.

Notes

For a DenseLayer, if gain='relu' and initializer=Uniform, the weights are initialized as

\[\begin{split}a &= \sqrt{\frac{6}{fan_{in}+fan_{out}}}\\ W &\sim U[-a, a]\end{split}\]

If gain=1 and initializer=Normal, the weights are initialized as

\[\begin{split}\sigma &= \sqrt{\frac{2}{fan_{in}+fan_{out}}}\\ W &\sim N(0, \sigma)\end{split}\]

References

[R1](1, 2) Xavier Glorot and Yoshua Bengio (2010): Understanding the difficulty of training deep feedforward neural networks. International conference on artificial intelligence and statistics.
class lasagne.init.GlorotNormal(gain=1.0, c01b=False)[source]

Glorot with weights sampled from the Normal distribution.

See Glorot for a description of the parameters.

class lasagne.init.GlorotUniform(gain=1.0, c01b=False)[source]

Glorot with weights sampled from the Uniform distribution.

See Glorot for a description of the parameters.

class lasagne.init.He(initializer, gain=1.0, c01b=False)[source]

He weight initialization [R2].

Weights are initialized with a standard deviation of \(\sigma = gain \sqrt{\frac{1}{fan_{in}}}\).

Parameters:

initializer : lasagne.init.Initializer

Initializer used to sample the weights, must accept std in its constructor to sample from a distribution with a given standard deviation.

gain : float or ‘relu’

Scaling factor for the weights. Set this to 1.0 for linear and sigmoid units, to ‘relu’ or sqrt(2) for rectified linear units. Other transfer functions may need different factors.

c01b : bool

For a lasagne.layers.cuda_convnet.Conv2DCCLayer constructed with dimshuffle=False, c01b must be set to True to compute the correct fan-in and fan-out.

See also

HeNormal
Shortcut with Gaussian initializer.
HeUniform
Shortcut with uniform initializer.

References

[R2](1, 2) Kaiming He et al. (2015): Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. arXiv preprint arXiv:1502.01852.
class lasagne.init.HeNormal(gain=1.0, c01b=False)[source]

He initializer with weights sampled from the Normal distribution.

See He for a description of the parameters.

class lasagne.init.HeUniform(gain=1.0, c01b=False)[source]

He initializer with weights sampled from the Uniform distribution.

See He for a description of the parameters.

class lasagne.init.Orthogonal(gain=1.0)[source]

Intialize weights as Orthogonal matrix.

Orthogonal matrix initialization. For n-dimensional shapes where n > 2, the n-1 trailing axes are flattened. For convolutional layers, this corresponds to the fan-in, so this makes the initialization usable for both dense and convolutional layers.

Parameters:

gain : float or ‘relu’

‘relu’ gives gain of sqrt(2).

class lasagne.init.Sparse(sparsity=0.1, std=0.01)[source]

Initialize weights as sparse matrix.

Parameters:

sparsity : float

Exact fraction of non-zero values per column. Larger values give less sparsity.

std : float

Non-zero weights are sampled from N(0, std).