`lasagne.init`¶

Functions to create initializers for parameter variables.

Examples¶

>>> from lasagne.layers import DenseLayer
>>> from lasagne.init import Constant, GlorotUniform
>>> l1 = DenseLayer((100,20), num_units=50,
...                 W=GlorotUniform('relu'), b=Constant(0.0))

Initializers¶

`Constant`([val])	Initialize weights with constant value.
`Normal`([std, mean])	Sample initial weights from the Gaussian distribution.
`Uniform`([range, std, mean])	Sample initial weights from the uniform distribution.
`Glorot`(initializer[, gain, c01b])	Glorot weight initialization.
`GlorotNormal`([gain, c01b])	Glorot with weights sampled from the Normal distribution.
`GlorotUniform`([gain, c01b])	Glorot with weights sampled from the Uniform distribution.
`He`(initializer[, gain, c01b])	He weight initialization.
`HeNormal`([gain, c01b])	He initializer with weights sampled from the Normal distribution.
`HeUniform`([gain, c01b])	He initializer with weights sampled from the Uniform distribution.
`Orthogonal`([gain])	Intialize weights as Orthogonal matrix.
`Sparse`([sparsity, std])	Initialize weights as sparse matrix.

Detailed description¶

class lasagne.init.Initializer[source]¶

Base class for parameter tensor initializers.

The Initializer class represents a weight initializer used to initialize weight parameters in a neural network layer. It should be subclassed when implementing new types of weight initializers.

sample(shape)[source]¶

Sample should return a theano.tensor of size shape and data type theano.config.floatX.

Parameters:	shape : tuple or int Integer or tuple specifying the size of the returned matrix. returns : theano.tensor Matrix of size shape and dtype theano.config.floatX.

class lasagne.init.Constant(val=0.0)[source]¶

Initialize weights with constant value.

Parameters:	val : float Constant value for weights.

class lasagne.init.Normal(std=0.01, mean=0.0)[source]¶

Sample initial weights from the Gaussian distribution.

Initial weight parameters are sampled from N(mean, std).

Parameters:	std : float Std of initial parameters. mean : float Mean of initial parameters.

class lasagne.init.Uniform(range=0.01, std=None, mean=0.0)[source]¶

Sample initial weights from the uniform distribution.

Parameters are sampled from U(a, b).

Parameters:	range : float or tuple When std is None then range determines a, b. If range is a float the weights are sampled from U(-range, range). If range is a tuple the weights are sampled from U(range[0], range[1]). std : float or None If std is a float then the weights are sampled from U(mean - np.sqrt(3) * std, mean + np.sqrt(3) * std). mean : float see std for description.

class lasagne.init.Glorot(initializer, gain=1.0, c01b=False)[source]¶

Glorot weight initialization.

This is also known as Xavier initialization [1].

Parameters:

initializer : lasagne.init.Initializer: Initializer used to sample the weights, must accept std in its constructor to sample from a distribution with a given standard deviation.
gain : float or ‘relu’: Scaling factor for the weights. Set this to 1.0 for linear and sigmoid units, to ‘relu’ or sqrt(2) for rectified linear units, and to sqrt(2/(1+alpha**2)) for leaky rectified linear units with leakiness alpha. Other transfer functions may need different factors.
c01b : bool: For a lasagne.layers.cuda_convnet.Conv2DCCLayer constructed with dimshuffle=False, c01b must be set to True to compute the correct fan-in and fan-out.

See also

GlorotNormal: Shortcut with Gaussian initializer.
GlorotUniform: Shortcut with uniform initializer.

Notes

For a DenseLayer, if gain='relu' and initializer=Uniform, the weights are initialized as

\[\begin{split}a &= \sqrt{\frac{12}{fan_{in}+fan_{out}}}\\ W &\sim U[-a, a]\end{split}\]

If gain=1 and initializer=Normal, the weights are initialized as

\[\begin{split}\sigma &= \sqrt{\frac{2}{fan_{in}+fan_{out}}}\\ W &\sim N(0, \sigma)\end{split}\]

References

[1]	(1, 2) Xavier Glorot and Yoshua Bengio (2010): Understanding the difficulty of training deep feedforward neural networks. International conference on artificial intelligence and statistics.

class lasagne.init.GlorotNormal(gain=1.0, c01b=False)[source]¶

Glorot with weights sampled from the Normal distribution.

See Glorot for a description of the parameters.

class lasagne.init.GlorotUniform(gain=1.0, c01b=False)[source]¶

Glorot with weights sampled from the Uniform distribution.

See Glorot for a description of the parameters.

class lasagne.init.He(initializer, gain=1.0, c01b=False)[source]¶

He weight initialization.

Weights are initialized with a standard deviation of \(\sigma = gain \sqrt{\frac{1}{fan_{in}}}\) [1].

Parameters:

initializer : lasagne.init.Initializer: Initializer used to sample the weights, must accept std in its constructor to sample from a distribution with a given standard deviation.
gain : float or ‘relu’: Scaling factor for the weights. Set this to 1.0 for linear and sigmoid units, to ‘relu’ or sqrt(2) for rectified linear units, and to sqrt(2/(1+alpha**2)) for leaky rectified linear units with leakiness alpha. Other transfer functions may need different factors.
c01b : bool: For a lasagne.layers.cuda_convnet.Conv2DCCLayer constructed with dimshuffle=False, c01b must be set to True to compute the correct fan-in and fan-out.

See also

HeNormal: Shortcut with Gaussian initializer.
HeUniform: Shortcut with uniform initializer.

References

[1]	(1, 2) Kaiming He et al. (2015): Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. arXiv preprint arXiv:1502.01852.

class lasagne.init.HeNormal(gain=1.0, c01b=False)[source]¶

He initializer with weights sampled from the Normal distribution.

See He for a description of the parameters.

class lasagne.init.HeUniform(gain=1.0, c01b=False)[source]¶

He initializer with weights sampled from the Uniform distribution.

See He for a description of the parameters.

class lasagne.init.Orthogonal(gain=1.0)[source]¶

Intialize weights as Orthogonal matrix.

Orthogonal matrix initialization [1]. For n-dimensional shapes where n > 2, the n-1 trailing axes are flattened. For convolutional layers, this corresponds to the fan-in, so this makes the initialization usable for both dense and convolutional layers.

Parameters:	gain : float or ‘relu’ Scaling factor for the weights. Set this to `1.0` for linear and sigmoid units, to ‘relu’ or `sqrt(2)` for rectified linear units, and to `sqrt(2/(1+alpha**2))` for leaky rectified linear units with leakiness `alpha`. Other transfer functions may need different factors.

References

[1]	(1, 2) Saxe, Andrew M., James L. McClelland, and Surya Ganguli. “Exact solutions to the nonlinear dynamics of learning in deep linear neural networks.” arXiv preprint arXiv:1312.6120 (2013).

class lasagne.init.Sparse(sparsity=0.1, std=0.01)[source]¶

Initialize weights as sparse matrix.

Parameters:	sparsity : float Exact fraction of non-zero values per column. Larger values give less sparsity. std : float Non-zero weights are sampled from N(0, std).

lasagne.init¶

Examples¶

Initializers¶

Detailed description¶

`lasagne.init`¶