lasagne.init¶
Functions to create initializers for parameter variables.
Examples¶
>>> from lasagne.layers import DenseLayer
>>> from lasagne.init import Constant, GlorotUniform
>>> l1 = DenseLayer((100,20), num_units=50, W=GlorotUniform(), b=Constant(0.0))
- class lasagne.init.Initializer[source]¶
Base class for parameter tensor initializers.
The Initializer class represents a weight initializer used to initialize weight parameters in a neural network layer. It should be subclassed when implementing new types of weight initializers.
- class lasagne.init.Constant(val=0.0)[source]¶
Initialize weights with constant value.
Parameters: val : float
Constant value for weights.
- class lasagne.init.Normal(std=0.01, mean=0.0)[source]¶
Sample initial weights from the Gaussian distribution.
Initial weight parameters are sampled from N(mean, std).
Parameters: std : float
Std of initial parameters.
mean : float
Mean of initial parameters.
- class lasagne.init.Uniform(range=0.01, std=None, mean=0.0)[source]¶
Sample initial weights from the uniform distribution.
Parameters are sampled from U(a, b).
Parameters: range : float or tuple
When std is None then range determines a, b. If range is a float the weights are sampled from U(-range, range). If range is a tuple the weights are sampled from U(range[0], range[1]).
std : float or None
If std is a float then the weights are sampled from U(mean - np.sqrt(3) * std, mean + np.sqrt(3) * std).
mean : float
see std for description.
- class lasagne.init.Glorot(initializer, gain=1.0, c01b=False)[source]¶
Glorot weight initialization [R1].
This is also known as Xavier initialization.
Parameters: initializer : lasagne.init.Initializer
Initializer used to sample the weights, must accept std in its constructor to sample from a distribution with a given standard deviation.
gain : float or ‘relu’
Scaling factor for the weights. Set this to 1.0 for linear and sigmoid units, to ‘relu’ or sqrt(2) for rectified linear units. Other transfer functions may need different factors.
c01b : bool
For a lasagne.layers.cuda_convnet.Conv2DCCLayer constructed with dimshuffle=False, c01b must be set to True to compute the correct fan-in and fan-out.
See also
- GlorotNormal
- Shortcut with Gaussian initializer.
- GlorotUniform
- Shortcut with uniform initializer.
Notes
For a DenseLayer, if gain='relu' and initializer=Uniform, the weights are initialized as
\[\begin{split}a &= \sqrt{\frac{6}{fan_{in}+fan_{out}}}\\ W &\sim U[-a, a]\end{split}\]If gain=1 and initializer=Normal, the weights are initialized as
\[\begin{split}\sigma &= \sqrt{\frac{2}{fan_{in}+fan_{out}}}\\ W &\sim N(0, \sigma)\end{split}\]References
[R1] (1, 2) Xavier Glorot and Yoshua Bengio (2010): Understanding the difficulty of training deep feedforward neural networks. International conference on artificial intelligence and statistics.
- class lasagne.init.GlorotNormal(gain=1.0, c01b=False)[source]¶
Glorot with weights sampled from the Normal distribution.
See Glorot for a description of the parameters.
- class lasagne.init.GlorotUniform(gain=1.0, c01b=False)[source]¶
Glorot with weights sampled from the Uniform distribution.
See Glorot for a description of the parameters.
- class lasagne.init.He(initializer, gain=1.0, c01b=False)[source]¶
He weight initialization [R2].
Weights are initialized with a standard deviation of \(\sigma = gain \sqrt{\frac{1}{fan_{in}}}\).
Parameters: initializer : lasagne.init.Initializer
Initializer used to sample the weights, must accept std in its constructor to sample from a distribution with a given standard deviation.
gain : float or ‘relu’
Scaling factor for the weights. Set this to 1.0 for linear and sigmoid units, to ‘relu’ or sqrt(2) for rectified linear units. Other transfer functions may need different factors.
c01b : bool
For a lasagne.layers.cuda_convnet.Conv2DCCLayer constructed with dimshuffle=False, c01b must be set to True to compute the correct fan-in and fan-out.
References
[R2] (1, 2) Kaiming He et al. (2015): Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. arXiv preprint arXiv:1502.01852.
- class lasagne.init.HeNormal(gain=1.0, c01b=False)[source]¶
He initializer with weights sampled from the Normal distribution.
See He for a description of the parameters.
- class lasagne.init.HeUniform(gain=1.0, c01b=False)[source]¶
He initializer with weights sampled from the Uniform distribution.
See He for a description of the parameters.
- class lasagne.init.Orthogonal(gain=1.0)[source]¶
Intialize weights as Orthogonal matrix.
Orthogonal matrix initialization. For n-dimensional shapes where n > 2, the n-1 trailing axes are flattened. For convolutional layers, this corresponds to the fan-in, so this makes the initialization usable for both dense and convolutional layers.
Parameters: gain : float or ‘relu’
‘relu’ gives gain of sqrt(2).