CLstmLayer Class

CLstmLayer Class

This class implements a long short-term memory (LSTM) layer that can be applied to a set of vector sequences.

The output is a sequence containing the same number of vectors, each of GetHiddenSize() size.

Settings

Hidden layer size

void SetHiddenSize(int size);

Sets the hidden layer size. It affects the output size and the size of the state vector inside the LSTM.

Variational dropout

void SetDropoutRate(float newDropoutRate);

Sets the dropout probability. If this value is set, the operation will be performed on the input combined with the output of the last run; the result will be passed to the fully connected layer.

Activation function

void SetRecurrentActivation( TActivationFunction newActivation );

Sets the activation function that is used in forget, reset, and input gates. By default, AF_Sigmoid is used.

Trainable parameters

Weight matrix

CPtr<CDnnBlob> GetWeightsData() const;

The weight matrix containing the weights for each gate. The matrix is represented by a blob of the following dimensions:

BatchLength * BatchWidth * ListSize is equal to 4 * GetHiddenSize().
Height * Width * Depth * Channels is equal to the sum of the same dimension of the input and GetHiddenSize().

The BatchLength * BatchWidth * ListSize axis corresponds to the gate weights, in the following order:

G_Main = 0, // The main output data
G_Forget,   // Forget gate
G_Input,    // Input gate
G_Reset,    // Reset gate

The Height * Width * Depth * Channels axis corresponds to the weights:

0 to the input size: weights that serve as coefficients for the vectors of the input sequence;
the rest of the coordinates (up to HiddenSize) correspond to the weights that serve as coefficients for the output of the previous step.

Free terms

CPtr<CDnnBlob> GetFreeTermData() const

The free terms are represented by a blob of the total size 4 * GetHiddenSize(). The order in which they correspond to the gates is the same as above.

Inputs

The layer may have 1 to 3 inputs:

The set of vector sequences.
[Optional] The initial state of the LSTM layer before the first step. If this input is not specified, the initial state is all zeros.
[Optional] The initial value of the "previous output" to be used on the first step. If this input is not specified, all zeros are used.

First input size

BatchLength - the length of one vector sequence.
BatchWidth - the number of vector sequences in the input set.
ListSize should be 1.
Height * Width * Depth * Channels - the size of each vector in the sequence.

Other inputs size

BatchLength and ListSize should be 1.
BatchWidth should be equal to the BatchWidth of the first input.
Height * Width * Depth * Channels must be equal to the GetHiddenSize().

Outputs

The layer has two outputs:

The result of the current step.
The layer history.

Both outputs are of the following size:

BatchLength and BatchWidth are equal to the same sizes of the first input.
ListSize, Height, Width, and Depth equal 1.
Channels equals GetHiddenSize().

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LstmLayer.md

LstmLayer.md

CLstmLayer Class

Settings

Hidden layer size

Variational dropout

Activation function

Trainable parameters

Weight matrix

Free terms

Inputs

First input size

Other inputs size

Outputs

Files

LstmLayer.md

Latest commit

History

LstmLayer.md

File metadata and controls

CLstmLayer Class

Settings

Hidden layer size

Variational dropout

Activation function

Trainable parameters

Weight matrix

Free terms

Inputs

First input size

Other inputs size

Outputs