Skip to content

Latest commit

 

History

History
120 lines (77 loc) · 3.16 KB

IrnnLayer.md

File metadata and controls

120 lines (77 loc) · 3.16 KB

CIrnnLayer Class

This class implements an identity recurrent neural network (IRNN) from this article.

It's a simple recurrent unit with the following formula:

    Y_t = ReLU( FC_input( X_t ) + FC_recur( Y_t-1 ) )

where FC_* are fully-connected layers.

The crucial point of this layer lies in the weights initialization.

The weight matrix of FC_input is initialized from N(0, inputWeightStd), where inputWeightStd is a layer setting.

The weight matrix of FC_recur is initialized with an identity matrix multiplied by identityScale value.

Settings

Hidden layer size

void SetHiddenSize( int size );

Sets the hidden layer size. It affects the output size.

Identity scale

void SetIdentityScale( float scale );

Sets the multiplier for identity matrix that is used for the initialization of recurrent weights.

Input weight standard deviation

void SetInputWeightStd( float var );

Sets the standard deviation for input weights.

Trainable parameters

Input weight matrix

CPtr<CDnnBlob> GetInputWeightsData() const;

The weight matrix of the FC_input from the formula.

It has the following shape:

  • BatchLength * BatchWidth * ListSize is equal to GetHiddenSize().
  • Height * Width * Depth * Channels is equal to the product of the same dimensions of the input.

Input free terms

CPtr<CDnnBlob> GetInputFreeTermData() const

The free terms of the FC_input. It's represented by a blob of the total size GetHiddenSize().

Recurrent weight matrix

CPtr<CDnnBlob> GetRecurWeightsData() const;

The weight matrix of the FC_recur from the formula.

It has the following shape:

  • BatchLength * BatchWidth * ListSize is equal to GetHiddenSize()
  • Height * Width * Depth * Channels is equal to GetHiddenSize()

Recurrent free terms

CPtr<CDnnBlob> GetRecurrentFreeTermData() const

The free terms of the FC_recur. It's represented by a blob of the total size GetHiddenSize().

Inputs

The single input of this layer accepts the set of vector sequences of the following shape:

  • BatchLength - the length of one vector sequence.
  • BatchWidth * ListSize - the number of vector sequences in the input set.
  • Height * Width * Depth * Channels - the size of each vector in the sequence.

Outputs

The single output returns a blob of the following size:

  • BatchLength, BatchWidth, and ListSize are equal to the same sizes of the first input.
  • Height, Width, and Depth equal 1.
  • Channels equals GetHiddenSize().