Skip to content

Latest commit

 

History

History
327 lines (257 loc) · 20.7 KB

README.md

File metadata and controls

327 lines (257 loc) · 20.7 KB

Neural Networks

Choose the math engine

Before you start your work with neural networks, choose the device to be used for calculations. This can be a CPU or a GPU. Create a math engine for the required device and pass the reference to it when creating the network and the layers.

Data blobs

All data used in the network operation (inputs, outputs, trainable parameters) is stored in blobs. A blob is a 7-dimensional array, and each of its dimensions has a specific meaning:

  • BatchLength is a "time" axis, used to denote data sequences; it is mainly used in recurrent networks
  • BatchWidth corresponds to the batch, used to pass several independent objects together
  • ListSize is the dimensions for the objects that are connected (for example, pixels out of one image) but do not form a sequence
  • Height is the height of a matrix or an image
  • Width is the width of a matrix or an image
  • Depth is the width of a 3-dimensional image
  • Channels corresponds to channels for multi-channel image formats and is also used to work with one-dimensional vectors.

The blobs may contain one of the two types of data: float (CT_Float) and integer (CT_Int). Both data types are 32-bit.

If the data type is not specified directly anywhere in this documentation, that means float is used.

General principles

The layer concept

A layer is an element of the network that performs some operation: anything from the input data reshape or a simple math function calculation, up to convolution or LSTM (Long short-term memory).

If the operation needs input data, it will be taken from the layer input. Each layer input contains one data blob, and if several blobs are needed, the layer will have several inputs. Each layer input should be connected to another layer's output.

If the operation returns results that should be used by other layers, they will be passed to the layer outputs. Each layer output contains one data blob, so depending on the operation it performs the layer may have several outputs. Several other layer inputs may be connected to the same output, but you may not leave an output unconnected to any inputs.

In addition, the layer may have settings specified by the user before starting calculations, and trainable parameters that are optimized during network training.

The layers also have names that can be used to find a layer in the network. The name should be set at layer creation or before adding it to the network.

See below for the full list of available layers with links to the detailed descriptions.

CDnn class for the network

The neural network is implemented by a CDnn class. A neural network is a directed graph with the vertices corresponding to layers and the arcs corresponding to the connections along which the data is passed from one layer's output to another's input.

Each layer should be added to the network after you assign a unique name to it. A layer may not be connected to several networks at once.

Source layers are used to pass the data into the network. A source layer has no inputs and passes the data blob specified by the user to its only output.

Sink layers with no outputs are used to retrieve the result of the network operation. They provide a function that returns the blob with data.

After all the layers are added and connected the network may be set up for training.

Training the network

To train the network you will need:

  • a layer (or several layers) that would calculate the loss function to be optimized
  • additional source layers that contain the correct labels for input data and the object weights
  • the initializer that would be used to assign the values to the weights before starting to optimize them
  • the optimizer mechanism that will be used for training

Weights initialization

Before the first training iteration the layers' weights (trainable parameters) are initialized using the CDnnInitializer object. There are two implementations for it:

  • CDnnUniformInitializer generates the weights using a uniform distribution over a segment from GetLowerBound to GetUpperBound.
  • CDnnXavierInitializer generates the weights using the normal distribution N(0, 1/n) where n is the input size.
  • CDnnXavierUniformInitializer generates the weights using the uniform distribution U(-sqrt(1/n), sqrt(1/n)) where n is the input size.

To select the preferred initializer, create an instance of one of these classes and pass it to the network using the CDnn::SetInitializer method. The default initialization methods is Xavier.

The initializer is the same for all the network trainable weights, except for the free term vectors that are initialized with zeros.

Optimizers

The optimizer sets the rules to update the weights during training. It is represented by the CDnnSolver that has 4 implementations:

  • CDnnSimpleGradientSolver - gradient descent with momentum
  • CDnnAdaptiveGradientSolver - gradient descent with adaptive momentum (Adam)
  • CDnnNesterovGradientSolver - Adam with Nesterov momentum (Nadam)
  • CDnnLambGradientSolver - LAMB

To select the preferred optimizer, create an instance of one of these classes and pass it to the network using the CDnn::SetSolver method.

The additional settings for the optimizer are:

  • learning rate (CDnnSolver::SetLearningRate)
  • regularization factors (CDnnSolver::SetL2Regularization and CDnnSolver::SetL1Regularization)

Training iteration

After the initializer and the optimizer have been set, you may start the learning process. To do that, set the input data blobs for all source layers and call the CDnn::RunAndLearnOnce method.

The method call will perform three internal operations:

  1. Reshape - calculates the size and allocates memory for the output blobs of every layer, using the source blobs' size.
  2. RunOnce - performs all calculations on the source blob data.
  3. BackwardAndLearnOnce - calculates the loss function gradient for all trainable weights and updates the trainable weights through backpropagation.

The learning process consists of many iterations, each calling CDnn::RunAndLearnOnce for new source data.

Running the network

Sometimes during learning you will need to get the network response without changing the current parameters, for example, on test data for validation. In this case, use the CDnn::RunOnce method, which, unlike CDnn::RunAndLearnOnce, does not calculate the gradients and update the trainable parameters. This method is also used for working with the trained network.

Serialization

Two classes are defined for serializing the network:

  • CArchiveFile represents the file used for serialization
  • CArchive represents the archive used to write and read from CArchiveFile

The serializing direction is determined by the settings with which the file and the archive instances are created:

  • to save the network into a file, create CArchiveFile with CArchive::store flag and an archive over it with CArchive::SD_Storing flag.
  • to read the network from the file, use CArchive::load and CArchive::SD_Loading flags instead.

Once the archive has been created, call the CDnn::Serialize method to serialize the network. The direction will be chosen automatically.

See also more details about the classes used for serialization.

Sample code for saving the network

CRandom random( 0x123 );
CDnn net( random, GetDefaultCpuMathEngine() );

/*
... Build and train the network ...
*/

CArchiveFile file( "my_net.archive", CArchive::store );
CArchive archive( &file, CArchive::SD_Storing );
archive.Serialize( net );
archive.Close();
file.Close();

Using the network

// The math engine working on GPU that uses not more than 1GB GPU RAM
IMathEngine* gpuMathEngine = CreateGpuMathEngine( 1024 * 1024 * 1024, GetFmlExceptionHandler() );

{
    CRandom random( 0x123 );
    CDnn net( random, *gpuMathEngine );

    // Load the network
    {
      CArchiveFile file( "my_net.archive", CArchive::load );
      CArchive archive( &file, CArchive::SD_Loading );
      archive.Serialize( net );
      // file and archive will be closed in destructors
    }

    // The blob to store a single 32x32 RGB image
    CPtr<CDnnBlob> dataBlob = CDnnBlob::Create2DImageBlob( *gpuMathEngine, CT_Float, 1, 1, 32, 32, 3 );

    dataBlob->Fill( 0.5f ); // Filling with a constant value

    // Get the pointers to the source and the sink layers
    CPtr<CSourceLayer> src = CheckCast<CSourceLayer>( net.GetLayer( "source" ) );
    CPtr<CSinkLayer> sink = CheckCast<CSinkLayer>( net.GetLayer( "sink" ) );

    src->SetBlob( dataBlob ); // setting the input data
    net.RunOnce(); // running the network
    CPtr<CDnnBlob> resultBlob = sink->GetBlob(); // getting the response

    // Extract the data and put it in an array
    CArray<float> result;
    result.SetSize( resultBlob->GetDataSize() );
    resulBlob->CopyTo( result.GetPtr() );

    // Analyze the network response

    // Destroy all blobs and the network object
}

// Delete the engine after all blobs are deleted
delete gpuMathEngine;

The layers