Skip to content
This repository was archived by the owner on May 24, 2018. It is now read-only.

Tutorial

antinucleon edited this page Apr 7, 2014 · 17 revisions

Reference configuration example MNIST.conf

Before you start

  • Download the MNIST dataset, keep the gz files and don't decompress it

Setup data iterator configuration

CXXNET use iterator to provide data batch to the network trainer. First we need to set the data type (eval or data). Then we need to specify iterator type (mnist, cifar, image, imgbin, etc). Then set some attribute to the iterator including shuffle, file path and so on. Here is an example for MNIST

Setup training iterator

  • Change path_image to the path of training image file, change path_label to the path of training label file you download just now
data = train
iter = mnist
    path_img = "../data/mnist/train-images-idx3-ubyte.gz"
    path_label = "../data/mnist/train-labels-idx1-ubyte.gz"
    shuffle = 1
iter = end

Setup test iterator

  • Change path_image to the path of test image file, change path_label to the path of test label file you download just now
eval = test
iter = mnist
    path_img = "../data/mnist/t10k-images-idx3-ubyte.gz"
    path_label = "../data/mnist/t10k-labels-idx1-ubyte.gz"
iter = end

Setup network structure

Network structure start with declaration "netconfig=start" and end with "netconfig=end". They layer is declared in the format "layer[ from_num -> to_num ] = layer_type:nick" Then comes the parameters of the layer. Here is an example for MNIST

netconfig=start
layer[0->1] = fullc:fc1
  nhidden = 100
  init_sigma = 0.01
layer[1->2] = sigmoid:se1
layer[2->3] = fullc:fc1
  nhidden = 10
  init_sigma = 0.01
layer[3->3] = softmax
netconfig=end

Notice some special layer like softmax and dropout use self-loop.

Setup input size and batch size

In this section, we need to set the input shape and batch size. The input shape should be 3 numbers, split by ','; The 3 numbers is channel, height and width for 3D input or 1, 1, dim for 1D vector input. In this example it is

input_shape = 1,1,784
batch_size = 100

Setup global parameters

Global parameters are used for setting the trainer behavior. In this example, we use the following configuration.

dev = cpu
save_model = 15
max_round = 15
num_round = 15
train_eval = 1
random_type = gaussian

First set working device dev ( cpu or gpu ); frequent to save mode save_model ; training round num_round and max_round and whether to print training set evaluation train_eval. The random_type defines weight initialization method. We provides ( gaussian method and xavier method)

Setup learning parameters

learning parameter change the updater behavior. eta is known as learning rate, and wd is known as weight decay. And momentum will help train faster.

eta = 0.1
momentum = 0.9
wd  = 0.0

Metric method

For classification, we use error to metric performance. So set

metric = error

Running experiment

Make a folder named "models" for saving the training model in the same folder you calling cxxnet_learner
Then just run

cxxnet_learner ./example/MNIST.conf

Then you will get a nearly 98% correct result in just several seconds.

Clone this wiki locally