NNHelferlein - ADo's Neural Networks Little Helpers
The package provides helpers and utilities mainly to be used with the Knet package to build artificial neural networks. The German word Helferlein means something like little helper; please pronounce it like hell-fur-line
.
The package follows mainly the Knet-style; i.e. all networks can be trained with the Knet-iterators, all layers can be used together with Knet-style quickly-self-written layers, all Knet-networks can be trained with tb_train(), all data providers can be used together, ...
The project is hosted here: https://github.com/KnetML/NNHelferlein.jl
Installation
NNHelferlein is a registered package and can be installed with the package manager as:
] add NNHelferlein
or
using Pkg
Pkg.add("NNHelferlein")
First Steps
NNHelferlein provides quick and easy definition, training and validation of neural network chains.
Symbolic API
The Keras-like symbolic API allows for building simple Chains, Classifiers and Regressors from predefined or self-written layers or functions.
A first example may be the famous MNIST handwriting recognition data. Let us assume the data is already loaded in minibatches in a dtrn
iterator and a MLP shall do the job. The remainder is as little as:
mlp = Classifier(Dense(784, 256),
Dense(256, 64),
Dense(64, 10, actf=identity)))
mlp = tb_train!(mlp, Adam, dtrn, epochs=10, split=0.8,
acc_fun=accuracy, eval_size=0.2)
Chains may be built of type Chain
, Classifier
or Regressor
. Simple Chain
s bring only a signature model(x)
to compute forward computations of a data-sample, a minibatch of data as well as many minibatches of data (the dataset -here: dtrn- must be an iterable object that provides one minibatch on every call).
Classifier
s and Regressor
s in addition already come with signatures for loss calculation of (x,y)-minibatches (model(x,y)
) with crossentropy loss (i.e. negative log-likelihood) and square-loss respectively. This is why for both types the last layer must not have an activation function (the Helferlein Dense
-layer comes with a logistic/sigmoid activation by default; alternatively the Linear
-layer can be used that have no default activation function).
The function tb_train!()
updates the model with the possibility to specify optimiser, training and validation data or an optional split ratio to perform a random training/validation split. The function offers a multitude of other options (see the API-documentation for details) and writes tensorboard log-files that allow for online monitoring of the training progress during training via tensorboard.
A second way to define a model is the add_layer!()
-syntax, here shown for a simple LeNet-like model for the same problem:
lenet = Classifier()
add_layer!(lenet, Conv(5,5,1,20))
add_layer!(lenet, Pool())
add_layer!(lenet, Conv(5,5,20,50))
add_layer!(lenet, Pool())
add_layer!(lenet, Flat())
add_layer!(lenet, Dense(800,512))
add_layer!(lenet, Dense(512,10, actf=identity))
mlp = tb_train!(lenet, Adam, dtrn, epochs=10, split=0.8,
acc_fun=accuracy, eval_size=0.2)
As an alternative the +
-operator is overloaded to be able to just add layers to a network:
julia> mdl = Classifier() + Dense(2,5)
julia> mdl = mdl + Dense(5,5) + Dense(5,1, actf=identity)
julia> summary(mdl)
NNHelferlein neural network summary:
Classifier with 3 layers, 51 params
Details:
Dense layer 2 → 5 with sigm, 15 params
Dense layer 5 → 5 with sigm, 30 params
Dense layer 5 → 1 with identity, 6 params
Total number of layers: 3
Total number of parameters: 51
Of course, all possibilities can be combined as desired; the following code gives a similar model:
filters = Chain(Conv(5,5,1,20),
Pool(),
Conv(5,5,20,50),
Pool())
classif = Chain(Dense(800,512),
Dense(512,10, actf=identity))
lenet2 = Classifier(filters,
Flat())
add_layer!(lenet2, classif)
mlp = tb_train!(lenet2, Adam, dtrn, epochs=10, split=0.8,
acc_fun=accuracy, eval_size=0.2)
Models can be summarised with summary()
or the print_network()
-helper:
julia> summary(lenet)
Neural network summary:
Classifier with 7 layers, 440812 params
Details:
Conv layer 1 → 20 (5,5) with relu, 520 params
Pool layer, 0 params
Conv layer 20 → 50 (5,5) with relu, 25050 params
Pool layer, 0 params
Flat layer, 0 params
Dense layer 800 → 512 with sigm, 410112 params
Dense layer 512 → 10 with identity, 5130 params
Total number of layers: 7
Total number of parameters: 440812
Free model definition
Another way of model definition gives the full freedom to define a forward function as pure Julia code. In the Python world this type of definition is often referred to as the functional API - in the Julia world we hesitate calling it an API, because at the end of the day all is just out-of-the-box Julia! Each model just needs a type, able to store all parameters, a signature model(x)
to compute a forward run and predict the result and a signature model(x,y)
to calculate the loss.
For the predefined Classifier
and Regressor
types the signatures are predefined - for own models, this can be easily done in a few lines of code.
The LeNet-like example network for MNIST may be written as:
The type and constructor:
struct LeNet <: AbstractNN
drop1
conv1
pool1
conv2
pool2
flat
drop2
mlp
predict
function LeNet(;drop=0.2)
return new(Dropout(drop),
Conv(5,5,1,20),
Pool(),
Conv(5,5,20,50),
Pool(),
Flat(),
Dropout(drop),
Dense(800, 512),
Dense(512, 10, actf=identity))
end
end
Of course the model may be configured by giving the constructor more parameters. Also the code may be written better organised by combining layers to Chains
.
The forward signature:
Brute-force definition:
function (nn::LeNet)(x::AbstractArray)
x = nn.drop1(x)
x = nn.conv1(x)
x = nn.pool1(x)
x = nn.conv2(x)
x = nn.pool2(x)
x = nn.flat(x)
x = nn.drop2(x)
x = nn.mlp(x)
x = nn.predict(x)
return x
end
... or a little bit more elegant:
function (nn::LeNet)(x::AbstractArray)
layers = (nn.drop1, nn.conv1, nn.pool1,
nn.conv2, nn.pool2, nn.flat,
nn.drop2, nn.mlp, nn.predict)
for layer in layers
x = layer(x)
end
return x
end
... or a little bit more elegant:
function (nn::LeNet)(x::AbstractArray)
layers = (nn.drop1, nn.conv1, nn.pool1,
nn.conv2, nn.pool2, nn.flat,
nn.drop2, nn.mlp, nn.predict)
return foldl((x,layer)->layer(x), layers, init=x)
end
... or a little more structured:
function (nn::LeNet)(x::AbstractArray)
filters = Chain(nn.drop1, nn.conv1, nn.pool1,
nn.conv2, nn.pool2)
classifier = Chain(nn.drop2, nn.mlp, nn.predict)
x = filters(x)
x = nn.flat(x)
x = classifier(x)
return x
end
The loss-signature:
function (nn::LeNet)(x,y)
return nll(nn(x), y)
end
Here we use the Knet.nll()
function to calculate the crossentropy.
That's it!
Belive it or not - that's all you need to leave the limitations of the Python world behind and playfully design any innovative neural network in just a couple of lines of Julia code.
Every object of type LeNet
is now a fully functional model, which can be trained with tb_train!()
.
With the signatures defined above, the model can be executed with an array of data (i.e. one minibatch) to get the prediction:
julia> x,y = first(dtrn)
julia> lenet = LeNet()
julia> lenet(x) |> x->softmax(x, dims=1)
10×8 CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}:
0.211679 0.214012 0.215643 … 0.208102 0.215791 0.212072
0.141102 0.134484 0.132739 0.136913 0.135558 0.134883
0.0590838 0.0632321 0.0624464 0.0624035 0.0603432 0.0610424
0.221033 0.222141 0.222283 0.223187 0.216619 0.226215
0.0203605 0.0201211 0.0212645 0.0207431 0.0212106 0.0206721
0.0327132 0.0317656 0.0305331 … 0.0320621 0.033188 0.031767
0.181409 0.178959 0.180939 0.182545 0.183172 0.179674
0.0242452 0.0240787 0.0251508 0.0251202 0.0253443 0.0244217
0.0522174 0.0531308 0.0512095 0.0512213 0.05218 0.0517014
0.0561568 0.0580765 0.0577915 0.0577029 0.056594 0.0575527
... with a minibatch and the corresponding teaching input (i.e. labels) to get the loss:
julia> @show y
julia> lenet(x,y)
y = Int8[5, 10, 4, 1, 9, 2, 1, 3]
2.3798099f0
... or with an iterator of minibatches to get the mean loss for the dataset:
julia> lenet(dtrn)
2.6070921f0
The next step is to have a look at the examples in the GitHub repo:
Overview
Datasets
Some datasets as playground-data are provided with the package. Maybe more will follow...
MIT Normal Sinus Rhythm Database is a modified version of the Physionet dataset, adapted for use in machine leraning (see the docstring of
dataset_mit_nsr()
for details).the famous MNIST dataset.
R.A. Fisher's Iris dataset.
API Reference
- Chains
- Layers
- Data providers
- Training
- Evaluation and accuracy
- ImageNet tools
- Other utils
- Pretrained networks
Index
NNHelferlein.AbstractChain
NNHelferlein.AbstractLayer
NNHelferlein.AbstractNN
NNHelferlein.Activation
NNHelferlein.AttentionMechanism
NNHelferlein.AttnBahdanau
NNHelferlein.AttnDot
NNHelferlein.AttnInFeed
NNHelferlein.AttnLocation
NNHelferlein.AttnLuong
NNHelferlein.BatchNorm
NNHelferlein.Chain
NNHelferlein.Classifier
NNHelferlein.Conv
NNHelferlein.DataLoader
NNHelferlein.DeConv
NNHelferlein.Dense
NNHelferlein.DepthwiseConv
NNHelferlein.Dropout
NNHelferlein.Embed
NNHelferlein.EmbedAminoAcids
NNHelferlein.FeatureSelection
NNHelferlein.Flat
NNHelferlein.GPUIterator
NNHelferlein.GaussianNoise
NNHelferlein.GlobalAveragePooling
NNHelferlein.ImageLoader
NNHelferlein.LayerNorm
NNHelferlein.Linear
NNHelferlein.Logistic
NNHelferlein.MBMasquerade
NNHelferlein.MBNoiser
NNHelferlein.MultiHeadAttn
NNHelferlein.Pad
NNHelferlein.PartialIterator
NNHelferlein.Pool
NNHelferlein.PositionalEncoding
NNHelferlein.PyFlat
NNHelferlein.Recurrent
NNHelferlein.RecurrentUnit
NNHelferlein.Regressor
NNHelferlein.ResNetBlock
NNHelferlein.SequenceData
NNHelferlein.Softmax
NNHelferlein.TFDecoder
NNHelferlein.TFDecoderLayer
NNHelferlein.TFEncoder
NNHelferlein.TFEncoderLayer
NNHelferlein.TokenTransformer
NNHelferlein.Transformer
NNHelferlein.UnPool
NNHelferlein.VAE
NNHelferlein.WordTokenizer
Base.:+
Base.summary
NNHelferlein.abs_error_acc
NNHelferlein.add_layer!
NNHelferlein.aminoacid_tokenizer
NNHelferlein.array2RGB
NNHelferlein.array2image
NNHelferlein.blowup_array
NNHelferlein.clean_sentence
NNHelferlein.confusion_matrix
NNHelferlein.convert2CuArray
NNHelferlein.convert2KnetArray
NNHelferlein.copy_network
NNHelferlein.crop_array
NNHelferlein.dataframe_minibatch
NNHelferlein.dataframe_read
NNHelferlein.dataframe_split
NNHelferlein.dataset_fashion_mnist
NNHelferlein.dataset_iris
NNHelferlein.dataset_mit_nsr
NNHelferlein.dataset_mnist
NNHelferlein.dataset_pfam
NNHelferlein.de_embed
NNHelferlein.dot_prod_attn
NNHelferlein.embed_blosum62
NNHelferlein.embed_vhse8
NNHelferlein.emptyCuArray
NNHelferlein.flatten
NNHelferlein.focal_bce
NNHelferlein.focal_nll
NNHelferlein.get_beta
NNHelferlein.get_cell_states
NNHelferlein.get_class_labels
NNHelferlein.get_hidden_states
NNHelferlein.get_imagenet_classes
NNHelferlein.get_resnet50v2
NNHelferlein.get_tatoeba_corpus
NNHelferlein.get_vgg16
NNHelferlein.global_average_pooling
NNHelferlein.hamming_dist
NNHelferlein.ifgpu
NNHelferlein.image2array
NNHelferlein.init0
NNHelferlein.load_network
NNHelferlein.merge_heads
NNHelferlein.minibatch_eval
NNHelferlein.mk_class_ids
NNHelferlein.mk_image_minibatch
NNHelferlein.mk_padding_mask
NNHelferlein.mk_peek_ahead_mask
NNHelferlein.pad_sequence
NNHelferlein.peak_finder_acc
NNHelferlein.positional_encoding_sincos
NNHelferlein.predict
NNHelferlein.predict_imagenet
NNHelferlein.predict_top5
NNHelferlein.preproc_imagenet_resnet
NNHelferlein.preproc_imagenet_resnetv2
NNHelferlein.preproc_imagenet_vgg
NNHelferlein.print_network
NNHelferlein.recycle_array
NNHelferlein.reset_cell_states!
NNHelferlein.reset_hidden_states!
NNHelferlein.save_network
NNHelferlein.separate_heads
NNHelferlein.sequence_minibatch
NNHelferlein.set_beta!
NNHelferlein.set_cell_states!
NNHelferlein.set_hidden_states!
NNHelferlein.split_minibatches
NNHelferlein.squared_error_acc
NNHelferlein.tb_train!
NNHelferlein.truncate_sequence
Changelog
The history can be found here: ChangeLog of NNHelferlein package