Skip to content

cslr/dinrhiw2

Repository files navigation

WHAT DINRHIW2 IS?
-----------------

Dinrhiw is a linear algebra library and machine learning library.
Dinrhiw implements linear algebra, PCA+ICA, complex value neural network, 
variational autoencoder and deep learning 
(residual neural network+leaky ReLU non-linearity) codes. 

Currently, the feedforward neural network code supports:

* popular ADAM gradient descent (backpropagation)
* hamiltonian monte carlo sampling (HMC) and simple bayesian neural network
* parallel computing (via OpenMP)
* residual neural network (deep learning)
* experimental: (RBM) GB-RBM and BB-RBM 2nd order
  LBFGS optimizer and HMC sampler => greedy deep belief net optimizer
* experimental: reinforcement learning code
  
Read wiki pages for further documentation: https://github.com/cslr/dinrhiw2/wiki
  
Dinrhiw2 has been developed by Tomas Ukkonen.

BUILDING IT
-----------

You need GNU GCC (www.gcc.org). The code compiles both on Linux and
Windows (requires *nix environment). It is recommended to try
to compile and use the library initially on Linux.

GCC 8.* has bug in random_device, you need to use GCC >= 9.0 version
compiler or newer to work around this compiler bug.

Library dependencies:

* OpenBLAS, Intel MKL or AMD BLIS (cblas.h interface),
* GMP (arbitrary precision mathematics)
* Python [not really used]


To build and install library execute: 


./build.sh
make install


commands at the top level.


For the working examples how to use dinrhiw look at the *tools* directory
below the root dictory.

Building tools (you need to install bison parser):

cd tools
make all
su
make install

It creates programs:

dstool and nntool - dataset management and neural network weight learning.

So the keyfiles to read for the (neural network) documentation are

tools/nntool.cpp
src/dataset.h
src/neuralnetwork/nnetwork.h

Expecially the first one shows how to use nnetwork<> class properly to 
learn from the data. Note that it only accepts aprox. [-1,1] valued data
as a default so proper preprocessing of the data using dataset class can
be very important in order to keep data within the correct range 
(+ PCA preprocessing can make the learning from the data exponentially 
   faster in some cases).

IMPORTANT: There is also (alternative) neural network implementation
           in neuralnetwork.h and related files that DO NOT work.

Additionally, there are 

tools/test_data.sh
tools/test_data2.sh
tools/test_data3.sh
tools/test_data3_bayes.sh


scripts that shows how the learning and data prediction from example data 
works in practice using dstool and nntool commands
(only bayes and feedforward training).

Use of RBM neural networks requires direct use the library
(C++ interface classes).


ADDITIONAL NOTES
----------------

The library contains needless implementation of various algorithms
that DO NOT belong to this library. They will be removed slowly
before 1.0 release. (cleanup the code)


PLAN
----

In development:

Stacked RBMs (GB-RBM + BB-RBMs) and implementation of DBN networks,
which can be transformed to nnetwork and further trained using output examples.

Finally add support of parallel DBN pretraining + nnetwork optimization to tools ("nntool").

TODO

Reinforcement learning.