dinrhiw2/tools at 180324-sync · cslr/dinrhiw2

History

Name		Name	Last commit message	Last commit date
parent directory ..
docs		docs
CommViolPredUnnormalizedData.txt		CommViolPredUnnormalizedData.txt
CryptoFileSource.cpp		CryptoFileSource.cpp
CryptoFileSource.h		CryptoFileSource.h
CryptoInterface.java		CryptoInterface.java
FileSource.cpp		FileSource.cpp
FileSource.h		FileSource.h
MMAP.cpp		MMAP.cpp
MMAP.h		MMAP.h
Makefile.in		Makefile.in
README.txt		README.txt
Vera.ttf		Vera.ttf
aescipher.cpp		aescipher.cpp
argparser.output		argparser.output
argparser.tab.cpp		argparser.tab.cpp
argparser.tab.h		argparser.tab.h
argparser.ypp		argparser.ypp
commviol.in		commviol.in
commviol.out		commviol.out
cpuid_threads.cpp		cpuid_threads.cpp
cpuid_threads.h		cpuid_threads.h
cputest.cpp		cputest.cpp
cryptoanalysis.cpp		cryptoanalysis.cpp
dataplot.cpp		dataplot.cpp
dataplot.h		dataplot.h
datatool.cpp		datatool.cpp
diabetes.csv		diabetes.csv
diabetes_input.csv		diabetes_input.csv
diabetes_output.csv		diabetes_output.csv
diabetes_preprocess.py		diabetes_preprocess.py
examples.csv		examples.csv
gendata.cpp		gendata.cpp
gendata2.cpp		gendata2.cpp
gendata3.cpp		gendata3.cpp
gendata3_norm.cpp		gendata3_norm.cpp
iris.data		iris.data
iris.in		iris.in
iris.out		iris.out
lines.data		lines.data
mapping.cpp		mapping.cpp
mnist_digit_test_input.out		mnist_digit_test_input.out
mnist_digit_train_input.out		mnist_digit_train_input.out
mnist_digit_train_output.out		mnist_digit_train_output.out
mnist_kaggle_export.py		mnist_kaggle_export.py
multiplication_problem.cpp		multiplication_problem.cpp
nntool.cpp		nntool.cpp
nntool.h		nntool.h
numbers.in		numbers.in
numbers.out		numbers.out
process_data2.pl		process_data2.pl
process_data3.pl		process_data3.pl
process_data4.pl		process_data4.pl
process_examples.pl		process_examples.pl
tensorflow_diabetes.py		tensorflow_diabetes.py
test_crypto.sh		test_crypto.sh
test_crypto_superreso.sh		test_crypto_superreso.sh
test_data.sh		test_data.sh
test_data2.sh		test_data2.sh
test_data2_bayes.sh		test_data2_bayes.sh
test_data2_bfgs.sh		test_data2_bfgs.sh
test_data2_parallel.sh		test_data2_parallel.sh
test_data2_parallel_random.sh		test_data2_parallel_random.sh
test_data3.sh		test_data3.sh
test_data3_bayes.sh		test_data3_bayes.sh
test_data3_deep.sh		test_data3_deep.sh
test_data3_ica.sh		test_data3_ica.sh
test_data3_lbfgs.sh		test_data3_lbfgs.sh
test_data3_multimethod.sh		test_data3_multimethod.sh
test_data3_nondeep.sh		test_data3_nondeep.sh
test_data3_parallel.sh		test_data3_parallel.sh
test_data3_recurrent.sh		test_data3_recurrent.sh
test_data4_bayes.sh		test_data4_bayes.sh
test_data4_bayes_ica.sh		test_data4_bayes_ica.sh
test_data4_bfgs.sh		test_data4_bfgs.sh
test_data4_lbfgs.sh		test_data4_lbfgs.sh
test_data4_parallel.sh		test_data4_parallel.sh
test_data4_plbfgs.sh		test_data4_plbfgs.sh
test_data4_recurrent.sh		test_data4_recurrent.sh
test_data_bayes.sh		test_data_bayes.sh
test_data_bayes_ica.sh		test_data_bayes_ica.sh
test_data_bfgs.sh		test_data_bfgs.sh
test_data_deep.sh		test_data_deep.sh
test_data_examples.sh		test_data_examples.sh
test_data_numbers.sh		test_data_numbers.sh
test_data_parallel.sh		test_data_parallel.sh
test_data_recurrent.sh		test_data_recurrent.sh
test_data_residual.sh		test_data_residual.sh
test_data_superreso.sh		test_data_superreso.sh
test_datamax.sh		test_datamax.sh
test_datanorm.sh		test_datanorm.sh
test_datasort.sh		test_datasort.sh
test_diabetes.sh		test_diabetes.sh
test_inv_crypto_superreso.sh		test_inv_crypto_superreso.sh
test_mnist.sh		test_mnist.sh
test_mnist_superreso.sh		test_mnist_superreso.sh
test_titanic.sh		test_titanic.sh
titanic_kaggle_export.py		titanic_kaggle_export.py
titanic_test_input.out		titanic_test_input.out
titanic_train_input.out		titanic_train_input.out
titanic_train_output.out		titanic_train_output.out

README.txt

The primary programs created using "make" command
(after make all install in the root directory one
level higher to install dinrhiw library) are:
nntool and dstool. (+ make install to instal dinrhiw-tools).

After this you then preprocess data text files using perl scripts:

process_data2.pl
process_data3.pl

Which create proper text files with data that can be imported into
datasets files using dstool. These datasets files are then read by
nntool (neural network code) that can be used for machine learning
relationships from the data.

The test scripts to test neural network training code are:

test_data.sh
test_data_parallel.sh
test_data2.sh
test_data2_parallel.sh
test_data2_parallel_random.sh
test_data3.sh
test_data3_parallel.sh

which use both dstool and nntool and print some results.

The scripts do PCA preprocessing and mean variance removal
preprocessing which means that data is something like normally
distributed data with zero mean = Normal(0, I) when it is fed
to the neural network code.

This means that mean error of the training process is often
usable to predict learning results. Values below 0.01 mean
that neural network errors are close to minimum and results
are usable and mean error rates higher than it mean that
the neural network do NOT converge (dataset 3) and it cannot
be used to predict future outcomes.

The "best" gradient descent code in nntool is "lbfgs" which
starts multistart parallel L-BFGS searches with NUMCORES threads
where NUMCORES is number of cores or hyperthreading units in CPU.

It keeps doing Limited memory BFGS optimization with early stopping
again and again from semirandomly chosen starting points until
timeout or number of iterations has been reached.

DATA
----

Machine learning datasets are from UCI Machine learning repository:

http://archive.ics.uci.edu/ml/

Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository 
[http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, 
School of Information and Computer Science.

Files

tools

Directory actions

More options

Directory actions

More options

Latest commit

History

tools

Folders and files

parent directory