auto-nng

auto-nng is a software for analysis and classification of data, using artificial neuronal networks. You can feed an amount of datasets consisting of certain input and output parameters into the program, to make it try to find a mathematical correlation between the input and output parameters. Afterwards the program tries to calculate the output parameters for datasets which only have known input parameters.

auto-nng is free, and can be used and enhanced under the terms of the given license (MIT-style). It was developed under the NetBSD operating system, and runs under Linux (and other similar systems) too.

Which kind of data can be processed by auto-nng?

The data being processed by auto-nng should be in a binary form (e.g. answers to questions, which you can answer with “yes” or “no”). This binary data is to be represented by real numbers between -1 (for “no”) and +1 (for “yes”). Numbers between those limits are possible too, for example 0 for “unknown” or 0.5 for “probably yes” or “mostly”. If there is data in continuous rather than binary form (e.g. measurands), auto-nng can be used to transfer those values into two fuzzy binary attributes (“smaller than average?” and “greater than average?”). For this process the standard deviation is used.

It is very important that there is a big amount of example datasets provided, as the program is based on learning and random processes, which wont work well with few datasets.

Which interfaces are available for auto-nng?

auto-nng accepts data as real numbers in form of CSV files (comma seperated values). The results are also presented as CSV files.

How does auto-nng work?

As the first step auto-nng converts the properties of datasets marked as “continuous” into two (fuzzy) binary attributes, to make it easier for the neuronal networks to treat deviations differently with regard to the direction (lower or higher). For this the standard deviation is used as a scale. In the next step all example datasets are randomly mixed and split into two equal sized parts. One part is going to be used for the training of neuronal networks, the other part is only used to test the ability to abstract. Now different structures of neuronal networks are constructed (depending on the number of input and output attributes). Those can be seen as mathematical formulas of different structure with multiple coefficients, which are supposed to transform the input attributes into output attributes. In the beginning all coefficients are zero (i.e. neutral) and lead always to a neutral output (“unknown” or average). In the following process, all neuronal networks will slightly mutate (by random change of the coefficients). If those mutations lead to success in regard to the training data, the changes are kept, otherwise discarded. The mutation rate is adaptivly regulated in order to try to make the process as fast as possible.

The neuronal networks will try to find out a relation between the input and output values of the training data due to that process. To ensure that the ability to abstract is not lost, the neuronal networks are tested with the second part of the datasets, which were not used to train them. The neuronal network with the best results for the test datasets is finally returned by the program.

How exact are the results of auto-nng?

auto-nng is experimental software. The behaviour of auto-nng is among other things dependent on random processes. You cannot predict the reliability of the software. When using continuous output values results are limited to a deviation of maximum two times the standard deviation of the training/test data.

Download

Changes

  • 2011-08-14: Version 1.7 released
    • Changed calculation of layer sizes to simplify computational complexity
    • Removed unused code
    • Changed timeout to 90 seconds in test script
  • 2011-07-31: Version 1.6 released
    • use inverse of x → x / (1−x²) instead of hyperbolic tangent for signal saturation
    • use two new distinct functions for fuzzyfication of continous input and output values to avoid errors on defuzzification
  • 2008-03-11: Version 1.5 released
    • The source now includes "string.h" instead of "strings.h" to suppress compiler warnings on some platforms.
  • 2008-03-09: Version 1.4 released
    • The mutation rate is now specific for every weight, allowing mutation to be directed to those weights where mutation was most successful.
  • 2007-05-04: Version 1.3 released
    • optimized the transformation function for continuous input and output values to cause a smaller error during defuzzyfication
  • 2007-05-01: Version 1.2 released
    • changed the transformation function for continous input and output values to be continuously differentiable
  • 2007-04-28: Version 1.1 released
    • fixed a bug, which caused auto-nng to crash under Linux