A simple Neural Network to learn the Iris data. The program
runs for 5000 epochs, although it should converge within a few hundred.
The number of samples misclassified in the current epoch and the smallest
number of misclassifications in the current run are plotted.