ENUMs, User Preferences, and the MySQL SET Datatype
Symfony 2 Crash Course
Enforce Coding Standards with PHP_CodeSniffer and Eclipse IDE on Ubuntu Linux
Nice n' Easy JQuery Image Rotator
Getting Set up with Ogre 3D on Ubuntu
Book Review: How to Implement Design Patterns in PHP

Introduction to Neural Networks

Sunday, 12 February 12, 12:24 pm
Artificial Intelligence begins with the study of intelligence in biological systems and takes the knowledge and theories gained from that to engineer systems which exhibit intelligence. Neural networks are a perfect example of this: abstract mathematical models of the neurons that make up the brain. In some cases, neural networks are used in research to model the brain, and in these cases the aim is to make the artifical neurons resemble their biological counterparts as closely as possible.

At the other end of the spectrum, neural nets are used in real-world engineering applications to address problems that are too complex or imprecise for a practical solution based on traditional models, and then it's not so important to simulate biological neurons accurately. The major difference between neural networks and traditional computational approaches is that a neural net features many processors working in parallel rather than a single processor working in series. This leads to greater fault tolerance, but one drawback with current networks is that each one has to be designed and configured for one specific task.

Biological Basis for Artificial Neurons

A real neuron in the brain consists of the main body - a blob with a large nucleus in the centre - with many fine tendrils sprouting off in all directions (dendrites) and one long thicker stem - the axon - snaking off, itself ending in a branching out of fine tendrils. The dendrites receive inputs from other neurons via synapses, where the conductivity of each synapse is regulated by the concentrations of various carrier/inhibitor chemicals in that synapse.

Whether or not a neuron fires is determined as a function of the incoming signals from the dendrites, and the output is carried down the axon to synapses of other neurons. Billions of these in a big glob is what we know of as the brain. During learning, the concentrations of brain chemicals in different synapses changes so that the system as a whole is more likely to produce the correct solution to a problem - ie the animal is going to react appropriately to a given stimulus.

Threshold Logic Unit

Using this biological system as a blueprint, the threshold logic unit has multiple inputs each having its own weighting which together determine the TLU's single output signal. The usual method is to multiply each input by its weight, add them all together, and fire the output if the result is above a set threshold. Thus the weight of each input corresponds to the concentration levels of a synapse in the brain.

The TLU makes for a very simple mathematical model, where the set of inputs can be treated as a vector, X, (x1, x2, .. , xn), and there is a corresponding set of weights which is also a vector, W, (w1, w2, .. , wn). The sum of each input times its weight is simply the vector dot product X · W, i.e. the summation Σ xiwi. The threshold of a TLU is usually denoted by theta, θ.

Automated Learning by Error Propagation

So the basic principle of a neural network is that multiple inputs are weighted and fed into a number of TLUs, each of which feed into further TLUs with their own sets of weights, and finally the network produces an output signal that determines an intelligent response to those particular inputs. Clearly the weights for each TLU are crucial, but how do we know what they should be?

Ideally, the neural network would be like the brain, and teach itself ie learn from experience. There are many algorithms to accomplish this, and Error Propagation is just one. It requires a training session, where the network is fed test inputs for which the correct output is known. For a single TLU, the princple is fairly straightforward and consists of feeding the test input values to the network and comparing its output with the correct output. For each set of test inputs, we will be able to determine the network's error i.e. the difference between its actual output and the correct output. If the actual output is too low, we would increase all the weights by a small amount, and decrease them if the output is too high, using the following formula:

new weight = old weight + α(Outtarget - Outactual) × In

Here, α is the learning rate, and is typically a small value such as 0.1 or 0.05. The idea is that for each set of test data, we only change the weights a little bit, to help ensure that a single unusual example doesn't skew all the results excessively.

Things do however become a bit more complicated when we want to train a full network rather than just a single TLU. The algorithm used then is known as backpropagation.

Please enter your comment in the box below. Comments will be moderated before going live. Thanks for your feedback!

Cancel Post

/xkcd/ Chess Zoo