Welcome back Developers!
Today, I am going to begin discussing Neural Network Theory! In this post, I will be focusing on covering what a neuron is, as well as two different models for neurons: The Perceptron, and the Sigmoid Neuron.
Neural Networks are designed to mimic Neural Networks found in that of our own Brain. The neuron serves as the primary building block of our brain, as well as the Artificial Neural Networks we’ll be discussing here.
Every second, billions of neurons fire in your brain. After these neurons fire, a small voltage is sent down the synapses which then exchange information with other neurons. This chained reaction continues on and on, and this is how we learn, talk and recognize patterns. Everything we do stems from the way our brains process information.
So, how does this relate to Artificial Neural Networks? Well, as I stated, Artificial Neural Networks were primarily designed using the brain as inspiration, therefore, neurons play the same role they would in your mind, as in our Artificial Neural Networks. With that said, let us formally define it.
An Artificial Neuron is the building block of all Neural Networks. Conventionally, they take one or more inputs, evaluate those inputs, and then they determine, algorithmic-ally, what their output is, which we’ll come to know as the Activation Function.
To gain a better intuition for how a simple neuron would work, imagine you have a decision to make. The decision serves as the neuron. Now in order for you to make this decision a series of conditions must be met, we will call this the neurons threshold. Some of these conditions may carry more value than others, so we can say that particular condition “weighs” more. Once we know the conditions, we can evaluate them with respect to their weight. Finally, if they meet the criteria for the decision to be made, we make the decision, or else, we choose not to.
To put this in perspective, say it’s mid-afternoon, and you’re hungry. Whether or not you have a snack might depend on a number of conditions. Let’s take a look at the image below for a visualization.
We have 4 conditions, each of these will be given a value of 1 or 0, depending if it the question is true. These conditions serve as inputs into the neuron, which is the final decision maker. But, we are forgetting the weights, because whether or not you are excited for dinner that day may have more a bigger pull than if you had a lighter lunch does.
The weights for each of these conditions could change based upon whoever is making the decision, but for this example lets assume they are 2, 2, 1, and 1 respectively. Then, let’s give the neuron a threshold of 5.
We then multiply the condition value, by the weight, and add their products of each of the inputs. If the sum is above or equal to the neurons threshold, then you make the decision to have a snack, but if it is below you choose not to.
A simple decision like this demonstrates the use of such a simple neuron, as well as provides us a basis for the first model of neuron that we’ll look at: The Perceptron.
Before we begin the next section, I feel should explain the ultimate difference between these two different neurons. As we’ll see in just a moment, every Artificial Neuron has what is known as an Activation Function. This Activation Function is called every time the neuron is given some sort of input. So, for example in the case of the neuron decision above, the Activation Function may be something like: if the criteria has been met, then output “I will have a snack,” or if the criteria hasn’t been met, then out “I will not have a snack.” It is this Function that makes such a big difference between the two neurons that we will examine here today.
This is the simplest model of a neuron. The Perceptron takes any number of inputs, weighs them, and then produces a binary output. In most Neural Networks the Activation Function will not always be the same, some may output a value of -1 and 1, 0 and -1, or 0 and 1. The concept still remains throughout the design of the Perceptron, the output of this neuron returns either a yes or a no. For this reason, this the Perceptron works very well in situations where a decision must be made, however, it doesn’t perform as well for pattern recognition.
Since, we already have an understanding of what an Artificial Neuron is, and the example I gave above illustrates the concept of a Perceptron well, I will simply explain what makes the Perceptron not very suitable for Pattern Recognition.
All learning done with a Perceptron involves manipulating the weights. The Perceptron models weights are, generally, whole numbers. This means, whenever the the neural network analyzes itself to change some of its weighting properties, it can lead to over-saturation of the network. Which is when the weights get way too big to manage, and it just gets entirely too complex.
This can become a real problem when it comes to learning to recognize very detailed patterns, such as pictures, or even handwriting. So, the problem lies within weighting properties and the Activation Function of the neuron. This is where the even more common Sigmoid Neuron comes into play.
The Sigmoid Model
The setup of a Sigmoid Neuron is not any different than the setup of a Perceptron. The only differences, are in how the weights and the Activation Function are setup. With the Perceptron model, the Activation Function acts kind of like a step function. It’s either 1 or 0, and as we explained before this is not a good model for a Neural Network that is going to be recognizing patterns. Instead, we need to change this Activation function, that there is some “wiggle” room for the Network to learn. Instead, of being 1 or 0 we need it to be between 0 and 1.
It turns out that we accomplish this by running our weighted, , sum input through a function called the Sigmoid, or logistic, function. Check it out:
Now when we feed our inputs into the Neural Network, they will be manipulated into simple learning. These values are much more subtle, and they result in patterns that are much more common in nature, than the binary values that we see being output from the Perceptron. Here is a graph of the comparing the output of the Sigmoid Function, on the bottom, and the Perceptron Activation Function, on the top.
Fun with Perceptrons
I found that you can emulate Logic Gates with these basic Perceptrons, and I decided that I would share them with you here. If you are unfamiliar with Logic Gates, then I will explain them as I go.
This first one demonstrates a simple ‘AND’ gate. This Logic Gate takes two bits as input, and if both of the bits are 1 a one then the Neurons Activation Function will output a 1. For every other set of inputs, this Neuron will output a 0.
The next one is emulates an ‘OR’ gate. This Logic Gate takes two bits as input, and if either of this bits is a 1, the the Neuron evaluates to a 1. For every other set of inputs, this Neuron will output a 0.
Now that you’ve seen these two examples, have a look at the ‘XOR’ Gate, try to create a diagram for yourself. This will give you a better idea of how these Perceptrons work.
If you enjoyed this post on Neural Network Theory, and would like to read more about this stuff, leave a comment and let me know what you liked about it. If you think that I missed something, or said something wrong, then let me know and I’ll fix it as soon as possible.
Until then my friends, enjoy the rest of your weekend!