Neural Network Theory: Composition of a Neural Network

Welcome Developers!

In this post, I will be covering the basic composition of a Neural Network, as well as some of the basic functionality. I’ll also include a brief comparison between a couple of popular Machine Learning Techniques and Neural Networks, so you can get an idea of how awesome Neural Networks really are!

Neural Network Components

All Neural Networks are composed of a few basic parts. First, a Neural Network is really just a Network of interconnected Neurons. The Neuron takes an input, evaluates the input, and outputs a value. However, there is more to the input of the Neuron.

The Neuron

The Neuron plays an important role within the Neural Network, so much that I even wrote a whole blog post covering them, you can check it out here. However, for this post I will only be briefly explaining their purpose.

Neurons serve as the workhorse of Neural Networks. Without them, the Network would be rendered useless. Neurons receive signals as input, run them through a function, known as the Activation Function, and output the corresponding value. These values are then propagated through the rest of the Neural Network via the connection I mention above, and run through the same process over and over again arriving at the output layer.

The Activation Function is the most important part of a Neuron, because without the Activation Function, the Neuron has no way to evaluate its inputs, and therefore has no output to feed the rest of the Neural Network.

A simple Activation Function for a Neuron follows the model of a Perceptron. A Perceptron is a binary Activation Function, meaning that it only has two outputs, True or False for example. There are other types of Activation Functions, such as the Sigmoid function, but are much more complex.

Weights

Each Neuron is connected to other Neurons via a line.

Neural Network Complex

These lines contain a special value, known as a ‘weight.’ These weights are multiplied by the input values received from other Neurons. It is with these ‘weights’ that, upon adjusting, the Neural Network begins to learn. These weights are typically adjusted by a regression algorithm, through a process known as back-propagation, which we will discuss in the next section.

But why do we use weights? Why waste all that extra time multiplying an input value, when you could just directly feed the values into a Neuron and have it evaluate just by the raw input? The reason, as I hinted at above, is that having these weights allows the Network to adjust itself, allowing it to establish rules for itself.

For example, if you have a Neural Network in which you intend to use to predict next weeks weather, your algorithm will need a way to adjust itself to learn different weather patterns. Based on these weather patterns, the Network will be able to predict next weeks weather.

If we were to just feed the raw input values into a Neural Network, then the Network would not be able to learn to detect different weather patterns. Without the weights in a Neural Network, the concept of a Neural Net is rendered useless, and would simply become a chain if-else statements.

The weights in Neural Network allow the algorithm to write the rules for itself, rather than us to write the rules for it. This is interesting, and this is what makes Neural Networks so powerful! Writing rules for a program can be tedious, and you probably won’t even be able to achieve solving for all of the edge cases. By allowing the program to do it, not only do you get a more accurate answer, but you also spend much less time and mental energy in trying to figure out those tedious parts of predicting the weather.

Neural Network Layers

Neural Networks are typically laid out in a series of layers. Starting with the input layer, which serves as the entry point for all inputs. The Hidden layers, which serve the purpose of establishing rules, and evaluating the inputs, and finally the output layer.

In every Neural Network, there is always an input and output layer, as well as one hidden layer. The difference between the input and output layers, and the hidden layers are that there can be many hidden layers, as many as you want in fact. However, you should note that it is possible to have too many hidden layers.

For example, if you write a Neural Network to classify handwritten digits, you typically only need 3 layers: Input, Hidden, Output. If you add more, you’re just adding un-needed complexity, which should always be avoided.

It is also interesting to note that the ‘hidden’ actually serves no purpose in its naming. It is simply a convention.

Back-propagation and Regression

Back-Propagation and Regression are both ways of training a Machine Learning Algorithm. In our case, back-propagation is responsible for making small changes to the weights in our Neural Network based on a cost function. Regression on the other hand is the method that we use to train the Neural Network.

Back-Propagation

Although this term sounds intimidating, it is really simple to understand. So, let me explain.

Whenever you train your Neural Network, back-propagation always has a role in helping it learn. The back-propagation algorithm relies heavily on how well the Neural Network is doing. Using our cost function, which just tells us how bad the Neural Network is doing, it determines the error for each output neuron, and goes back through the Network editing the weights.

For example, let’s use our Weather prediction idea. So, we have setup our Neural Network for supervised learning. We also got our weather data set, which will supply us the data that we will need to train the Network as well as the correct labels for each of our data set instances. But, we need a way to tell how poorly the network is doing.

The Cost Function

The cost function determines the error of the output from the Neural Network. In order to determine it, we must get select the correct label for the data set instance that we are working on, and the output of our Network. We can then define the cost function, , simply as the difference between the target output and the Networks output:

Cost(or error) = target – output

We now have our cost function, but how do we use this to fine tune every single weight in a Network? Simple! The output of a neuron is determined by the weighted sum of all of its inputs. Some of the inputs have a higher weight value than others. So, we can deduce that the higher weighted input value is contributing more to the out of the neuron than the others. As a result, we divide the error accordingly.

Once we determine which weights need to be adjusted, we can begin to manipulate them accordingly. By dividing the error up according to the weight of the input values, we can then begin to adjust them using our regression algorithm.

Regression

Think back to your Algebra classes. One of the most repetitive things that your teacher probably had you do was Linear regression problems. These problems typically give you a y-intercept, or a slope and a point on a graph. With this information you were supposed to find an equation for in the form of y = mx + b. This is basically what our regression algorithm will do for us, except our regression problems are much more complex.

The regression algorithm is what manipulates the weights of our Neural Network. One type of regression algorithm, that is often used, is ‘Stochastic Gradient Descent.’ It is typically explained by having the student imagine a valley. Then, imagine what happens when as a ball rolls down a hill into the valley. As the ball moves down the slope of the hill and enters the valley, it gets closer and closer to the lowest point in the valley. In our case, the valley would represent our cost function, and the ball represents our current error.

Each time we train our algorithm using one of our data set instances and we determine our output error, the regression algorithm determines which way the ball needs to move, and by how much the ball needs to move in order to achieve the minimum error. In other words, it determines how it needs to adjust the weights in our Network so that we achieve the minimum error.

Neural Network Comparisons

Neural Networks are one of many different Machine Learning Algorithms. Let us compare the differences between two popular Machine Learning Algorithms: Random Forest Generators and Support Vector Machines.

Random Forest Generators

Like Neural Networks, Random Forest Generators are closely modeled to how we process information on a daily basis. Random Forest Generators are based off of decision trees, much like the image below…

dt

Image Source

As Random Forest Generators are trained, and they ‘learn’ to classify different objects. They create more limbs along their trees, allowing them to be able to accurately identify a lot of different things.

Neural Networks are different from Random Forest Generators in that they don’t have as binary computations in order to come to a conclusion as Random Forest Generators. Meaning, they don’t just make a lot of simple decisions to arrive upon an answer like a Random Forest; rather, a Neural Networks output is determined by the entire computation of all Neurons in the Network.

These are two very interesting concepts in that they clearly demonstrate some of the more and less obvious ways that we, as humans, learn.

Support Vector Machines

The Support Vector Machine is one of the most popular Machine Learning Techniques, and perhaps one of the most powerful. The Support Vector Machine is capable of both linear and non-linear classification!

Compared to Neural Networks, these are completely different beasts. Support Vector Machines are great for classification. They are capable of separating data that is linearly correlated as well as non-linearly. This is something, that Neural Networks are not capable of. Take the image below for instance:

SVM Poly

Image from book: “Hands-On Machine Learning with Scikit-Learn & Tensorflow

I haven’t learned a lot in the way of Support Vector Machines, but what I have seen is that they can be very powerful if used the right way.

 

If you enjoyed this first post, feel free to follow my blog. Expect new posts every Saturday. If you enjoyed this post, feel free to leave a comment and let me know what you enjoyed about it. I’d love to hear your thoughts!

Stay tuned for next Saturday to learn more about Tensorflow! Until then, enjoy the rest of your weekend!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s