Tensorflow with Python: Learning to Read Handwritten Numbers

Welcome Developers!

Today we will dive into making our first Neural Network. We will be doing this using Python and Tensorflow. The data set that we will train our Neural Network for is known as the MNIST(Modified National Institute of Standards and Technology) Handwritten number data set. This data set is widely available from many different sources, but Google has prepackaged a copy for us in the Tensorflow library, so we will be using that one.

Creating a Plan

Just like all programs, Neural Networks have to be designed. Here is where we will plan the design of our Network.

We will be using a Machine Learning technique known as Supervised learning. Supervised learning involves training your Machine Learning Algorithm, in this case a Neural Network, by giving it data while also supplying it with the correct classification(a.k.a. the answers). These are commonly known as the data set’s labels.

MNIST

Here are some examples of what our data set actually looks like. Now, we’re not going to be using an Image Recognition algorithm in this tutorial, and besides that’d probably be much slower if we did. Instead, we will split each of the images into a set of gray scale values.

So, each handwritten digit is 28 pixels by 28 pixels, leaving us 784 individual pixels. Each of those pixels has a corresponding gray scale value. We will use these gray scale values to train our Neural Network, because it is the different gray scale values that make up the handwritten digit.

Each of these images are supplied with a corresponding “solution,” or “label,” telling us what the image actually represents. We will use these labels primarily to train our Neural Network on. Meaning, that every time our Neural Network produces a classification of an image that doesn’t match what its corresponding label is, the Neural Network should adjust itself so that it is.

Since we know what our inputs are going to be, as well as our outputs, we can come up with an idea of how our Neural Network will look. We will need 784 different inputs values, and 10 output values, meaning 784 input nodes and 10 output nodes. Each individual gray scale value will server as an input into the Network, this is why we have 784 inputs.

Implementation

Setup

Firstly, in order to use the Tensorflow API, we must first import it. We will do this by adding this first line to our code.

import tensorflow as tf

Now, we have access to the Tensorflow Library. Lets start setting up the variables that we are going to need. We know that we are going to need a place to store the current instance of data, as well as the labels. We also know that those inputs nodes are going to have a connection to our hidden layer, where they are also being multiplied by a weight, and then a summed with a bias. So, let us create these…

input_layer = tf.placeholder(tf.float32, [None, 784])

input_weights = tf.Variable(tf.zeros([784, 10]))
input_biases = tf.Variable(tf.zeros([10]))

Alright, we now have a place to store the data that we feed into the Network, as well as a place to store the weights and biases for those inputs. What we need to do now is  setup the Neural Network’s Model.

For this implementation, we will be utilizing a Softmax Regression algorithm to help our Network learn. This algorithm comes pre-packaged with Tensorflow, so all we have to do is call it.

output = tf.nn.softmax(tf.matmul(input_layer, input_weights) + input_biases)

This is amazing, in just 5 lines of code we were able to setup a Neural Network to classify handwritten digits. But our job is not over yet, we still need some way to teach our Neural Network how to classify these handwritten digits. This is our next step.

Training The Network

In order to train our Neural Network, we are going to need a way to track its progress in terms of how well it is classifying these images. To do this, we will define what is called a cost function. This function, rather than tell us how good the Network is doing, tells us how bad the Network is doing.

The cost function that we will be using is known as the “cross-entropy” function. It’s a cool sounding name, and may be a little intimidating, but all you need to know is that it helps us calculate how bad our Neural Network is doing.

Now, the cost function will give us a percentage of how bad the Neural Network is doing. The higher the percentage, the worse the Neural Network is doing, and the lower it is, the better it is doing. Our goal will be to minimize how bad the function is doing. Meaning, we want the percentage to be as low as possible.

In order to implement this cost function, we will need something for the function to check itself against to find the percent error that is Network is coming up with. This is where the labels of the data set come in. We will first define a place holder value for these labels.

labels = tf.placeholder(tf.float32, [None, 10])

Then, once we have the the place holders for our labels we can implement our cost function.

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=output))

To minimize our cost function we use the Gradient Descent Algorithm. This algorithm is a pretty in depth and complicated mathematical process, so I won’t be covering it in this post, but be sure to look out for it in a future post!

train = tf.train.GradientDescentOptimizer(0.5).minimize(cost)

We have finished constructing the Neural Network, and it is time to begin training it. Let us write a function that will accomplish this. Add the following lines to the top of your code.

from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

This will import the MNIST data set, but in order to train the Network we need to feed it into the training algorithm. First we have to create a session in which our Computation Graph will execute. Then, initialize all of our Tensorflow variables(“tf.Variable”) inside of that Session.

session = tf.Session()
tf.global_variables_initializer().run()

Now, we can begin feeding the data into the Network’s training algorithm. We will use a process known as batching to feed the data. This process isolates separate “batches”, and feeds them into the Network using a for loop. Each batch will contain 100 different instances of data(“batch_instances”) and their corresponding labels(“batch_labels”).

for _ in range(1000):
    batch_instances, batch_labels = mnist.train.next_batch(100)
    session.run(train, feed_dict={x: batch_instances, y: batch_labels})

Our Neural Network is now fully trained and ready to classify other handwritten digits!

Testing

The Networks output will look something like this:

[0, 0.39, 0.41, 0.69, 0.15, 0.1, 0.01, 0.05, 0.56, 0.23]

Each one of the elements in this array corresponds to a specific value in our input. For example, index 1(0.39) corresponds to the prediction that the instance of data has a 39% change of being a 1, and at index 5 there is a 10% change of the data being a 5. So, we will have to compare the highest of these values is what our Model thinks the instance of data is. In the case of this output, the model predicted the data to representative of an 8.

We will begin by defining a variable to store the predictions of our Neural Network.

nets_ans = tf.equal(tf.argmax(output, 1), tf.argmax(labels, 1))

This bit of code compares the highest percentage of the Networks model and the labels directly. The result is a Boolean value. Meaning, if the models prediction is correct, then the result is true, otherwise it is false.

Next, we need to find the accuracy of the Neural Network.

accuracy = tf.reduce_mean(tf.cast(nets_ans, tf.float32))

Finally, we output the accuracy by executing the Computation Graph with the test portion of the data set.

print(session.run(accuracy, feed_dict={input_layer: mnist.test.images, labels: mnist.test.labels})

This rough model should produce a 92% accuracy if everything is done correctly.

Understanding

I encourage all of you to write this code yourself, line-by-line to get a deeper understanding of what’s really going on. If you’d like more information covering some of the code, visit Google’s tutorial on the MNIST data set.

Play around with the Tensorboard with this Neural Network, and see if you can get a deeper intuition and understanding by seeing the Computation Graph visually. If you’re not sure how to use the Tensorboard, follow the tutorial I mentioned in my article on an “Introduction to Tensorflow with Python.

Also, if you’re interested in the Theory behind Neural Networks, make sure to check out my article covering how an Artificial Neuron works. Here is my version of this code.

 

Finally, I’d love to hear your thoughts on this topic. Whether you liked what I said, or I didn’t, feel free to leave a comment to let me know how I’m doing.

Until next time my friends, “May the Flow be with you…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s