Tutorial please find here. https://www.tensorflow.org/versions/0.6.0/tutorials/mnist/pros/index.html
Now you are a not a beginner 😀
Since right now, you know what is CNN, lol, that’s amazing! Let’s raise our accuracy by applying a “professional” CNN. Well, basic but still works!
It’s said to be approximately 99.2%, here’s my testing after training 1000 times (too less but still works as 96.6%!):
View the code here, or:
# coding: utf-8 # In[1]: import tensorflow.examples.tutorials.mnist.input_data as input_data import tensorflow as tf import os # In[2]: # import dataset mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) # In[3]: # --------------------------Init-------------------------- # build a softmax regression model # imput images(x), target output classes(y_) x = tf.placeholder(tf.float32, [None, 784]) y_ = tf.placeholder(tf.float32, [None, 10]) # Init weights, bioas, (all zeros first) W = tf.Variable(tf.zeros([784, 10])) b = tf.Variable(tf.zeros([10])) init = tf.initialize_all_variables() # define softmax function # first multiply x and w, then add b vector. apply softmax to get probabilities y = tf.nn.softmax(tf.matmul(x, W) + b) # launch the model in a Session, do NOT run the initialized operation here sess = tf.Session() # --------------------------Trainning-------------------------- # -------Build a Multilayer Convolutional Network-------------- # init functions: weights and bias generation. def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) # In[13]: # convolution and Pooling functions def conv2d(x, W): return tf.nn.conv2d(input=x, filter=W, strides=[1, 1, 1, 1], padding='SAME') def max_pool_2x2(x): return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') # In[18]: # First Convolutional Layer x_image = tf.reshape(x, [-1,28,28,1]) sess = tf.InteractiveSession() print x_image W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) h_pool1 = max_pool_2x2(h_conv1) # In[6]: # Second Convolutional Layer W_conv2 = weight_variable([5, 5, 32, 64]) b_conv2 = bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) h_pool2 = max_pool_2x2(h_conv2) # In[7]: # Densely Connected Layer W_fc1 = weight_variable([7 * 7 * 64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) keep_prob = tf.placeholder("float") h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) # In[8]: # Readout layer W_fc2 = weight_variable([1024, 10]) b_fc2 = bias_variable([10]) y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2) cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv)) train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) sess.run(tf.initialize_all_variables()) # In[12]: # Start Training: train for 1000 times. for i in range(1000): batch = mnist.train.next_batch(50) if i%100 == 0: train_accuracy = accuracy.eval(session=sess, feed_dict={ x:batch[0], y_: batch[1], keep_prob: 1.0}) print "step %d, training accuracy %g"%(i, train_accuracy) train_step.run(session=sess, feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) # In[11]: print "test accuracy %g"%accuracy.eval(session=sess,feed_dict={ x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}) # In[ ]:
x is our original input, its a m * 784 2-D tensor, or a matrix.
The reshape function, change each element in x, in to a shape of 1 * 28 * 28 * 1. -1 means 1-D here. So we have a 4-D input, [filter_height * filter_width * in_channels, output_channels].
The Filter is the Weight tensor or matrix. For each patch, right-multiplies the filter matrix and the image patch vector.
Usually, strides = [1, stride, stride, 1]. strides: A list of ints. 1-D of length 4. The stride of the sliding window for each dimension of input.
Padding, “same” or ”valid”, same will give you the same shape after computing.
Returns a same shape tensor with the input (x).
tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding=’SAME’)
ksize: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor. Here we have a 2 * 2 patch as our window size.
strides we have also 2 * 2. ?
Liine 66-74 First C Layer
First three lines initialize input shape, weights and biases.
Line72, tf.nn.relu() gives the ReLU function, which is max(feature, 0), you can think of it as softmax or sigmoid functions, so as to map to probabilities within the range (0,1). As the input to the pooling funtion. (??)
Liine 76-85 Second C Layer
We set this layer 64 features, with 5*5 patches.
Same with the first layer, just change numbers in dimensions.
Liine 87-99 Fully (Densely) Connected Layer
Create a layer of 1024 neurone (can be any), then reshape the pooling output from the second convolution layer to fit for the standard NN. Combine with a Linear model (times with weights and adds up with biases), then use ReLU to map into probabilities.
Dropout *
To reduce overfitting. Think of a fully connected net, there might be noises if we consider all of the connections between a layer to its next. So a way to prevent additional traing is to drop some less important neuros.
Train
Different from the website, “session=sess” is added in the equal() function. Or you will meet an error:
ValueError: Cannot evaluate tensor using eval(): No default session is registered. Use `with sess.as_default()` or pass an explicit session to eval(session=sess)
Run the code with patience, good luck!
Framework
C: convolutional layers
S: subsampling layers
One thought on “TensorFlow 03: MNIST and CNN”