Posted in Deep Learning, Python, Theory

TensorFlow 05: Understanding Basic Usage

Until recently, I realized I missed some basics about TF. I went directly to the MNIST when I learned. Also, I asked few people if they have some nice tutorials for TF or for DL. Well, it is not like other modules, where you can easily find good ones like Andrew’s ML. But I did find something (in the reference section), I did not go through every one. For those who are interested, have a check by yourself. Or you might happy with sharing your recommends.

This blog is about TF basis. You will know about the “graphs”, how to start a session, for example. You can find everything from the white paper, and the official tutorial. There is my script, from the official website, but you can run directly.

Computation Graph and Session

TensorFlow, by its name, you will probably get the points. Tensor can be a high dimensional array, while in practice, more generalized, we are doing something like matrix multiplications (that’s a reason why GPU is popular here, because as an accelerator, GPU could do that quicker), you could imagine they are flowing in the whole computational process.
In TF, the computations are described as a directed graph. [*]
13219756_798986766903560_704317321_n
So you could see there are nodes, and edges. Each node (like “MatMul”) can have one or more input tensors and zero or more output tensors. In this graph, we are simply doing xW+b then we use a ReLU function, might be a max(0,x) function. C is the cost.

We need to create a session to interact with TF. Or to load a graph in a session. We will become good friends of Run function of it.
Here is the example from the official website.

# define two matrices
matrix1 = tf.constant([[3., 3.]])
matrix2 = tf.constant([[2.],[2.]])
# do multiplication
product = tf.matmul(matrix1, matrix2)
# Let’s get a new session
sess = tf.Session()

# launch a session, indicate "product" is what we want to get
result = sess.run(product)

print(result)
# the output is : [[ 12.]]
sess.close()

Here matrix1 contains (3,3),matrix2 contains (2,2). A dot product gives a result of 12. From Line 1-5, the computation doesn’t start yet. We just define our “rule” for a computation. Then we launch a session, we call the method sess.run(product), the computation starts here. By passing in the parameter “product”, is the name of the tensor we want to fetch. Then we assign the fetched tensor to “result”, and finally, we can get the values in it. Remember to close the session.

Normally, there is another way to help us save resource by using a with block:

with tf.Session() as sess:
	 result = sess.run([product])
	print(result)

You can always put everything inside the with block if you have GPUs:


# multiple
with tf.Session() as sess:
    with tf.device("/cpu:0"):
        matrix1 = tf.constant([[3., 3.]])
        matrix2 = tf.constant([[2.],[2.]])
        product = tf.matmul(matrix1, matrix2)
        result = sess.run([product])
        print(result)

# "/cpu:0": The CPU of your machine.
# "/gpu:0": The GPU of your machine, if you have one.
# "/gpu:1": The second GPU of your machine, etc

I do not think they have “/cpu:1” things. By the device function, you can easily indicate your computation graphs to different devices. Also, TF provides the multiple GPU usage:
13236001_798974693571434_729652432_n

Feed, Fetch

To understand feed and fetch, let’s see this example from the white paper:
13235872_798979530237617_1177940872_n

The computation graph can be huge, even like a net. An advantage of TF is that you can do partial computations. From Figure6, suppose you want to get the value in node f, and you already have the known node a. We look at node f, find its parent, node c. Then we look at node c, and find c has parents node a and b. Since we have node a, now we are missing node b. So we only need to feed the computation by node b, then we can fetch node f. If we look back, we realize that this process does not contain node d or e. The tensors won’t flow there then.
So simply, the fetch node is the node you want to get as the output, while the feed node is the node you need to pass in the values.
We can pass in both fetch node and feed node into sess.run(), we will see in a moment.

Variables, Placeholders

A tensor can be treated as an n-d array or list. In TF, it can be kept in a variable or a placeholder.
Following is a piece of code:

# Create a Variable, that will be initialized to the scalar value 0.
state = tf.Variable(0, name="counter")

# Create an Op to add one to `state`.

one = tf.constant(1)
new_value = tf.add(state, one)
update = tf.assign(state, new_value)

# Variables must be initialized by running an `init` Op after having
# launched the graph.  We first have to add the `init` Op to the graph.
init_op = tf.initialize_all_variables()

# Launch the graph and run the ops.
with tf.Session() as sess:
    # Run the 'init' op
    sess.run(init_op)
    # Print the initial value of 'state'
    print(sess.run(state))
    # Run the op that updates 'state' and print 'state'.
    for _ in range(3):
        sess.run(update)
    print(sess.run([state,one]))

Remember to init everything before you start your computation. We can define the variables or constant tensors before launching a session. In Line 17, we start running our computation, by passing in the init operation. In Line 19, we want to print out the values in state, so we pass in the name state.
Another example shows how to use placeholders:

input1 = tf.placeholder(tf.float32)
input2 = tf.placeholder(tf.float32)
output = tf.mul(input1, input2)

with tf.Session() as sess:
  print(sess.run([output], feed_dict={input1:[7.], input2:[2.]}))

We set two input placeholders input1 and input2, then we multiple them as the output. It is telling the program that we need two places with the type of float. Placeholder is simply holding a place for an unknown tensor, because you need to feed them! In the last line, we use feed_dict the dictionary to assign the two input placeholders via names and values. The first parameter of run function is the fetch node, here we want to get the value of output. Instead of one tensor at each time, we can put all the wanted tensor into a list, for example, [output1,output2,…].

My idea

I never tried other frameworks, like Caffe, Torch, etc. But I prefer TF in my future study. Although currently I can not feel its power of “large-scale” ability, but it will be a point I care about. With a good scalability, training larger models won’t be a trouble.


References

[*] White paper.
Official website (tutorials).
TensorFlow tutorials using Jupyter Notebook.
11 Deep Learning Articles, Tutorials and Resources.

Author:

Keep calm and update blog.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s