# TensorFlow 08: save and restore a subset of variables

TensorFlow provides save and restore functions for us to save and re-use the model parameters. If you have a trained VGG model, for example, it will be helpful for you to restore the first few layers then apply them in your own networks. This may raise a problem, how do we restore a subset of the parameters? You can always check the TF official document here. In this post, I will take some code from the document and add some practical points.

# Working with ROUGE 1.5.5 Evaluation Metric in Python

If you use ROUGE Evaluation metric for text summarization systems or machine translation systems, you must have noticed that there are many versions of them. So how to get it work with your own systems with Python? What packages are helpful? In this post, I will give some ideas based on engineering’s view (which means I am not going to introduce what is ROUGE). I also suffered from few issues and finally got them solved. My methods might not be the best ways but they worked.

A brief introduction on Word2vec please check this post. In this post, we try to load pre-trained Word2vec model, which is a huge file contains all the word vectors trained on huge corpora.

Download here .I downloaded the GloVe one, the vocabulary size is 4 million, dimension is 50. It is a smaller one trained on a “global” corpus (from wikipedia). There are models trained on Twitter as well in the page.

The model is formatted as (word vector) in each line, separated by a space. Below shows a screenshot: not only the words, but also some marks like comma are included in the model.

There is an easy way for you to load the model by reading the vector file. Here I separate the words and vectors, because the words will be fed into vocabulary.

```import numpy as np
filename = 'glove.6B.50d.txt'
vocab = []
embd = []
file = open(filename,'r')
row = line.strip().split(' ')
vocab.append(row[0])
embd.append(row[1:])
file.close()
return vocab,embd
vocab_size = len(vocab)
embedding_dim = len(embd[0])
embedding = np.asarray(embd)
```

The vocab is a list of words or marks. The embedding is the huge 2-d array with all the word vectors. We initialize the embedding size to be the number of column of the embedding array.

### Embedding Layer

After loading in the vectors, we need to use them to initialize W of the embedding layer in your network.

```W = tf.Variable(tf.constant(0.0, shape=[vocab_size, embedding_dim]),
trainable=False, name="W")
embedding_placeholder = tf.placeholder(tf.float32, [vocab_size, embedding_dim])
embedding_init = W.assign(embedding_placeholder)
```

Here W is first built as Variables, but initialized by constant zeros. Be careful with the shape: [vocab_size, embedding_dim], where we can know after loading the model. If trainable is set to be False, it would not be updated during training. Change to True for a trainable setup. Then an embedding_placeholder is set up to receive the real values (fed from the feed_dict in sess.run()), and at last W is assigned.

After creating a session and initialize global variables, run the embedding_init operation by feeding in the 2-D array embedding.

```sess.run(embedding_init, feed_dict={embedding_placeholder: embedding})
```

### Vocabulary

Suppose you have raw documents, the first thing you need to do is to build a vocabulary, which will map each word into an id. TensorFlow process the following code to lookup embeddings:

```tf.nn.embedding_lookup(W, input_x)
```

where W is the huge embedding matrix, input_x is a tensor with ids. In another word, it will lookup embeddings by given Ids.

So we would choose the pre-trained model when we build the vocabulary: word-id maps.

```from tensorflow.contrib import learn
#init vocab processor
vocab_processor = learn.preprocessing.VocabularyProcessor(max_document_length)
#fit the vocab from glove
pretrain = vocab_processor.fit(vocab)
#transform inputs
x = np.array(list(vocab_processor.transform(your_raw_input)))
```

First init the vocab processor by passing in a max_document_length, in default, shorter sentences would be padded by zeros. Then we fit the processor by the vocab list to build the word-id maps. Finally, use the processor to transform from real raw documents.

Now you are ready to train your own network with pre-trained word vectors!

# Understanding SVM(2)

A brief Introduction here. (Wrote a blog about it last year, but do not think it is detailed.)

This blog is learning notes from this video (English slides but Chinese speaker). First a quick introduction on SVM, then the magic of how to solve max/min values. Also, you could find Kernel SVM. Continue reading “Understanding SVM(2)”

# NLP 05: From Word2vec to Doc2vec: a simple example with Gensim

#### Introduction

First introduced by Mikolov 1 in 2013, the word2vec is to learn distributed representations (word embeddings) when applying neural network. It is based on the distributed hypothesis that words occur in similar contexts (neighboring words) tend to have similar meanings. Two models here: cbow ( continuous bag of words) where we use a bag of words to predict a target word and skip-gram where we use one word to predict its neighbors. For more, although not highly recommended, have a look at TensorFlow tutorial here. Continue reading “NLP 05: From Word2vec to Doc2vec: a simple example with Gensim”

# NLP 04: Log-Linear Models for Tagging Task (Python)

We will focus on POS tagging in this blog.

##### Notations

While HMM gives us a joint probability on tags and words: $p({t}_{[1:n]},{w}_{[1:n]})$. Tags t and words w are one-to-one mapping, so in the series, they share the same length.