Please check my notes for Transfer Learning introduction!
A brief introduction on unsupervised transfer learning methods.
The presentation focused on unsupervised transfer learning methods, introducing feature-based and model-based strategies and few recent papers from ICML, ACL.
Comments are welcomed!
TensorFlow provides save and restore functions for us to save and re-use the model parameters. If you have a trained VGG model, for example, it will be helpful for you to restore the first few layers then apply them in your own networks. This may raise a problem, how do we restore a subset of the parameters? You can always check the TF official document here. In this post, I will take some code from the document and add some practical points.
In our daily life, we always repeating something mentioned before in our dialogue, like the name of people or organizations. “Hi, my name is Pikachu”, “Hi, Pikachu,…” There is a high probability that the word “Pikachu” will not be in the vocabulary extracted from the training data. So in the paper (Incorporating Copying Mechanism in Sequence-to-Sequence Learning), the authors proposed CopyNet which brings copying mechanism to seq2seq models with encoder and decoder structure. Read from my old post to learn the prerequisite knowledge.
People would be attracted only on a part of an image, say a person on a photo. Similarly, for a given sequence of words, we should pay attention to few keywords instead of treating each word equally. For example, “this is an apple”, when you read it loudly, I am sure you will stress “apple” more rather than “is” or “an” because you will naturally pay attention to the word “apple” (meaningful in this sentence). In seq2seq models (check this post if you forget), we are learning some weights corresponding to the words, where important words get a higher weight.