Was working on my research with sklearn, but realized that choosing the right evaluation metrics was always a problem to me. If someone asks me ,”does your model performs well?” The first thing in my mind is “accuracy”. Besides the accuracy, there are a lot, depending on your own problem.

# Category Archives: Statitics

## NLP 01: Language Modeling Problems

Lecture notes from Natural Language Processing (by Michael Collins)

## PGM 02: Lots of Markov Family members: MC, PMN, CRF…

Markov Chain Markov Process is a kind of random process. The main idea is given the current state of the system, its future state does not depend on its past states.

## Deep Learning 10: Sequence Modeling

Learning notes for Lecture 7 Modeling sequences: A brief overview. by Geoffrey Hinton [1]

## Deep Learning 09: Small Tricks(2)

Let’s try some ways to speedup our learning!

## Lucky or not: Monte Carlo Method

AlphaGo! http://c.brightcove.com/services/viewer/federated_f9?isVid=1&isUI=1 When you play any games, probably you have strategies or experiences. But you could not deny that some times you need luck, which data scientists would say a “random choice”. Monte Carlo Method provides only an approximate optimizer, thus giving you the luck to win a game.

## Gibbs Sampling: about Parallelization

About BN Belief Network, or directed acyclic graphical model (DAG). When BN is huge: Exact Inference(variable elimination) Stochastic Inference(MCMC)

## Loopy BP: an easy implementation on Pregel Model

Pregel: Message Passing. Focus on the process, no matter each vertex computation. Steps [1]: (this part was referenced from a blog)

## MCMC:Gibbs Sampling

In the Important Sampling, all the samples are independent. But in MCMC, samples are dependent.