Margin-infused relaxed algorithm

Machine learning algorithm

Margin-infused relaxed algorithm (MIRA)^[1] is a machine learning and online algorithm for multiclass classification problems. It is designed to learn a set of parameters (vector or matrix) by processing all the given training examples one-by-one and updating the parameters according to each training example, so that the current training example is classified correctly with a margin against incorrect classifications at least as large as their loss.^[2] The change of the parameters is kept as small as possible.

A two-class version called binary MIRA^[1] simplifies the algorithm by not requiring the solution of a quadratic programming problem (see below). When used in a one-vs-all configuration, binary MIRA can be extended to a multiclass learner that approximates full MIRA, but may be faster to train.

The flow of the algorithm^[3]^[4] looks as follows:

Algorithm MIRA
 Input: Training examples  $T=\{x_{i},y_{i}\}$ {\displaystyle T=\{x_{i},y_{i}\}}
 Output: Set of parameters  $w$ {\displaystyle w}

  $i$ {\displaystyle i} ← 0,  $w^{(0)}$ {\displaystyle w^{(0)}} ← 0
 for  $n$ {\displaystyle n} ← 1 to  $N$ {\displaystyle N}
 for  $t$ {\displaystyle t} ← 1 to  $|T|$ {\displaystyle |T|}
  $w^{(i+1)}$ {\displaystyle w^{(i+1)}} ← update  $w^{(i)}$ {\displaystyle w^{(i)}} according to  $\{x_{t},y_{t}\}$ {\displaystyle \{x_{t},y_{t}\}}
  $i$ {\displaystyle i} ←  $i+1$ {\displaystyle i+1}
 end for
 end for
 return  ${\frac {\sum _{j=1}^{N\times |T|}w^{(j)}}{N\times |T|}}$ {\displaystyle {\frac {\sum _{j=1}^{N\times |T|}w^{(j)}}{N\times |T|}}}

"←" denotes assignment. For instance, "largest ← item" means that the value of largest changes to the value of item.
"return" terminates the algorithm and outputs the following value.

The update step is then formalized as a quadratic programming ^[2] problem: Find $min\|w^{(i+1)}-w^{(i)}\|$ {\displaystyle min\|w^{(i+1)}-w^{(i)}\|}, so that $score(x_{t},y_{t})-score(x_{t},y')\geq L(y_{t},y')\ \forall y'$ {\displaystyle score(x_{t},y_{t})-score(x_{t},y')\geq L(y_{t},y')\ \forall y'}, i.e. the score of the current correct training $y$ {\displaystyle y} must be greater than the score of any other possible $y'$ {\displaystyle y'} by at least the loss (number of errors) of that $y'$ {\displaystyle y'} in comparison to $y$ {\displaystyle y}.

References

[edit ]

^ ^a ^b Crammer, Koby; Singer, Yoram (2003). "Ultraconservative Online Algorithms for Multiclass Problems". Journal of Machine Learning Research . 3: 951–991.
^ ^a ^b McDonald, Ryan; Crammer, Koby; Pereira, Fernando (2005). "Online Large-Margin Training of Dependency Parsers" (PDF). Proceedings of the 43rd Annual Meeting of the ACL. Association for Computational Linguistics. pp. 91–98.
^ Watanabe, T. et al (2007): "Online Large Margin Training for Statistical Machine Translation". In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 764–773.
^ Bohnet, B. (2009): Efficient Parsing of Syntactic and Semantic Dependency Structures. Proceedings of Conference on Natural Language Learning (CoNLL), Boulder, 67–72.

External links

[edit ]

adMIRAble – MIRA implementation in C++
Miralium – MIRA implementation in Java
MIRA implementation for Mahout in Hadoop

Retrieved from "https://en.wikipedia.org/w/index.php?title=Margin-infused_relaxed_algorithm&oldid=1308468080"