Search Results

Documents authored by Matsubara, Edson


Document
On classification, ranking, and probability estimation

Authors: Peter Flach and Edson Matsubara

Published in: Dagstuhl Seminar Proceedings, Volume 7161, Probabilistic, Logical and Relational Learning - A Further Synthesis (2008)


Abstract
Given a binary classification task, a ranker is an algorithm that can sort a set of instances from highest to lowest expectation that the instance is positive. In contrast to a classifier, a ranker does not output class predictions – although it can be turned into a classifier with help of an additional procedure to split the ranked list into two. A straightforward way to compute rankings is to train a scoring classifier to assign numerical scores to instances, for example the predicted odds that an instance is positive. However, rankings can be computed without scores, as we demonstrate in this paper. We propose a lexicographic ranker, LexRank , whose rankings are derived not from scores, but from a simple ranking of attribute values obtained from the training data. Although various metrics can be used, we show that by using the odds ratio to rank the attribute values we obtain a ranker that is conceptually close to the naive Bayes classifier, in the sense that for every instance of LexRank there exists an instance of naive Bayes that achieves the same ranking. However, the reverse is not true, which means that LexRank is more biased than naive Bayes. We systematically develop the relationships and differences between classification, ranking, and probability estimation, which leads to a novel connection between the Brier score and ROC curves. Combining LexRank with isotonic regression, which derives probability estimates from the ROC convex hull, results in the lexicographic probability estimator LexProb.

Cite as

Peter Flach and Edson Matsubara. On classification, ranking, and probability estimation. In Probabilistic, Logical and Relational Learning - A Further Synthesis. Dagstuhl Seminar Proceedings, Volume 7161, pp. 1-10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)


Copy BibTex To Clipboard

@InProceedings{flach_et_al:DagSemProc.07161.8,
  author =	{Flach, Peter and Matsubara, Edson},
  title =	{{On classification, ranking, and probability estimation}},
  booktitle =	{Probabilistic, Logical and Relational Learning - A Further Synthesis},
  pages =	{1--10},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{7161},
  editor =	{Luc de Raedt and Thomas Dietterich and Lise Getoor and Kristian Kersting and Stephen H. Muggleton},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.07161.8},
  URN =		{urn:nbn:de:0030-drops-13828},
  doi =		{10.4230/DagSemProc.07161.8},
  annote =	{Keywords: Ranking, probability estimation, ROC analysis, calibration}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail