License
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.GCB.2013.90
URN: urn:nbn:de:0030-drops-42340
URL: http://drops.dagstuhl.de/opus/volltexte/2013/4234/
Go to the corresponding OASIcs Volume Portal


Leha, Andreas ; Jung, Klaus ; Beißbarth, Tim

Utilization of ordinal response structures in classification with high-dimensional expression data

pdf-format:
p090-leha.pdf (0.6 MB)


Abstract

Molecular diagnosis or prediction of clinical treatment outcome based on high-throughput genomics data is a modern application of machine learning techniques for clinical problems. In practice, clinical parameters, such as patient health status or toxic reaction to therapy, are often measured on an ordinal scale (e.g. good, fair, poor). Commonly, the prediction of ordinal end-points is treated as a multi-class classification problem, disregarding the ordering information contained in the response. This may result in a loss of prediction accuracy. Classical approaches to model ordinal response directly, including for instance the cumulative logit model, are typically not applicable to high-dimensional data. We present hierarchical twoing (hi2), a novel algorithm for classification of high-dimensional data into ordered categories. hi2 combines the power of well-understood binary classification with ordinal response prediction. A comparison of several approaches for ordinal classification on real world data as well as simulated data shows that classification algorithms especially designed to handle ordered categories fail to improve upon state-of-the-art non-ordinal classification algorithms. In general, the classification performance of an algorithm is dominated by its ability to deal with the high-dimensionality of the data. Only hi2 outperforms its competitors in the case of moderate effects.

BibTeX - Entry

@InProceedings{leha_et_al:OASIcs:2013:4234,
  author =	{Andreas Leha and Klaus Jung and Tim Bei{\ss}barth},
  title =	{{Utilization of ordinal response structures in classification with high-dimensional expression data}},
  booktitle =	{German Conference on Bioinformatics 2013},
  pages =	{90--100},
  series =	{OpenAccess Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-59-0},
  ISSN =	{2190-6807},
  year =	{2013},
  volume =	{34},
  editor =	{Tim Bei{\ss}barth and Martin Kollmar and Andreas Leha and Burkhard Morgenstern and Anne-Kathrin Schultz and Stephan Waack and Edgar Wingender},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2013/4234},
  URN =		{urn:nbn:de:0030-drops-42340},
  doi =		{10.4230/OASIcs.GCB.2013.90},
  annote =	{Keywords: Classification, High-Dimensional Data, Ordinal Response, Expression Data}
}

Keywords: Classification, High-Dimensional Data, Ordinal Response, Expression Data
Seminar: German Conference on Bioinformatics 2013
Issue Date: 2013
Date of publication: 29.08.2013


DROPS-Home | Fulltext Search | Imprint Published by LZI