l1-Penalised Ordinal Polytomous Regression Estimators with Application to Gene Expression Studies

Authors Stéphane Chrétien, Christophe Guyeux, Serge Moulin

Thumbnail PDF


  • Filesize: 465 kB
  • 13 pages

Document Identifiers

Author Details

Stéphane Chrétien
  • National Physical Laboratory, Hampton Road, Teddington, United Kingdom
Christophe Guyeux
  • Computer Science Department, FEMTO-ST Institute, UMR 6174 CNRS, Université de Bourgogne Franche-Comté, 16 route de Gray, 25030 Besançon, France
Serge Moulin
  • Computer Science Department, FEMTO-ST Institute, UMR 6174 CNRS, Université de Bourgogne Franche-Comté, 16 route de Gray, 25030 Besançon, France

Cite AsGet BibTex

Stéphane Chrétien, Christophe Guyeux, and Serge Moulin. l1-Penalised Ordinal Polytomous Regression Estimators with Application to Gene Expression Studies. In 18th International Workshop on Algorithms in Bioinformatics (WABI 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 113, pp. 17:1-17:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Qualitative but ordered random variables, such as severity of a pathology, are of paramount importance in biostatistics and medicine. Understanding the conditional distribution of such qualitative variables as a function of other explanatory variables can be performed using a specific regression model known as ordinal polytomous regression. Variable selection in the ordinal polytomous regression model is a computationally difficult combinatorial optimisation problem which is however crucial when practitioners need to understand which covariates are physically related to the output and which covariates are not. One easy way to circumvent the computational hardness of variable selection is to introduce a penalised maximum likelihood estimator based on some well chosen non-smooth penalisation function such as, e.g., the l_1-norm. In the case of the Gaussian linear model, the l_1-penalised least-squares estimator, also known as LASSO estimator, has attracted a lot of attention in the last decade, both from the theoretical and algorithmic viewpoints. However, even in the Gaussian linear model, accurate calibration of the relaxation parameter, i.e., the relative weight of the penalisation term in the estimation cost function is still considered a difficult problem that has to be addressed with caution. In the present paper, we apply l_1-penalisation to the ordinal polytomous regression model and compare several hyper-parameter calibration strategies. Our main contributions are: (a) a useful and simple l_1 penalised estimator for ordinal polytomous regression and a thorough description of how to apply Nesterov's accelerated gradient and the online Frank-Wolfe methods to the problem of computing this estimator, (b) a new hyper-parameter calibration method for the proposed model, based on the QUT idea of Giacobino et al. and (c) a code which can be freely used that implements the proposed estimation procedure.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Regression analysis
  • ordinal polytomous regression
  • Quantile Universal Threshold
  • Frank-Wolfe algorithm
  • Nesterov algorithm


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Our python module. Accessed: 2018-05-11. URL: https://github.com/SergeMOULIN/l1-penalised-ordinal-polytomous-regression-estimators.
  2. Hirotogu Akaike. Information theory and an extension of the maximum likelihood principle. In Selected Papers of Hirotugu Akaike, pages 199-213. Springer, 1998. Google Scholar
  3. Sylvain Arlot, Alain Celisse, et al. A survey of cross-validation procedures for model selection. Statistics surveys, 4:40-79, 2010. Google Scholar
  4. Stephen Becker, Jérôme Bobin, and Emmanuel J Candès. Nesta: A fast and accurate first-order method for sparse recovery. SIAM Journal on Imaging Sciences, 4(1):1-39, 2011. Google Scholar
  5. Alexandre Belloni, Victor Chernozhukov, and Lie Wang. Square-root lasso: pivotal recovery of sparse signals via conic programming. Biometrika, 98(4):791-806, 2011. Google Scholar
  6. Peter J Bickel, Ya’acov Ritov, Alexandre B Tsybakov, et al. Simultaneous analysis of lasso and dantzig selector. The Annals of Statistics, 37(4):1705-1732, 2009. Google Scholar
  7. Emmanuel Candes, Terence Tao, et al. The dantzig selector: Statistical estimation when p is much larger than n. The Annals of Statistics, 35(6):2313-2351, 2007. Google Scholar
  8. Emmanuel J Candès, Yaniv Plan, et al. Near-ideal model selection by 𝓁1 minimization. The Annals of Statistics, 37(5A):2145-2177, 2009. Google Scholar
  9. Stephane Chretien, Guyeux Christophe, and Serge Moulin. l1-penalised ordinal polytomous regression estimators. arXiv preprint to be submitted, 2018. Google Scholar
  10. Stéphane Chrétien and Sébastien Darses. Sparse recovery with unknown variance: a lasso-type approach. IEEE Transactions on Information Theory, 60(7):3970-3988, 2014. Google Scholar
  11. Stephane Chretien, Alex Gibberd, and Sandipan Roy. Hedging hyperparameter selection for basis pursuit. arXiv preprint arXiv:1805.01870, 2018. Google Scholar
  12. Stephane Chretien, Christophe Guyeux, Michael Boyer-Guittaut, Regis Delage-Mouroux, and Francoise Descotes. Investigating gene expression array with outliers and missing data in bladder cancer. In Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference on, pages 994-998. IEEE, 2015. Google Scholar
  13. Stéphane Chrétien, Christophe Guyeux, Michael Boyer-Guittaut, Régis Delage-Mouroux, and Françoise Descôtes. Using the lasso for gene selection in bladder cancer data. arXiv preprint arXiv:1504.05004, 2015. Google Scholar
  14. Stéphane Chrétien, Christophe Guyeux, Bastien Conesa, Régis Delage-Mouroux, Michèle Jouvenot, Philippe Huetz, and Françoise Descôtes. A bregman-proximal point algorithm for robust non-negative matrix factorization with possible missing values and outliers-application to gene expression analysis. BMC bioinformatics, 17(8):284, 2016. Google Scholar
  15. Marguerite Frank and Philip Wolfe. An algorithm for quadratic programming. Naval Research Logistics (NRL), 3(1-2):95-110, 1956. Google Scholar
  16. Yoav Freund and Robert E Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55(1):119-139, 1997. Google Scholar
  17. Caroline Giacobino, Sylvain Sardy, Jairo Diaz-Rodriguez, and Nick Hengartner. Quantile universal threshold for model selection. arXiv preprint arXiv:1511.05433, 2015. Google Scholar
  18. Christopher Kennedy and Rachel Ward. Greedy variance estimation for the lasso. arXiv preprint arXiv:1803.10878, 2018. Google Scholar
  19. Alan Miller. Subset selection in regression. CRC Press, 2002. Google Scholar
  20. Yu Nesterov. Smooth minimization of non-smooth functions. Mathematical programming, 103(1):127-152, 2005. Google Scholar
  21. Yurii Nesterov. A method of solving a convex programming problem with convergence rate o (1/k2). Soviet Mathematics Doklady, pages 372-376, 1983. Google Scholar
  22. Gideon Schwarz et al. Estimating the dimension of a model. The annals of statistics, 6(2):461-464, 1978. Google Scholar
  23. Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58:267-288, 1994. Google Scholar
  24. Robert Tibshirani. The lasso method for variable selection in the cox model. Statistics in medicine, 16(4):385-395, 1997. Google Scholar
  25. Sara A Van de Geer et al. High-dimensional generalized linear models and the lasso. The Annals of Statistics, 36(2):614-645, 2008. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail