On Correcting Inputs: Inverse Optimization for Online Structured Prediction

Authors Hal Daumé III, Samir Khuller, Manish Purohit, Gregory Sanders

Thumbnail PDF


  • Filesize: 0.49 MB
  • 14 pages

Document Identifiers

Author Details

Hal Daumé III
Samir Khuller
Manish Purohit
Gregory Sanders

Cite AsGet BibTex

Hal Daumé III, Samir Khuller, Manish Purohit, and Gregory Sanders. On Correcting Inputs: Inverse Optimization for Online Structured Prediction. In 35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 45, pp. 38-51, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)


Algorithm designers typically assume that the input data is correct, and then proceed to find "optimal" or "sub-optimal" solutions using this input data. However this assumption of correct data does not always hold in practice, especially in the context of online learning systems where the objective is to learn appropriate feature weights given some training samples. Such scenarios necessitate the study of inverse optimization problems where one is given an input instance as well as a desired output and the task is to adjust the input data so that the given output is indeed optimal. Motivated by learning structured prediction models, in this paper we consider inverse optimization with a margin, i.e., we require the given output to be better than all other feasible outputs by a desired margin. We consider such inverse optimization problems for maximum weight matroid basis, matroid intersection, perfect matchings, minimum cost maximum flows, and shortest paths and derive the first known results for such problems with a non-zero margin. The effectiveness of these algorithmic approaches to online learning for structured prediction is also discussed.
  • Inverse Optimization
  • Structured Prediction
  • Online Learning


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. David Chiang. Hope and fear for discriminative training of statistical translation models. Journal of Machine Learning Research, 2012. Google Scholar
  2. Y.J. Chu and T.H. Liu. On the shortest arborescence of a directed graph. Science Sinica, 14:1396-1400, 1965. Google Scholar
  3. Michael Collins. Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2002. Google Scholar
  4. Koby Crammer. Online Learning of Complex Categorical Problems. PhD thesis, Hebrew University of Jerusalem, 2004. Google Scholar
  5. Koby Crammer, Ofer Dekel, Joseph Keshet, Shai Shalev-Shwartz, and Yoram Singer. Online passive-aggressive algorithms. Journal of Machine Learning Research (JMLR), 2006. Google Scholar
  6. Koby Crammer and Yoram Singer. Ultraconservative online algorithms for multiclass problems. Journal of Machine Learning Research (JMLR), 2003. Google Scholar
  7. Hal Daumé III, Samir Khuller, Manish Purohit, and Gregory Sanders. On correcting inputs: Inverse optimization for online structured prediction. CoRR, 2015. URL: http://arxiv.org/abs/1510.03130.
  8. Mauro Dell'Amico, Francesco Maffioli, and Federico Malucelli. The base-matroid and inverse combinatorial optimization problems. Discrete applied mathematics, 128(2):337-353, 2003. Google Scholar
  9. J. Edmonds. Optimum branchings. Journal of Research of the National Bureau of Standards, 71B:233-240, 1967. Google Scholar
  10. András Frank. A weighted matroid intersection algorithm. Journal of Algorithms, 2(4):328-336, 1981. Google Scholar
  11. Satoru Fujishige. A primal approach to the independent assignment problem. Journal of the Operations Research Society of Japan, 20(1):1-15, 1977. Google Scholar
  12. Inc. Gurobi Optimization. Gurobi optimizer reference manual, 2012. Google Scholar
  13. Clemens Heuberger. Inverse combinatorial optimization: A survey on problems, methods, and results. Journal of Combinatorial Optimization, 8(3):329-361, 2004. Google Scholar
  14. Jyrki Kivinen and Manfred Warmuth. Exponentiated gradient versus gradient descent for linear predictors. In Symposium on the Theory of Computing (STOC), 1995. Google Scholar
  15. John Lafferty, Andrew McCallum, and Fernando Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the International Conference on Machine Learning (ICML), pages 282-289, 2001. Google Scholar
  16. Cai Mao-Cheng and Yanjun Li. Inverse matroid intersection problem. Mathematical Methods of Operations Research, 45(2):235-243, 1997. Google Scholar
  17. David McAllester, Michael Collins, and Fernando Pereira. Case-factor diagrams for structured probabilistic modeling. In Proceedings of the Converence on Uncertainty in Artificial Intelligence (UAI), 2004. Google Scholar
  18. Ryan McDonald, Koby Crammer, and Fernando Pereira. Online large-margin training of dependency parsers. In Proceedings of the Conference of the Association for Computational Linguistics (ACL), 2005. Google Scholar
  19. Ryan McDonald, Fernando Pereira, Kiril Ribarov, and Jan Hajic. Non-projective dependency parsing using spanning tree algorithms. In Proceedings of the Joint Conference on Human Language Technology Conference and Empirical Methods in Natural Language Processing (HLT/EMNLP), 2005. Google Scholar
  20. Vasin Punyakanok and Dan Roth. The use of classifiers in sequential inference. In Advances in Neural Information Processing Systems (NIPS), 2001. Google Scholar
  21. Alexander Schrijver. Combinatorial optimization: polyhedra and efficiency, volume 24. Springer Verlag, 2003. Google Scholar
  22. Ben Taskar, Vassil Chatalbashev, Daphne Koller, and Carlos Guestrin. Learning structured prediction models: A large margin approach. In Proceedings of the International Conference on Machine Learning (ICML), pages 897-904, 2005. Google Scholar
  23. Ben Taskar, Simon Lacoste-Julien, and Dan Klein. A discriminative matching approach to word alignment. In Proceedings of EMNLP 2005, 2005. Google Scholar
  24. Ioannis Tsochantaridis, Thomas Hofmann, Thorsten Joachims, and Yasmine Altun. Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research (JMLR), 6:1453-1484, Sep 2005. Google Scholar
  25. Daniel Zeman. A statistical approach to parsing of czech. Prague Bulletin of Mathematical Linguistics, 69:29œ37, 1998. Google Scholar
  26. Hu Zhiquan and Liu Zhenhong. A strongly polynomial algorithm for the inverse shortest arborescence problem. Discrete applied mathematics, 82(1):135-154, 1998. Google Scholar