Search Results

Documents authored by Chandar, Sarath


Document
Combining Reinforcement Learning and Constraint Programming for Sequence-Generation Tasks with Hard Constraints

Authors: Daphné Lafleur, Sarath Chandar, and Gilles Pesant

Published in: LIPIcs, Volume 235, 28th International Conference on Principles and Practice of Constraint Programming (CP 2022)


Abstract
While Machine Learning (ML) techniques are good at generating data similar to a dataset, they lack the capacity to enforce constraints. On the other hand, any solution to a Constraint Programming (CP) model satisfies its constraints but has no obligation to imitate a dataset. Yet, we sometimes need both. In this paper we borrow RL-Tuner, a Reinforcement Learning (RL) algorithm introduced to tune neural networks, as our enabling architecture to exploit the respective strengths of ML and CP. RL-Tuner maximizes the sum of a pretrained network’s learned probabilities and of manually-tuned penalties for each violated constraint. We replace the latter with outputs of a CP model representing the marginal probabilities of each value and the number of constraint violations. As was the case for the original RL-Tuner, we apply our algorithm to music generation since it is a highly-constrained domain for which CP is especially suited. We show that combining ML and CP, as opposed to using them individually, allows the agent to reflect the pretrained network while taking into account constraints, leading to melodic lines that respect both the corpus' style and the music theory constraints.

Cite as

Daphné Lafleur, Sarath Chandar, and Gilles Pesant. Combining Reinforcement Learning and Constraint Programming for Sequence-Generation Tasks with Hard Constraints. In 28th International Conference on Principles and Practice of Constraint Programming (CP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 235, pp. 30:1-30:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{lafleur_et_al:LIPIcs.CP.2022.30,
  author =	{Lafleur, Daphn\'{e} and Chandar, Sarath and Pesant, Gilles},
  title =	{{Combining Reinforcement Learning and Constraint Programming for Sequence-Generation Tasks with Hard Constraints}},
  booktitle =	{28th International Conference on Principles and Practice of Constraint Programming (CP 2022)},
  pages =	{30:1--30:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-240-2},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{235},
  editor =	{Solnon, Christine},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CP.2022.30},
  URN =		{urn:nbn:de:0030-drops-166594},
  doi =		{10.4230/LIPIcs.CP.2022.30},
  annote =	{Keywords: Constraint programming, reinforcement learning, RNN, music generation}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail