Multiple-Choice Knapsack for Assigning Partial Atomic Charges in Drug-Like Molecules

Authors Martin S. Engler, Bertrand Caron , Lourens Veen, Daan P. Geerke , Alan E. Mark , Gunnar W. Klau

Thumbnail PDF


  • Filesize: 0.79 MB
  • 13 pages

Document Identifiers

Author Details

Martin S. Engler
  • Life Sciences and Health Group, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
Bertrand Caron
  • School of Chemistry & Molecular Biosciences, The University of Queensland, St Lucia, Australia
Lourens Veen
  • Netherlands eScience Center, Amsterdam, The Netherlands
Daan P. Geerke
  • AIMMS Division of Molecular and Computational Toxicology, Vrije Universiteit Amsterdam, The Netherlands
Alan E. Mark
  • School of Chemistry & Molecular Biosciences, The University of Queensland, St Lucia, Australia
Gunnar W. Klau
  • Algorithmic Bioinformatics, Heinrich Heine University Düsseldorf, Germany

Cite AsGet BibTex

Martin S. Engler, Bertrand Caron, Lourens Veen, Daan P. Geerke, Alan E. Mark, and Gunnar W. Klau. Multiple-Choice Knapsack for Assigning Partial Atomic Charges in Drug-Like Molecules. In 18th International Workshop on Algorithms in Bioinformatics (WABI 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 113, pp. 16:1-16:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


A key factor in computational drug design is the consistency and reliability with which intermolecular interactions between a wide variety of molecules can be described. Here we present a procedure to efficiently, reliably and automatically assign partial atomic charges to atoms based on known distributions. We formally introduce the molecular charge assignment problem, where the task is to select a charge from a set of candidate charges for every atom of a given query molecule. Charges are accompanied by a score that depends on their observed frequency in similar neighbourhoods (chemical environments) in a database of previously parameterised molecules. The aim is to assign the charges such that the total charge equals a known target charge within a margin of error while maximizing the sum of the charge scores. We show that the problem is a variant of the well-studied multiple-choice knapsack problem and thus weakly NP-complete. We propose solutions based on Integer Linear Programming and a pseudo-polynomial time Dynamic Programming algorithm. We show that the results obtained for novel molecules not included in the database are comparable to the ones obtained performing explicit charge calculations while decreasing the time to determine partial charges for a molecule by several orders of magnitude, that is, from hours or even days to below a second. Our software is openly available at

Subject Classification

ACM Subject Classification
  • Applied computing → Chemistry
  • Multiple-choice knapsack
  • integer linear programming
  • pseudo-polynomial dynamic programming
  • partial charge assignment
  • molecular dynamics simulations


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Robert Abel, Lingle Wang, David L. Mobley, and Richard A. Friesner. A critical review of validation, blind testing, and real-world use of alchemical protein-ligand binding free energy calculations. Current Topics in Medicinal Chemistry, 17(23):2577-2585, 2017. URL:
  2. Christopher I. Bayly, Piotr Cieplak, Wendy Cornell, and Peter A. Kollman. A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the RESP model. The Journal of Physical Chemistry, 97(40):10269-10280, 1993. URL:
  3. F. Matthias Bickelhaupt and Evert Jan Baerends. Kohn-Sham Density Functional Theory: Predicting and Understanding Chemistry. In Kenny B. Lipkowitz and Donald B. Boyd, editors, Reviews in Computational Chemistry, pages 1-86. John Wiley &Sons, Inc., Hoboken, NJ, USA, 2007. URL:
  4. Patrick Bleiziffer, Kay Schaller, and Sereina Riniker. Machine Learning of Partial Charges Derived from High-Quality Quantum-Mechanical Calculations. Journal of Chemical Information and Modeling, 58(3):579-590, 2018. URL:
  5. Stefan Canzar, Mohammed El-Kebir, René Pool, Khaled Elbassioni, Alpeshkumar K. Malde, Alan E. Mark, Daan P. Geerke, Leen Stougie, and Gunnar W. Klau. Charge Group Partitioning in Biomolecular Simulation. Journal of Computational Biology, 20(3):188-198, 2013. URL:
  6. Krzysztof Dudziński and Stanisław Walukiewicz. Exact methods for the knapsack problem and its generalizations. European Journal of Operational Research, 28(1):3-21, 1987. URL:
  7. Martin S. Engler, Mohammed El-Kebir, Jelmer Mulder, Alan E. Mark, Daan P. Geerke, and Gunnar W. Klau. Enumerating common molecular substructures. PeerJ Preprints, 5:e3250v1, 2017. URL:
  8. David Freedman and Persi Diaconis. On the histogram as a density estimator: L₂ theory. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 57(4):453-476, 1981. URL:
  9. Michael R. Garey and David S. Johnson. Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman &Co., New York, NY, USA, 1990. Google Scholar
  10. Anna Gaulton, Louisa J. Bellis, A. Patricia Bento, Jon Chambers, Mark Davies, Anne Hersey, Yvonne Light, Shaun McGlinchey, David Michalovich, Bissan Al-Lazikani, and John P. Overington. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Research, 40(D1):D1100-D1107, 2012. URL:
  11. Alexander Hillisch, Nikolaus Heinrich, and Hanno Wild. Computational chemistry in the pharmaceutical industry: From childhood to adolescence. ChemMedChem, 10(12):1958-1962, 2015. URL:
  12. John J. Irwin and Brian K. Shoichet. ZINC - a free database of commercially available compounds for virtual screening. Journal of Chemical Information and Modeling, 45(1):177-182, 2005. URL:
  13. Maxim V. Ivanov, Marat R. Talipov, and Qadir K. Timerghazin. Genetic Algorithm Optimization of Point Charges in Force Field Development: Challenges and Insights. The Journal of Physical Chemistry A, 119(8):1422-1434, 2015. URL:
  14. Hans Kellerer, Ulrich Pferschy, and David Pisinger. Knapsack problems. Springer Berlin, Berlin, 1. edition, 2004. Google Scholar
  15. Alpeshkumar K. Malde, Le Zuo, Matthew Breeze, Martin Stroet, David Poger, Pramod C. Nair, Chris Oostenbrink, and Alan E. Mark. An Automated Force Field Topology Builder (ATB) and Repository: Version 1.0. Journal of Chemical Theory and Computation, 7(12):4026-4037, 2011. URL:
  16. Brajesh K. Rai and Gregory A. Bakken. Fast and accurate generation of ab initio quality atomic charges using nonparametric statistical regression. Journal of Computational Chemistry, 34(19):1661-1671, 2013. URL:
  17. Matthew J. Saltzman. COIN-OR: An open-source library for optimization. In Søren S. Nielsen, editor, Programming Languages and Systems in Computational Economics and Finance, pages 3-32. Springer, Boston, MA, 2002. URL:
  18. Bradley Sherborne, Veerabahu Shanmugasundaram, Alan C. Cheng, Clara D. Christ, Renee L. DesJarlais, Jose S. Duca, Richard A. Lewis, Deborah A. Loughney, Eric S. Manas, Georgia B. McGaughey, Catherine E. Peishoff, and Herman van Vlijmen. Collaborating to improve the use of free-energy and other quantitative methods in drug discovery. Journal of Computer-Aided Molecular Design, 30(12):1139-1141, 2016. URL:
  19. U. Chandra Singh and Peter A. Kollman. An approach to computing electrostatic charges for molecules. Journal of Computational Chemistry, 5(2):129-145, 1984. URL:
  20. Johannes H. Voigt, Bruno Bienfait, Shaomeng Wang, and Marc C. Nicklaus. Comparison of the NCI Open Database with Seven Large Chemical Structural Databases. Journal of Chemical Information and Computer Sciences, 41(3):702-712, 2001. URL:
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail