Derivation of Constraints from Machine Learning Models and Applications to Security and Privacy

Falaschi, Moreno; Palamidessi, Catuscia; Romanelli, Marco

doi:10.4230/OASIcs.Gabbrielli.11

Abstract

This paper shows how we can combine the power of machine learning with the flexibility of constraints. More specifically, we show how machine learning models can be represented by first-order logic theories, and how to derive these theories. The advantage of this representation is that it can be augmented with additional formulae, representing constraints of some kind on the data domain. For instance, new knowledge, or potential attackers, or fairness desiderata. We consider various kinds of learning algorithms (neural networks, k-nearest-neighbours, decision trees, support vector machines) and for each of them we show how to infer the FOL formulae. Then we focus on one particular application domain, namely the field of security and privacy. The idea is to represent the potentialities and goals of the attacker as a set of constraints, then use a constraint solver (more precisely, a solver modulo theories) to verify the satisfiability. If a solution exists, then it means that an attack is possible, otherwise, the system is safe. We show various examples from different areas of security and privacy; specifically, we consider a side-channel attack on a password checker, a malware attack on smart health systems, and a model-inversion attack on a neural network.

Ulrich Aïvodji, Sébastien Gambs, and Timon Ther. GAMIN: an adversarial approach to black-box model inversion. CoRR, abs/1909.11835, 2019. URL: http://arxiv.org/abs/1909.11835.
N. S. Altman. An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46(3):175-185, 1992. URL: http://www.jstor.org/stable/2685209.
Mário S. Alvim, Konstantinos Chatzikokolakis, Annabelle McIver, Carroll Morgan, Catuscia Palamidessi, and Geoffrey Smith. The Science of Quantitative Information Flow. Springer International Publishing, 2020. URL: https://doi.org/10.1007/978-3-319-96131-6.
Roberto Amadini, Maurizio Gabbrielli, and Jacopo Mauro. Portfolio approaches for constraint optimization problems. Ann. Math. Artif. Intell., 76(1-2):229-246, 2016. URL: https://doi.org/10.1007/s10472-015-9459-5.
Clark W. Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanovic, Tim King, Andrew Reynolds, and Cesare Tinelli. CVC4. In Ganesh Gopalakrishnan and Shaz Qadeer, editors, Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14-20, 2011. Proceedings, volume 6806 of Lecture Notes in Computer Science, pages 171-177. Springer, 2011. URL: https://doi.org/10.1007/978-3-642-22110-1_14.
Nicolas Beldiceanu and Helmut Simonis. A model seeker: Extracting global constraint models from positive examples. In Michela Milano, editor, Principles and Practice of Constraint Programming - 18th International Conference, CP 2012, Québec City, QC, Canada, October 8-12, 2012. Proceedings, volume 7514 of Lecture Notes in Computer Science, pages 141-157. Springer, 2012. URL: https://doi.org/10.1007/978-3-642-33558-7_13.
Christopher M. Bishop. Pattern recognition and machine learning, 5th Edition. Information science and statistics. Springer, 2007. URL: https://www.worldcat.org/oclc/71008143.
Bernhard E. Boser, Isabelle Guyon, and Vladimir Vapnik. A training algorithm for optimal margin classifiers. In David Haussler, editor, Proceedings of the Fifth Annual ACM Conference on Computational Learning Theory, COLT 1992, Pittsburgh, PA, USA, July 27-29, 1992, pages 144-152. ACM, 1992. URL: https://doi.org/10.1145/130385.130401.
Will Bridewell, Pat Langley, Ljupco Todorovski, and Saso Dzeroski. Inductive process modeling. Mach. Learn., 71(1):1-32, 2008. URL: https://doi.org/10.1007/s10994-007-5042-6.
Giovanni Cherubin, Konstantinos Chatzikokolakis, and Catuscia Palamidessi. F-BLEAU: fast black-box leakage estimation. In 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, May 19-23, 2019, pages 835-852. IEEE, 2019. URL: https://doi.org/10.1109/SP.2019.00073.
Geoffrey Chu, Peter J. Stuckey, Andreas Schutt, Thorsten Ehlers, Graeme Gange, and Kathryn Francis. Chuffed, a lazy clause generation solver. URL: https://github.com/chuffed/chuffed.
Gabriele Ciravegna, Francesco Giannini, Marco Gori, Marco Maggini, and Stefano Melacci. Human-driven FOL explanations of deep learning. In Christian Bessiere, editor, Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, pages 2234-2240. ijcai.org, 2020. URL: https://doi.org/10.24963/ijcai.2020/309.
Gabriele Ciravegna, Francesco Giannini, Stefano Melacci, Marco Maggini, and Marco Gori. A constraint-based approach to learning and explanation. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, pages 3658-3665. AAAI Press, 2020. URL: https://aaai.org/ojs/index.php/AAAI/article/view/5774.
Leonardo de Moura and Nikolaj Bjørner. Z3: An efficient SMT solver. In Tools and Algorithms for the Construction and Analysis (TACAS), volume 4963 of Lecture Notes in Computer Science, pages 337-340, Berlin, 2008. Springer-Verlag. URL: https://doi.org/10.1007/978-3-540-78800-3_24.
Luc Devroye, László Györfi, and Gábor Lugosi. A Probabilistic Theory of Pattern Recognition, volume 31 of Stochastic Modelling and Applied Probability. Springer, 1996. URL: https://doi.org/10.1007/978-1-4612-0711-5.
Bruno Dutertre. Yices 2.2. In Armin Biere and Roderick Bloem, editors, Computer Aided Verification - 26th International Conference, CAV 2014, Held as Part of the Vienna Summer of Logic, VSL 2014, Vienna, Austria, July 18-22, 2014. Proceedings, volume 8559 of Lecture Notes in Computer Science, pages 737-744. Springer, 2014. URL: https://doi.org/10.1007/978-3-319-08867-9_49.
Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. Model inversion attacks that exploit confidence information and basic countermeasures. In Indrajit Ray, Ninghui Li, and Christopher Kruegel, editors, Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, CO, USA, October 12-16, 2015, pages 1322-1333. ACM, 2015. URL: https://doi.org/10.1145/2810103.2813677.
Matthew Fredrikson, Eric Lantz, Somesh Jha, Simon M. Lin, David Page, and Thomas Ristenpart. Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing. In Kevin Fu and Jaeyeon Jung, editors, Proceedings of the 23rd USENIX Security Symposium, San Diego, CA, USA, August 20-22, 2014, pages 17-32. USENIX Association, 2014. URL: https://www.usenix.org/conference/usenixsecurity14/technical-sessions/presentation/fredrikson_matthew.
Giorgio Gnecco, Marco Gori, Stefano Melacci, and Marcello Sanguineti. Foundations of support constraint machines. Neural Comput., 27(2):388-480, 2015. URL: https://doi.org/10.1162/NECO_a_00686.
J. A. Goguen and J. Meseguer. Security policies and security models. In Proceedings of the 1982 Symposium on Security and Privacy (SSP '82), pages 11-20, Los Alamitos, Ca., USA, April 1990. IEEE Computer Society Press. URL: https://doi.org/10.1109/SP.1982.10014.
Ian J. Goodfellow, Yoshua Bengio, and Aaron C. Courville. Deep Learning. Adaptive computation and machine learning. MIT Press, 2016. URL: http://www.deeplearningbook.org/.
Trevor Hastie, Robert Tibshirani, and Jerome H. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition. Springer Series in Statistics. Springer, 2009. URL: https://doi.org/10.1007/978-0-387-84858-7.
Joxan Jaffar and Michael J. Maher. Constraint logic programming: A survey. J. Log. Program., 19/20:503-581, 1994. URL: https://doi.org/10.1016/0743-1066(94)90033-7.
Samuel Kolb, Stefano Teso, Andrea Passerini, and Luc De Raedt. Learning SMT(LRA) constraints using SMT solvers. In Jérôme Lang, editor, Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, pages 2333-2340. ijcai.org, 2018. URL: https://doi.org/10.24963/ijcai.2018/323.
Samuel M. Kolb. Learning constraints and optimization criteria. In Parisa Kordjamshidi, editor, Declarative Learning Based Programming, Papers from the 2016 AAAI Workshop, Phoenix, Arizona, USA, February 13, 2016, volume WS-16-07 of AAAI Workshops. AAAI Press, 2016. URL: http://www.aaai.org/ocs/index.php/WS/AAAIW16/paper/view/12651.
Boris Köpf and Markus Dürmuth. A provably secure and efficient countermeasure against timing attacks. In Proceedings of the 22nd IEEE Computer Security Foundations Symposium, CSF 2009, Port Jefferson, New York, USA, July 8-10, 2009, pages 324-335. IEEE Computer Society, 2009. URL: https://doi.org/10.1109/CSF.2009.21.
Michele Lombardi, Michela Milano, and Andrea Bartolini. Empirical decision model learning. Artif. Intell, 244:343-367, 2017. URL: https://doi.org/10.1016/j.artint.2016.01.005.
Stephen Muggleton. Inductive logic programming. New Gener. Comput., 8(4):295-318, 1991. URL: https://doi.org/10.1007/BF03037089.
Robert Nieuwenhuis, Albert Oliveras, and Cesare Tinelli. Solving SAT and SAT modulo theories: From an abstract davis-putnam-logemann-loveland procedure to dpll(T). J. ACM, 53(6):937-977, 2006. URL: https://doi.org/10.1145/1217856.1217859.
Chunki Park, Will Bridewell, and Pat Langley. Integrated systems for inducing spatio-temporal process models. In Maria Fox and David Poole, editors, Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2010, Atlanta, Georgia, USA, July 11-15, 2010. AAAI Press, 2010. URL: http://www.aaai.org/ocs/index.php/AAAI/AAAI10/paper/view/1742.
Tomasz P. Pawlak and Krzysztof Krawiec. Automatic synthesis of constraints from examples using mixed integer linear programming. Eur. J. Oper. Res., 261(3):1141-1157, 2017. URL: https://doi.org/10.1016/j.ejor.2017.02.034.
Laurent Perron and Vincent Furnon. Or-tools. URL: https://developers.google.com/optimization/.
J. Ross Quinlan. Induction of decision trees. Mach. Learn., 1(1):81-106, 1986. URL: https://doi.org/10.1023/A:1022643204877.
Luc De Raedt, Anton Dries, Tias Guns, and Christian Bessiere. Learning constraint satisfaction problems: An ILP perspective. In Christian Bessiere, Luc De Raedt, Lars Kotthoff, Siegfried Nijssen, Barry O'Sullivan, and Dino Pedreschi, editors, Data Mining and Constraint Programming - Foundations of a Cross-Disciplinary Approach, volume 10101 of Lecture Notes in Computer Science, pages 96-112. Springer, 2016. URL: https://doi.org/10.1007/978-3-319-50137-6_5.
Luc De Raedt, Andrea Passerini, and Stefano Teso. Learning constraints from examples. In Sheila A. McIlraith and Kilian Q. Weinberger, editors, Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pages 7965-7970. AAAI Press, 2018. URL: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17229.
Marco Romanelli, Konstantinos Chatzikokolakis, and Catuscia Palamidessi. Optimal obfuscation mechanisms via machine learning. In 33rd IEEE Computer Security Foundations Symposium, CSF 2020, Boston, MA, USA, June 22-26, 2020, pages 153-168. IEEE, 2020. URL: https://doi.org/10.1109/CSF49147.2020.00019.
Marco Romanelli, Konstantinos Chatzikokolakis, Catuscia Palamidessi, and Pablo Piantanida. Estimating g-leakage via machine learning. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security (CCS '20). ACM, 2020. To appear. Preprint in CoRR abs/2005.04399. URL: http://arxiv.org/abs/2005.04399.
Francesca Rossi, Peter van Beek, and Toby Walsh, editors. Handbook of Constraint Programming, volume 2 of Foundations of Artificial Intelligence. Elsevier, 2006. URL: http://www.sciencedirect.com/science/bookseries/15746526/2.
Vijay Saraswat. Concurrent Constraint Programming. MIT Press, 1993. URL: https://doi.org/10.7551/mitpress/2086.001.0001.
Vijay A. Saraswat, Martin Rinard, and Prakash Panangaden. The semantic foundations of concurrent constraint programming. In Conference Record of the Eighteenth Annual ACM Symposium on Principles of Programming Languages, pages 333-352. ACM Press, 1991. URL: https://doi.org/10.1145/99583.99627.
Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning - From Theory to Algorithms. Cambridge University Press, 2014. URL: http://www.cambridge.org/de/academic/subjects/computer-science/pattern-recognition-and-machine-learning/understanding-machine-learning-theory-algorithms.
Darlene Storm. MEDJACK: Hackers hijacking medical devices to create backdoors in hospital networks. Computerworld, June 2015. URL: https://www.computerworld.com/article/2932371/medjack-hackers-hijacking-medical-devices-to-create-backdoors-in-hospital-networks.html.
Geoffrey G. Towell and Jude W. Shavlik. Extracting refined rules from knowledge-based neural networks. Mach. Learn., 13:71-101, 1993. URL: https://doi.org/10.1007/BF00993103.
Leslie G. Valiant. A theory of the learnable. In Richard A. DeMillo, editor, Proceedings of the 16th Annual ACM Symposium on Theory of Computing, April 30 - May 2, 1984, Washington, DC, USA, pages 436-445. ACM, 1984. URL: https://doi.org/10.1145/800057.808710.

Derivation of Constraints from Machine Learning Models and Applications to Security and Privacy

Authors Moreno Falaschi , Catuscia Palamidessi , Marco Romanelli

File

Document Identifiers

Author Details

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Derivation of Constraints from Machine Learning Models and Applications to Security and Privacy

Authors Moreno Falaschi , Catuscia Palamidessi , Marco Romanelli

File

Document Identifiers

Author Details

Funding

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References