Robust Mean Estimation by All Means (Short Paper)

Authors Reynald Affeldt , Clark Barrett , Alessandro Bruni , Ieva Daukantas , Harun Khan , Takafumi Saikawa , Carsten Schürmann



PDF
Thumbnail PDF

File

LIPIcs.ITP.2024.39.pdf
  • Filesize: 0.78 MB
  • 8 pages

Document Identifiers

Author Details

Reynald Affeldt
  • National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan
Clark Barrett
  • Stanford University, CA, USA
Alessandro Bruni
  • IT University of Copenhagen, Denmark
Ieva Daukantas
  • IT University of Copenhagen, Denmark
Harun Khan
  • Stanford University, CA, USA
Takafumi Saikawa
  • Nagoya University, Japan
Carsten Schürmann
  • IT University of Copenhagen, Denmark

Acknowledgements

The authors would like to thank the anonymous referees for their thorough reviews.

Cite AsGet BibTex

Reynald Affeldt, Clark Barrett, Alessandro Bruni, Ieva Daukantas, Harun Khan, Takafumi Saikawa, and Carsten Schürmann. Robust Mean Estimation by All Means (Short Paper). In 15th International Conference on Interactive Theorem Proving (ITP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 309, pp. 39:1-39:8, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ITP.2024.39

Abstract

We report the results of a verification experiment on an algorithm for robust mean estimation, i.e., an algorithm that computes a mean in the presence of outliers. We formalize the algorithm in the Coq proof assistant and devise a pragmatic approach for identifying and solving issues related to the choice of bounds. To keep our formalization succinct and generic, we recast the original argument using an existing library for finite probabilities that we extend with reusable lemmas. To formalize the original algorithm, which relies on a subtle convergence argument, we observe that by adding suitable termination checks, we can turn it into a well-founded recursion without losing its original properties. We also exploit a tactic for solving real-valued inequalities by approximation to heuristically fix inaccurate constant values in the original proof.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Probabilistic inference problems
  • Theory of computation → Logic and verification
Keywords
  • robust statistics
  • probability theory
  • formal verification

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Reynald Affeldt, Alessandro Bruni, Ieva Daukantas, and Takafumi Saikawa. Robust mean estimators. Available as part of the InfoTheo library [Reynald Affeldt et al., 2018], directory robust, 2024. Google Scholar
  2. Reynald Affeldt, Jacques Garrigue, and Takafumi Saikawa. Reasoning with conditional probabilities and joint distributions in Coq. Computer Software, 37(3):79-95, 2020. URL: https://doi.org/10.11309/jssst.37.3_79.
  3. Reynald Affeldt, Manabu Hagiwara, and Jonas Sénizergues. Formalization of Shannon’s theorems. J. Autom. Reason., 53(1):63-103, 2014. URL: https://doi.org/10.1007/S10817-013-9298-1.
  4. Reynald Affeldt, Manabu Hagiwara, Jonas Sénizergues, Jacques Garrigue, Kazuhiko Sakaguchi, Taku Asai, Takafumi Saikawa, Naruomi Obata, and Alessandro Bruni. InfoTheo: A Coq formalization of information theory and linear error-correcting codes. https://github.com/affeldt-aist/infotheo, 2018. Last stable release: 0.7.2 (2024).
  5. Ieva Daukantas, Alessandro Bruni, and Carsten Schürmann. Trimming data sets: a verified algorithm for robust mean estimation. In 23rd International Symposium on Principles and Practice of Declarative Programming (PPDP 2021), Tallinn, Estonia, September 6-8, 2021, pages 17:1-17:9. ACM, 2021. URL: https://doi.org/10.1145/3479394.3479412.
  6. Ilias Diakonikolas, Gautam Kamath, Daniel Kane, Jerry Li, Ankur Moitra, and Alistair Stewart. Robust estimators in high-dimensions without the computational intractability. SIAM J. Comput., 48(2):742-864, 2019. URL: https://doi.org/10.1137/17M1126680.
  7. Peter J. Huber. Robust Estimation of a Location Parameter, pages 492-518. Springer New York, New York, NY, 1992. URL: https://doi.org/10.1007/978-1-4612-4380-9_35.
  8. MathComp. The mathematical components library. Available at https://github.com/math-comp/math-comp, 2007. Last stable version: Version 2.2.0 (2024).
  9. Guillaume Melquiond. https://coqinterval.gitlabpages.inria.fr/, 2008. Last stable version: 4.12.0 (2024).
  10. Guillaume Melquiond. Proving bounds on real-valued functions with computations. In 4th International Joint Conference on Automated Reasoning (IJCAR 2008), Sydney, Australia, August 12-15, 2008, volume 5195 of Lecture Notes in Computer Science, pages 2-17. Springer, 2008. URL: https://doi.org/10.1007/978-3-540-71070-7_2.
  11. Jacob Steinhardt. Robust Learning: Information Theory and Algorithms. PhD thesis, Stanford, 2018. Google Scholar
  12. The Coq Development Team. The Coq Proof Assistant Reference Manual. Inria, 2024. Available at https://coq.inria.fr. Version 8.19.0.
  13. Jean-Baptiste Tristan, Joseph Tassarotti, Koundinya Vajjha, Michael L. Wick, and Anindya Banerjee. Verification of ML systems via reparameterization, 2020. URL: https://arxiv.org/abs/2007.06776.
  14. J.W. Tukey and Princeton University. Department of Statistics. A Survey of Sampling from Contaminated Distributions. STRG Technical report. Princeton University, 1959. Google Scholar
  15. Koundinya Vajjha, Barry M. Trager, Avraham Shinnar, and Vasily Pestun. Formalization of a stochastic approximation theorem. In 13th International Conference on Interactive Theorem Proving (ITP 2022), August 7-10, 2022, Haifa, Israel, volume 237 of LIPIcs, pages 31:1-31:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. URL: https://doi.org/10.4230/LIPICS.ITP.2022.31.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail