Robust Mean Estimation by All Means (Short Paper)

Affeldt, Reynald; Barrett, Clark; Bruni, Alessandro; Daukantas, Ieva; Khan, Harun; Saikawa, Takafumi; Schürmann, Carsten

doi:10.4230/LIPIcs.ITP.2024.39

Abstract

We report the results of a verification experiment on an algorithm for robust mean estimation, i.e., an algorithm that computes a mean in the presence of outliers. We formalize the algorithm in the Coq proof assistant and devise a pragmatic approach for identifying and solving issues related to the choice of bounds. To keep our formalization succinct and generic, we recast the original argument using an existing library for finite probabilities that we extend with reusable lemmas. To formalize the original algorithm, which relies on a subtle convergence argument, we observe that by adding suitable termination checks, we can turn it into a well-founded recursion without losing its original properties. We also exploit a tactic for solving real-valued inequalities by approximation to heuristically fix inaccurate constant values in the original proof.

Reynald Affeldt, Alessandro Bruni, Ieva Daukantas, and Takafumi Saikawa. Robust mean estimators. Available as part of the InfoTheo library [Reynald Affeldt et al., 2018], directory robust, 2024.
Reynald Affeldt, Jacques Garrigue, and Takafumi Saikawa. Reasoning with conditional probabilities and joint distributions in Coq. Computer Software, 37(3):79-95, 2020. URL: https://doi.org/10.11309/jssst.37.3_79.
Reynald Affeldt, Manabu Hagiwara, and Jonas Sénizergues. Formalization of Shannon’s theorems. J. Autom. Reason., 53(1):63-103, 2014. URL: https://doi.org/10.1007/S10817-013-9298-1.
Reynald Affeldt, Manabu Hagiwara, Jonas Sénizergues, Jacques Garrigue, Kazuhiko Sakaguchi, Taku Asai, Takafumi Saikawa, Naruomi Obata, and Alessandro Bruni. InfoTheo: A Coq formalization of information theory and linear error-correcting codes. https://github.com/affeldt-aist/infotheo, 2018. Last stable release: 0.7.2 (2024).
Ieva Daukantas, Alessandro Bruni, and Carsten Schürmann. Trimming data sets: a verified algorithm for robust mean estimation. In 23rd International Symposium on Principles and Practice of Declarative Programming (PPDP 2021), Tallinn, Estonia, September 6-8, 2021, pages 17:1-17:9. ACM, 2021. URL: https://doi.org/10.1145/3479394.3479412.
Ilias Diakonikolas, Gautam Kamath, Daniel Kane, Jerry Li, Ankur Moitra, and Alistair Stewart. Robust estimators in high-dimensions without the computational intractability. SIAM J. Comput., 48(2):742-864, 2019. URL: https://doi.org/10.1137/17M1126680.
Peter J. Huber. Robust Estimation of a Location Parameter, pages 492-518. Springer New York, New York, NY, 1992. URL: https://doi.org/10.1007/978-1-4612-4380-9_35.
MathComp. The mathematical components library. Available at https://github.com/math-comp/math-comp, 2007. Last stable version: Version 2.2.0 (2024).
Guillaume Melquiond. https://coqinterval.gitlabpages.inria.fr/, 2008. Last stable version: 4.12.0 (2024).
Guillaume Melquiond. Proving bounds on real-valued functions with computations. In 4th International Joint Conference on Automated Reasoning (IJCAR 2008), Sydney, Australia, August 12-15, 2008, volume 5195 of Lecture Notes in Computer Science, pages 2-17. Springer, 2008. URL: https://doi.org/10.1007/978-3-540-71070-7_2.
Jacob Steinhardt. Robust Learning: Information Theory and Algorithms. PhD thesis, Stanford, 2018.
The Coq Development Team. The Coq Proof Assistant Reference Manual. Inria, 2024. Available at https://coq.inria.fr. Version 8.19.0.
Jean-Baptiste Tristan, Joseph Tassarotti, Koundinya Vajjha, Michael L. Wick, and Anindya Banerjee. Verification of ML systems via reparameterization, 2020. URL: https://arxiv.org/abs/2007.06776.
J.W. Tukey and Princeton University. Department of Statistics. A Survey of Sampling from Contaminated Distributions. STRG Technical report. Princeton University, 1959.
Koundinya Vajjha, Barry M. Trager, Avraham Shinnar, and Vasily Pestun. Formalization of a stochastic approximation theorem. In 13th International Conference on Interactive Theorem Proving (ITP 2022), August 7-10, 2022, Haifa, Israel, volume 237 of LIPIcs, pages 31:1-31:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. URL: https://doi.org/10.4230/LIPICS.ITP.2022.31.

Robust Mean Estimation by All Means (Short Paper)

Authors Reynald Affeldt , Clark Barrett , Alessandro Bruni , Ieva Daukantas , Harun Khan , Takafumi Saikawa , Carsten Schürmann

File

Document Identifiers

Author Details

Acknowledgements

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Robust Mean Estimation by All Means (Short Paper)

Authors Reynald Affeldt , Clark Barrett , Alessandro Bruni , Ieva Daukantas , Harun Khan , Takafumi Saikawa , Carsten Schürmann

File

Document Identifiers

Author Details

Funding

Acknowledgements

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Supplementary Materials

References

Thanks for your feedback!

Could not send message