On the Power of Learning from k-Wise Queries

Feldman, Vitaly; Ghazi, Badih

doi:10.4230/LIPIcs.ITCS.2017.41

Abstract

Several well-studied models of access to data samples, including statistical queries, local differential privacy and low-communication algorithms rely on queries that provide information about a function of a single sample. (For example, a statistical query (SQ) gives an estimate of Ex_{x ~ D}[q(x)] for any choice of the query function q mapping X to the reals, where D is an unknown data distribution over X.) Yet some data analysis algorithms rely on properties of functions that depend on multiple samples. Such algorithms would be naturally implemented using k-wise queries each of which is specified by a function q mapping X^k to the reals. Hence it is natural to ask whether algorithms using k-wise queries can solve learning problems more efficiently and by how much. Blum, Kalai and Wasserman (2003) showed that for any weak PAC learning problem over a fixed distribution, the complexity of learning with k-wise SQs is smaller than the (unary) SQ complexity by a factor of at most 2^k. We show that for more general problems over distributions the picture is substantially richer. For every k, the complexity of distribution-independent PAC learning with k-wise queries can be exponentially larger than learning with (k+1)-wise queries. We then give two approaches for simulating a k-wise query using unary queries. The first approach exploits the structure of the problem that needs to be solved. It generalizes and strengthens (exponentially) the results of Blum et al.. It allows us to derive strong lower bounds for learning DNF formulas and stochastic constraint satisfaction problems that hold against algorithms using k-wise queries. The second approach exploits the k-party communication complexity of the k-wise query function.

Javed A Aslam and Scott E Decatur. General bounds on statistical query learning and pac learning with noise via hypothesis boosting. In Foundations of Computer Science, 1993. Proceedings., 34th Annual Symposium on, pages 282-291. IEEE, 1993.
Maria-Florina Balcan, Avrim Blum, Shai Fine, and Yishay Mansour. Distributed learning, communication complexity and privacy. In Shie Mannor, Nathan Srebro, and Robert C. Williamson, editors, COLT 2012 - The 25th Annual Conference on Learning Theory, June 25-27, 2012, Edinburgh, Scotland, volume 23 of JMLR Proceedings, pages 26.1-26.22. JMLR.org, 2012. URL: http://www.jmlr.org/proceedings/papers/v23/balcan12a/balcan12a.pdf.
Maria-Florina Balcan and Vitaly Feldman. Statistical active learning algorithms for noise tolerance and differential privacy. Algorithmica, 72(1):282-315, 2015. URL: http://dx.doi.org/10.1007/s00453-014-9954-9.
Shai Ben-David and Eli Dichterman. Learning with restricted focus of attention. J. Comput. Syst. Sci., 56(3):277-298, 1998. URL: http://dx.doi.org/10.1006/jcss.1998.1569.
Shai Ben-David, Alon Itai, and Eyal Kushilevitz. Learning by distances. In Mark A. Fulk and John Case, editors, Proceedings of the Third Annual Workshop on Computational Learning Theory, COLT 1990, University of Rochester, Rochester, NY, USA, August 6-8, 1990., pages 232-245. Morgan Kaufmann, 1990. URL: http://dl.acm.org/citation.cfm?id=92644.
A. Blum, M. Furst, J. Jackson, M. Kearns, Y. Mansour, and S. Rudich. Weakly learning DNF and characterizing statistical query learning using Fourier analysis. In Proceedings of STOC, pages 253-262, 1994.
Avrim Blum, Cynthia Dwork, Frank McSherry, and Kobbi Nissim. Practical privacy: the sulq framework. In Chen Li, editor, Proceedings of the Twenty-fourth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 13-15, 2005, Baltimore, Maryland, USA, pages 128-138. ACM, 2005. URL: http://dx.doi.org/10.1145/1065167.1065184.
Avrim Blum, Alan Frieze, Ravi Kannan, and Santosh Vempala. A polynomial-time algorithm for learning noisy linear threshold functions. Algorithmica, 22(1-2):35-52, 1998.
Avrim Blum, Adam Kalai, and Hal Wasserman. Noise-tolerant learning, the parity problem, and the statistical query model. Journal of the ACM (JACM), 50(4):506-519, 2003.
D. Boneh and R. Lipton. Amplification of weak learning over the uniform distribution. In Proceedings of the Sixth Annual Workshop on Computational Learning Theory, pages 347-351, 1993.
Cheng Chu, Sang Kyun Kim, Yi-An Lin, YuanYuan Yu, Gary Bradski, Andrew Y Ng, and Kunle Olukotun. Map-reduce for machine learning on multicore. Advances in neural information processing systems, 19:281, 2007.
Dana Dachman-Soled, Vitaly Feldman, Li-Yang Tan, Andrew Wan, and Karl Wimmer. Approximate resilience, monotonicity, and the complexity of agnostic learning. In Proceedings of SODA, 2015.
Amit Daniely and Shai Shalev-Shwartz. Complexity theoretic limitations on learning dnf’s. In COLT, pages 815-830, 2016. URL: http://jmlr.org/proceedings/papers/v49/daniely16.html.
Anindya De, Ilias Diakonikolas, and Rocco A Servedio. Learning from satisfying assignments. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 478-497. SIAM, 2015.
Ilias Diakonikolas, Daniel M. Kane, and Alistair Stewart. Statistical query lower bounds for robust estimation of high-dimensional gaussians and gaussian mixtures. CoRR, abs/1611.03473, 2016. URL: http://arxiv.org/abs/1611.03473.
Irit Dinur and Kobbi Nissim. Revealing information while preserving privacy. In Frank Neven, Catriel Beeri, and Tova Milo, editors, Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 9-12, 2003, San Diego, CA, USA, pages 202-210. ACM, 2003. URL: http://dx.doi.org/10.1145/773153.773173.
John Dunagan and Santosh Vempala. A simple polynomial-time rescaling algorithm for solving linear programs. In László Babai, editor, Proceedings of the 36th Annual ACM Symposium on Theory of Computing, Chicago, IL, USA, June 13-16, 2004, pages 315-320. ACM, 2004. URL: http://dx.doi.org/10.1145/1007352.1007404.
Cynthia Dwork, Vitaly Feldman, Moritz Hardt, Toni Pitassi, Omer Reingold, and Aaron Roth. Generalization in adaptive data analysis and holdout reuse. In Advances in Neural Information Processing Systems, pages 2341-2349, 2015.
Cynthia Dwork, Vitaly Feldman, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Aaron Leon Roth. Preserving statistical validity in adaptive data analysis. In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, pages 117-126. ACM, 2015.
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. Calibrating noise to sensitivity in private data analysis. In Shai Halevi and Tal Rabin, editors, Theory of Cryptography, Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006, Proceedings, volume 3876 of Lecture Notes in Computer Science, pages 265-284. Springer, 2006. URL: http://dx.doi.org/10.1007/11681878_14.
Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. RAPPOR: randomized aggregatable privacy-preserving ordinal response. In ACM SIGSAC Conference on Computer and Communications Security, pages 1054-1067, 2014.
V. Feldman, H. Lee, and R. Servedio. Lower bounds and hardness amplification for learning shallow monotone formulas. In COLT, volume 19, pages 273-292, 2011.
Vitaly Feldman. Evolvability from learning algorithms. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pages 619-628. ACM, 2008.
Vitaly Feldman. Dealing with range anxiety in mean estimation via statistical queries. arXiv, abs/1611.06475, 2016. URL: http://arxiv.org/abs/1611.06475.
Vitaly Feldman. A general characterization of the statistical query complexity. CoRR, abs/1608.02198, 2016. URL: http://arxiv.org/abs/1608.02198.
Vitaly Feldman, Elena Grigorescu, Lev Reyzin, Santosh Vempala, and Ying Xiao. Statistical algorithms and a lower bound for detecting planted cliques. arXiv, CoRR, abs/1201.1214, 2012. Extended abstract in STOC 2013.
Vitaly Feldman, Cristobal Guzman, and Santosh Vempala. Statistical query algorithms for mean vector estimation and stochastic convex optimization. CoRR, abs/1512.09170, 2015. Extended abstract in SODA 2017. URL: http://arxiv.org/abs/1512.09170.
Vitaly Feldman, Will Perkins, and Santosh Vempala. On the complexity of random satisfiability problems with planted solutions. CoRR, abs/1311.4821, 2013. Extended abstract in STOC 2015.
Oded Goldreich. Candidate one-way functions based on expander graphs. IACR Cryptology ePrint Archive, 2000:63, 2000.
Shiva Prasad Kasiviswanathan, Homin K Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. What can we learn privately? SIAM Journal on Computing, 40(3):793-826, 2011.
Michael Kearns. Efficient noise-tolerant learning from statistical queries. Journal of the ACM (JACM), 45(6):983-1006, 1998.
Eyal Kushilevitz and Noam Nisan. Communication complexity. Cambridge University Press, 1997.
Yishay Mansour, Mehryar Mohri, and Afshin Rostamizadeh. Multiple source adaptation and the rényi divergence. In UAI, pages 367-374, 2009. URL: https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_id=1600&proceeding_id=25.
Aaron Roth and Tim Roughgarden. Interactive privacy via the median mechanism. In Proceedings of the forty-second ACM symposium on Theory of computing, pages 765-774. ACM, 2010.
Indrajit Roy, Srinath TV Setty, Ann Kilzer, Vitaly Shmatikov, and Emmett Witchel. Airavat: Security and privacy for mapreduce. In NSDI, volume 10, pages 297-312, 2010.
J. Steinhardt, G. Valiant, and S. Wager. Memory, communication, and statistical queries. In COLT, pages 1490-1516, 2016.
Jacob Steinhardt and John C. Duchi. Minimax rates for memory-bounded sparse linear regression. In COLT, pages 1564-1587, 2015. URL: http://jmlr.org/proceedings/papers/v40/Steinhardt15.html.
Arvind Sujeeth, HyoukJoong Lee, Kevin Brown, Tiark Rompf, Hassan Chafi, Michael Wu, Anand Atreya, Martin Odersky, and Kunle Olukotun. Optiml: an implicitly parallel domain-specific language for machine learning. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 609-616, 2011.
Leslie G Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134-1142, 1984.
Leslie G Valiant. Evolvability. Journal of the ACM (JACM), 56(1):3, 2009.
Larry Wasserman. All of statistics: a concise course in statistical inference. Springer Science &Business Media, 2013.
Andrew Yao. Probabilistic computations: Toward a unified measure of complexity. In FOCS, pages 222-227, 1977.
Yuchen Zhang, John C. Duchi, Michael I. Jordan, and Martin J. Wainwright. Information-theoretic lower bounds for distributed statistical estimation with communication constraints. In Proceedings of NIPS, pages 2328-2336, 2013.

On the Power of Learning from k-Wise Queries

Authors Vitaly Feldman, Badih Ghazi

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Keywords

Metrics

References