Learning and Testing Variable Partitions

Authors Andrej Bogdanov , Baoxiang Wang



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2020.37.pdf
  • Filesize: 9.43 MB
  • 22 pages

Document Identifiers

Author Details

Andrej Bogdanov
  • Department of Computer Science and Engineering , Institute of Theoretical Computer Science and Communications, The Chinese University of Hong Kong
Baoxiang Wang
  • Department of Computer Science and Engineering, The Chinese University of Hong Kong

Acknowledgements

We would like to thank Arnab Bhattacharyya and Guy Kindler for helpful discussions on our work and its connection to learning and testing juntas, and Jiajin Li for pointing out that variable partitioning for reinforcement learning in fact reduces the variance of its policy gradient estimator.

Cite AsGet BibTex

Andrej Bogdanov and Baoxiang Wang. Learning and Testing Variable Partitions. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 37:1-37:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.ITCS.2020.37

Abstract

Let F be a multivariate function from a product set Σ^n to an Abelian group G. A k-partition of F with cost δ is a partition of the set of variables V into k non-empty subsets (X_1, ̇s, X_k) such that F(V) is δ-close to F_1(X_1)+ ̇s+F_k(X_k) for some F_1, ̇s, F_k with respect to a given error metric. We study algorithms for agnostically learning k partitions and testing k-partitionability over various groups and error metrics given query access to F. In particular we show that 1) Given a function that has a k-partition of cost δ, a partition of cost O(k n^2)(δ + ε) can be learned in time Õ(n^2 poly 1/ε) for any ε > 0. In contrast, for k = 2 and n = 3 learning a partition of cost δ + ε is NP-hard. 2) When F is real-valued and the error metric is the 2-norm, a 2-partition of cost √(δ^2 + ε) can be learned in time Õ(n^5/ε^2). 3) When F is Z_q-valued and the error metric is Hamming weight, k-partitionability is testable with one-sided error and O(kn^3/ε) non-adaptive queries. We also show that even two-sided testers require Ω(n) queries when k = 2. This work was motivated by reinforcement learning control tasks in which the set of control variables can be partitioned. The partitioning reduces the task into multiple lower-dimensional ones that are relatively easier to learn. Our second algorithm empirically increases the scores attained over previous heuristic partitioning methods applied in this context.

Subject Classification

ACM Subject Classification
  • Theory of computation → Streaming, sublinear and near linear time algorithms
  • Theory of computation → Approximation algorithms analysis
  • Theory of computation → Machine learning theory
  • Theory of computation → Reinforcement learning
  • Computing methodologies → Reinforcement learning
Keywords
  • partitioning
  • agnostic learning
  • property testing
  • sublinear-time algorithms
  • hypergraph cut
  • reinforcement learning

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. S.S. Barsov and Vladimir Ulyanov. Estimates of the proximity of Gaussian measures. Doklady Mathematics, 34:462-, January 1987. Google Scholar
  2. Eric Blais. Testing juntas nearly optimally. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009, pages 151-158, 2009. URL: https://doi.org/10.1145/1536414.1536437.
  3. Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym. arXiv preprint, 2016. URL: http://arxiv.org/abs/1606.01540.
  4. Nader H. Bshouty. Almost Optimal Distribution-Free Junta Testing. In 34th Computational Complexity Conference, CCC 2019, July 18-20, 2019, New Brunswick, NJ, USA, pages 2:1-2:13, 2019. URL: https://doi.org/10.4230/LIPIcs.CCC.2019.2.
  5. George Casella and Christian P Robert. Rao-Blackwellisation of sampling schemes. Biometrika, 83(1):81-94, 1996. Google Scholar
  6. Karthekeyan Chandrasekaran, Chao Xu, and Xilin Yu. Hypergraph K-cut in Randomized Polynomial Time. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '18, pages 1426-1438, Philadelphia, PA, USA, 2018. Society for Industrial and Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=3174304.3175399.
  7. Chandra Chekuri and Shi Li. A note on the hardness of approximating the k-way hypergraph cut problem, 2015. Google Scholar
  8. Xi Chen, Rocco A. Servedio, Li-Yang Tan, Erik Waingarten, and Jinyu Xie. Settling the Query Complexity of Non-adaptive Junta Testing. J. ACM, 65(6):40:1-40:18, November 2018. URL: https://doi.org/10.1145/3213772.
  9. Hana Chockler and Dan Gutfreund. A lower bound for testing juntas. Inf. Process. Lett., 90(6):301-305, 2004. URL: https://doi.org/10.1016/j.ipl.2004.01.023.
  10. Roee David, Irit Dinur, Elazar Goldenberg, Guy Kindler, and Igor Shinkar. Direct Sum Testing. SIAM Journal on Computing, 46(4):1336-1369, 2017. URL: https://doi.org/10.1137/16M1061655.
  11. Thomas Degris, Martha White, and Richard S Sutton. Off-policy actor-critic. arXiv preprint, 2012. URL: http://arxiv.org/abs/1205.4839.
  12. Irit Dinur and Konstantin Golubev. Direct Sum Testing: The General Case. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2019, September 20-22, 2019, Massachusetts Institute of Technology, Cambridge, MA, USA, pages 40:1-40:11, 2019. URL: https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2019.40.
  13. Eldar Fischer, Guy Kindler, Dana Ron, Shmuel Safra, and Alex Samorodnitsky. Testing juntas. Journal of Computer and System Sciences, 68(4):753-787, 2004. Google Scholar
  14. Martin Grötschel, László Lovász, and Alexander Schrijver. The ellipsoid method and its consequences in combinatorial optimization. Combinatorica, 1(2):169-197, 1981. Google Scholar
  15. Satoru Iwata and James Orlin. A Simple Combinatorial Algorithm for Submodular Function Minimization. In Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1230-1237, January 2009. URL: https://doi.org/10.1145/1496770.1496903.
  16. David Karger and Matthew S. Levine. Fast Augmenting Paths by Random Sampling from Residual Graphs. SIAM Journal on Computing, 44:320-339, March 2015. URL: https://doi.org/10.1137/070705994.
  17. David R. Karger. Minimum Cuts in Near-linear Time. J. ACM, 47(1):46-76, January 2000. URL: https://doi.org/10.1145/331605.331608.
  18. David R. Karger and Clifford Stein. A New Approach to the Minimum Cut Problem. J. ACM, 43(4):601-640, July 1996. URL: https://doi.org/10.1145/234533.234534.
  19. Marek Karpinski and Warren Schudy. Linear Time Approximation Schemes for the Gale-Berlekamp Game and Related Minimization Problems. In Proceedings of the Forty-first Annual ACM Symposium on Theory of Computing, STOC '09, pages 313-322, New York, NY, USA, 2009. ACM. URL: https://doi.org/10.1145/1536414.1536458.
  20. Ken-ichi Kawarabayashi and Mikkel Thorup. Deterministic Global Minimum Cut of a Simple Graph in Near-Linear Time. In Proceedings of the Forty-seventh Annual ACM Symposium on Theory of Computing, STOC '15, pages 665-674, New York, NY, USA, 2015. ACM. URL: https://doi.org/10.1145/2746539.2746588.
  21. Regina Klimmek and Frank Wagner. A Simple Hypergraph Min Cut Algorithm, 1996. Technical Report B. Google Scholar
  22. Ilya Kostrikov. PyTorch Implementations of Reinforcement Learning Algorithms. https://github.com/ikostrikov/pytorch-a2c-ppo-acktr, 2018.
  23. Jiajin Li and Baoxiang Wang. Policy Optimization with Second-Order Advantage Information. arXiv preprint, 2018. URL: http://arxiv.org/abs/1805.03586.
  24. Pasin Manurangsi. Almost-polynomial Ratio ETH-hardness of Approximating Densest k-subgraph. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, pages 954-961, New York, NY, USA, 2017. ACM. URL: https://doi.org/10.1145/3055399.3055412.
  25. Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy P Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In International Conference on Machine Learning, 2016. Google Scholar
  26. Elchanan Mossel, Ryan O'Donnell, and Rocco P Servedio. Learning juntas. In Proceedings of the thirty-fifth annual ACM symposium on Theory of computing, pages 206-212. ACM, 2003. Google Scholar
  27. Maurice Queyranne. Minimizing symmetric submodular functions. Mathematical Programming, 82(1-2):3-12, 1998. Google Scholar
  28. R. M. Roth and K. Viswanathan. On the Hardness of Decoding the Gale-Berlekamp Code. In 2007 IEEE International Symposium on Information Theory, pages 1356-1360, June 2007. URL: https://doi.org/10.1109/ISIT.2007.4557411.
  29. Aviad Rubinstein, Tselil Schramm, and S. Matthew Weinberg. Computing Exact Minimum Cuts Without Knowing the Graph. In Anna R. Karlin, editor, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018), volume 94 of Leibniz International Proceedings in Informatics (LIPIcs), pages 39:1-39:16, Dagstuhl, Germany, 2018. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. URL: https://doi.org/10.4230/LIPIcs.ITCS.2018.39.
  30. Mert Saglam. Near Log-Convexity of Measured Heat in (Discrete) Time and Consequences. In 59th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018, pages 967-978, 2018. URL: https://doi.org/10.1109/FOCS.2018.00095.
  31. John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, and Pieter Abbeel. High-dimensional continuous control using generalized advantage estimation. arXiv preprint, 2015. URL: http://arxiv.org/abs/1506.02438.
  32. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. arXiv preprint, 2017. URL: http://arxiv.org/abs/1707.06347.
  33. Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT press, 2018. Google Scholar
  34. Richard S Sutton, David A McAllester, Satinder P Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems, pages 1057-1063, 2000. Google Scholar
  35. Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4):229-256, 1992. Google Scholar
  36. Cathy Wu, Aravind Rajeswaran, Yan Duan, Vikash Kumar, Alexandre M.Bayen, Sham Kakade, Igor Mordatch, and Pieter Abbeel. Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines. In International Conference on Learning Representations, 2018. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail