6 Search Results for "Peng, Binghui"


Document
Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time

Authors: Zhao Song, Lichen Zhang, and Ruizhe Zhang

Published in: LIPIcs, Volume 287, 15th Innovations in Theoretical Computer Science Conference (ITCS 2024)


Abstract
We consider the problem of training a multi-layer over-parametrized neural network to minimize the empirical risk induced by a loss function. In the typical setting of over-parametrization, the network width m is much larger than the data dimension d and the number of training samples n (m = poly(n,d)), which induces a prohibitive large weight matrix W ∈ ℝ^{m× m} per layer. Naively, one has to pay O(m²) time to read the weight matrix and evaluate the neural network function in both forward and backward computation. In this work, we show how to reduce the training cost per iteration. Specifically, we propose a framework that uses m² cost only in the initialization phase and achieves a truly subquadratic cost per iteration in terms of m, i.e., m^{2-Ω(1)} per iteration. Our result has implications beyond standard over-parametrization theory, as it can be viewed as designing an efficient data structure on top of a pre-trained large model to further speed up the fine-tuning process, a core procedure to deploy large language models (LLM).

Cite as

Zhao Song, Lichen Zhang, and Ruizhe Zhang. Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time. In 15th Innovations in Theoretical Computer Science Conference (ITCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 287, pp. 93:1-93:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{song_et_al:LIPIcs.ITCS.2024.93,
  author =	{Song, Zhao and Zhang, Lichen and Zhang, Ruizhe},
  title =	{{Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time}},
  booktitle =	{15th Innovations in Theoretical Computer Science Conference (ITCS 2024)},
  pages =	{93:1--93:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-309-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{287},
  editor =	{Guruswami, Venkatesan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2024.93},
  URN =		{urn:nbn:de:0030-drops-196212},
  doi =		{10.4230/LIPIcs.ITCS.2024.93},
  annote =	{Keywords: Deep learning theory, Nonconvex optimization}
}
Document
Primal-Dual Schemes for Online Matching in Bounded Degree Graphs

Authors: Ilan Reuven Cohen and Binghui Peng

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)


Abstract
We explore various generalizations of the online matching problem in a bipartite graph G as the b-matching problem [Kalyanasundaram and Pruhs, 2000], the allocation problem [Buchbinder et al., 2007], and the AdWords problem [Mehta et al., 2007] in a beyond-worst-case setting. Specifically, we assume that G is a (k, d)-bounded degree graph, introduced by Naor and Wajc [Naor and Wajc, 2018]. Such graphs model natural properties on the degrees of advertisers and queries in the allocation and AdWords problems. While previous work only considers the scenario where k ≥ d, we consider the interesting intermediate regime of k ≤ d and prove a tight competitive ratio as a function of k,d (under the small-bid assumption) of τ(k,d) = 1 - (1-k/d)⋅(1-1/d)^{d - k} for the b-matching and allocation problems. We exploit primal-dual schemes [Buchbinder et al., 2009; Azar et al., 2017] to design and analyze the corresponding tight upper and lower bounds. Finally, we show a separation between the allocation and AdWords problems. We demonstrate that τ(k,d) competitiveness is impossible for the AdWords problem even in (k,d)-bounded degree graphs.

Cite as

Ilan Reuven Cohen and Binghui Peng. Primal-Dual Schemes for Online Matching in Bounded Degree Graphs. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 35:1-35:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{cohen_et_al:LIPIcs.ESA.2023.35,
  author =	{Cohen, Ilan Reuven and Peng, Binghui},
  title =	{{Primal-Dual Schemes for Online Matching in Bounded Degree Graphs}},
  booktitle =	{31st Annual European Symposium on Algorithms (ESA 2023)},
  pages =	{35:1--35:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-295-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{274},
  editor =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.35},
  URN =		{urn:nbn:de:0030-drops-186884},
  doi =		{10.4230/LIPIcs.ESA.2023.35},
  annote =	{Keywords: Online Matching, Primal-dual analysis, bounded-degree graph, the AdWords problem}
}
Document
Training (Overparametrized) Neural Networks in Near-Linear Time

Authors: Jan van den Brand, Binghui Peng, Zhao Song, and Omri Weinstein

Published in: LIPIcs, Volume 185, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)


Abstract
The slow convergence rate and pathological curvature issues of first-order gradient methods for training deep neural networks, initiated an ongoing effort for developing faster second-order optimization algorithms beyond SGD, without compromising the generalization error. Despite their remarkable convergence rate (independent of the training batch size n), second-order algorithms incur a daunting slowdown in the cost per iteration (inverting the Hessian matrix of the loss function), which renders them impractical. Very recently, this computational overhead was mitigated by the works of [Zhang et al., 2019; Cai et al., 2019], yielding an O(mn²)-time second-order algorithm for training two-layer overparametrized neural networks of polynomial width m. We show how to speed up the algorithm of [Cai et al., 2019], achieving an Õ(mn)-time backpropagation algorithm for training (mildly overparametrized) ReLU networks, which is near-linear in the dimension (mn) of the full gradient (Jacobian) matrix. The centerpiece of our algorithm is to reformulate the Gauss-Newton iteration as an 𝓁₂-regression problem, and then use a Fast-JL type dimension reduction to precondition the underlying Gram matrix in time independent of M, allowing to find a sufficiently good approximate solution via first-order conjugate gradient. Our result provides a proof-of-concept that advanced machinery from randomized linear algebra - which led to recent breakthroughs in convex optimization (ERM, LPs, Regression) - can be carried over to the realm of deep learning as well.

Cite as

Jan van den Brand, Binghui Peng, Zhao Song, and Omri Weinstein. Training (Overparametrized) Neural Networks in Near-Linear Time. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 63:1-63:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{vandenbrand_et_al:LIPIcs.ITCS.2021.63,
  author =	{van den Brand, Jan and Peng, Binghui and Song, Zhao and Weinstein, Omri},
  title =	{{Training (Overparametrized) Neural Networks in Near-Linear Time}},
  booktitle =	{12th Innovations in Theoretical Computer Science Conference (ITCS 2021)},
  pages =	{63:1--63:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-177-1},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{185},
  editor =	{Lee, James R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2021.63},
  URN =		{urn:nbn:de:0030-drops-136025},
  doi =		{10.4230/LIPIcs.ITCS.2021.63},
  annote =	{Keywords: Deep learning theory, Nonconvex optimization}
}
Document
Distributed Load Balancing: A New Framework and Improved Guarantees

Authors: Sara Ahmadian, Allen Liu, Binghui Peng, and Morteza Zadimoghaddam

Published in: LIPIcs, Volume 185, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)


Abstract
Inspired by applications on search engines and web servers, we consider a load balancing problem with a general convex objective function. In this problem, we are given a bipartite graph on a set of sources S and a set of workers W and the goal is to distribute the load from each source among its neighboring workers such that the total load of workers are as balanced as possible. We present a new distributed algorithm that works with any symmetric non-decreasing convex function for evaluating the balancedness of the workers' load. Our algorithm computes a nearly optimal allocation of loads in O(log n log² d/ε³) rounds where n is the number of nodes, d is the maximum degree, and ε is the desired precision. If the objective is to minimize the maximum load, we modify the algorithm to obtain a nearly optimal solution in O(log n log d/ε²) rounds. This improves a line of algorithms that require a polynomial number of rounds in n and d and appear to encounter a fundamental barrier that prevents them from obtaining poly-logarithmic runtime [Berenbrink et al., 2005; Berenbrink et al., 2009; Subramanian and Scherson, 1994; Rabani et al., 1998]. In our paper, we introduce a novel primal-dual approach with multiplicative weight updates that allows us to circumvent this barrier. Our algorithm is inspired by [Agrawal et al., 2018] and other distributed algorithms for optimizing linear objectives but introduces several new twists to deal with general convex objectives.

Cite as

Sara Ahmadian, Allen Liu, Binghui Peng, and Morteza Zadimoghaddam. Distributed Load Balancing: A New Framework and Improved Guarantees. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 79:1-79:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{ahmadian_et_al:LIPIcs.ITCS.2021.79,
  author =	{Ahmadian, Sara and Liu, Allen and Peng, Binghui and Zadimoghaddam, Morteza},
  title =	{{Distributed Load Balancing: A New Framework and Improved Guarantees}},
  booktitle =	{12th Innovations in Theoretical Computer Science Conference (ITCS 2021)},
  pages =	{79:1--79:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-177-1},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{185},
  editor =	{Lee, James R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2021.79},
  URN =		{urn:nbn:de:0030-drops-136186},
  doi =		{10.4230/LIPIcs.ITCS.2021.79},
  annote =	{Keywords: Load balancing, Distributed algorithms}
}
Document
On Adaptivity Gaps of Influence Maximization Under the Independent Cascade Model with Full-Adoption Feedback

Authors: Wei Chen and Binghui Peng

Published in: LIPIcs, Volume 149, 30th International Symposium on Algorithms and Computation (ISAAC 2019)


Abstract
In this paper, we study the adaptivity gap of the influence maximization problem under the independent cascade model when full-adoption feedback is available. Our main results are to derive upper bounds on several families of well-studied influence graphs, including in-arborescences, out-arborescences and bipartite graphs. Especially, we prove that the adaptivity gap for the in-arborescences is between [e/(e-1), 2e/(e-1)], and for the out-arborescences the gap is between [e/(e-1), 2]. These are the first constant upper bounds in the full-adoption feedback model. Our analysis provides several novel ideas to tackle the correlated feedback appearing in adaptive stochastic optimization, which may be of independent interest.

Cite as

Wei Chen and Binghui Peng. On Adaptivity Gaps of Influence Maximization Under the Independent Cascade Model with Full-Adoption Feedback. In 30th International Symposium on Algorithms and Computation (ISAAC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 149, pp. 24:1-24:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.ISAAC.2019.24,
  author =	{Chen, Wei and Peng, Binghui},
  title =	{{On Adaptivity Gaps of Influence Maximization Under the Independent Cascade Model with Full-Adoption Feedback}},
  booktitle =	{30th International Symposium on Algorithms and Computation (ISAAC 2019)},
  pages =	{24:1--24:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-130-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{149},
  editor =	{Lu, Pinyan and Zhang, Guochuan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2019.24},
  URN =		{urn:nbn:de:0030-drops-115208},
  doi =		{10.4230/LIPIcs.ISAAC.2019.24},
  annote =	{Keywords: Adaptive influence maximization, adaptivity gap, full-adoption feedback}
}
Document
Track A: Algorithms, Complexity and Games
Stochastic Online Metric Matching

Authors: Anupam Gupta, Guru Guruganesh, Binghui Peng, and David Wajc

Published in: LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)


Abstract
We study the minimum-cost metric perfect matching problem under online i.i.d arrivals. We are given a fixed metric with a server at each of the points, and then requests arrive online, each drawn independently from a known probability distribution over the points. Each request has to be matched to a free server, with cost equal to the distance. The goal is to minimize the expected total cost of the matching. Such stochastic arrival models have been widely studied for the maximization variants of the online matching problem; however, the only known result for the minimization problem is a tight O(log n)-competitiveness for the random-order arrival model. This is in contrast with the adversarial model, where an optimal competitive ratio of O(log n) has long been conjectured and remains a tantalizing open question. In this paper, we show that the i.i.d model admits substantially better algorithms: our main result is an O((log log log n)^2)-competitive algorithm in this model, implying a strict separation between the i.i.d model and the adversarial and random order models. Along the way we give a 9-competitive algorithm for the line and tree metrics - the first O(1)-competitive algorithm for any non-trivial arrival model for these much-studied metrics.

Cite as

Anupam Gupta, Guru Guruganesh, Binghui Peng, and David Wajc. Stochastic Online Metric Matching. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 67:1-67:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{gupta_et_al:LIPIcs.ICALP.2019.67,
  author =	{Gupta, Anupam and Guruganesh, Guru and Peng, Binghui and Wajc, David},
  title =	{{Stochastic Online Metric Matching}},
  booktitle =	{46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)},
  pages =	{67:1--67:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-109-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{132},
  editor =	{Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.67},
  URN =		{urn:nbn:de:0030-drops-106430},
  doi =		{10.4230/LIPIcs.ICALP.2019.67},
  annote =	{Keywords: stochastic, online, online matching, metric matching}
}
  • Refine by Author
  • 5 Peng, Binghui
  • 2 Song, Zhao
  • 1 Ahmadian, Sara
  • 1 Chen, Wei
  • 1 Cohen, Ilan Reuven
  • Show More...

  • Refine by Classification
  • 2 Theory of computation → Nonconvex optimization
  • 2 Theory of computation → Online algorithms
  • 1 Mathematics of computing → Matchings and factors
  • 1 Theory of computation → Distributed algorithms
  • 1 Theory of computation → Machine learning theory
  • Show More...

  • Refine by Keyword
  • 2 Deep learning theory
  • 2 Nonconvex optimization
  • 1 Adaptive influence maximization
  • 1 Distributed algorithms
  • 1 Load balancing
  • Show More...

  • Refine by Type
  • 6 document

  • Refine by Publication Year
  • 2 2019
  • 2 2021
  • 1 2023
  • 1 2024

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail