Distributed and Robust Support Vector Machine

Authors Yangwei Liu, Hu Ding, Ziyun Huang, Jinhui Xu



PDF
Thumbnail PDF

File

LIPIcs.ISAAC.2016.54.pdf
  • Filesize: 1.92 MB
  • 13 pages

Document Identifiers

Author Details

Yangwei Liu
Hu Ding
Ziyun Huang
Jinhui Xu

Cite As Get BibTex

Yangwei Liu, Hu Ding, Ziyun Huang, and Jinhui Xu. Distributed and Robust Support Vector Machine. In 27th International Symposium on Algorithms and Computation (ISAAC 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 64, pp. 54:1-54:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016) https://doi.org/10.4230/LIPIcs.ISAAC.2016.54

Abstract

In this paper, we consider the distributed version of Support Vector Machine (SVM) under the coordinator model, where all input data (i.e., points in R^d space) of SVM are arbitrarily distributed among k nodes in some network with a coordinator which can communicate with all nodes. We investigate two variants of this problem, with and without outliers. For distributed SVM without outliers, we prove a lower bound on the communication complexity and give a distributed (1-epsilon)-approximation algorithm to reach this lower bound, where epsilon is a user specified small constant. For distributed SVM with outliers, we present a (1-epsilon)-approximation algorithm to explicitly remove the influence of outliers. Our algorithm is based on a deterministic distributed top t selection algorithm with communication complexity of O(k log (t)) in the coordinator model. Experimental results on benchmark datasets confirm the theoretical guarantees of our algorithms.

Subject Classification

Keywords
  • Distributed Algorithm
  • Communication Complexity
  • Robust Algorithm
  • SVM

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Maria-Florina Balcan, Steven Ehrlich, and Yingyu Liang. Distributed clustering on graphs. CoRR, 2013. Google Scholar
  2. Aurélien Bellet, Yingyu Liang, Alireza Bagheri Garakani, Maria-Florina Balcan, and Fei Sha. Distributed frank-wolfe algorithm: A unified framework for communication-efficient sparse learning. CoRR, 2014. Google Scholar
  3. G. Cauwenberghs and T. Poggio. Incremental and decremental support vector machine learning. NIPS, 2000. Google Scholar
  4. E. Y. Chang, K. Zhu, H. Wang, H. Bai, H. Li, Z. Qiu, and H. Cui. Psvm: Parallelizing support vector machines on distributed computers. 21st Neural Info. Processing Systems Conf, 2007. Google Scholar
  5. C. Cortes and V. Vapnik. Support vector networks. Machine Learning, 1995. Google Scholar
  6. Hal Daumé, Jeff M. Phillips, Avishek Saha, and Suresh Venkatasubramanian. Efficient protocols for distributed classification and optimization. In Proceedings of the 23rd International Conference on Algorithmic Learning Theory, 2012. Google Scholar
  7. H. Ding and J. Xu. Random gradient descent tree: A combinatorial approach for svm with outliers. Association for the Advanced of Artificial Intelligence Conf., 2015. Google Scholar
  8. P. A. Forero, A. Cano, and G. B. Giannakis. Consensus-based distributed support vector machines. Journal of Machine Learning Research, 2010. Google Scholar
  9. G. Fung and O. L. Mangasarian. Incremental support vector machine classification. Proc. of the SIAM Intl. Conf. on Data Mining, 2002. Google Scholar
  10. G. Gartner and M. Jaggi. Coresets for polytope distance. SOCG, 2009. Google Scholar
  11. E. G. Gilbert. An iterative procedure for computing the minimum of a quadratic form on a convex set. SIAM Journal on Control, 1966. Google Scholar
  12. H. P. Graf, E. Cosatto, L. Bottou, I. Dourdanovic, and V. Vapnik. Parallel support vector machines: The cascade svm. Advances in Neural Information processing Systems, 2005. Google Scholar
  13. F. Kuhn, T. Locher, and R. Wattenhofer. Tight bounds for distributed selection. Symposium on Parallelism in Algorithms and Architectures, 2007. Google Scholar
  14. Y. Lu, V. Roychowdhury, and L. Vandenberghe. Distributed parallel support vector machines in strongly connected networks. IEEE Tran. on Neural Networks, 2008. Google Scholar
  15. N. M. Luscombe, D. Greenbaum, and M. Gerstein. Review: What is bioinformatics? an introduction and overview. Yearbook of Medical Informatics, 2001. Google Scholar
  16. A. Navia-Vazquez, D. Gutierrez-Gonzalez, E. Parrado-Hernandez, and J. J. Navarro-Abellan. Distributed support vector machines. IEEE Tran. on Neural Networks, 2006. Google Scholar
  17. A. Negro, N. Santoro, and J. Urrutia. Efficient distributed selection with bounded messages. IEEE Trans. Parallel and Distributed Systems, 1997. Google Scholar
  18. D. Pechyony, L. Shen, and R. Jones. Solving large scale linear svm with distributed block minimization. NIPS 2011 Workshop on Big Learning: Algorithms, Systems, and Tools for Learning at Scale, 2011. Google Scholar
  19. J. Phillips, E. Verbin, and Q. Zhang. Lower bounds for number in hand multiparty communication complexity, made easy. SODA, 2012. Google Scholar
  20. Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems 24. NIPS, 2011. Google Scholar
  21. N. Santoro, J. B. Sidney, and S. J. Sidney. A distributed selection algorithm and its expected communication complexity. Theoretical Computer Science, 1992. Google Scholar
  22. J. Yick, B. Mukherjee, and D. Ghosal. Wireless sensor network survey. Computer Networks, 2008. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail