Engineering Fused Lasso Solvers on Trees

Authors Elias Kuthe , Sven Rahmann



PDF
Thumbnail PDF

File

LIPIcs.SEA.2020.23.pdf
  • Filesize: 0.8 MB
  • 14 pages

Document Identifiers

Author Details

Elias Kuthe
  • Computer Science XI, TU Dortmund University, Germany
Sven Rahmann
  • Genome Informatics, Institute of Human Genetics, University of Duisburg-Essen, Essen, Germany

Cite AsGet BibTex

Elias Kuthe and Sven Rahmann. Engineering Fused Lasso Solvers on Trees. In 18th International Symposium on Experimental Algorithms (SEA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 160, pp. 23:1-23:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.SEA.2020.23

Abstract

The graph fused lasso optimization problem seeks, for a given input signal y=(y_i) on nodes i∈ V of a graph G=(V,E), a reconstructed signal x=(x_i) that is both element-wise close to y in quadratic error and also has bounded total variation (sum of absolute differences across edges), thereby favoring regionally constant solutions. An important application is denoising of spatially correlated data, especially for medical images. Currently, fused lasso solvers for general graph input reduce the problem to an iteration over a series of "one-dimensional" problems (on paths or line graphs), which can be solved in linear time. Recently, a direct fused lasso algorithm for tree graphs has been presented, but no implementation of it appears to be available. We here present a simplified exact algorithm and additionally a fast approximation scheme for trees, together with engineered implementations for both. We empirically evaluate their performance on different kinds of trees with distinct degree distributions (simulated trees; spanning trees of road networks, grid graphs of images, social networks). The exact algorithm is very efficient on trees with low node degrees, which covers many naturally arising graphs, while the approximation scheme can perform better on trees with several higher-degree nodes when limiting the desired accuracy to values that are useful in practice.

Subject Classification

ACM Subject Classification
  • Theory of computation → Mathematical optimization
  • Theory of computation → Dynamic programming
  • Mathematics of computing → Trees
Keywords
  • fused lasso
  • regularization
  • tree traversal
  • cache effects

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Taylor Arnold, Veeranjaneyulu Sadhanala, and Ryan J. Tibshirani. GLMGEN: Fast generalized lasso solver, 2019. URL: https://github.com/glmgen/glmgen.
  2. David A. Bader, Henning Meyerhenke, Peter Sanders, and Dorothea Wagner, editors. Graph Partitioning and Graph Clustering-10th DIMACS Implementation Challenge Workshop, Georgia Institute of Technology, Atlanta, GA, USA, February 13-14, 2012, volume 588. American Mathematical Society, 2013. Google Scholar
  3. Amir Beck and Marc Teboulle. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. Image Processing, IEEE Transactions on, 18(11):2419-2434, November 2009. URL: https://doi.org/10.1109/TIP.2009.2028250.
  4. Gerth S. Brodal. Worst-case efficient priority queues. In Proceedings of the Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '96, pages 52-58, Philadelphia, PA, USA, 1996. Society for Industrial and Applied Mathematics. Google Scholar
  5. Gerth S. Brodal. A survey on priority queues. In Andrej Brodnik, Alejandro López-Ortiz, Venkatesh Raman, and Alfredo Viola, editors, Space-Efficient Data Structures, Streams, and Algorithms, volume 8066 of Lecture Notes in Computer Science, pages 150-163. Springer Berlin Heidelberg, 2013. URL: https://doi.org/10.1007/978-3-642-40273-9_11.
  6. Laurent Condat. A direct algorithm for 1-d total variation denoising. Signal Processing Letters, IEEE, 20(11):1054-1057, November 2013. URL: https://doi.org/10.1109/LSP.2013.2278339.
  7. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. MIT Press, 3rd edition, 2009. Google Scholar
  8. Amr Elmasry. Pairing heaps with costless meld. In Mark de Berg and Ulrich Meyer, editors, Algorithms – ESA 2010, volume 6347 of Lecture Notes in Computer Science, pages 183-193. Springer Berlin Heidelberg, 2010. URL: https://doi.org/10.1007/978-3-642-15781-3_16.
  9. Pedro F. Felzenszwalb and Daniel P. Huttenlocher. Distance transforms of sampled functions. Theory of computing, 8(1):415-428, 2012. Google Scholar
  10. Michael L. Fredman, Robert Sedgewick, Daniel D. Sleator, and Robert E. Tarjan. The pairing heap: A new form of self-adjusting heap. Algorithmica, 1(1-4):111-129, 1986. URL: https://doi.org/10.1007/BF01840439.
  11. Michael L. Fredman and Robert E. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. Journal of the ACM (JACM), 34(3):596-615, 1987. Google Scholar
  12. Alexandre Gramfort, Bertrand Thirion, and Gaël Varoquaux. Identifying predictive regions from fmri with tv-l1 prior. In Pattern Recognition in Neuroimaging (PRNI), 2013 International Workshop on, pages 17-20. IEEE, 2013. Google Scholar
  13. Bernhard Haeupler, Siddhartha Sen, and Robert E. Tarjan. Rank-pairing heaps. SIAM Journal on Computing, 40(6):1463-1485, 2011. URL: https://doi.org/10.1137/100785351.
  14. Wenzel Jakob, Jason Rhinelander, and Dean Moldovan. pybind11 - Seamless operability between C++11 and Python, 2017. URL: https://github.com/pybind/pybind11.
  15. Nicholas Johnson. A dynamic programming algorithm for the fused lasso and L₀-segmentation. Journal of Computational and Graphical Statistics, 22(2):246-260, 2013. URL: https://doi.org/10.1080/10618600.2012.681238.
  16. Vladimir Kolmogorov, Thomas Pock, and Michal Rolinek. Total variation on a tree. SIAM J. Imaging Sciences, 9(2):605-636, 2016. Google Scholar
  17. Johannes Köster and Sven Rahmann. Snakemake—a scalable bioinformatics workflow engine. Bioinformatics, 28(19):2520-2522, 2012. Google Scholar
  18. Daniel H. Larkin, Siddhartha Sen, and Robert E. Tarjan. A back-to-basics empirical study of priority queues. In 2014 Proceedings of the Sixteenth Workshop on Algorithm Engineering and Experiments (ALENEX), pages 61-72. SIAM, 2014. Google Scholar
  19. Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford large network dataset collection, 2014. URL: http://snap.stanford.edu/data.
  20. Oscar H. M. Padilla, James Sharpnack, and James G Scott. The DFS fused lasso: Linear-time denoising over general graphs. The Journal of Machine Learning Research, 18(1):6410-6445, 2017. Google Scholar
  21. Heinz Prüfer. Neuer Beweis eines Satzes über Permutationen. Arch. Math. Phys., 27:742-744, 1918. Google Scholar
  22. Leonid I. Rudin. Images, numerical analysis of singularities and shock filters. Dissertation (Ph. D.), California Institute of Technology, 1987. Google Scholar
  23. Lawrence A. Shepp and Benjamin F. Logan. The Fourier reconstruction of a head section. IEEE Transactions on nuclear science, 21(3):21-43, 1974. Google Scholar
  24. John T. Stasko and Jeffrey S. Vitter. Pairing heaps: Experiments and analysis. Commun. ACM, 30(3):234-249, March 1987. URL: https://doi.org/10.1145/214748.214759.
  25. Wesley Tansey and Jeffrey G Scott. A fast and flexible algorithm for the graph-fused lasso, May 2015. URL: http://arxiv.org/abs/1505.06475.
  26. Wesley Tansey, Jesse Thomason, and James G. Scott. Maximum-variance total variation denoising for interpretable spatial smoothing. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA, February 2-7, 2018, 2018. Google Scholar
  27. Robert Tibshirani, Michael Saunders, Saharon Rosset, Ji Zhu, and Keith Knight. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistics Society: Series B, 67(1):91-108, 2005. URL: https://doi.org/10.1111/j.1467-9868.2005.00490.x.
  28. Xiaodong Wang, Lei Wang, and Yingjie Wu. An optimal algorithm for Prufer codes. Journal of Software Engineering and Applications, 2(02):111, 2009. Google Scholar
  29. Álvaro Barbero and Suvrit Sra. Modular proximal optimization for multidimensional total-variation regularization, 2014. URL: http://arxiv.org/abs/1411.0589.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail