Engineering Fused Lasso Solvers on Trees

Kuthe, Elias; Rahmann, Sven

doi:10.4230/LIPIcs.SEA.2020.23

File

Subject Classification

ACM Subject Classification

Theory of computation → Mathematical optimization
Theory of computation → Dynamic programming
Mathematics of computing → Trees

Keywords

fused lasso
regularization
tree traversal
cache effects

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

Document

0

Metadata

Abstract

The graph fused lasso optimization problem seeks, for a given input signal y=(y_i) on nodes i∈ V of a graph G=(V,E), a reconstructed signal x=(x_i) that is both element-wise close to y in quadratic error and also has bounded total variation (sum of absolute differences across edges), thereby favoring regionally constant solutions. An important application is denoising of spatially correlated data, especially for medical images. Currently, fused lasso solvers for general graph input reduce the problem to an iteration over a series of "one-dimensional" problems (on paths or line graphs), which can be solved in linear time. Recently, a direct fused lasso algorithm for tree graphs has been presented, but no implementation of it appears to be available. We here present a simplified exact algorithm and additionally a fast approximation scheme for trees, together with engineered implementations for both. We empirically evaluate their performance on different kinds of trees with distinct degree distributions (simulated trees; spanning trees of road networks, grid graphs of images, social networks). The exact algorithm is very efficient on trees with low node degrees, which covers many naturally arising graphs, while the approximation scheme can perform better on trees with several higher-degree nodes when limiting the desired accuracy to values that are useful in practice.

Cite As Get BibTex

Elias Kuthe and Sven Rahmann. Engineering Fused Lasso Solvers on Trees. In 18th International Symposium on Experimental Algorithms (SEA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 160, pp. 23:1-23:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/LIPIcs.SEA.2020.23

Author Details

Elias Kuthe

Computer Science XI, TU Dortmund University, Germany

Sven Rahmann

Genome Informatics, Institute of Human Genetics, University of Duisburg-Essen, Essen, Germany

References

Taylor Arnold, Veeranjaneyulu Sadhanala, and Ryan J. Tibshirani. GLMGEN: Fast generalized lasso solver, 2019. URL: https://github.com/glmgen/glmgen.
David A. Bader, Henning Meyerhenke, Peter Sanders, and Dorothea Wagner, editors. Graph Partitioning and Graph Clustering-10th DIMACS Implementation Challenge Workshop, Georgia Institute of Technology, Atlanta, GA, USA, February 13-14, 2012, volume 588. American Mathematical Society, 2013.
Amir Beck and Marc Teboulle. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. Image Processing, IEEE Transactions on, 18(11):2419-2434, November 2009. URL: https://doi.org/10.1109/TIP.2009.2028250.
Gerth S. Brodal. Worst-case efficient priority queues. In Proceedings of the Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '96, pages 52-58, Philadelphia, PA, USA, 1996. Society for Industrial and Applied Mathematics.
Gerth S. Brodal. A survey on priority queues. In Andrej Brodnik, Alejandro López-Ortiz, Venkatesh Raman, and Alfredo Viola, editors, Space-Efficient Data Structures, Streams, and Algorithms, volume 8066 of Lecture Notes in Computer Science, pages 150-163. Springer Berlin Heidelberg, 2013. URL: https://doi.org/10.1007/978-3-642-40273-9_11.
Laurent Condat. A direct algorithm for 1-d total variation denoising. Signal Processing Letters, IEEE, 20(11):1054-1057, November 2013. URL: https://doi.org/10.1109/LSP.2013.2278339.
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. MIT Press, 3rd edition, 2009.
Amr Elmasry. Pairing heaps with costless meld. In Mark de Berg and Ulrich Meyer, editors, Algorithms – ESA 2010, volume 6347 of Lecture Notes in Computer Science, pages 183-193. Springer Berlin Heidelberg, 2010. URL: https://doi.org/10.1007/978-3-642-15781-3_16.
Pedro F. Felzenszwalb and Daniel P. Huttenlocher. Distance transforms of sampled functions. Theory of computing, 8(1):415-428, 2012.
Michael L. Fredman, Robert Sedgewick, Daniel D. Sleator, and Robert E. Tarjan. The pairing heap: A new form of self-adjusting heap. Algorithmica, 1(1-4):111-129, 1986. URL: https://doi.org/10.1007/BF01840439.
Michael L. Fredman and Robert E. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. Journal of the ACM (JACM), 34(3):596-615, 1987.
Alexandre Gramfort, Bertrand Thirion, and Gaël Varoquaux. Identifying predictive regions from fmri with tv-l1 prior. In Pattern Recognition in Neuroimaging (PRNI), 2013 International Workshop on, pages 17-20. IEEE, 2013.
Bernhard Haeupler, Siddhartha Sen, and Robert E. Tarjan. Rank-pairing heaps. SIAM Journal on Computing, 40(6):1463-1485, 2011. URL: https://doi.org/10.1137/100785351.
Wenzel Jakob, Jason Rhinelander, and Dean Moldovan. pybind11 - Seamless operability between C++11 and Python, 2017. URL: https://github.com/pybind/pybind11.
Nicholas Johnson. A dynamic programming algorithm for the fused lasso and L₀-segmentation. Journal of Computational and Graphical Statistics, 22(2):246-260, 2013. URL: https://doi.org/10.1080/10618600.2012.681238.
Vladimir Kolmogorov, Thomas Pock, and Michal Rolinek. Total variation on a tree. SIAM J. Imaging Sciences, 9(2):605-636, 2016.
Johannes Köster and Sven Rahmann. Snakemake—a scalable bioinformatics workflow engine. Bioinformatics, 28(19):2520-2522, 2012.
Daniel H. Larkin, Siddhartha Sen, and Robert E. Tarjan. A back-to-basics empirical study of priority queues. In 2014 Proceedings of the Sixteenth Workshop on Algorithm Engineering and Experiments (ALENEX), pages 61-72. SIAM, 2014.
Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford large network dataset collection, 2014. URL: http://snap.stanford.edu/data.
Oscar H. M. Padilla, James Sharpnack, and James G Scott. The DFS fused lasso: Linear-time denoising over general graphs. The Journal of Machine Learning Research, 18(1):6410-6445, 2017.
Heinz Prüfer. Neuer Beweis eines Satzes über Permutationen. Arch. Math. Phys., 27:742-744, 1918.
Leonid I. Rudin. Images, numerical analysis of singularities and shock filters. Dissertation (Ph. D.), California Institute of Technology, 1987.
Lawrence A. Shepp and Benjamin F. Logan. The Fourier reconstruction of a head section. IEEE Transactions on nuclear science, 21(3):21-43, 1974.
John T. Stasko and Jeffrey S. Vitter. Pairing heaps: Experiments and analysis. Commun. ACM, 30(3):234-249, March 1987. URL: https://doi.org/10.1145/214748.214759.
Wesley Tansey and Jeffrey G Scott. A fast and flexible algorithm for the graph-fused lasso, May 2015. URL: http://arxiv.org/abs/1505.06475.
Wesley Tansey, Jesse Thomason, and James G. Scott. Maximum-variance total variation denoising for interpretable spatial smoothing. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA, February 2-7, 2018, 2018.
Robert Tibshirani, Michael Saunders, Saharon Rosset, Ji Zhu, and Keith Knight. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistics Society: Series B, 67(1):91-108, 2005. URL: https://doi.org/10.1111/j.1467-9868.2005.00490.x.
Xiaodong Wang, Lei Wang, and Yingjie Wu. An optimal algorithm for Prufer codes. Journal of Software Engineering and Applications, 2(02):111, 2009.
Álvaro Barbero and Suvrit Sra. Modular proximal optimization for multidimensional total-variation regularization, 2014. URL: http://arxiv.org/abs/1411.0589.

Engineering Fused Lasso Solvers on Trees

Authors Elias Kuthe , Sven Rahmann

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

Engineering Fused Lasso Solvers on Trees

Authors Elias Kuthe , Sven Rahmann

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

Supplementary Materials

References

Thanks for your feedback!

Could not send message