Jointly Embedding Multiple Single-Cell Omics Measurements

Authors Jie Liu, Yuanhao Huang, Ritambhara Singh, Jean-Philippe Vert , William Stafford Noble



PDF
Thumbnail PDF

File

LIPIcs.WABI.2019.10.pdf
  • Filesize: 2.86 MB
  • 13 pages

Document Identifiers

Author Details

Jie Liu
  • Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
Yuanhao Huang
  • Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
Ritambhara Singh
  • Department of Genome Sciences, University of Washington, Seattle, WA, USA
Jean-Philippe Vert
  • Google Brain, Paris, France
  • Centre for Computational Biology, MINES ParisTech, PSL University, Paris, France
William Stafford Noble
  • Department of Genome Sciences, University of Washington, Seattle, WA, USA
  • Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA

Cite AsGet BibTex

Jie Liu, Yuanhao Huang, Ritambhara Singh, Jean-Philippe Vert, and William Stafford Noble. Jointly Embedding Multiple Single-Cell Omics Measurements. In 19th International Workshop on Algorithms in Bioinformatics (WABI 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 143, pp. 10:1-10:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.WABI.2019.10

Abstract

Many single-cell sequencing technologies are now available, but it is still difficult to apply multiple sequencing technologies to the same single cell. In this paper, we propose an unsupervised manifold alignment algorithm, MMD-MA, for integrating multiple measurements carried out on disjoint aliquots of a given population of cells. Effectively, MMD-MA performs an in silico co-assay by embedding cells measured in different ways into a learned latent space. In the MMD-MA algorithm, single-cell data points from multiple domains are aligned by optimizing an objective function with three components: (1) a maximum mean discrepancy (MMD) term to encourage the differently measured points to have similar distributions in the latent space, (2) a distortion term to preserve the structure of the data between the input space and the latent space, and (3) a penalty term to avoid collapse to a trivial solution. Notably, MMD-MA does not require any correspondence information across data modalities, either between the cells or between the features. Furthermore, MMD-MA’s weak distributional requirements for the domains to be aligned allow the algorithm to integrate heterogeneous types of single cell measures, such as gene expression, DNA accessibility, chromatin organization, methylation, and imaging data. We demonstrate the utility of MMD-MA in simulation experiments and using a real data set involving single-cell gene expression and methylation data.

Subject Classification

ACM Subject Classification
  • Applied computing → Computational biology
  • Computing methodologies → Dimensionality reduction and manifold learning
  • Computing methodologies → Unsupervised learning
  • Computing methodologies → Machine learning algorithms
Keywords
  • Manifold alignment
  • single-cell sequencing

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. M. Amodio and S. Krishnaswamy. MAGAN: Aligning biological manifolds. In Proceedings of the International Conference on Machine Learning, 2018. Google Scholar
  2. C. Angermueller, S. J. Clark, H. J. Lee, I. C. Macaulay, M. J. Teng, T. X. Hu, F. Krueger, S. A. Smallwood, C. P. Ponting, T. Voet, G. Kelsey, O.r Stegle, and W. Reik. Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity. Nature Methods, 13:229-232, 2016. Google Scholar
  3. J. D. Buenrostro, B. Wu, U. M. Litzenburger, D. Ruff, M. L. Gonzales, M. P. Snyder, H. Y. Chang, and W. Greenleaf. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature, 523(7561):486-490, 2015. Google Scholar
  4. J. Cao, D. A. Cusanovich, V. Ramani, D. Aghamirzaie, H. A. Pliner, A. J. Hill, R. M. Daza, J. L. McFaline-Figueroa, J. S. Packer, L. Christiansen, F. J. Steemers, A. C. Adey, C. Trapnell, and J. Shendure. Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science, 361(6409):1380-1385, 2018. Google Scholar
  5. K. Chwialkowski, A. Ramdas, D. Sejdinovic, and A. Gretton. Fast two-sample testing with analytic representations of probability measures. In Advances in Neural Information Processing Systems, pages 1981-1989, 2015. Google Scholar
  6. Z. Cui, H. Chang, S. Shan, and X. Chen. Generalized unsupervised manifold alignment. In Advances in Neural Information Processing Systems, pages 2429-2437, 2014. Google Scholar
  7. A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. Smola. A kernel two-sample test. Journal of Machine Learning Research, 13:723-773, 2012. Google Scholar
  8. T. Kim, M. Cha, H. Kim, J. K. Lee, and J. Kim. Learning to discover cross-domain relations with generative adversarial networks. arXiv, 2017. URL: http://arxiv.org/abs/1703.05192.
  9. D. Kingma and J. Ba. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, 2015. Google Scholar
  10. T. Nagano, Y. Lubling, T. J. Stevens, S. Schoenfelder, E. Yaffe, W. Dean, E. D. Laue, A. Tanay, and P. Fraser. Single-cell Hi-C reveals cell-to-cell variability in chromosome structure. Nature, 502(7469):59-64, 2013. Google Scholar
  11. Y. Pei, F. Huang, F. Shi, and H. Zha. Unsupervised image matching based on manifold alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(8):1658-1664, 2012. Google Scholar
  12. A. Rahimi and B. Recht. Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems, 2007. Google Scholar
  13. S. A. Smallwood, H. J. Lee, C. Angermueller, F. Krueger, H. Saadeh, J. Peat, S. R. Andrews, O. Stegle, W. Reik, and G. Kelsey. Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nature Methods, 11:817-820, 2014. Google Scholar
  14. F. Tang, C. Barbacioru, Y. Wang, E. Nordman, C. Lee, N. Xu, X. Wang, J. Bodeau, B. B. Tuch, A. Siddiqui, K. Lao, and M. A. Surani. mRNA-Seq whole-transcriptome analysis of a single cell. Nature Methods, 6:377-382, 2009. Google Scholar
  15. C. Wang, P. Krafft, and S. Mahadevan. Manifold alignment. In Y. Ma and Y. Fu, editors, Manifold Learning: Theory and Applications. CRC Press, 2011. Google Scholar
  16. C. Williams and M. Seeger. Using the Nyström method to speed up kernel machines. In Advances in Neural Information Processing Systems, 2001. Google Scholar
  17. Z. Yi, H. Zhang, P. Tan, and M. Gong. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation. In ICCV, pages 2868-2876, 2017. Google Scholar
  18. J. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv, 2017. URL: http://arxiv.org/abs/1703.10593.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail