Distance Estimation Between Unknown Matrices Using Sublinear Projections on Hamming Cube

Authors Arijit Bishnu, Arijit Ghosh, Gopinath Mishra



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2021.44.pdf
  • Filesize: 0.99 MB
  • 22 pages

Document Identifiers

Author Details

Arijit Bishnu
  • Indian Statistical Institute, Kolkata, India
Arijit Ghosh
  • Indian Statistical Institute, Kolkata, India
Gopinath Mishra
  • Indian Statistical Institute, Kolkata, India

Acknowledgements

The authors wish to thank their colleague Ansuman Banerjee for helpful discussions on GPU architecture and CUDA.

Cite AsGet BibTex

Arijit Bishnu, Arijit Ghosh, and Gopinath Mishra. Distance Estimation Between Unknown Matrices Using Sublinear Projections on Hamming Cube. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 44:1-44:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2021.44

Abstract

Using geometric techniques like projection and dimensionality reduction, we show that there exists a randomized sub-linear time algorithm that can estimate the Hamming distance between two matrices. Consider two matrices A and B of size n × n whose dimensions are known to the algorithm but the entries are not. The entries of the matrix are real numbers. The access to any matrix is through an oracle that computes the projection of a row (or a column) of the matrix on a vector in {0,1}ⁿ. We call this query oracle to be an Inner Product oracle (shortened as IP). We show that our algorithm returns a (1± ε) approximation to {D}_M (A,B) with high probability by making O(n/(√{{D)_M (A,B)}}poly(log n, 1/(ε))) oracle queries, where {D}_M (A,B) denotes the Hamming distance (the number of corresponding entries in which A and B differ) between two matrices A and B of size n × n. We also show a matching lower bound on the number of such IP queries needed. Though our main result is on estimating {D}_M (A,B) using IP, we also compare our results with other query models.

Subject Classification

ACM Subject Classification
  • Theory of computation → Streaming, sublinear and near linear time algorithms
Keywords
  • Distance estimation
  • Property testing
  • Dimensionality reduction
  • Sub-linear algorithms

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. URL: https://www.mathworks.com/products/matlab.html.
  2. URL: https://www.mathworks.com/help/stats/pdist.html.
  3. URL: https://docs.scipy.org/doc/scipy-0.7.x/scipy-ref.pdf.
  4. URL: https://developer.download.nvidia.com/cg/dot.html.
  5. URL: https://software.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/compiler-reference/intrinsics/intrinsics-for-intel-streaming-simd-extensions-4-intel-sse4/vectorizing-compiler-and-media-accelerators/floating-point-dot-product-intrinsics.html.
  6. Dimitris Achlioptas. Database-friendly random projections: Johnson-lindenstrauss with binary coins. J. Comput. Syst. Sci., 66(4):671-687, 2003. URL: https://doi.org/10.1016/S0022-0000(03)00025-4.
  7. Pankaj K. Agarwal, Rinat Ben Avraham, Haim Kaplan, and Micha Sharir. Computing the discrete fréchet distance in subquadratic time. SIAM J. Comput., 43(2):429-449, 2014. URL: https://doi.org/10.1137/130920526.
  8. Pankaj K. Agarwal, Kyle Fox, Abhinandan Nath, Anastasios Sidiropoulos, and Yusu Wang. Computing the gromov-hausdorff distance for metric trees. ACM Trans. Algorithms, 14(2), 2018. URL: https://doi.org/10.1145/3185466.
  9. Pankaj K. Agarwal, Sariel Har-Peled, Micha Sharir, and Yusu Wang. Hausdorff distance under translation for points and balls. 6(4), 2010. URL: https://doi.org/10.1145/1824777.1824791.
  10. Piotr Berman, Meiram Murzabulatov, and Sofya Raskhodnikova. Testing convexity of figures under the uniform distribution. In Sándor P. Fekete and Anna Lubiw, editors, 32nd International Symposium on Computational Geometry, SoCG 2016, June 14-18, 2016, Boston, MA, USA, volume 51 of LIPIcs, pages 17:1-17:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. URL: https://doi.org/10.4230/LIPIcs.SoCG.2016.17.
  11. Piotr Berman, Meiram Murzabulatov, and Sofya Raskhodnikova. Tolerant testers of image properties. In Ioannis Chatzigiannakis, Michael Mitzenmacher, Yuval Rabani, and Davide Sangiorgi, editors, 43rd International Colloquium on Automata, Languages, and Programming, ICALP 2016, July 11-15, 2016, Rome, Italy, volume 55 of LIPIcs, pages 90:1-90:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. URL: https://doi.org/10.4230/LIPIcs.ICALP.2016.90.
  12. Piotr Berman, Meiram Murzabulatov, and Sofya Raskhodnikova. Testing convexity of figures under the uniform distribution. Random Struct. Algorithms, 54(3):413-443, 2019. URL: https://doi.org/10.1002/rsa.20797.
  13. Arijit Bishnu, Arijit Ghosh, and Gopinath Mishra. Distance estimation between unknown matrices using sublinear projections on hamming cube. CoRR, abs/2107.02666, 2021. URL: http://arxiv.org/abs/2107.02666.
  14. Arijit Bishnu, Arijit Ghosh, Gopinath Mishra, and Manaswi Paraashar. Inner product oracle can estimate and sample. CoRR, abs/1906.07398, 2019. URL: http://arxiv.org/abs/1906.07398.
  15. Bernard Chazelle, Ding Liu, and Avner Magen. Sublinear geometric algorithms. SIAM J. Comput., 35(3):627-646, 2005. URL: https://doi.org/10.1137/S009753970444572X.
  16. Artur Czumaj, Funda Ergün, Lance Fortnow, Avner Magen, Ilan Newman, Ronitt Rubinfeld, and Christian Sohler. Sublinear-time approximation of euclidean minimum spanning tree. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, January 12-14, 2003, Baltimore, Maryland, USA, pages 813-822. ACM/SIAM, 2003. URL: http://dl.acm.org/citation.cfm?id=644108.644242.
  17. Artur Czumaj and Christian Sohler. Property testing with geometric queries. In Friedhelm Meyer auf der Heide, editor, Algorithms - ESA 2001, 9th Annual European Symposium, Aarhus, Denmark, August 28-31, 2001, Proceedings, volume 2161 of Lecture Notes in Computer Science, pages 266-277. Springer, 2001. URL: https://doi.org/10.1007/3-540-44676-1_22.
  18. Artur Czumaj, Christian Sohler, and Martin Ziegler. Property testing in computational geometry. In Mike Paterson, editor, Algorithms - ESA 2000, 8th Annual European Symposium, Saarbrücken, Germany, September 5-8, 2000, Proceedings, volume 1879 of Lecture Notes in Computer Science, pages 155-166. Springer, 2000. URL: https://doi.org/10.1007/3-540-45253-2_15.
  19. Anne Driemel and Sariel Har-Peled. Jaywalking your dog: Computing the fréchet distance with shortcuts. SIAM J. Comput., 42(5):1830-1866, 2013. URL: https://doi.org/10.1137/120865112.
  20. Anne Driemel, Sariel Har-Peled, and Carola Wenk. Approximating the fréchet distance for realistic curves in near linear time. Discret. Comput. Geom., 48(1):94-127, 2012. URL: https://doi.org/10.1007/s00454-012-9402-z.
  21. D. Dubhashi and A. Panconesi. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, 1st edition, 2009. Google Scholar
  22. Oded Goldreich. Introduction to Property Testing. Cambridge University Press, 2017. URL: https://doi.org/10.1017/9781108135252.
  23. John L. Hennessy and David A. Patterson. Computer Architecture - A Quantitative Approach (5. ed.). Morgan Kaufmann, 2012. Google Scholar
  24. Igor Kleiner, Daniel Keren, Ilan Newman, and Oren Ben-Zwi. Applying property testing to an image partitioning problem. IEEE Trans. Pattern Anal. Mach. Intell., 33(2):256-265, 2011. URL: https://doi.org/10.1109/TPAMI.2010.165.
  25. Eyal Kushilevitz and Noam Nisan. Communication complexity. Cambridge University Press, 1997. Google Scholar
  26. Sofya Raskhodnikova. Approximate testing of visual properties. In Sanjeev Arora, Klaus Jansen, José D. P. Rolim, and Amit Sahai, editors, Approximation, Randomization, and Combinatorial Optimization: Algorithms and Techniques, 6th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2003 and 7th International Workshop on Randomization and Approximation Techniques in Computer Science, RANDOM 2003, Princeton, NJ, USA, August 24-26, 2003, Proceedings, volume 2764 of Lecture Notes in Computer Science, pages 370-381. Springer, 2003. URL: https://doi.org/10.1007/978-3-540-45198-3_31.
  27. Dana Ron and Gilad Tsur. Testing properties of sparse images. ACM Trans. Algorithms, 10(4):17:1-17:52, 2014. URL: https://doi.org/10.1145/2635806.
  28. Jason Sanders and Edward Kandrot. CUDA by Example: An Introduction to General-Purpose GPU Programming. Addison-Wesley, Upper Saddle River, NJ, 2010. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail