PEDroid: Automatically Extracting Patches from Android App Updates

Li, Hehao; Wang, Yizhuo; Zhang, Yiwei; Li, Juanru; Gu, Dawu

doi:10.4230/LIPIcs.ECOOP.2022.21

Abstract

Identifying and analyzing code patches is a common practice to not only understand existing bugs but also help find and fix similar bugs in new projects. Most patch analysis techniques aim at open-source projects, in which the differentials of source code are easily identified, and some extra information such as code commit logs could be leveraged to help find and locate patches. The task, however, becomes challenging when source code as well as development logs are lacking. A typical scenario is to discover patches in an updated Android app, which requires bytecode-level analysis. In this paper, we propose an approach to automatically identify and extract patches from updated Android apps by comparing the updated versions and their predecessors. Given two Android apps (original and updated versions), our approach first identifies identical and modified methods by similarity comparison through code features and app structures. Then, it compares these modified methods with their original implementations in the original app, and detects whether a patch is applied to the modified method by analyzing the difference in internal semantics. We implemented PEDroid, a prototype patch extraction tool against Android apps, and evaluated it with a set of popular open-source apps and a set of real-world apps from different Android vendors. PEDroid identifies 28 of the 36 known patches in the former, and successfully analyzes 568 real-world app updates in the latter, among which 94.37% of updates could be completed within 20 minutes.

Android debug bridge (adb), accessed: November 2021. URL: https://developer.android.com/studio/command-line/adb.
Open source two-factor authentication for android, accessed: November 2021. URL: https://github.com/andOTP/andOTP.
androguard, accessed: November 2021. URL: https://code.google.com/archive/p/androguard/.
Ankidroid: Anki flashcards on android. your secret trick to achieve superhuman information retention, accessed: November 2021. URL: https://github.com/ankidroid/Anki-Android.
Tanzirul Azim, Iulian Neamtiu, and Lisa M. Marvel. Towards self-healing smartphone software via automated patching. In ACM/IEEE International Conference on Automated Software Engineering, ASE '14, Vasteras, Sweden - September 15 - 19, 2014, pages 623-628. ACM, 2014.
Michael Backes, Sven Bugiel, and Erik Derr. Reliable third-party library detection in android and its security applications. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, October 24-28, 2016, pages 356-367. ACM, 2016.
Bindiff, accessed: November 2021. URL: https://www.zynamics.com/bindiff.html.
Anthony Desnos. Android: Static analysis using similarity distance. In 45th Hawaii International International Conference on Systems Science (HICSS-45 2012), Proceedings, 4-7 January 2012, Grand Wailea, Maui, HI, USA, pages 5394-5403. IEEE Computer Society, 2012.
Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Montperrus. Fine-grained and accurate source code differencing. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, ASE '14, pages 313-324, New York, NY, USA, 2014. ACM.
Find security bugs, accessed: November 2021. URL: https://find-sec-bugs.github.io/.
git-difftool documentation, accessed: November 2021. URL: https://git-scm.com/docs/git-difftool.
Github: Where the world builds software, accessed: November 2021. URL: https://github.com/.
Gnucash for android mobile companion application, accessed: November 2021. URL: https://github.com/codinguser/gnucash-android.
open-source android gnss/gps test program, accessed: November 2021. URL: https://github.com/barbeau/gpstest.
Steve Hanna, Ling Huang, Edward XueJun Wu, Saung Li, Charles Chen, and Dawn Song. Juxtapp: A scalable system for detecting code reuse among android applications. In Detection of Intrusions and Malware, and Vulnerability Assessment - 9th International Conference, DIMVA 2012, Heraklion, Crete, Greece, July 26-27, 2012, Revised Selected Papers, volume 7591 of Lecture Notes in Computer Science, pages 62-81. Springer, 2012.
Project planning for developers, accessed: November 2021. URL: https://github.com/features/issues.
Jiajun Jiang, Yingfei Xiong, Hongyu Zhang, Qing Gao, and Xiangqun Chen. Shaping program repair space with existing patches and similar code. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2018, Amsterdam, The Netherlands, July 16-21, 2018, pages 298-309. ACM, 2018.
keytool, accessed: November 2021. URL: https://docs.oracle.com/javase/8/docs/technotes/tools/unix/keytool.html.
Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. Automatic patch generation learned from human-written patches. In 35th International Conference on Software Engineering, ICSE '13, San Francisco, CA, USA, May 18-26, 2013, pages 802-811. IEEE Computer Society, 2013.
Li Li, Tegawendé F. Bissyandé, and Jacques Klein. Simidroid: Identifying and explaining similarities in android apps. In 2017 IEEE Trustcom/BigDataSE/ICESS, Sydney, Australia, August 1-4, 2017, pages 136-143. IEEE Computer Society, 2017.
Yi Li, Shaohua Wang, and Tien N. Nguyen. Dlfix: context-based code transformation learning for automated program repair. In ICSE '20: 42nd International Conference on Software Engineering, Seoul, South Korea, 27 June - 19 July, 2020, pages 602-614. ACM, 2020.
Yi Li, Shaohua Wang, Tien N. Nguyen, and Son Van Nguyen. Improving bug detection via context-based code representation learning and attention-based neural networks. Proc. ACM Program. Lang., 3(OOPSLA):162:1-162:30, 2019.
Xuliang Liu and Hao Zhong. Mining stackoverflow for program repair. In 25th International Conference on Software Analysis, Evolution and Reengineering, SANER 2018, Campobasso, Italy, March 20-23, 2018, pages 118-129. IEEE Computer Society, 2018.
Fan Long, Peter Amidon, and Martin Rinard. Automatic inference of code transforms for patch generation. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2017, Paderborn, Germany, September 4-8, 2017, pages 727-739. ACM, 2017.
Siqi Ma, David Lo, Teng Li, and Robert H. Deng. Cdrep: Automatic repair of cryptographic misuses in android applications. In Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, AsiaCCS 2016, Xi'an, China, May 30 - June 3, 2016, pages 711-722. ACM, 2016.
Siqi Ma, Ferdian Thung, David Lo, Cong Sun, and Robert H. Deng. Vurle: Automatic vulnerability detection and repair by learning from examples. In Computer Security - ESORICS 2017 - 22nd European Symposium on Research in Computer Security, Oslo, Norway, September 11-15, 2017, Proceedings, Part II, volume 10493 of Lecture Notes in Computer Science, pages 229-246. Springer, 2017.
Text editor - notes & todo (for android), accessed: November 2021. URL: https://github.com/gsantner/markor.
Material design file manager for android, accessed: November 2021. URL: https://github.com/zhanghai/MaterialFiles.
Stuart McIlroy, Nasir Ali, and Ahmed E. Hassan. Fresh apps: an empirical study of frequently-updated mobile apps in the google play store. Empir. Softw. Eng., 21(3):1346-1370, 2016.
Shrink your java and android code, accessed: November 2021. URL: https://www.guardsquare.com/proguard.
Thorsten Schäfer, Jan Jonas, and Mira Mezini. Mining framework usage changes from instantiation code. In International Conference on Software Engineering (ICSE), pages 471-480, New York, NY, USA, 2008. ACM.
Danilo Silva, João Paulo da Silva, Gustavo Jansen de Souza Santos, Ricardo Terra, and Marco Tulio Valente. Refdiff 2.0: A multi-language refactoring detection tool. IEEE Trans. Software Eng., 47(12):2786-2802, 2021.
Danilo Silva and Marco Tulio Valente. Refdiff: Detecting refactorings in version histories. In Proceedings of the 14th International Conference on Mining Software Repositories, MSR '17, pages 269-279. IEEE Press, 2017.
Soot - A java optimization framework, accessed: November 2021. URL: https://github.com/soot-oss/soot.
Spotbugs, accessed: November 2021. URL: https://spotbugs.github.io/.
Shin Hwei Tan, Zhen Dong, Xiang Gao, and Abhik Roychoudhury. Repairing crashes in android apps. In Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018, pages 187-198. ACM, 2018.
Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. An empirical study on learning bug-fixing patches in the wild via neural machine translation. ACM Trans. Softw. Eng. Methodol., 28(4):19:1-19:29, 2019.
Xinda Wang, Kun Sun, Archer L. Batcheller, and Sushil Jajodia. Detecting "0-day" vulnerability: An empirical study of secret security patch in OSS. In 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019, Portland, OR, USA, June 24-27, 2019, pages 485-492. IEEE, 2019.
Yan Wang, Haowei Wu, Hailong Zhang, and Atanas Rountev. ORLIS: obfuscation-resilient library detection for android. In Proceedings of the 5th International Conference on Mobile Software Engineering and Systems, MOBILESoft@ICSE 2018, Gothenburg, Sweden, May 27 - 28, 2018, pages 13-23. ACM, 2018.
Martin White, Michele Tufano, Matias Martinez, Martin Monperrus, and Denys Poshyvanyk. Sorting and transforming program repair ingredients via deep learning code similarities. In 26th IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2019, Hangzhou, China, February 24-27, 2019, pages 479-490. IEEE, 2019.
Qiushi Wu, Yang He, Stephen McCamant, and Kangjie Lu. Precisely characterizing security impact in a flood of patches via symbolic rule comparison. In 27th Annual Network and Distributed System Security Symposium, NDSS 2020, San Diego, California, USA, February 23-26, 2020. The Internet Society, 2020.
Jiayun Xie, Xiao Fu, Xiaojiang Du, Bin Luo, and Mohsen Guizani. Autopatchdroid: A framework for patching inter-app vulnerabilities in android application. In IEEE International Conference on Communications, ICC 2017, Paris, France, May 21-25, 2017, pages 1-6. IEEE, 2017.
Zhenchang Xing and Eleni Stroulia. Umldiff: an algorithm for object-oriented design differencing. In 20th IEEE/ACM International Conference on Automated Software Engineering (ASE 2005), November 7-11, 2005, Long Beach, CA, USA, pages 54-65. ACM, 2005.
Yifei Xu, Zhengzi Xu, Bihuan Chen, Fu Song, Yang Liu, and Ting Liu. Patch based vulnerability matching for binary programs. In Proc. 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), Virtual Event, USA, 2020. ACM.
Zhengzi Xu, Bihuan Chen, Mahinthan Chandramohan, Yang Liu, and Fu Song. SPAIN: security patch analysis for binaries towards understanding the pain and pills. In Proceedings of the 39th International Conference on Software Engineering, ICSE 2017, Buenos Aires, Argentina, May 20-28, 2017, pages 462-472. IEEE / ACM, 2017.
Dongjin Yu, Jie Wang, Qing Wu, Jiazha Yang, Jiaojiao Wang, Wei Yang, and Wei Yan. Detecting java code clones with multi-granularities based on bytecode. In 41st IEEE Annual Computer Software and Applications Conference, COMPSAC 2017, Turin, Italy, July 4-8, 2017. Volume 1, pages 317-326. IEEE Computer Society, 2017.
Jiexin Zhang, Alastair R. Beresford, and Stephan A. Kollmann. Libid: reliable identification of obfuscated third-party android libraries. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2019, Beijing, China, July 15-19, 2019, pages 55-65. ACM, 2019.
Mu Zhang and Heng Yin. Appsealer: Automatic generation of vulnerability-specific patches for preventing component hijacking attacks in android applications. In 21st Annual Network and Distributed System Security Symposium, NDSS 2014, San Diego, California, USA, February 23-26, 2014. The Internet Society, 2014.
Yuan Zhang, Jiarun Dai, Xiaohan Zhang, Sirong Huang, Zhemin Yang, Min Yang, and Hao Chen. Detecting third-party libraries in android applications with high precision and recall. In 25th International Conference on Software Analysis, Evolution and Reengineering, SANER 2018, Campobasso, Italy, March 20-23, 2018, pages 141-152. IEEE Computer Society, 2018.

PEDroid: Automatically Extracting Patches from Android App Updates

Authors Hehao Li, Yizhuo Wang, Yiwei Zhang, Juanru Li, Dawu Gu

File

Document Identifiers

Author Details

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

PEDroid: Automatically Extracting Patches from Android App Updates

Authors Hehao Li, Yizhuo Wang, Yiwei Zhang, Juanru Li, Dawu Gu

File

Document Identifiers

Author Details

Funding

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Supplementary Materials

References

Thanks for your feedback!

Could not send message