A Hybrid Approach for Android Malware Detection and Family Classification.

Authors

DOI:

https://doi.org/10.9781/ijimai.2020.09.001

Keywords:

Android, Malware, Machine Learning

Abstract

With the increase in the popularity of mobile devices, malicious applications targeting Android platform have greatly increased. Malware is coded so prudently that it has become very complicated to identify. The increase in the large amount of malware every day has made the manual approaches inadequate for detecting the malware. Nowadays, a new malware is characterized by sophisticated and complex obfuscation techniques. Thus, the static malware analysis alone is not enough for detecting it. However, dynamic malware analysis is appropriate to tackle evasion techniques but incapable to investigate all the execution paths and also it is very time consuming. So, for better detection and classification of Android malware, we propose a hybrid approach which integrates the features obtained after performing static and dynamic malware analysis. This approach tackles the problem of analyzing, detecting and classifying the Android malware in a more efficient manner. In this paper, we have used a robust set of features from static and dynamic malware analysis for creating two datasets i.e. binary and multiclass (family) classification datasets. These are made publically available on GitHub and Kaggle with the aim to help researchers and anti-malware tool creators for enhancing or developing new techniques and tools for detecting and classifying Android malware. Various machine learning algorithms are employed to detect and classify malware using the features extracted after performing static and dynamic malware analysis. The experimental outcomes indicate that hybrid approach enhances the accuracy of detection and classification of Android malware as compared to the case when static and dynamic features are considered alone.

Downloads

Download data is not yet available.

References

StatistaReport. Accessed: December. 2019. [Online]. Available: http://www.statista.com/statistics/266488/forecast-of-mobile-appdownloads/

A. M. Memon, and A. Anwar, “Colluding apps: tomorrow’s mobile malware threat,” IEEE Security & Privacy, vol. 13 no. 6, pp. 77–81, 2015.

Y. Zhou, and X. Jiang, “Dissecting Android malware: characterization and evolution,” in IEEE Symposium in Security and Privacy, 2012, pp. 95–109.

Future-Trends-of-Android-Malware-Growth. Accessed: December. 2019. [Online]. Available: https://www.researchgate.net/figure/Future-Trendsof-Android-Malware-Growth.

McAfee Labs. (2018) Threat Predictions Report, McAfee Labs, Santa Clara, CA, USA.

D. Barrera, H. G. Kayacik, P. C. V. Oorschot, and A. Somayaji, “A methodology for empirical analysis of permission-based security models and its application to Android,” in Proc. of 17th ACM Conf. Computer and Communications Security, CCS 10, 2010, pp. 73–84.

S. Singla, E. Gandotra, D. Bansal, and S. Sofat, “Detecting and classifying morphed malwares: A survey,” International Journal of Computer Applications, vol. 122, no. 10, 2015.

E. Gandotra, D. Bansal, and S. Sofat, “Malware analysis and classification: A survey,” Journal of Information Security, vol. 5, no. 02, p. 56, 2014.

CuckooDroid. Accessed: October. 2019. [Online]. Available: https://cuckoo-droid.readthedocs.io/en/latest/installation/

E. Gandotra, D. Bansal, and S. Sofat, “Malware intelligence: beyond malware analysis,” International Journal of Advanced Intelligence Paradigms, vol. 13, no. 1-2, pp. 80-100, 2019.

G. Suarez-Tangil, J. Tapiador, P. Peris-Lopez, and A. Ribagorda, “Evolution, detection and analysis of malware for smart devices,” IEEE Communications Surveys & Tutorials, vol. 16, no. 2, pp. 961–987, 2013.

S. Moghaddam, and M. Abbaspour, “Sensitivity analysis of static features for Android malware detection,” in Electrical Engineering (ICEE), Tehran, Iran, 2014, pp. 920–924.

Q. Li, and X. Li, “Android malware detection based on static analysis of characteristic tree,” in Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Xian, China, 2015, pp. 84-91.

J. Li, L. Sun, Q. Yan, Z. Li, W. Srisaan, and Y. Heng, “Significant permission identification for machine-learning-based android malware detection,” IEEE Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3216-3225, 2018.

H. J. Zhu, Z. H. You, Z. X. Zhu, W. L. Shi, X. Chen, and L. Cheng, “DroidDet: effective and robust detection of android malware using static analysis along with rotation forest model,” Neurocomputing, vol. 272, pp. 638-646, 2018.

S. Y. Yerima, and S. Sezer, “Droidfusion: A novel multilevel classifier fusion approach for android malware detection,” IEEE transactions on cybernetic, vol. 49, no. 2, pp. 453-466, 2018.

T. Kim, B. Kang, M. Rho, S. Sezer, and E. G. Im, “A multimodal deep learning method for Android malware detection using various features,” IEEE Transactions on Information Forensics and Security, vol. 14, no. 3, pp. 773-788, 2018.

A. Feizollah, N. B. Anuar, R. Salleh, G. S. Tangil, and S. Furnell, “Androdialysis: Analysis of android intent effectiveness in malware detection,” Computers & Security, vol. 65, pp. 121-134, 2017.

W. Wang, X. Wang, D. Feng, J. Liu, Z. Han, and X. Zhang, “Exploring permission-induced risk in android applications for malicious application detection,” IEEE Transactions on Information Forensics and Security, vol. 9, no. 11, pp. 1869-1882, 2014.

M. Dhalaria, E. Gandotra, and S. Saha, “Comparative Analysis of Ensemble Methods for Classification of Android Malicious Applications,” in advances in Computing and Data Sciences, M. Singh, P. K. Gupta, V. Tyagi, J. Flusser, T. Oren and R. Kashyap, Eds. Singapore: Springer International Publishing, 2019, pp. 370-380.

M. Dhalaria and E. Gandotra, “Convolutional Neural Network for Classification of Android Applications Represented as Grayscale Images,” International Journal of Innovative Technology and Exploring Engineering (IJITEE), vol. 8, no. 12S, pp. 835-843, 2019.

H. Cai, N. Meng, B. Ryder, and D. Yao, “Droidcat: Effective android malware detection and categorization via app-level profiling,” IEEE Transactions on Information Forensics and Security, vol. 14, no. 6, pp. 1455-1470, 2018.

P. Feng, J. Ma, C. Sun, X. Xu, and Y. Ma, “A Novel Dynamic Android Malware Detection System With Ensemble Learning,” IEEE Access, vol. 6, pp. 30996-31011, 2018.

S. Das, Y. Liu, W. Zhang, and M. Chandramohan, “Semantics-based online malware detection: Towards efficient real-time protection against malware,” IEEE transactions on information forensics and security, vol. 11, no. 2, pp. 289-302, 2015.

W. Enck, P. Gilbert, S. Han, V. Tendulkar, B. Chun, L. P. Cox, J. Jung, P. McDaniel, and A. N. Sheth, “TaintDroid: an information-flow tracking system for real-time privacy monitoring on smartphones,” ACM Transactions on Computer Systems (TOCS), vol. 32, no. 2, p. 5, 2014.

L. Chen, M. Zhang, C. Y. Yang, and R. Sahita, “Semi-supervised classification for dynamic Android malware detection,” arXiv preprint arXiv: 1704.05948, 2017.

M. Zheng, M. Sun, and J. C. S. Lui, “DroidTrace: A ptrace based Android dynamic analysis system with forward execution capability,” in international wireless communications and mobile computing conference (IWCMC), Nicosia, Cyprus, 2014, pp. 128-133.

Z. Yuan, Y. Lu, and Y. Xue, “Droiddetector: android malware characterization and detection using deep learning,” Tsinghua Science and Technology, vol. 21, no. 1, pp. 114-123, 2016.

F. Tong, and Z. Yan, “A hybrid approach of mobile malware detection in Android,” Journal of Parallel and Distributed computing, vol. 103, pp. 22-31, 2017.

A. Martín, R. L. Cabrera, and D. Camacho, “Android malware detection through hybrid features fusion and ensemble classifiers: The AndroPyTool framework and the OmniDroid dataset,” Information Fusion, vol. 52, pp. 128-142, 2019.

T. Bläsing, L. Batyuk, A. D.Schmidt, S. A. Camtepe, and S. Albayrak, “An android application sandbox system for suspicious software detection,” in 5th International Conference on Malicious and Unwanted Software, Nancy, Lorraine, France, 2010, pp. 55-62.

D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. E. R. T. Siemens, “Drebin: Effective and explainable detection of android malware in your pocket,” In Ndss, vol. 14, pp. 23-26, 2014.

Virusshare. Accessed: March. 2019. [Online]. Available: https://virusshare.com/

APKMirror. Accessed: March. 2019. [Online]. Available: https://www. apkmirror.com/

Apkpure. Accessed: March. 2019. [Online]. Available: https://apkpure.com/

Avira. Accessed: April. 2019. [Online]. Available: https://www.avira.com/

W. Enck, D. Octeau, P. D. McDaniel, and S. Chaudhuri, “A study of android application security,” In USENIX security symposium, vol. 2, p. 2, 2011.

E. Gandotra, D. Bansal, and S. Sofat, “Tools & Techniques for Malware Analysis and Classification,” International Journal of Next-Generation Computing, vol. 7, no. 3, 2016.

Android4me: J2ME port of Google’s Android (2011) https://code.google.com/p/android4me/downloads/list

Android Developers. Accessed: May. 2019. [Online]: Available: https://developer.android.com/guide/topics/manifest/permissionelement

Xposed module repository. Accessed: May. 2019. [Online]. Available: http://repo.xposed.info/module/de.robv.android.xposed.installer

S. Malik, and K. Khatter, “System call analysis of android malware families,” Indian Journal of Science and Technology, vol. 9, no. 21, 2016.

B. Chizi, and O. Maimon, “Dimension reduction and feature selection,” in Data mining and knowledge discovery handbook, O. Maimon and L. Rokach, Eds. Boston MA: Springer, 2009, pp. 83-100.

J. Han, J. Pei, and M. Kamber, “Data mining: concepts and techniques,” Elsevier, 2011.

G. Shakhnarovish, T. Darrell, and P. Indyk, “Nearest-neighbor methods in learning and vision,” In MIT Press, 2005, p. 262.

Keerthi, S. Sathiya, and E. G. Gilbert, “Convergence of a generalized SMO algorithm for SVM classifier design,” Machine Learning, vol. 46, no. 1-3, pp. 351-360, 2002.

A. Liaw, and M. Wiener, “Classification and regression by randomForest,” R news, vol. 2, no. 3, 2002, pp. 18-22.

J. R. Quinlan, “The Morgan Kaufmann Series in Machine Learning,” San Mateo, 1993.

P. Domingos, and M. Pazzani, “On the optimality of the simple Bayesian classifier under zero-one loss,” Machine learning, vol. 29, no. 2-3, pp. 103- 130, 1997.

F. Eibe, and I. H. Witten, “Generating Accurate Rule Sets Without Global Optimization,” In: Fifteenth International Conference on Machine Learning, 1998, pp. 144-151.

S. B. Joo, S. E. Oh, T. Sim, H. Kim, C. H. Choi, H. Koo, and J. H. Mun, “Prediction of gait speed from plantar pressure using artificial neural networks,” Expert Systems with Applications, vol. 41, no. 16, pp. 7398-7405, 2014.

V. B. Semwal, K. Mondal, and G. C. Nandi, “Robust and accurate feature selection for humanoid push recovery and classification: deep learning approach,” Neural Computing and Applications, vol. 28, no. 3, pp. 565-574, 2017.

V. B. Semwal, M. Raj, and G. C. Nandi, “Biometric gait identification based on a multilayer perceptron,” Robotics and Autonomous Systems vol. 65, pp. 65-75, 2015.

V. B. Semwal, N. Gaud, and G. C. Nandi, “Human gait state prediction using cellular automata and classification using ELM,” in machine intelligence and signal analysis, M.Tanveer and R. B. Pachori, Eds. Singapore: Springer, 2019, pp. 135-145.

D. Gupta, and R. Rani, “Big Data Framework for Zero-Day Malware Detection,” Cybernetics and Systems, vol. 49, no. 2, pp. 103-121, 2018.

Scikit-Learn Machine Learning in Python. Accessed: June. 2019. [Online]. Available: https://scikit-learn.org/stable/

D. Gupta, and R. Rani, “A study of big data evolution and research challenges,” Journal of Information Science, vol. 45, no. 3, pp. 322-340, 2019.

Downloads

Published

2021-06-01
Metrics
Views/Downloads
  • Abstract
    257
  • PDF
    97

How to Cite

Dhalaria, M. and Gandotra, E. (2021). A Hybrid Approach for Android Malware Detection and Family Classification. International Journal of Interactive Multimedia and Artificial Intelligence, 6(6), 174–188. https://doi.org/10.9781/ijimai.2020.09.001