A Decade of Mal-Activity Reporting: A Retrospective Analysis of Internet Malicious Activity Blacklists
Published in Asia Computer and Communication Security Conference (CCS), 2019
Abstract: This paper focuses on reporting of Internet malicious activity (ormal-activity in short) by public blacklists with the objective of pro-viding a systematic characterization of what has been reportedover the years, and more importantly, the evolution of reportedactivities. Using an initial seed of 22 blacklists, covering the periodfrom January 2007 to June 2017, we collect more than 51 millionmal-activity reports involving 662K unique IP addresses worldwide.Leveraging the Wayback Machine, antivirus (AV) tool reports andseveral additional public datasets (e.g., BGP Route Views and Inter-net registries) we enrich the data with historical meta-informationincluding geo-locations (countries), autonomous system (AS) num-bers and types of mal-activity. Furthermore, we use the initiallylabelled dataset of approx. 1.57 million mal-activities (obtained from pub-lic blacklists) to train a machine learning classifier to classify theremaining unlabeled dataset of approx. 44 million mal-activities obtainedthrough additional sources. We make our unique collected dataset(and scripts used) publicly available for further research.
Recommended citation: Benjamin Zi Hao Zhao, Muhammad Ikram, Hassan Asghar, Mohamed Ali Kaafar, Abdelberi Chaabane, and Kanchana Thilakarathna, "A Decade of Mal-Activity Reporting: A Retrospective Analysis of Internet Malicious Activity Blacklists", In Asia Computer and Communication Security Conference (CCS), 2019. https://internetmaliciousactivity.github.io