Frequent Sequential Attack Patterns of Malware in Botnets Nur Rohman Rosyid.

Slides:



Advertisements
Similar presentations
PREFIXSPAN ALGORITHM Mining Sequential Patterns Efficiently by Prefix- Projected Pattern Growth
Advertisements

Rule Discovery from Time Series Presented by: Murali K. Kadimi.
Sequential Patterns & Process Mining Current State of Research Edgar de Graaf LIACS.
Zhou Zhao, Da Yan and Wilfred Ng
Honeypot 서울과학기술대학교 Jeilyn Molina Honeypot is the software or set of computers that are intended to attract attackers, pretending to be weak.
Edi Winarko, John F. Roddick
Rakesh Agrawal Ramakrishnan Srikant
Our New Progress on Frequent/Sequential Pattern Mining We develop new frequent/sequential pattern mining methods Performance study on both synthetic and.
Multi-dimensional Sequential Pattern Mining
Sequential Pattern Mining
Temporal Pattern Matching of Moving Objects for Location-Based Service GDM Ronald Treur14 October 2003.
Temporal Data Mining Claudio Bettini, X.Sean Wang and Sushil Jajodia Presented by Zhuang Liu.
Association Rule Mining (Some material adapted from: Mining Sequential Patterns by Karuna Pande Joshi)‏
2/8/00CSE 711 data mining: Apriori Algorithm by S. Cha 1 CSE 711 Seminar on Data Mining: Apriori Algorithm By Sung-Hyuk Cha.
School of Computer Science and Information Systems
Mining Association Rules between Sets of Items in Large Databases presented by Zhuang Wang.
A Short Introduction to Sequential Data Mining
Alert Correlation for Extracting Attack Strategies Authors: B. Zhu and A. A. Ghorbani Source: IJNS review paper Reporter: Chun-Ta Li ( 李俊達 )
What Is Sequential Pattern Mining?
Association Rules. 2 Customer buying habits by finding associations and correlations between the different items that customers place in their “shopping.
Fast Portscan Detection Using Sequential Hypothesis Testing Authors: Jaeyeon Jung, Vern Paxson, Arthur W. Berger, and Hari Balakrishnan Publication: IEEE.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Investigation.
October 2, 2015 Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 8 — 8.3 Mining sequence patterns in transactional.
Efficient Data Mining for Calling Path Patterns in GSM Networks Information Systems, accepted 5 December 2002 SPEAKER: YAO-TE WANG ( 王耀德 )
Mining Multidimensional Sequential Patterns over Data Streams Chedy Raїssi and Marc Plantevit DaWak_2008.
Botnet behavior and detection October RONOG Silviu Sofronie – a Head of Forensics.
Mining Sequential Patterns Rakesh Agrawal Ramakrishnan Srikant Proc. of the Int ’ l Conference on Data Engineering (ICDE) March 1995 Presenter: Sam Brown.
Trojan Virus By Forbes and Mark. What is a Trojan virus Trojans are malicious programs that perform actions that have not been authorised by the user.
Course on Data Mining: Seminar Meetings Page 1/17 Course on Data Mining ( ): Seminar Meetings Ass. Rules EpisodesEpisodes Text Mining
Optimizing multi-pattern searches for compressed suffix arrays Kalle Karhu Department of Computer Science and Engineering Aalto University, School of Science,
Pattern-Growth Methods for Sequential Pattern Mining Iris Zhang
Sequential Pattern Mining
Automatically Generating Models for Botnet Detection Presenter: 葉倚任 Authors: Peter Wurzinger, Leyla Bilge, Thorsten Holz, Jan Goebel, Christopher Kruegel,
IEEE Communications Surveys & Tutorials 1st Quarter 2008.
Mining various kinds of Association Rules
Mining Frequent Patterns in unaligned Protein Sequences (An Implementation of The Teiresias Algorithm) Fei Shao 1.
Jian Pei Jiawei Han Behzad Mortazavi-Asl Helen Pinto ICDE’01
BotGraph: Large Scale Spamming Botnet Detection Yao Zhao, Yinglian Xie, Fang Yu, Qifa Ke, Yuan Yu, Yan Chen, and Eliot Gillum Speaker: 林佳宜.
Generalized Sequential Pattern Mining with Item Intervals Yu Hirate Hayato Yamana PAKDD2006.
Mining Approximate Frequent Itemsets in the Presence of Noise By- J. Liu, S. Paulsen, X. Sun, W. Wang, A. Nobel and J. Prins Presentation by- Apurv Awasthi.
Automating Analysis of Large-Scale Botnet Probing Events Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson* Lab for Internet and Security Technology (LIST)
Exploiting Temporal Persistence to Detect Covert Botnet Channels Authors: Frederic Giroire, Jaideep Chandrashekar, Nina Taft… RAID 2009 Reporter: Jing.
CloSpan: Mining Closed Sequential Patterns in Large Datasets Xifeng Yan, Jiawei Han and Ramin Afshar Proceedings of 2003 SIAM International Conference.
Analysing Clickstream Data: From Anomaly Detection to Visitor Profiling Peter I. Hofgesang Wojtek Kowalczyk ECML/PKDD Discovery.
Kazuya Kuwabara, Hiroaki Kikuchi, Tokai University Masato Terada and Masashi Fujiwara, Hitachi Ltd.,
Temporal Database Paper Reading R 資工碩一 馬智釗 Efficient Mining Strategy for Frequent Serial Episodes in Temporal Database, K Huang, C Chang.
Web Mining Issues Size Size –>350 million pages –Grows at about 1 million pages a day Diverse types of data Diverse types of data.
Intelligent DataBase System Lab, NCKU, Taiwan Josh Jia-Ching Ying 1, Wang-Chien Lee 2, Tz-Chiao Weng 1 and Vincent S. Tseng 1 1 Department of Computer.
Masayuki Ohrui, Hiroaki Kikuchi, Tokai University Masato Terada, Hitachi Ltd.
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.
Special Topics in Educational Data Mining HUDK5199 Spring, 2013 April 24, 2012.
1 Mining Episode Rules in STULONG dataset N. Méger 1, C. Leschi 1, N. Lucas 2 & C. Rigotti 1 1 INSA Lyon - LIRIS FRE CNRS Université d’Orsay – LRI.
Mining Progressive Confident Rules M. Zhang, W. Hsu and M.L. Lee Int'l Conf on Data Mining (ICDM),2006 IEEE Advisor : Jia-Ling Koh Speaker : Tsui-Feng.
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth Jiawei Han, Jian Pei, Helen Pinto, Behzad Mortazavi-Asl, Qiming Chen,
© Prentice Hall1 DATA MINING Web Mining Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Companion slides.
Fuzzy data mining for interesting generalized association rules Source : Fuzzy Sets and Systems ; Vol.138, No. 2, 2003, pp Author : Tzung-Pei,
REU 2009-Traffic Analysis of IP Networks Daniel S. Allen, Mentor: Dr. Rahul Tripathi Department of Computer Science & Engineering Data Streams Data streams.
Apriori–PrefixSpan Hybrid Approach for Automated Detection of Botnet Coordinated Attacks Masayuki Ohrui, Hiroaki Kikuchi, Tokai University Masato Terada,
Mining Dependent Patterns
Sequential Pattern Mining Using A Bitmap Representation
Association Rules.
Mining Access Pattrens Efficiently from Web Logs Jian Pei, Jiawei Han, Behzad Mortazavi-asl, and Hua Zhu 2000년 5월 26일 DE Lab. 윤지영.
A Fast Algorithm for Subspace Clustering by Pattern Similarity
Data Warehousing Mining & BI
Mining Sequential Patterns
Discovering Frequent Poly-Regions in DNA Sequences
Presentation transcript:

Frequent Sequential Attack Patterns of Malware in Botnets Nur Rohman Rosyid

Kikuchi Laboratory Tokai University Outlines 1. Botnet attack 2. PrefixSpan method 3. Results and Analysis 4. Conclusion

Kikuchi Laboratory Tokai University Botnet

Honeypots Kikuchi Laboratory Tokai University  CCC DATA set 2009 consist of the access log of attack to 94 honeypots in 1 year (may 1, 2008 – April ).  This research observes one of honeypot runs on Windows XP+SP1

Kikuchi Laboratory Tokai University Coordinated attack TROJ_QHOST.WT BKDR_POEBOT.AHP PE_VIRUT.AV TSPY_ONLINEG.OPJ TSPY_KOLABC.CHTROJ_AGENT.AGSB

Sequential pattern  It is difficult to find sequential pattern of attacks in the access log of attacks manually. Kikuchi Laboratory Tokai University Sequence_idSequence

Objective  Discover the frequent sequential attack pattern on CCC DATA set 2009 Kikuchi Laboratory Tokai University

Method  PrefixSpan data mining algorithm 1 to discover the frequent sub-sequences as patterns in a sequence database.  Example: Given a sequence database and minimum support threshold 2 Kikuchi Laboratory Tokai University 1) J. Pei, et al., ``PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth'‘, in Proc. of The 17th Int'l Conf. on Data Engineering, pp , Sequence_idSequence

Method (count) Kikuchi Laboratory Tokai University Seq. Database Projected Database Sequential Patterns :5 :2 :4 :5 :5, :5, :5, :2, and :2

Method (count) Kikuchi Laboratory Tokai University Seq. Database Projected Database Sequential Patterns :5 :2 :4 :5 :5, :5, :5, :2, and :2

Method (count) Sequential Patterns :5, :5, :2, :4 :2 :4 :2 Kikuchi Laboratory Tokai University

Pre-Processing Data Kikuchi Laboratory Tokai University

sequential 2-Pattern of malware attack Kikuchi Laboratory Tokai University length serial

sequential 3-Pattern of malware attack Kikuchi Laboratory Tokai University

Distribution of attacks of duplicate 3-pattern Kikuchi Laboratory Tokai University

Distribution of attacks of non-duplicate 3-pattern Kikuchi Laboratory Tokai University (P3.4) TROJ_QHOST.WT, WORM_HAMWEQ.AP, BKDR_POEBOT.AHP (P3.29) TSPY_ONLINEG.OPJ, TROJ_QHOST.WT, BKDR_POEBOT.AHP (P3.30) BKDR_RBOT.CZO, WORM_HAMWEQ.AP, TROJ_QHOST.WT A (P3.21) PE_VIRUT.AV BKDR_SDBOT.BU BKDR_VANBOT.HI (P3.27) BKDR_SCRYPT.ZHB BKDR_SDBOT.BU BKDR_VANBOT.HI (P3.49) BKDR_SCRYPT.ZHB PE_VIRUT.AV BKDR_SDBOT.BU B (P3.7) PE_VIRUT.AV WORM_SWTYMLAI.CD TSPY_KOLABC.CH (P3.10) PE_VIRUT.AV TSPY_KOLABC.CH WORM_SWTYMLAI.CD c (P3.37) PE_VIRUT.AV TSPY_KOLABC.CH TROJ_AGENT.AGSB D 20 days25 days26 days8 days

Distribution of time interval of the 3-pattern Kikuchi Laboratory Tokai University Time interval is a time difference between the first and last malware infections in the same sequential pattern at the honeypot.

Kikuchi Laboratory Tokai University Sequential attack pattern based on source IP address and timestamp Pattern based on IP Address IP pattern codeIP Pattern A1S1 A2S1 S2 A3S1S2S1 A4S1S2 A5S1S2S3 Pattern based on Timestamp Time pattern codeTime pattern E1T1 E2T1 T2 E3T1T2 E4T1T2T3

Kikuchi Laboratory Tokai University Sequential attack pattern by source IP address and timestamp (count) A4E1 : 10% A4E4 : 90% A5E4 : 20% A5E5 : 80% TROJ_QHOST.WT BKDR_POEBOT.AHP PE_VIRUT.AV TSPY_ONLINEG.OPJ TSPY_KOLABC.CHTROJ_AGENT.AGSB

Confidence of sequential attack pattern  How strong the n-pattern coordinated attack, if (n-1)-pattern, a subsequence of n-pattern occur  where n is the length of pattern and m is the length of subsequence of n-pattern Company Name for n > 1 and m = (n-1), Conf(n-pattern) = Supp(n-pattern) Supp(m-pattern)

Confidence of 3-pattern Kikuchi Laboratory Tokai University

Conclusion Kikuchi Laboratory Tokai University  PrefixSpan method sufficiently discover all sequential attack patterns.  Coordinated attacks are performed by multiple sequential attack patterns within certain short time interval.  The sequential pattern of coordinated attack tends to change all the time.  This result gives several behaviors useful for alerting threats of botnets attacks.