CISC 879 - Machine Learning for Solving Systems Problems Presented by: Akanksha Kaul Dept of Computer & Information Sciences University of Delaware SBMDS:

Slides:



Advertisements
Similar presentations
Random Forest Predrag Radenković 3237/10
Advertisements

Data Mining Classification: Alternative Techniques
1 Detection of Injected, Dynamically Generated, and Obfuscated Malicious Code (DOME) Subha Ramanathan & Arun Krishnamurthy Nov 15, 2005.
Ragib Hasan Johns Hopkins University en Spring 2011 Lecture 10 04/18/2011 Security and Privacy in Cloud Computing.
Polymorphism in Computer Viruses CS265 Security Engineering Term Project Puneet Mishra.
Virus Encyption CS 450 Joshua Bostic. topics Encryption as a deterent to virus scans. History of polymorphic viruses. Use of encryption by viruses.
Metamorphic Viruses Pat Walpole. Introduction What are metamorphic viruses Why they are dangerous Defenses against them.
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
Towards Extending the Antivirus Capability to Scan Network Traffic Mohammed I. Al-Saleh Jordan University of Science and Technology.
Beyond Anti-Virus by Dan Keller Fred Cohen- Computer Scientist “there is no algorithm that can perfectly detect all possible computer viruses”
Jarhead Analysis and Detection of Malicious Java Applets Johannes Schlumberger, Christopher Kruegel, Giovanni Vigna University of California Annual Computer.
Automated malware classification based on network behavior
Silvio Cesare Ph.D. Candidate, Deakin University.
MutantX-S: Scalable Malware Clustering Based on Static Features Xin Hu, IBM T.J. Watson Research Center; Sandeep Bhatkar and Kent Griffin, Symantec Research.
Henric Johnson1 Chapter 10 Malicious Software Henric Johnson Blekinge Institute of Technology, Sweden
A Hybrid Model to Detect Malicious Executables Mohammad M. Masud Latifur Khan Bhavani Thuraisingham Department of Computer Science The University of Texas.
Combining Supervised and Unsupervised Learning for Zero-Day Malware Detection © 2013 Narus, Inc. Prakash Comar 1 Lei Liu 1 Sabyasachi (Saby) Saha 2 Pang-Ning.
1 Chap 10 Malicious Software. 2 Viruses and ”Malicious Programs ” Computer “Viruses” and related programs have the ability to replicate themselves on.
Malicious Code Brian E. Brzezicki. Malicious Code (from Chapter 13 and 11)
Hacker Zombie Computer Reflectors Target.
Spyware and Viruses Group 6 Magen Price, Candice Fitzgerald, & Brittnee Breze.
Behavior-based Spyware Detection By Engin Kirda and Christopher Kruegel Secure Systems Lab Technical University Vienna Greg Banks, Giovanni Vigna, and.
 a crime committed on a computer network, esp. the Internet.
Virus and Antivirus Team members: - Muzaffar Malik - Kiran Karki.
Printing: This poster is 48” wide by 36” high. It’s designed to be printed on a large-format printer. Customizing the Content: The placeholders in this.
A virus is software that spreads from program to program, or from disk to disk, and uses each infected program or disk to make copies of itself. Basically.
1 Chap 10 Virus. 2 Viruses and ”Malicious Programs ” Computer “Viruses” and related programs have the ability to replicate themselves on an ever increasing.
Chapter 10 Malicious software. Viruses and ” Malicious Programs Computer “ Viruses ” and related programs have the ability to replicate themselves on.
Hunting for Metamorphic Engines Wing Wong Mark Stamp Hunting for Metamorphic Engines 1.
Malware Analysis Jaimin Shah & Krunal Patel Vishal Patel & Shreyas Patel Georgia Institute of Technology School of Electrical and Computer Engineering.
Automated Classification and Analysis of Internet Malware M. Bailey J. Oberheide J. Andersen Z. M. Mao F. Jahanian J. Nazario RAID 2007 Presented by Mike.
Return to the PC Security web page Lesson 5: Dealing with Malware.
KAIST Internet Security Lab. CS710 Behavioral Detection of Malware on Mobile Handsets MobiSys 2008, Abhijit Bose et al 이 승 민.
CIS 442: Chapter 2 Viruses. Malewares Maleware classifications and types Viruses Logical and time bombs Trojan horses and backdoors Worms Spam Spyware.
CISC Machine Learning for Solving Systems Problems Presented by: Alparslan SARI Dept of Computer & Information Sciences University of Delaware
Biologically Inspired Defenses against Computer Viruses International Joint Conference on Artificial Intelligence 95’ J.O. Kephart et al.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Dealing with Malware By: Brandon Payne Image source: TechTips.com.
CISC Machine Learning for Solving Systems Problems Presented by: Sandeep Dept of Computer & Information Sciences University of Delaware Detection.
CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.
BY FIOLA CARVALHO TE COMP. CONTENTS  Malicious Software-Definition  Malicious Programs Backdoor Logic Bomb Trojan Horse Mobile Code Multiple-Threat.
CISC Machine Learning for Solving Systems Problems Presented by: Satyajeet Dept of Computer & Information Sciences University of Delaware Automatic.
LOGOPolyUnpack: Automating the Hidden-Code Extraction of Unpack-Executing Malware Royal, P.; Halpin, M.; Dagon, D.; Edmonds, R.; Wenke Lee; Computer Security.
2012 IEEE/IPSJ 12 th International Symposium on Applications and the Internet 陳盈妤 1/10.
Printing: This poster is 48” wide by 36” high. It’s designed to be printed on a large-format printer. Customizing the Content: The placeholders in this.
CISC Machine Learning for Solving Systems Problems Presented by: Suman Chander B Dept of Computer & Information Sciences University of Delaware Automatic.
Ensemble Learning for Low-level Hardware-supported Malware Detection
Malicious Software.
Ensemble Methods in Machine Learning
Zozzle: Low-overhead Mostly Static JavaScript Malware Detection.
CISC Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.
1 January 24, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 7 — Classification Ensemble Learning.
CISC Machine Learning for Solving Systems Problems Presented by: Eunjung Park Dept of Computer & Information Sciences University of Delaware Solutions.
Safe’n’Sec complex solutions for home PCs protection.
1 3 Computing System Fundamentals 3.7 Utility Software.
Antivirus Software Troy Behmer. Outline Topics covered: – What is Antivirus software (AVS)? – What are the advantages and disadvantages of AVS? – What.
ANTIVIRUS ANTIVIRUS Author: Somnath G. Kavalase Junior Software developer at PBWebvsion PVT.LTD.
Unveiling Zeus Automated Classification of Malware Samples Abedelaziz Mohaisen Omar Alrawi Verisign Inc, VA, USA Verisign Labs, VA, USA
Protecting Computers From Viruses and Similarly Programmed Threats Ryan Gray COSC 316.
Cosc 4765 Antivirus Approaches. In a Perfect world The best solution to viruses and worms to prevent infected the system –Generally considered impossible.
Learning to Detect and Classify Malicious Executables in the Wild by J
Ensemble methods with Data Streams
Bag-of-Visual-Words Based Feature Extraction
Techniques, Tools, and Research Issues
BotCatch: A Behavior and Signature Correlated Bot Detection Approach
Basic machine learning background with Python scikit-learn
Chap 10 Malicious Software.
RHMD: Evasion-Resilient Hardware Malware Detectors
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Presentation transcript:

CISC Machine Learning for Solving Systems Problems Presented by: Akanksha Kaul Dept of Computer & Information Sciences University of Delaware SBMDS: an interpretable string based malware detection system using SVM ensemble with bagging Yanfang Ye, Lifei Chen, Dingding Wang, Tao Li, Qingshan Jiang, Min Zhao

CISC Machine Learning for Solving Systems Problems Urgent need to detect malicious executables Major Threats Metamorphic Executables Reprograms itself Capable of infecting two OS. Polymorphic Executables Emulates as Non-malicious code Unseen Executables MOTIVATION

CISC Machine Learning for Solving Systems Problems Need of the Hour SBMDS String Based Malware Detection System What this system is exactly all about?? Performs Interpretable String Analysis Interpretable string is line of codes in a program which contains both API execution calls and important semantic strings representing the intent and goal of the program writer.

CISC Machine Learning for Solving Systems Problems Interpretable String??? Eg: Worm “Nimda ” “html script language = ‘javascript’ window.open(‘readme.eml’)” Another Example: “&gameid= %s&pass=%s; myparentthreadid=%d; myguid=%s” But all Strings are not interpretable Eg: “!0&0h0m0o0t0y0” “*3d%3dtgyhjij”,

CISC Machine Learning for Solving Systems Problems Major Steps to perform Constructing the interpretable strings by developing a feature parser. Performing feature selection to select informative strings. Using SVM ensemble with bagging to construct the classifier. Conducting the malware detector, also predict the exact type of the malware.

CISC Machine Learning for Solving Systems Problems Step 1 Develop Feature parser 39,838 executable collected from Kingsoft Anti-virus lab. All executables are PE files. Extract static features API calls from import table. Strings carrying semantic interpretation.

CISC Machine Learning for Solving Systems Problems SAMPLE (Backdoor-Redgirl.exe) ‘%s’ goto delete” always implicates that the malware may generate the “.bat” file to suicide

CISC Machine Learning for Solving Systems Problems Step 2 Feature Selection Selects only interpretable strings from the huge set of strings obtained from previous step. Assign these strings as signatures of the PE files.

CISC Machine Learning for Solving Systems Problems Step 3 Using SVM to CLASSIFY Why SVM ?? Have showed state-of-art results in classification problem. Problem: training complexity of SVM dependent on size of data set.

CISC Machine Learning for Solving Systems Problems Problem Training Accuracy becomes Constant when size of dataset reaches 3000

CISC Machine Learning for Solving Systems Problems Curse of Dimensionality?? Problem caused by the exponential increase in volume of data. How does SVM deals with “Curse of Dimensionality” Solution: By Using SVM ensemble & Bagging SVM ensemble and Bagging???

CISC Machine Learning for Solving Systems Problems 3.1 SVM Ensemble with Bagging Ensemble is a set of classifiers whose individual decisions are combined in some way to classify new samples. Bagging technique on the training set “BAGGING” (Bootstrap AGGregating) Uniform sampling of training data set

CISC Machine Learning for Solving Systems Problems 3.2 Multi-Classification Various classes of Malwares. To select the identical values from two different classes method of “MAJORITY VOTING” is used. Smallest index is chosen 1= Backdoors 2= Spywares 3= Trojans 4= Worms 0= Benign files

CISC Machine Learning for Solving Systems Problems STEP 4: Malware Detection Unknown variants of malwares are used. Malicious or not. To which class Malware belongs to.

CISC Machine Learning for Solving Systems Problems System Architecture 1. Feature Parser 2. Feature Selection 3. SVM Ensemble Classifier 4. Malware Detector

CISC Machine Learning for Solving Systems Problems Reason why I Chose This paper Comparisons With the Popular Anti- Virus Software. Points of Comparisons: 1. Detecting Known Variants of Malware. 2. Detecting Unknown Variants. 3. Efficiency (Detection Time). 4. Number of False positive Detections.

CISC Machine Learning for Solving Systems Problems Detecting Known Variants

CISC Machine Learning for Solving Systems Problems Detecting Unknown Variants

CISC Machine Learning for Solving Systems Problems Efficiency (Detection Time)

CISC Machine Learning for Solving Systems Problems Number of False Positives

CISC Machine Learning for Solving Systems Problems Conclusion This system has been already incorporated into the scanning tool of a commercial Anti- Virus software. Anti-Virus Name not Disclosed.

CISC Machine Learning for Solving Systems Problems Questions?????

CISC Machine Learning for Solving Systems Problems All Well that Ends Well THANK YOU