CISC 879 - Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.

Slides:



Advertisements
Similar presentations
Robust Feature Selection by Mutual Information Distributions Marco Zaffalon & Marcus Hutter IDSIA IDSIA Galleria 2, 6928 Manno (Lugano), Switzerland
Advertisements

Scalable Parallel Intrusion Detection Fahad Zafar Advising Faculty: Dr. John Dorband and Dr. Yaacov Yeesha 1 University of Maryland Baltimore County.
Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin.
Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.
Relevant characteristics extraction from semantically unstructured data PhD title : Data mining in unstructured data Daniel I. MORARIU, MSc PhD Supervisor:
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Data Mining Classification: Alternative Techniques
SVM—Support Vector Machines
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
Automatic Identification of Cognates, False Friends, and Partial Cognates University of Ottawa, Canada University of Ottawa, Canada.
Bayesian Learning Rong Jin. Outline MAP learning vs. ML learning Minimum description length principle Bayes optimal classifier Bagging.
Mapping Between Taxonomies Elena Eneva 11 Dec 2001 Advanced IR Seminar.
Text Classification Using Stochastic Keyword Generation Cong Li, Ji-Rong Wen and Hang Li Microsoft Research Asia August 22nd, 2003.
Ensemble Learning (2), Tree and Forest
Automated malware classification based on network behavior
A Hybrid Model to Detect Malicious Executables Mohammad M. Masud Latifur Khan Bhavani Thuraisingham Department of Computer Science The University of Texas.
CISC Machine Learning for Solving Systems Problems Presented by: Akanksha Kaul Dept of Computer & Information Sciences University of Delaware SBMDS:
 The Weka The Weka is an well known bird of New Zealand..  W(aikato) E(nvironment) for K(nowlegde) A(nalysis)  Developed by the University of Waikato.
Combining Supervised and Unsupervised Learning for Zero-Day Malware Detection © 2013 Narus, Inc. Prakash Comar 1 Lei Liu 1 Sabyasachi (Saby) Saha 2 Pang-Ning.
Masquerade Detection Mark Stamp 1Masquerade Detection.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
APPLICATIONS OF DATA MINING IN INFORMATION RETRIEVAL.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
Data mining and machine learning A brief introduction.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Machine Learning CSE 681 CH2 - Supervised Learning.
A Language Independent Method for Question Classification COLING 2004.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
One-class Training for Masquerade Detection Ke Wang, Sal Stolfo Columbia University Computer Science IDS Lab.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensemble Methods: Bagging and Boosting
Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.
Exploring Alternative Splicing Features using Support Vector Machines Feature for Alternative Splicing Alternative splicing is a mechanism for generating.
Data Reduction via Instance Selection Chapter 1. Background KDD  Nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable.
CISC Machine Learning for Solving Systems Problems Presented by: Sandeep Dept of Computer & Information Sciences University of Delaware Detection.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
CISC Machine Learning for Solving Systems Problems Presented by: Satyajeet Dept of Computer & Information Sciences University of Delaware Automatic.
Online Multiple Kernel Classification Steven C.H. Hoi, Rong Jin, Peilin Zhao, Tianbao Yang Machine Learning (2013) Presented by Audrey Cheong Electrical.
Ensemble Learning for Low-level Hardware-supported Malware Detection
Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
Text Categorization With Support Vector Machines: Learning With Many Relevant Features By Thornsten Joachims Presented By Meghneel Gore.
Post-Ranking query suggestion by diversifying search Chao Wang.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
KAIST TS & IS Lab. CS710 Know your Neighbors: Web Spam Detection using the Web Topology SIGIR 2007, Carlos Castillo et al., Yahoo! 이 승 민.
Combining Evolutionary Information Extracted From Frequency Profiles With Sequence-based Kernels For Protein Remote Homology Detection Name: ZhuFangzhi.
CISC Machine Learning for Solving Systems Problems Presented by: Eunjung Park Dept of Computer & Information Sciences University of Delaware Solutions.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
An Improved Algorithm for Decision-Tree-Based SVM Sindhu Kuchipudi INSTRUCTOR Dr.DONGCHUL KIM.
A Kernel Approach for Learning From Almost Orthogonal Pattern * CIS 525 Class Presentation Professor: Slobodan Vucetic Presenter: Yilian Qin * B. Scholkopf.
Don’t Follow me : Spam Detection in Twitter January 12, 2011 In-seok An SNU Internet Database Lab. Alex Hai Wang The Pensylvania State University International.
Next, this study employed SVM to classify the emotion label for each EEG segment. The basic idea is to project input data onto a higher dimensional feature.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Introduction to Data Mining Clustering & Classification Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.
Unveiling Zeus Automated Classification of Malware Samples Abedelaziz Mohaisen Omar Alrawi Verisign Inc, VA, USA Verisign Labs, VA, USA
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Learning to Detect and Classify Malicious Executables in the Wild by J
Classification with Gene Expression Data
An Enhanced Support Vector Machine Model for Intrusion Detection
Discriminative Frequent Pattern Analysis for Effective Classification
Prasit Usaphapanus Krerk Piromsopa
Chapter 7: Transformations
Presentation transcript:

CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning to Detect and Identify Malicious Executables in Wild J. Zico Kotler Marcus A Maloof

CISC Machine Learning for Solving Systems Problems Introduction Machine learning and data mining to identify malicious code Malicious Codes ? Why not antivirus suites? Training set: 1971 good and 1651 malicious executables Features extracted: n-gram byte code and executable based on their functions of payload Learning algorithms: naïve bayes, SVM, decision trees and boosting

CISC Machine Learning for Solving Systems Problems Goals of the research Paper How to use established methods to detect and classify malicious executables ? Present empirical results from an extensive study of inductive methods for detection and classification To show that methods achieve high detection rates on new and unseen executables.

CISC Machine Learning for Solving Systems Problems Related Work Lo et al., 1995; Kephart et al., 1995; Tesauro et al.,1996;Schultz et al.,2001 Lo et al., 1995: analysis of several programs Schultz et al.2001, used data mining to detect Binary profiling (Ripper learning) String Sequences (Naïve Bayes) Hex dumps (six naïve bayesian classifiers)

CISC Machine Learning for Solving Systems Problems Data Collection and Classification methods 1971 benign and 1651 malicious executables of windows pe format N-grams: Combine each four bye sequence into single term. For e.g.: ff 00 ab 3e 12 b3, the corresponding n-grams are ff00ab3e, 00ab3e12, ab3e12b3 etc. N-gram: each of them are considered as attributes Most relevant attribute (n-grams) are calculated using Information gain also called average mutual information. Collected 500 most relevant n-grams

CISC Machine Learning for Solving Systems Problems Classification methods

CISC Machine Learning for Solving Systems Problems Classification methods Instance based learner: Collection of training examples Naive bayes: Probablisitc model. Based on condition probability of each class P(Ci) and P(Vj | Ci)

CISC Machine Learning for Solving Systems Problems Classification methods Support Vector machines: vector of weights w and threshold,b. Uses a kernel function to map training data into higher dimensioned space so that problem is linearly separable. Decision Trees: Internal nodes correspond to attributes and leaf nodes corresponds to class labels. Boosted classifiers: It is method for combining multiple classifiers. Boosting produces set of weighted models by iteratively learning a model from a weighted data set, evaluating it and reweighting the data set based on model’s performance.

CISC Machine Learning for Solving Systems Problems Detecting malicious code using n-grams Used Ten-fold cross validation Pilot Study: To determine the size of n-grams and number of n-grams relevant. Used n-grams with n=4 and calculated the best number of n-grams using Information gain. 500 relevant n-grams produced the best result. Experiment With Small collection: Small collection of executable with total of 68,744,909 n-grams Experiment with Large Collection: 255 million distinct n-grams of size of 4.

CISC Machine Learning for Solving Systems Problems Results of Small Collection ROC curve for detecting malicious executables in small collection

CISC Machine Learning for Solving Systems Problems Result of Bigger Collection ROC Curve for bigger collection

CISC Machine Learning for Solving Systems Problems Classifying executables by Payload function Extent to which classification methods could determine whether a given malicious executable opened a backdoor, mass mailed or was an executable virus. Identify and enumerate the functions of payloads Many executables fell into many categories Experimental design similar to previous but for each of the fucntion data set is made from malicious executables only. Used ten fold Cross validation

CISC Machine Learning for Solving Systems Problems Experimental Results ROC curve for mass mailing capabilities

CISC Machine Learning for Solving Systems Problems Experimental Results ROC Curve for backdoor entries

CISC Machine Learning for Solving Systems Problems Evaluating Real World Online Performance Applied method to 291 real world malicious code to discovered after the original data were gathered Classifiers from the original data were build for both benign and malicious code Boosted decision tree detected 98% of the new malicious code.

CISC Machine Learning for Solving Systems Problems Conclusion and Future work Machine learning and data mining are useful and appropriate tool for detection of malware Boosted Classifiers, support vector machines performed exceptionally well Boosting removes bias and variance and outperformed other classifiers in the study This approach is scalable % of the codes were obfuscated using compression and encryption For functions of payload experiments remove obfuscation and rerun the experiments with larger set

CISC Machine Learning for Solving Systems Problems Conclusion and Future Work Similarity of malicious code and how such executables change over time. Clustering can provide good insight into this. This approach combined with search for known signatures, executing and analyzing code in virtual machine will provide better computer security

CISC Machine Learning for Solving Systems Problems Q&A ?