Presentation is loading. Please wait.

Presentation is loading. Please wait.

Detecting Web Attacks Using Multi-Stage Log Analysis

Similar presentations


Presentation on theme: "Detecting Web Attacks Using Multi-Stage Log Analysis"— Presentation transcript:

1 Detecting Web Attacks Using Multi-Stage Log Analysis
Presented by Akhil Katpally Authors: Melody Moh , Santhosh Pininti, Sindhusha Doddapaneni, Teng-sheng Moh

2 Agenda Goal of the paper. Introduction.
Background and Related Studies. System Design and Implementation. Experiments and Results. Conclusion How presentation will benefit audience: Adult learners are more interested in a subject if they know how or why it is important to them. Presenter’s level of expertise in the subject: Briefly state your credentials in this area, or explain why participants should listen to you.

3 Goal of the paper Authors have proposed a new Multi-Stage Log Analysis system, which combines both Pattern Matching and supervised Machine Learning methods. This system can effectively detect new SQL-injection attacks. Authors has successfully implemented a Proof-of-concept of the proposed system on Amazon AWS, using Kibana for Pattern Matching and Bayes Net for Machine Learning. Lesson descriptions should be brief.

4 Introduction With so much dependency on the web in our daily life, its security has become extremely important. One of the major security issues of web applications is SQL-injection. According to OWASP Top 10 Security issues, SQL-injection stands top. Even the emerging cloud technology is accessed through web interfaces, its security is a top priority for Internet-and-Cloud-Services providers(ISP/CSP). Example objectives At the end of this lesson, you will be able to: Save files to the team Web server. Move files to different locations on the team Web server. Share files on the team Web server.

5 Introduction….Continued
Real Time Log Analysis is one major procedure to detect and prevent an SQL-injection attacks. It uses Pattern Matching and Machine Learning techniques. With Pattern Matching, only know injection patterns are recognized and patterns with small changes are recognized. Existing Log Analysis methods for SQL injection detection are based on either Pattern Matching or Machine Learning. Proposed system uses both Pattern Matching and Machine Learning. Example objectives At the end of this lesson, you will be able to: Save files to the team Web server. Move files to different locations on the team Web server. Share files on the team Web server.

6 Contributions Proposed a multi-stage architecture for detecting SQL injection attacks. Implemented a prototype based on proposed architecture, using Bayes Net and Kibana. Compare the Pro and Cons of Pattern Matching (Kibana) and Machine Learning (Bayes Net) Evaluated the 2-stage system through a series of experiments.

7 Background and Related Studies
SQL injection: Attacker inputs an SQL query, which modifies or damages the database that is connected to the target web application. Order Wise, Blind and Against Database. Log Analysis(Log4j): understanding logs and extracting useful information. Pattern Matching: checks whether a set of words is present in the given text.

8 Background and Related Studies….Continued
Logstash: Data pipelining tools which connects to a variety of sources and receives different types of logs (system, web server, error and application logs). ElasticSearch: Search and data analysis software which gives deep insight on streaming data. Uses apache Lucene. Kibana: Data visualization interface for real-time summarizing and charting of stream data.

9 Background and Related Studies….Continued
Machine Learning: Way of making a computer learn and take action without explicitly programming. Naïve Bayes Classification: Simple probabilistic classifier, builds upon the Bayes theorem, which gives the probability of an event occurring based on the given conditions that are related to the event. Bayes Networks are often used to tackle the independent-attributes assumption of Native Bayes Classification and is helpful and improves performance.

10 System Design and Implementation
Web Application Logic Web Application Log Files Single-Stage Architecture Application Logs are generated using log4j library Either Machine Learning method (Bayes Net)or Pattern Matching method (ELK system)is used for SQL injection detection. log4j Web application users Logstash Preprocessing for WEKA Elasticsearch Bayes Net Model Analyst Kibana Bayes Net Rank

11 System Design and Implementation….Continued
Web Application Logic Multi-Stage Architecture Proposed method combines both machine learning and pattern matching. WEKA is a Machine Learning tool used, initially a model is trained, with the training data. Model generated is tested using 5-fold cross validation and has an accuracy of 78.8% for Bayes Net model. Web Application Log Files log4j Web application users Preprocessing for Kibana Preprocessing for WEKA Elasticsearch Bayes Net Model Analyst Kibana Bayes Net Rank

12 System Design and Implementation….Continued
Log Generation: We can either use parsers and filter to filter out the unnecessary information in the logs, or use logging libraries(log4j) to create custom logs. Preprocessing for WEKA (Single stage) : attributes in test set should match in training set. Preprocessing for WEKA(multi stage) : logs not detected by kibana are input to WEKA. Unix script to convert CSV to ARFF file for WEKA input. Preprocessing for Kibana(multi stage): output of WEKA is used as input of kibana. Unix script to convert ARFF file to text file.

13 Experiment Setup Dataset: web application logs generated using the Log4j framework. Web Application: Web application developed using Java, Bootstrap, HTML, CSS, JavaScript and MySQL. It is hosted on Amazon AWS Linux instance. Kibana and Bayes Net methods are used. Data Total Logs SQL Logs Regular Logs Training Set 2000 547 1453 Testing Set 10000 2812 7188

14 Kibana vs Bayes Net Kibana Bayes Net Purpose Mechanism Overhead
Used for Detecting SQL injections and visualizing data. Used for classification of logs into SQL-injection and other logs. Mechanism Use Pattern Matching techniques for detection. Use Supervised machine learning to learn and detect attacks Overhead No file conversion is required. It takes directly from text file. Load only ARFF files, so log files need to convert into ARFF. No preprocessing is required. Filters can be used to extract required data. Preprocessing is required. Before passed to model for classification needs to be preprocessed. No training is required. Queries are written for detection. Training is required, which involves manual classification. Pros and Cons A real-time system where new queries may be issued. Not a real-time system as it involves offline training. Can detect only specified patterns, cannot detect new types of SQL-injection. Can detect new patterns, since it considers attributes like IP address while classifying. Results are in visualized form. Easy to analyze. Results in text form. Difficult to analyze

15 Experiment Results Method Accuracy for SQL Detection (%)
Machine Learning: Naïve Bayes 61.7 Machine Learning: Bayes Net 80.0 Pattern Matching: Kibana 85.3 Kibana followed by Bayes Net 94.7 Bayes Net followed by Kibana 95.4

16 Conclusion A multi-stage log analysis architecture has been proposed, which uses both machine learning and pattern recognition. Experiment results proves two-stage architecture is more accurate and also particularly when Bayes Net model precedes Kibana. Kibana can also provide final output with visualization. Further improvements can be done on Kibana queries and also unsupervised machine learning methods can be used which may lead to real-time log analysis.


Download ppt "Detecting Web Attacks Using Multi-Stage Log Analysis"

Similar presentations


Ads by Google