1 Weekly Progress (MAGGIE) Adnan Iqbal Superviser Dr. Waqar Mahmood 22-11-05.

Slides:



Advertisements
Similar presentations
SOP and POS Forms Notation.
Advertisements

By Veronika Movagharianpour and Adam Brakel. Software Developers face challenges:  Producing high-quality software  with low-defect levels  while doing.
Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
Module 36: Correlation Pitfalls Effect Size and Correlations Larger sample sizes require a smaller correlation coefficient to reach statistical significance.
COMSATS Institute of Information Technology Virtual campus Islamabad
Lab4 CPIT 440 Data Mining and Warehouse.
Progress Presentation. Tasks Completed The tasks that were completed in the last week are: The tasks that were completed in the last week are: The implementation.
SAMPA digital status Arild Velure Tasks done previous week Changes in data formating unit -> ringbuffer -> memory completed and tested – Saves.
Multi-Route Anomaly detection using Principal Component Analysis Adnan Iqbal Superviser Dr. Waqar Mahmood
Anomaly Detection using GAs M. Umer Khan 22-Nov-2005.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Multi-Route Anomaly detection using Principal Component Analysis Adnan Iqbal Superviser Dr. Waqar Mahmood.
> >
Anomaly Detection. Anomaly/Outlier Detection  What are anomalies/outliers? The set of data points that are considerably different than the remainder.
Chapter 3, Part 1 Descriptive Statistics II: Numerical Methods
Total Quality Management BUS 3 – 142 Statistics for Variables Week of Mar 14, 2011.
Data Mining: A Closer Look
Slide 1 Detecting Outliers Outliers are cases that have an atypical score either for a single variable (univariate outliers) or for a combination of variables.
A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data Authors: Eleazar Eskin, Andrew Arnold, Michael Prerau,
CS490D: Introduction to Data Mining Prof. Chris Clifton April 14, 2004 Fraud and Misuse Detection.
Using Bayesian Networks for Detecting Network Anomalies Lane Thames ECE 8833 Intelligent Systems.
NATIONAL INSTITUTE OF SCIENCE & TECHNOLOGY Presented by:Manoj Kumar Gantayat CS: Technical Seminar Presentation by MANOJ KUMAR GANTAYAT.
Simple Covariation Focus is still on ‘Understanding the Variability” With Group Difference approaches, issue has been: Can group membership (based on ‘levels.
Data Screening & Descriptives. Typical class… Lecture (Theory)
A P STATISTICS LESSON 2 – 2 STANDARD NORMAL CALCULATIONS.
Brain Mapping Unit The General Linear Model A Basic Introduction Roger Tait
Understanding Regression Analysis Basics. Copyright © 2014 Pearson Education, Inc Learning Objectives To understand the basic concept of prediction.
Physics 270 – Experimental Physics. Standard Deviation of the Mean (Standard Error) When we report the average value of n measurements, the uncertainty.
Agresti/Franklin Statistics, 1 of 106  Section 9.4 How Can We Analyze Dependent Samples?
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
Fraud Detection
Measures of Relative Standing Percentiles Percentiles z-scores z-scores T-scores T-scores.
AP Review #3: Continuous Probability (The Normal Distribution)
Introduction to Quantitative Research Analysis and SPSS SW242 – Session 6 Slides.
Psychology 290 – Lab 9 January Normal Distribution Standardization Z-scores.
Part II Tools for Knowledge Discovery Ch 5. Knowledge Discovery in Databases Ch 6. The Data Warehouse Ch 7. Formal Evaluation Technique.
Correlations: Relationship, Strength, & Direction Scatterplots are used to plot correlational data – It displays the extent that two variables are related.
Inquiry 1 written and oral reports are due in lab Th 9/24 or M 9/28 Homework #1 and 2 are posted Today: Analyzing Data and Statistics xkcd.com.
D/RS 1013 Data Screening/Cleaning/ Preparation for Analyses.
Week 7 : String and photo processing. Today’s Tasks  Practice list and string  Convert Decimal to any radix base number  Between Binary and Hexadecimal.
 A standardized value  A number of standard deviations a given value, x, is above or below the mean  z = (score (x) – mean)/s (standard deviation)
Lab 4 Multiple Linear Regression. Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function.
Techniques for Decision-Making: Data Visualization Sam Affolter.
CHAPTER 4 NUMERICAL METHODS FOR DESCRIBING DATA What trends can be determined from individual data sets?
Review X = {3, 5, 7, 9, 11} Range? Sum of squares? Variance? Standard deviation?
English Hub School networks GCSE English Language
Understanding Regression Analysis Basics
CSEC 640 Innovative Education--snaptutorial.com
ECET 380 RANK Experience Tradition / ecet380rank.com.
ECET 380 Week 1 iLab Simulation of a Fundamental Communication System FOR MORE CLASSES VISIT Key Results: Key Conclusions (technical):
Identifying functions and using function notation
Baselining PMU Data to Find Patterns and Anomalies
SWE-795 Presentation 01 11/16/2018 Asking and Answering Questions during a Programming Change Task Jonathan Sillito, Member, IEEE Computer Society, Gail.
Normal Distribution.
Image Processing, Leture #16
Intrusion Detection with Neural Networks my awesome graphic ↑
Image Compression Purposes Requirements Types
Homework Analyzing Graphs
Software Development Process
Performance Management Training
Introduction to grand task
Regression III.
8.4 Control of Multivariable Systems
Data Warehousing Data Mining Privacy
MANOVA Control of experimentwise error rate (problem of multiple tests). Detection of multivariate vs. univariate differences among groups (multivariate.
Project 7: Modeling Social Network Structures and their Dynamic Evolutions with User- Generated Data from IoT REU Student: Emma Ambrosini Graduate mentors:
Assignment #3 Programming Language, Spring 2003
Chapter 14 Normalization Pearson Education © 2009.
Danni Yu Eli Lilly and Company (Tue)
Statistical based IDS background introduction
Presentation transcript:

1 Weekly Progress (MAGGIE) Adnan Iqbal Superviser Dr. Waqar Mahmood

2 Tasks completed Discover a scheme that can be used to get relationship between network wide anomalies and single route anomalies Implement the scheme Perform Regularization of Data Apply the scheme to suitable routes Analyze Results Study of MIT Lincoln Lab intrusion detection data Study of new data set provided by SLAC Reading Time Analysis (SLAC data) Analysis of MIT Lincoln Lab intrusion detection data Trimming (removal of outliers)

3 Trimming Improvement in the results presented last time Output format has been changed Date and Time are included 4/3/20050: /3/20050: /3/20050: /3/20050:

4 Normalization During multivariate data analysis it is possible that variables have different ranges which might differ significantly and introduce a bias in the analysis. For example Var 1 : 0-1 Var 2: Var 3: Normalization is the process of converting data into smaller ranges such that behavior of data does not change

5 Normalization Different Methods Min-Max Normalization Z- Score Normalization Decimal Scaling Many Others

6 Min-Max Normalization

7 Z-Score Normalization

8 Future Task Regularization Code