Baselining PMU Data to Find Patterns and Anomalies

Slides:



Advertisements
Similar presentations
Data Mining in Computer Games By Adib Adam Hussain & Mohammed Sarfraz.
Advertisements

Energy Performance Analysis with RETScreen
Data Mining Glen Shih CS157B Section 1 Dr. Sin-Min Lee April 4, 2006.
Anomaly Detection in the WIPER System using A Markov Modulated Poisson Distribution Ping Yan Tim Schoenharl Alec Pawling Greg Madey.
Supervised learning Given training examples of inputs and corresponding outputs, produce the “correct” outputs for new inputs Two main scenarios: –Classification:
Continuous Audit at Insurance Companies
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Introduction. 1.Data Mining and Knowledge Discovery 2.Data Mining Methods 3.Supervised Learning 4.Unsupervised Learning 5.Other Learning Paradigms 6.Introduction.
Total Quality Management BUS 3 – 142 Statistics for Variables Week of Mar 14, 2011.
A Signal Analysis of Network Traffic Anomalies Paul Barford, Jeffrey Kline, David Plonka, and Amos Ron.
Anomaly detection Problem motivation Machine Learning.
Overview DM for Business Intelligence.
1 Data Mining DT211 4 Refer to Connolly and Begg 4ed.
Data Mining Chun-Hung Chou
B. RAMAMURTHY EAP#2: Data Mining, Statistical Analysis and Predictive Analytics for Automotive Domain CSE651C, B. Ramamurthy 1 6/28/2014.
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
The Detection of Driver Cognitive Distraction Using Data Mining Methods Presenter: Yulan Liang Department of Mechanical and Industrial Engineering The.
Anomaly detection with Bayesian networks Website: John Sandiford.
Managing Software Projects Analysis and Evaluation of Data - Reliable, Accurate, and Valid Data - Distribution of Data - Centrality and Dispersion - Data.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
Copyright © 2012, SAS Institute Inc. All rights reserved. ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY,
1 Data Mining: Concepts and Techniques (3 rd ed.) — Chapter 12 — Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.
Data Mining and Decision Support
Predictive Analytics derived from HVAC and PMU data at UCSD Chuck Wells Industry Principal OSIsoft, LLC 1.
LOAD FORECASTING. - ELECTRICAL LOAD FORECASTING IS THE ESTIMATION FOR FUTURE LOAD BY AN INDUSTRY OR UTILITY COMPANY - IT HAS MANY APPLICATIONS INCLUDING.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
STATISTICS 13.0 Linear Time Series Trend “Time Series ”- Time Series Forecasting Method.
Data Summit 2016 H104: Building Hadoop Applications Abhik Roy Database Technologies - Experian LinkedIn Profile:
Machine learning & object recognition Cordelia Schmid Jakob Verbeek.
Introduction to Machine Learning, its potential usage in network area,
Discovery Across Texas: Technology Solutions for Wind Integration in ERCOT Using Synchrophasor Technology for Wind Integration and Event Monitoring in.
Energy Consumption Forecast Using JMP® Pro 11 Time Series Analysis
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning with Spark MLlib
SNS COLLEGE OF TECHNOLOGY
Data Based Decision Making
MIS2502: Data Analytics Advanced Analytics - Introduction
What Matters in Student Rating of Instructor Teaching (SRI)?
DATA MINING © Prentice Hall.
Progress Report for the May 26th 17:31 Event
Data Mining, Distributed Computing and Event Detection at BPA
A. Srivastava, S. Pandey, P. Banerjee, Y. Wu
Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:
Data Science Process Chapter 2 Rich's Training 11/13/2018.
Introduction to Azure Machine Learning Studio
CSE 4705 Artificial Intelligence
Outlier Discovery/Anomaly Detection
Update on Removing Forced Oscillation Bias from the Mode Meter
Data Science introduction.
Least-Squares Regression
Coherence-based Oscillation Detection
Classification and Prediction
3.1.1 Introduction to Machine Learning
Dimension reduction : PCA and Clustering
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
WISP Follow on Reporting.
Coherence-based Oscillation Detection
Data Mining, Distributed Computing and Event Detection at BPA
Machine Learning for Visual Scene Classification with EEG Data
Update on Removing Forced Oscillation Bias from the Mode Meter
Machine Learning – a Probabilistic Perspective
Jia-Bin Huang Virginia Tech
MIS2502: Data Analytics Introduction to Advanced Analytics and R
Jia-Bin Huang Virginia Tech
Machine Learning in Business John C. Hull
Presentation transcript:

Baselining PMU Data to Find Patterns and Anomalies CIGRE US National Committee 2015 Grid of the Future Symposium Brett amidan Jim Follum Kimberly Freeman Jeff Dagle Pacific Northwest National Laboratory November 15, 2018

“Big Picture” Objective Power grid related data (PMUs, State Estimators, Load, etc) Analytical Tool that provides: Real time analytics, monitoring the state of the grid Capability to look at historical trends and events Reliable predictions about the forthcoming state of the grid November 15, 2018

Pre-Processing Steps Read raw PMU data Develop and then use data quality filters to clean poor quality data Frequency Remove bad data Information about the data we are processing – 60 Hz PMU data for 50+ PMUs from BPA. We currently have processed 27+months of data (>8 TB). We are able to read, clean, and analyze 1 minute of data in under 45 seconds. 1 day of 60 Hz PMU data (54 PMUs) = 26 GB November 15, 2018 b.amidan@pnnl.gov

Feature Extraction (Data Signatures) Regression fits through the data calculate estimates of value, slope, curvature (acceleration), and noise. Can be calculated in the presence of missing or data quality flagged values. Summaries of these features are used in the analyses. November 15, 2018 b.amidan@pnnl.gov

Baselining Grid Behavior Univariate Approach Create a baseline of typical behavior for each individual variable Determine abnormal behavior based on the baseline Multivariate Approach Create a baseline across many (hundreds or even thousands) of variables Relationship between variables is considered when determining abnormal behavior Static Baselining Limits November 15, 2018

Univariate Baselining Example Date / Time Model – Time Series Based Model Day of Week Model Hours 0-23 Predicted Phase Angle Pair Value at Midnight Dynamic Baselining Limits (Calculated Daily) Phase Angle Difference Actual Value Initial Training Period November 15, 2018

Multivariate Baselining Baseline captures what normal behavior is expected to be Group similar behavior Time periods that group together indicate normal grid behavior Variables that group together indicate highly correlated variables and may be candidates for feature reduction Identify data that does not belong with the normal behavior Time period contains data that is unusual (possible abnormal grid behavior) Variable is unlike other variables, or something has happened to indicate a behavioral change in the variable November 15, 2018

Creating a Baseline – Unsupervised Learning Training Data: Historical PMU Data Baselining Learning Algorithm Class 1 Real Time PMU Data Model Class 2 Class 3 Class 4 Class 5 November 15, 2018

Identifying Data Driven Atypical Events Using multivariate statistical techniques to establish baselines of typical behavior, atypical moments in time can be discovered and the variables responsible can be identified. This slide shows how the atypicality score on the left increases due to atypical behavior in the system. The plots on the right show 2 different phase angle differences that were atypical during this same time period. November 15, 2018

Atypicality Detection Lightning Related Anomaly Atypicality Score Substation A Substation B November 15, 2018

Atypicality Detection Equipment Failure Related Anomaly Atypicality Score Other PMUs behaved similarly November 15, 2018 b.amidan@pnnl.gov

Phase Angle Pairs Clustering Unsupervised learning (clustering) used to determine which variables are most similar during Time Period A. Proximity on tree indicates similarity November 15, 2018 b.amidan@pnnl.gov

Phase Angle Pairs Clustering Time Period B (two months later) Phase Angle Pair #2 is no longer like Pair #1. Why? November 15, 2018 b.amidan@pnnl.gov

Baselining Learning Algorithm Supervised Learning Training Data: Historical PMU Data Class 1 Weather Baselining Learning Algorithm Class 2 Normal Class 3 Voltage Drop Class 4 Surge Class 5 Maintenance Labels Real Time PMU Data Predictive Model Weather Normal Voltage Drop Surge Maintenance November 15, 2018

Understanding Precursors to Inform Prediction Models Precursor Features (Signature) Precursor activity Inform Machine- Learning Model Create Classification to identify future precursors Known event November 15, 2018

Future Step – Using Supervised Learning to Predict Current State Possible Patterns Likelihood Event 1 0.75 Normal 4 0.15 Classification Based Prediction Model (Trained from Historical Data) Precursor 3 0.05 Extract Signature Event 7 0.04 NOTE: Only events and precursors with distinct data characteristics will be identifiable November 15, 2018

Conclusions Data driven anomalies can be identified using multivariate analyses techniques. Some of these anomalies correspond to actual events, but some do not. Understanding precursors can inform prediction models, allowing for probability based predictions of the near-term future grid behavior. November 15, 2018