| presented by Vasileios Zois CS at USC 09/20/2013 Introducing Scalability into Smart Grid 1
| Manage Data – Sparse Data – Heterogeneous Data – Semantic Represantation Train Prediction Models – Data Intensive Application – On Demand Procedure Make Prediction & Update Models – Fast Access to Trained Models – Update with new values 2 Smart Grid Project Services
| Management of Data – Choose Underline Technology – Evaluate provided services Training of Models – Design Training Tools – Take Advantage of Infrastructure – Give Efficient Solutions to Training Access & Update Training Models – Update: Change Invariants that Effect Prediction – Do it Efficiently 3 Steps to Scalability
| Requirements – Efficient Usage of Storage – Access Client to Data – Semantic Organization of Data Possible Solutions – Distributed File System (HDFS) » Raw Data » Work out a Structure (XML, Ontology Schemas) – Column Oriented NoSQL Systems(Hbase,Cassandra) » Structure offered – Column Families » Implemented Operations » Still Needs Reasoning Operations 4 Managing Data
| Regression Tree – Support Features – Tree Building – Scalable Implementation OpenPlanet ARIMA Model – Short Term Prediction – Does Not Support Features? – On Demand Training » Small Prediction Window 5 Prediction Models
| Brute Force – Efficient use of resources – Build a system from scratch Decrease Problem Size – Group Data and Pick Representatives – Clustering of Data with Similar Features – Introduce Features into ARIMA model » Use features to cluster Data » Execute Model on Clustered Data » Customer SuperCustomer 6 Scalable Prediction
| Problem – Computationally Expensive – High Dimensional – Inevitable Parallelization Challenges to Parallelization – Partitioning of Data to achieve Load Balance – Reduction of the Communication Cost Approaches – Hierarchical Clustering : PBirch – Evolutionary Strategies Clustering – Density Based Clustering : PDBSCAN – Model Based Clustering : Autoclass System 7 Parallel Clustering
| PBirch – Single Program Multiple Data(SPMD) – Message Passing Interface (MPI) Steps – Distribute Data Equally – Build Tree on Each Processor – Execute Clustering on Leaf nodes - Parallel Kmeans Results – Linear Speedup – Increased Communication Latency – Parallel Hierarchical Clustering
| Model – Stochastic Optimization – Biological Evolution Concepts – Recombination, Mutation – Motive: Huge Range of Possible Solutions Parallelization Techniques – Master – Slave Model » Master in charge of parent solutions » Slave in charge of recombination and mutation » Fits into mapreduce model Proposed Solution – thes.pdf thes.pdf 9 Clustering with Evolutionary Strategies
| PDBSCAN – Based on original DBSCAN Algorithm – Shared Nothing Architecture Execution – Divide Input into Several Partitions – Concurrently Cluster Data Locally with DBSCAN – Merge Local Clusters into Global Clusters dR*-Tree Introduced – Decreased Communication Cost – Efficient Access of Data – Distributed Data Pages – Replicated Indices on all Machines Results – Near Linear Speedup to the number of Machines – Parallel Density Based Clustering
| Auto-class System – Bayesian Classification – Probability of an Instance belonging to a class Approach – SIMD Single Instruction Multiple Data – Divide Input into Processors – Update Parameters for Classification Locally – No Need for Load Balancing Results – Good Scaling – After a certain threshold the communication starts to hinder the performance 11 Parallel Model Based Clustering
| Main Idea – Potential Model – Derived from Gravitational Force Model in Euclidean Space – Parameters: » Gravitational Constant, » Bandwidth Distance B ( Max Distance from center of cluster ) » δ threshold distance (avoid singularity problem) Execution – Calculate Potential at each Point – Sort Points According to the Calculated Potential – Choose Cluster Centers by iteration over sorted array – If distance between to points in array > B create new cluster Results – Near optimal Solution – Clustering By Sorting Potential Values
| Any Questions? 13
| Thank you for your attention! Vasilis Zois 14