Machine Learning in Simulation-Based Analysis 1 Li-C. Wang, Malgorzata Marek-Sadowska University of California, Santa Barbara.

Slides:

Advertisements

Similar presentations

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Advertisements

Statistical Machine Learning- The Basic Approach and Current Research Challenges Shai Ben-David CS497 February, 2007.

Random Forest Predrag Radenković 3237/10

Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.

Slide 1 Bayesian Model Fusion: Large-Scale Performance Modeling of Analog and Mixed- Signal Circuits by Reusing Early-Stage Data Fa Wang*, Wangyang Zhang*,

Machine learning continued Image source:

An Overview of Machine Learning

Chapter 4: Linear Models for Classification

Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Chung-Kuan Cheng†, Andrew B. Kahng†‡,

What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

Statistical Critical Path Selection for Timing Validation Kai Yang, Kwang-Ting Cheng, and Li-C Wang Department of Electrical and Computer Engineering University.

Introduction to machine learning

Radial Basis Function Networks

Introduction to Data Mining Engineering Group in ACL.

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

1 Presenter: Ming-Shiun Yang Sah, A., Balakrishnan, M., Panda, P.R. Design, Automation & Test in Europe Conference & Exhibition, DATE ‘09. A Generic.

Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.

Social Network Analysis via Factor Graph Model

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.

Chapter 4 CONCEPTS OF LEARNING, CLASSIFICATION AND REGRESSION Cios / Pedrycz / Swiniarski / Kurgan.

A Shaft Sensorless Control for PMSM Using Direct Neural Network Adaptive Observer Authors: Guo Qingding Luo Ruifu Wang Limei IEEE IECON 22 nd International.

 1  Outline  stages and topics in simulation  generation of random variates.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Jifeng Dai 2011/09/27.  Introduction  Structural SVM  Kernel Design  Segmentation and parameter learning  Object Feature Descriptors  Experimental.

Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:

1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.

Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.

Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.

Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.

Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.

A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.

Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science ＆ Information Engineering.

Challenges in Mining Large Image Datasets Jelena Tešić, B.S. Manjunath University of California, Santa Barbara

Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.

Machine Learning Extract from various presentations: University of Nebraska, Scott, Freund, Domingo, Hong,

RSVM: Reduced Support Vector Machines Y.-J. Lee & O. L. Mangasarian First SIAM International Conference on Data Mining Chicago, April 6, 2001 University.

DDM Kirk. LSST-VAO discussion: Distributed Data Mining (DDM) Kirk Borne George Mason University March 24, 2011.

CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.

Chapter1: Introduction Chapter2: Overview of Supervised Learning

Predicting Voice Elicited Emotions

Data Mining and Decision Support

CISC Machine Learning for Solving Systems Problems Microarchitecture Design Space Exploration Lecture 4 John Cavazos Dept of Computer & Information.

Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.

Logistic Regression & Elastic Net

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Artificial Neural Networks for Data Mining. Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 6-2 Learning Objectives Understand the.

Optimal Relay Placement for Indoor Sensor Networks Cuiyao Xue †, Yanmin Zhu †, Lei Ni †, Minglu Li †, Bo Li ‡ † Shanghai Jiao Tong University ‡ HK University.

Computational Intelligence: Methods and Applications Lecture 26 Density estimation, Expectation Maximization. Włodzisław Duch Dept. of Informatics, UMK.

Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.

Shape2Pose: Human Centric Shape Analysis CMPT888 Vladimir G. Kim Siddhartha Chaudhuri Leonidas Guibas Thomas Funkhouser Stanford University Princeton University.

Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.

Introduction to Machine Learning, its potential usage in network area,

Hybrid Deep Learning for Reflectance Confocal Microscopy Skin Images

Machine Learning with Spark MLlib

Machine Learning for Computer Security

Data Transformation: Normalization

Semi-Supervised Clustering

The Elements of Statistical Learning

Machine Learning Basics

Overview of Supervised Learning

A weight-incorporated similarity-based clustering ensemble method based on swarm intelligence Yue Ming NJIT#:

Multivariate Methods Berlin Chen

Multivariate Methods Berlin Chen, 2005 References:

Machine Learning – a Probabilistic Perspective

Presentation transcript:

Machine Learning in Simulation-Based Analysis 1 Li-C. Wang, Malgorzata Marek-Sadowska University of California, Santa Barbara

Synopsis  Simulation is a popular approach employed in many EDA applications  In this work, we explore the potential of using machine learning to improve simulation efficiency  While the work is developed based on specific simulation contexts, the concepts and ideas should be applicable to a generic simulation setting 2

Problem Setting 3

 Inputs to the simulation – X: e.g. input vectors, waveforms, assembly programs, etc. – C: e.g. device parameters to model statistical variations  Output from the simulation: – Y: e.g. output vectors, waveforms, coverage points  Goal of simulation analysis: To analyze the behavior of the mapping function f() 4 Mapping Function f ( ) (Design Under Analysis) Component random variables Input random variables Output behavior X C Y

Practical View of The Problem  For the analysis task, k essential outputs are enough – k << n*m  Fundamental problem: – Before simulation, how can we predict the inputs that will generate the essential outputs? 5 f ( ) Checker How to predict the outcome of an input before its simulation? Essential outputs

First Idea: Iterative Learning  Learning objective: to produce a learning model that predicts the “importance of an input” 6 l input samples Learning & Selection h potentially important input samples Simulation Checker outputs results Results include 2 types of information: (1)Inputs that do not produce essential outputs (2)Inputs that do produce essential outputs

Machine Learning Concepts 7

For More Information  Tutorial on Data Mining in EDA & Test – IEEE CEDA Austin chapter tutorial – April 2014 – April-2014.pdf  Tutorial papers – “Data Mining in EDA” – DAC 2014 Overview and include a list of references to our prior works – “Data Mining in Functional Debug” – ICCAD 2014 – “Data Mining in Functional Test Content Optimization” – ASP DAC nVidia talk, Li-C. Wang at 3/27/15

How A Learning Tool Sees The Data  A learning algorithm usually sees the dataset as above – Samples: examples to be reasoned on – Features: aspects to describe a sample – Vectors: resulting vector representing a sample – Labels: care behavior to be learned from (optional) 9 features samples labels vectors

Supervised Learning  Classification – Labels represent classes (e.g. +1, -1: binary classes)  Regression – Labels are some numerical values (e.g. frequencies) 10 (features) Labels

Unsupervised Learning  Work on features – Transformation – Dimension reduction  Work on samples – Clustering – Novelty detection – Density estimation 11 (features) No y’s

Semi-Supervised Learning  Only have labels for i samples – For i << m  Can be solved as an unsupervised problem with supervised constraints 12 (features) Labels for i samples only

Fundamental Question  A learning tool takes data as a matrix  Suppose we want to analyze m samples – waveforms, assembly programs, layout objects, etc.  How do I feed the samples to the tool? 13 Learning tool Sample 1 Sample 2 … Sample m ? Matrix view

Explicit Approach – Feature Encoding  Need to develop two things: – 1. Define a set of features – 2. Develop a parsing and encoding method based on the set of features  Does learning result then depend on the features and encoding method? (Yes!)  That’s why learning is all about “learning the features” 14 Samples Parsing and encoding method Set of Features

Implicit Approach – Kernel Based Learning  Define a similarity function (kernel function) – It is a computer program that computes a similarity value between any two tests  Most of the learning algorithms can work with such a similarity function directly – No need for a matrix data input 15 Sample i Sample j Similarity Function Similarity value

Kernel-Based Learning  A kernel based learning algorithm does not operate on the samples  As long as you have a kernel, the samples to analyze – Vector form is no longer needed  Does learning result depend on the kernel? (Yes!)  That’s why learning is about learning a good kernel 16 Kernel function Learning Algorithm Learned model Query for pair (x i,x j ) Similarity Measure for (x i,x j )

Example: RTL Simulation Context 17

Recall: Iterative Learning  Learning objective: to produce a learning model that predicts the “importance of an input” 18 l input samples Learning & Selection h potentially important input samples Simulation Checker outputs results Results include 2 types of information: (1)Inputs that do not produce essential outputs (2)Inputs that do produce essential outputs

Iterative Learning  Learning objective: to produce a learning model that predicts the “inputs likely to improve coverage” 19 l assembly programs Learning & Selection h potentially important assembly programs Simulation Checker outputs results Results include 2 types of information: (1)Inputs that provide no new coverage (2)Inputs that provide new coverage

Unsupervised: Novelty Detection  Learning is to model the simulated assembly programs  Use the model to identify novel assembly programs  A novel assembly program is likely to produce new coverage 20 : simulated assembly programs : filtered assembly programs : novel assembly programs Boundary captured by a one-class learning model

One Example  Design: 64-bit Dual-thread low-power processor (Power Architecture)  Each test is with 50 generated instructions  Roughly saving: 94% 21 With novelty detection, only 100 tests are needed Without novelty detection, 1690 tests are needed

Another Example  Each test is a 50-instruction assembly program  Tests target on Complex FPU (33 instruction types)  Roughly saving: 95% – Simulation is carried out in parallel in a server farm hours simulation With novelty detection => Require only 310 tests Without novelty detection => Require 6010 tests

Example: SPICE Simulation Context (Include C Variations) 23

SPICE Simulation Context  Mapping function f() – SPICE simulation of a transistor netlist  Inputs to the simulation – X: Input waveforms over a fixed period – C: Transistor size variations  Output from the function: Y – output waveforms 24 Mapping Function f ( ) Design Under Analysis Transistor size variations X C Y

Recall: Iterative Learning  In each iteration, we will learn a model to predict the inputs likely to generate additional essential output waveforms 25 l input waveforms Learning & Selection h potentially important waveforms Simulation Checker outputs results Results include 2 types of information: (1)Inputs that do not produce essential outputs (2)Inputs that do produce essential outputs

i = 2 i = 1 Illustration of Iterative Learning  For an important input, continue the search in the neighboring region  For an unimportant input, avoid the inputs in the neighboring region 26 s4s4 y4y4 s3s3 s1s1 s2s2 y3y3 y1y1 y2y2 s5s5 s6s6 y6y6 y5y5 ? i = 0 X  C space Y space

Idea: Adaptive Similarity Space  In each iteration, similarity is measured in the space defined by important inputs  Instead of applying novelty detection, we apply clustering here to find “representative inputs” 27 s1s1 s2s2 s1s1 s2s2 Space implicitly defined by k( ) Adaptive similarity space Three additional samples selected

Initial Result – UWB-PLL  We will perform 4 sets of the experiments – each set is for each input-output pair 28 I4I4 O4O4 I1I1 O1O1 I2I2 O2O2 I3I3 O3O3

Initial Result  Comparing to random input selection  For each case, the # of essential outputs is shown  Learning enables simulation of less # of inputs to obtain the same coverage of the essential outputs 29

Additional Result – Regulator 30 I O1O1 O2O2 Apply LearningRandom InOut# IS’s#EO’s# IS’s#EO’s IO1O IO2O

Coverage Progress 31 # of covered EI’s # of applied tests Regulator I - O 1 With novelty detection => Require only 153 tests Without novelty detection, 388 tests are needed ~60% cost reduction

Additional Result – Low Power, Low Noise Amp. 32 I1I1 O1O1 Apply LearningRandom InOut# IS’s#EO’s# IS’s#EO’s I1I1 O1O

2 nd Idea: Supervised Learning Approach  In some applications, one may desire to predict the actual output (e.g. waveform) of an input (rather than just the importance of an input)  In this case, we need to apply a supervised learning approach (see paper for more detail) 33 input samples Learning model Predictable? yes Predictor Predicted outputs Simulation no Simulated outputs

Recall: Supervised Learning  Fundamental challenge: – Each y’s is a complex object (e.g. a waveform)  How do we build a supervised learning model in this case? (See the paper for discussion) 34 (features) Waveforms

Conclusion  Machine learning provides viable approaches for improving simulation efficiency in EDA applications  Keep in mind: Learning is about learning – The features, or – The kernel function  The proposed learning approaches are generic and can be applied to diverse simulation contexts  We are developing the theories and concepts – (1) for learning the kernel – (2) for predicting the complex output objects 35

Thank you Questions? 36