Exploiting Input Features for Controlling Tunable Approximate Programs Sherry Zhou Department of electronic engineering Tsinghua University.

Slides:

Advertisements

Similar presentations

Network Design with Degree Constraints Guy Kortsarz Joint work with Rohit Khandekar and Zeev Nutov.

Advertisements

Chapter 4 Partition I. Covering and Dominating.

‘Small World’ Networks (An Introduction) Presenter : Vishal Asthana

Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.

Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.

1 Greedy Forwarding in Dynamic Scale-Free Networks Embedded in Hyperbolic Metric Spaces Dmitri Krioukov CAIDA/UCSD Joint work with F. Papadopoulos, M.

Introduction to Markov Random Fields and Graph Cuts Simon Prince

Minimum Energy Mobile Wireless Networks IEEE JSAC 2001/10/18.

Modeling Malware Spreading Dynamics Michele Garetto (Politecnico di Torino – Italy) Weibo Gong (University of Massachusetts – Amherst – MA) Don Towsley.

Maximum Margin Markov Network Ben Taskar, Carlos Guestrin Daphne Koller 2004.

Support Vector Machines and Kernels Adapted from slides by Tim Oates Cognition, Robotics, and Learning (CORAL) Lab University of Maryland Baltimore County.

1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.

1 Routing and Wavelength Assignment in Wavelength Routing Networks.

A Size Scaling Approach for Mixed-size Placement Kalliopi Tsota, Cheng-Kok Koh, Venkataramanan Balakrishnan School of Electrical and Computer Engineering.

Farnoush Banaei-Kashani and Cyrus Shahabi Criticality-based Analysis and Design of Unstructured P2P Networks as “ Complex Systems ” Mohammad Al-Rifai.

Support Vector Machines and Kernel Methods

Data mining and statistical learning - lecture 13 Separating hyperplane.

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract.

A Resource-level Parallel Approach for Global-routing-based Routing Congestion Estimation and a Method to Quantify Estimation Accuracy Wen-Hao Liu, Zhen-Yu.

Graph Classification.

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.

Active Learning for Networked Data Based on Non-progressive Diffusion Model Zhilin Yang, Jie Tang, Bin Xu, Chunxiao Xing Dept. of Computer Science and.

Hybrid Prefetching for WWW Proxy Servers Yui-Wen Horng, Wen-Jou Lin, Hsing Mei Department of Computer Science and Information Engineering Fu Jen Catholic.

CAFE router: A Fast Connectivity Aware Multiple Nets Routing Algorithm for Routing Grid with Obstacles Y. Kohira and A. Takahashi School of Computer Science.

Efficient Gathering of Correlated Data in Sensor Networks

CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.

Adaptive CSMA under the SINR Model: Fast convergence using the Bethe Approximation Krishna Jagannathan IIT Madras (Joint work with) Peruru Subrahmanya.

Network Characterization via Random Walks B. Ribeiro, D. Towsley UMass-Amherst.

Automated Social Hierarchy Detection through Network Analysis (SNAKDD07) Ryan Rowe, Germ´an Creamer, Shlomo Hershkop, Salvatore J Stolfo 1 Advisor:

Page 1 Ming Ji Department of Computer Science University of Illinois at Urbana-Champaign.

An Introduction to Support Vector Machine (SVM) Presenter : Ahey Date : 2007/07/20 The slides are based on lecture notes of Prof. 林智仁 and Daniel Yeung.

DATA MINING LECTURE 13 Pagerank, Absorbing Random Walks Coverage Problems.

Understanding Crowds’ Migration on the Web Yong Wang Komal Pal Aleksandar Kuzmanovic Northwestern University

Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.

Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.

Gennaro Cordasco - How Much Independent Should Individual Contacts be to Form a Small-World? - 19/12/2006 How Much Independent Should Individual Contacts.

Mining Social Network for Personalized Prioritization Language Techonology Institute School of Computer Science Carnegie Mellon University Shinjae.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

3. SMALL WORLDS The Watts-Strogatz model. Watts-Strogatz, Nature 1998 Small world: the average shortest path length in a real network is small Six degrees.

Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova ， Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.

Chao-Yeh Chen and Kristen Grauman University of Texas at Austin Efficient Activity Detection with Max- Subgraph Search.

D. M. J. Tax and R. P. W. Duin. Presented by Mihajlo Grbovic Support Vector Data Description.

Algorithmic Detection of Semantic Similarity WWW 2005.

Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date :

Multiple Instance Learning for Sparse Positive Bags Razvan C. Bunescu Machine Learning Group Department of Computer Sciences University of Texas at Austin.

March 3, 2009 Network Analysis Valerie Cardenas Nicolson Assistant Adjunct Professor Department of Radiology and Biomedical Imaging.

Using Bayesian Networks to Predict Plankton Production from Satellite Data By: Rob Curtis, Richard Fenn, Damon Oberholster Supervisors: Anet Potgieter,

A Protocol for Tracking Mobile Targets using Sensor Networks H. Yang and B. Sikdar Department of Electrical, Computer and Systems Engineering Rensselaer.

Scalable Learning of Collective Behavior Based on Sparse Social Dimensions Lei Tang, Huan Liu CIKM ’ 09 Speaker: Hsin-Lan, Wang Date: 2010/02/01.

Selected Topics in Data Networking Explore Social Networks:

A Tutorial on Spectral Clustering Ulrike von Luxburg Max Planck Institute for Biological Cybernetics Statistics and Computing, Dec. 2007, Vol. 17, No.

Bo Zong, Yinghui Wu, Ambuj K. Singh, Xifeng Yan 1 Inferring the Underlying Structure of Information Cascades

GPGPU Performance and Power Estimation Using Machine Learning Gene Wu – UT Austin Joseph Greathouse – AMD Research Alexander Lyashevsky – AMD Research.

Prediction of Interconnect Net-Degree Distribution Based on Rent’s Rule Tao Wan and Malgorzata Chrzanowska- Jeske Department of Electrical and Computer.

Semi-Supervised Clustering

A Viewpoint-based Approach for Interaction Graph Analysis

Table 1. Advantages and Disadvantages of Traditional DM/ML Methods

CSE 4705 Artificial Intelligence

Dieudo Mulamba November 2017

Machine Learning Week 1.

The Watts-Strogatz model

Categorization by Learning and Combing Object Parts

Department of Computer Science University of York

MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.

Asymmetric Transitivity Preserving Graph Embedding

Support Vector Machines and Kernels

Parametric Methods Berlin Chen, 2005 References:

Using Clustering to Make Prediction Intervals For Neural Networks

Planting trees in random graphs (and finding them back)

Presentation transcript:

Exploiting Input Features for Controlling Tunable Approximate Programs Sherry Zhou Department of electronic engineering Tsinghua University

Outline Introduction Input features for GEM and SGD Experiments on applying features to error model Experiments on applying new features to cost model Conclusion and future work

Overview of Control Method

Control Problem Formulation Given: – a tunable program, – a set of possible inputs I, and – a probability function p such that for any i ∈ I, p(i) is the probability of getting input i For input i ∈ I, error bound ε > 0 and probability bound 0 ≤ π ≤ 1, find k 1, k 2 such that – Objective: minimize f c (i, k 1, k 2 ) – Subject to: ≥ π Feasible Region – Set of (k 1, k 2 ) satisfying : ≥ π 4

Error and Cost Model Error model is used to determine whether or not a knob setting is in the feasible region – Currently is input-agnostic Cost model uses input features, knob settings to predict running time – Currently is input-specific, simple features

Problem We Focused On Find and exploit computational cheap Input features: – Make error model input-awareness to improve its accuracy – Improve cost model accuracy

Two Benchmarks We Focused On GEM error model SGD error model GEM cost modelSGD cost model features

INPUT FEATURES

Features for GEM Input are social networks Simple features: – Number of nodes – Number of edges – Number of cluster

Features for GEM Sophisticated features but easy to compute: – Leadership – Sub-graph Edge Ratio – Degree Distribution

Feature: Leadership Small groups are usually created by one individual who then recruits others L = ∑ (d max − d i ) / ((n − 2)(n − 1)) Normalize: ∑((n-1)-1)= (n − 2)(n − 1)

Feature: Sub-Graph Edge Ratio Composed by those high-degree nodes Ratio of number of edges in the original to those in the sub-graph

Feature: Degree Distribution Take the scaled distribution as a vector and use K-means to get the vectors into groups

Other Features Explored High Computational Cost Hop-count – defined as the minimal number of distinct links that forms a path connecting the two nodes Clustering Coefficient – defined as the ratio of the number of links y connecting the di neighbors of i over the maximum possible ½y* di(di − 1) Bonding – Measures triadic closure in a graph

Features in SGD Inputs: training instances for SVM classifier Feature: ratio of number of instances to instance dimension: When ratio < 1 – Easier to find a solution When ratio > 1 – Harder to find a solution

APPLY FEATURES TO ERROR MODEL 16

Apply Features to Error Model Directly add features as nodes in Bayes network – High inference cost when number of features are high knob1 knob2 error knob1 knob2 error Feature 1 Feature 2

GEM: Add Features to Bayes Network Add features: – Number of nodes – Number of edges

GEM: Add Features to Bayes Network Add features: – leadership

GEM: Add Features to Bayes Network Add features: – Sub-graph edge ratio

GEM: Apply Features to Error Model Use features to classify inputs and in each class, learn a different model. INPUT CLASSIFIER ERROR MODEL 1 ERROR MODEL n

Results: Only Use Number of Nodes The performance become worse.

Feature: Sub-Graph Edges

Features: Leadership The performance has been slightly improved

Clustering Inputs Based on Knob Errors

Future Work Features: – What kinds of features are more useful to improve error model? Principle? – Will a combination of features be better?

COST MODEL 27

Result: add leadership feature to cost model in GEM Leadership Number of nodes Number of edges Number of nodes Number of edges Feature in cost model:

Result: add leadership feature to cost model in GEM leadership Number of nodes Number of edges Number of nodes Number of edges Feature in cost model:

SGD

Future Work The relationship between runtime and cache miss? Build an analytic model of runtime prediction Analyze the difference between analytic model with m5-tree model Provide some guidance to m5 learning algoirhtm

Conclusion Explored several features to improve error and cost models Experiments showed the error and cost model are improved