Distillation of Performance- Related Characteristics

Slides:

Advertisements

Similar presentations

Lesson Overview 1.1 What Is Science?.

Advertisements

Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.

Informed Search Methods How can we improve searching strategy by using intelligence? Map example: Heuristic: Expand those nodes closest in “as the crow.

Autonomic Systems Justin Moles, Winter 2006 Enabling autonomic behavior in systems software with hot swapping Paper by: J. Appavoo, et al. Presentation.

Evaluating Diagnostic Accuracy of Prostate Cancer Using Bayesian Analysis Part of an Undergraduate Research course Chantal D. Larose.

 Basic Concepts  Scheduling Criteria  Scheduling Algorithms.

Programming with Alice Computing Institute for K-12 Teachers Summer 2011 Workshop.

Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.

Generating Synthetic Workloads Using Iterative Distillation Zachary Kurmas – Georgia Tech Kimberly Keeton – HP Labs Kenneth Mackenzie – Reservoir Labs.

Planning under Uncertainty

Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.

1 Thesis Proposal Zachary Kurmas (v4.0– 24 April 03)

Storage system designs must be evaluated with respect to many workloads New Disk Array Performance (CDF of latency) seconds % I/Os seconds % I/Os seconds.

Algorithms and Problem Solving-1 Algorithms and Problem Solving.

Scheduling in Batch Systems

Distributed Cluster Repair for OceanStore Irena Nadjakova and Arindam Chakrabarti Acknowledgements: Hakim Weatherspoon John Kubiatowicz.

A Hierarchical Characterization of a Live Streaming Media Workload E. Veloso, V. Almeida W. Meira, A. Bestavros, S. Jin Proceedings of Internet Measurement.

OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.

Multi-Scale Analysis for Network Traffic Prediction and Anomaly Detection Ling Huang Joint work with Anthony Joseph and Nina Taft January, 2005.

Performance Evaluation

What we will cover…  CPU Scheduling  Basic Concepts  Scheduling Criteria  Scheduling Algorithms  Evaluations 1-1 Lecture 4.

Simulation Models as a Research Method Professor Alexander Settles.

Hidden Markov Models.

Distillation of Performance- Related Characteristics

Feature Subset Selection using Minimum Cost Spanning Trees Mike Farah Supervisor: Dr. Sid Ray.

OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.

RESEARCH METHODS IN EDUCATIONAL PSYCHOLOGY

RelSamp: Preserving Application Structure in Sampled Flow Measurements Myungjin Lee, Mohammad Hajjat, Ramana Rao Kompella, Sanjay Rao.

Development in hardware – Why? Option: array of custom processing nodes Step 1: analyze the application and extract the component tasks Step 2: design.

Genetic Algorithm.

Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.

1 CS 475/575 Slide Set 6 M. Overstreet Spring 2005.

Sorting HKOI Training Team (Advanced)

Switching breaks up large collision domains into smaller ones Collision domain is a network segment with two or more devices sharing the same Introduction.

1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.

Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.

Length Reduction in Binary Transforms Oren Kapah Ely Porat Amir Rothschild Amihood Amir Bar Ilan University and Johns Hopkins University.

CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Operating Systems Examples Algorithm.

Where did plants and animals come from? How did I come to be?

© 2006 Pearson Education 1 More Operators  To round out our knowledge of Java operators, let's examine a few more  In particular, we will examine the.

By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.

Deconstructing Storage Arrays Timothy E. Denehy, John Bent, Florentina I. Popovici, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin,

Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?

March 23 & 28, Csci 2111: Data and File Structures Week 10, Lectures 1 & 2 Hashing.

March 23 & 28, Hashing. 2 What is Hashing? A Hash function is a function h(K) which transforms a key K into an address. Hashing is like indexing.

Replicating Memory Behavior for Performance Skeletons Aditya Toomula PC-Doctor Inc. Reno, NV Jaspal Subhlok University of Houston Houston, TX By.

Marwan Al-Namari Hassan Al-Mathami. Indexing What is Indexing? Indexing is a mechanisms. Why we need to use Indexing? We used indexing to speed up access.

1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.

OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.

M1G Introduction to Programming 2 3. Creating Classes: Room and Item.

Lesson Overview Lesson Overview What Is Science?.

Chapter 9 Recursion © 2006 Pearson Education Inc., Upper Saddle River, NJ. All rights reserved.

Computacion Inteligente Least-Square Methods for System Identification.

OPERATING SYSTEMS CS 3502 Fall 2017

Tell Me Who I Am: An Interactive Recommendation System

CHAPTER 12 Developing Strategies for Whole-Number Computation

Challenges in Creating an Automated Protein Structure Metaserver

Subject Name: File Structures

Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.

Chapter 6: CPU Scheduling

CHAPTER 12 Developing Strategies for Whole-Number Computation

Teaching Styles Learning Objectives:

CPU Scheduling Basic Concepts Scheduling Criteria

CPU Scheduling G.Anuradha

Chapter 6: CPU Scheduling

Chapter 5: CPU Scheduling

Operating System Concepts

Switching Techniques.

Informatics 121 Software Design I

Presentation transcript:

Distillation of Performance- Related Characteristics

Introduction zWant synthetic workload to maintain certain realistic properties or attributes yWant representative behavior (performance) zResearch Question: yHow do we identify needed attributes? zWe have a method...

Goal CDF of Response Time will have performance similar to original. (R,1024,120932,124) (W,8192,120834,126) (W,8192,120844,127) (R,2048,334321, Original Workload Given a workload and storage system, automatically find a set of attributes, so Attribute List SyntheticWorkload (R,1024,120932,124) (W,8192,120834,126) (W,8192,120844,127) (R,2048,334321, synthetic workloads with the same values

Why? zPredicting performance of complex disk arrays is extremely difficult. yMany unknown interactions to account for. zList of attributes much easier to analyze than large, bulky workload trace. zList of attributes tells us: yWhich patterns in a workload affect performance yHow those patterns affect performance zPossible uses of attribute lists: yOne possible basis of “similarity” for workloads yStarting point for performance prediction model

Basic Idea zAttribute list may be different for every workload/storage system pair yRequire general method of finding attributes yMust require little human intervention zBasic Idea: Add attributes until performance of original and synthetic workloads is similar. (R,1024,120932,124) (W,8192,120834,126) (W,8192,120844,127) (R,2048,334321, Original Workload Attribute List SyntheticWorkload (R,1024,120932,124) (W,8192,120834,126) (W,8192,120844,127) (R,2048,334321,131...

Mean Arrival Time Arrival Time Dist. Hurst Parameter Mean Request Size Request Size Dist. Request Size Attrib 3 Request Size Attrib 4COV of Arrival Time Dist. of LocationsRead/Write ratio Mean run length Markov Read/Write Jump DistanceR/W Attrib. #3 Proximity MungeR/W Attrib #4 Mean Read Size D. of (R,W) Locations Read Rqst. Size Dist.Mean R,W run length Mean (R, W) SizesR/W Jump Distance (R, W) Size Dists.R/WProximity Munge Mean Arrival Time Arrival Time Dist. Hurst Parameter Mean Request Size Request Size Dist. Request Size Attrib 3 Request Size Attrib 4COV of Arrival Time Dist. of LocationsRead/Write ratio Mean run length Markov Read/Write Jump DistanceR/W Attrib. #3 Proximity MungeR/W Attrib #4 Mean Read Size D. of (R,W) Locations Read Rqst. Size Dist.Mean R,W run length Mean (R, W) SizesR/W Jump Distance (R, W) Size Dists.R/WProximity Munge Choosing Attribute Wisely zProblem: yNot all attributes useful yCan’t test all attributes zOur Solution: yGroup attributes yEvaluate entire groups at once Attributes How are they grouped? How are they evaluated?

zWorkload is series of requests y(Read/Write Type, Size, Location, Interarrival Time) zAttributes measure one or more parameters yMean Request Size Request Size yDistribution of LocationLocation yBurstinessInterarrival Time yRequest Size yRead/Write zAttributes grouped by parameter(s) measured yLocation = {mean location, distribution of location, locality, mean jump distance, mean run length,...} Attribute Group Distribution of Read Size

Evaluate Attribute Group zAdd “every” attribute in group at once and observe change in performance. zAmount of change in performance estimator of most effective attribute “All” (Size, R/W) “All” Request Size “All” Location

“All” Location attribute “All” (Location, Request Size) attribute The “All” Attribute zThe list of values for a set of parameters contains every attribute in that group zAttributes in that group will have same value for both original and synthetic workload z List represents “perfect knowledge” of group

RMS/Mean : Original:.1877 Current:.0918

Main Ideas zNew method of automatically finding performance-related attributes: yMeasure completeness of list by comparing performance of synthetic workloads yUseful method of grouping attributes yEffective method of evaluating entire groups of attributes yAvoid evaluation of useless attributes zwww.cc.gatech.edu/~kurmasz

END OF SHORT TALK zThe rest of the slides are for the full talk. zCurrent 26 January, 6:44 pm

Implement Improvement zAdd attribute from chosen group zThis is most time-consuming part yOnly a few attributes known, so we must develop most attributes from scratch zThis should get easier as technique used and “attribute library” grows yFuture Work: We will eventually need an intelligent method of searching library

Main Research Focus z1). How to automatically choose and apply “additive” or “subtractive” method z2). How to automatically evaluate results and choose single attribute group zIn practice, there are subtleties that are easily addressed by hand, but difficult to generalize for algorithm.

Current Progress zWe have working application yAmbiguous cases still done by hand yApplication stops and asks for a hint yAlgorithm being improved incrementally so that it needs fewer hints zApplication used on Open Mail Workload

The “All” Attribute zThe list of values for a set of parameters contains every attribute in that group yAttributes in that group will have same value for both original and synthetic workload “All” attribute for location