“Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Computations” By Ravi, Ma, Chiu, & Agrawal Presented.

Slides:



Advertisements
Similar presentations
Face Recognition Sumitha Balasuriya.
Advertisements

1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
CO-CLUSTERING USING CUDA. Co-Clustering Explained  Problem:  Large binary matrix of samples (rows) and features (columns)  What samples should be grouped.
LISA Short Course Series Multivariate Analysis in R Liang (Sally) Shan March 3, 2015 LISA: Multivariate Analysis in RMar. 3, 2015.
Principal Component Analysis
Unsupervised Learning: Clustering Rong Jin Outline  Unsupervised learning  K means for clustering  Expectation Maximization algorithm for clustering.
Principal Component Analysis IML Outline Max the variance of the output coordinates Optimal reconstruction Generating data Limitations of PCA.
Microarray analysis 2 Golan Yona. 2) Analysis of co-expression Search for similarly expressed genes experiment1 experiment2 experiment3 ……….. Gene i:
Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
Principal Component Analysis. Consider a collection of points.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Health and CS Philip Chan. DNA, Genes, Proteins What is the relationship among DNA Genes Proteins ?
Eigenfaces for Recognition Student: Yikun Jiang Professor: Brendan Morris.
Utilising software to enhance your research Eamonn Hynes 5 th November, 2012.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Summarized by Soo-Jin Kim
Principle Component Analysis Presented by: Sabbir Ahmed Roll: FH-227.
Presented By Wanchen Lu 2/25/2013
2012/06/22 Contents  GPU (Graphic Processing Unit)  CUDA Programming  Target: Clustering with Kmeans  How to use.
Exploiting Domain-Specific High-level Runtime Support for Parallel Code Generation Xiaogang Li Ruoming Jin Gagan Agrawal Department of Computer and Information.
Gaussian Mixture Model and the EM algorithm in Speech Recognition
Taking Raw Data Towards Analysis 1 iCSC2015, Vince Croft, NIKHEF Exploring EDA, Clustering and Data Preprocessing Lecture 2 Taking Raw Data Towards Analysis.
Predictive Runtime Code Scheduling for Heterogeneous Architectures 1.
IIIT, Hyderabad Performance Primitives for Massive Multithreading P J Narayanan Centre for Visual Information Technology IIIT, Hyderabad.
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
YOU LI SUPERVISOR: DR. CHU XIAOWEN CO-SUPERVISOR: PROF. LIU JIMING THURSDAY, MARCH 11, 2010 Speeding up k-Means by GPUs 1.
Accelerating Statistical Static Timing Analysis Using Graphics Processing Units Kanupriya Gulati and Sunil P. Khatri Department of ECE, Texas A&M University,
es/by-sa/2.0/. Principal Component Analysis & Clustering Prof:Rui Alves Dept Ciencies Mediques.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Computer Graphics and Image Processing (CIS-601).
Sep 11, 2009 Automatic Transformation of Applications onto GPUs and GPU Clusters PhD Candidacy presentation: Wenjing Ma Advisor: Dr Gagan Agrawal The.
Data-Intensive Computing: From Clouds to GPUs Gagan Agrawal December 3,
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Configurations Vignesh Ravi, Wenjing Ma, David Chiu.
Cluster Analysis Potyó László. Cluster: a collection of data objects Similar to one another within the same cluster Similar to one another within the.
EE4-62 MLCV Lecture Face Recognition – Subspace/Manifold Learning Tae-Kyun Kim 1 EE4-62 MLCV.
1/18 New Feature Presentation of Transition Probability Matrix for Image Tampering Detection Luyi Chen 1 Shilin Wang 2 Shenghong Li 1 Jianhua Li 1 1 Department.
Supporting Load Balancing for Distributed Data-Intensive Applications Leonid Glimcher, Vignesh Ravi, and Gagan Agrawal Department of ComputerScience and.
Big data Usman Roshan CS 675. Big data Typically refers to datasets with very large number of instances (rows) as opposed to attributes (columns). Data.
Euro-Par, 2006 ICS 2009 A Translation System for Enabling Data Mining Applications on GPUs Wenjing Ma Gagan Agrawal The Ohio State University ICS 2009.
PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.
CS/EE 217 GPU Architecture and Parallel Programming Midterm Review
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
High-level Interfaces for Scalable Data Mining Ruoming Jin Gagan Agrawal Department of Computer and Information Sciences Ohio State University.
Exploiting Computing Power of GPU for Data Mining Application Wenjing Ma, Leonid Glimcher, Gagan Agrawal.
AUTO-GC: Automatic Translation of Data Mining Applications to GPU Clusters Wenjing Ma Gagan Agrawal The Ohio State University.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Advanced Artificial Intelligence Lecture 8: Advance machine learning.
GPU Programming Contest. Contents Target: Clustering with Kmeans How to use toolkit1.0 Towards the fastest program.
Heterogeneous Computing using openCL lecture 4 F21DP Distributed and Parallel Technology Sven-Bodo Scholz.
Given a set of data points as input Randomly assign each point to one of the k clusters Repeat until convergence – Calculate model of each of the k clusters.
Clustering Usman Roshan CS 675. Clustering Suppose we want to cluster n vectors in R d into two groups. Define C 1 and C 2 as the two groups. Our objective.
GPGPU Performance and Power Estimation Using Machine Learning Gene Wu – UT Austin Joseph Greathouse – AMD Research Alexander Lyashevsky – AMD Research.
Principal Components Analysis ( PCA)
Accelerating K-Means Clustering with Parallel Implementations and GPU Computing Janki Bhimani Miriam Leeser Ningfang Mi
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Blocked 2D Convolution Ravi Sankar P Nair
A Dynamic Scheduling Framework for Emerging Heterogeneous Systems
COMP 1942 PCA TA: Harry Chan COMP1942.
Principal Component Analysis (PCA)
Accelerating MapReduce on a Coupled CPU-GPU Architecture
Linchuan Chen, Xin Huo and Gagan Agrawal
Principal Component Analysis
CS/EE 217 – GPU Architecture and Parallel Programming
KMeans Clustering on Hadoop Fall 2013 Elke A. Rundensteiner
Data-Intensive Computing: From Clouds to GPU Clusters
Presentation transcript:

“Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Computations” By Ravi, Ma, Chiu, & Agrawal Presented by Julian Bui 1

Outline Goals The focus problems Key ideas Results 2

Goals Programmability Performance Effective work distribution 3

The Focus Problems K-Means Clustering Principal Components Analysis 4

K-means clustering - 1 5

K-means clustering - 2 6

K-means clustering - 3 7

K-means clustering - 4 8

K-Means Clustering 1. Randomly guess k cluster center locations 2. Each datapoint figures out which cluster center it's closest to 3. Center now "owns" a set of points 4. Cluster calculates the actual center of the points it owns 5. Cluster's center point now becomes the actual center point 6. Repeat steps 2-5 until the cluster's datset doesn't change between iterations 9

K-Means Clustering For all clusters Centroid of the cluster 10

Principal Component Analysis (PCA) Goals –Dimensionality Reduction INPUT: set of M-dimensional pts OUTPUT: set of D-dimensional pts where D << M –Extract patterns in the data, machine learning –Transforms possibly correlated data into a smaller number of uncorrelated data –Principal components account for the greatest possible statistical variability 11

PCA Principal components are found by extracting eigenvectors from the covariance matrix of the data 12

PCA - Simple Example 13

PCA – Facial Recognition 14

So how does this work apply to those problems? Code generator input: –Reduction function –Variable list (data to be computed) Code Generator output –Host functions –Kernel code 15

System Architecture 16

Work Distribution Work Sharing vs. Work Stealing Uniform vs. Non-uniform chunk sizes 17

Experiments Machine: AMD Opteron 8350 w/ 8 cores and 16 GB of main memory, GeForce9800 GTX w/ 512MB memory K-Means: K = 125, 6.4GB file, 100M 3D points PCA: 8.5GB data set, covariance matrix width of 64 18

GPU Chunk Size & Num. Threads vs. Performance 19

K-Means, Hetero., Uni vs. Non- uniform chunk sizes 20

PCA, Hetero., Uni vs. Non-uniform chunk sizes 21

Idle Time 22

Work Distribution – K-Means 23

Work Distribution – PCA 24