Lecture 22: Compressed Linear Algebra for Large Scale ML

Slides:



Advertisements
Similar presentations
Example: Given a matrix defining a linear mapping Find a basis for the null space and a basis for the range Pamela Leutwyler.
Advertisements

Chapter 28 – Part II Matrix Operations. Gaussian elimination Gaussian elimination LU factorization LU factorization Gaussian elimination with partial.
Efficient Sparse Matrix-Matrix Multiplication on Heterogeneous High Performance Systems AACEC 2010 – Heraklion, Crete, Greece Jakob Siegel 1, Oreste Villa.
Assembly 2005, Helsinki, July Crinkler - compressing Windows 4k intros to EXE files Aske Simon Christensen Rune L. H. Stubbe.
ECIV 301 Programming & Graphics Numerical Methods for Engineers Lecture 14 Elimination Methods.
Compressibility of WML and WMLScript byte code:Initial results Eetu Ojanen and Jari Veijalainen Department of Computer Science and Information Systems.
Computer Graphics (Fall 2005) COMS 4160, Lecture 2: Review of Basic Math
Table of Contents Matrices - Multiplication Assume that matrix A is of order m  n and matrix B is of order p  q. To determine whether or not A can be.
Copyright © Cengage Learning. All rights reserved. 7.8 Applications of Matrices and Determinants.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1 Preview of Oracle Database 12 c In-Memory Option Thomas Kyte
Foundations of Computer Graphics (Fall 2012) CS 184, Lecture 2: Review of Basic Math
Little Linear Algebra Contents: Linear vector spaces Matrices Special Matrices Matrix & vector Norms.
Chapter 1 Section 1.1 Introduction to Matrices and Systems of Linear Equations.
Review of Matrices Or A Fast Introduction.
1 BIEN425 – Lecture 8 By the end of the lecture, you should be able to: –Compute cross- /auto-correlation using matrix multiplication –Compute cross- /auto-correlation.
AN ORTHOGONAL PROJECTION
Algebra 3: Section 5.5 Objectives of this Section Find the Sum and Difference of Two Matrices Find Scalar Multiples of a Matrix Find the Product of Two.
CGDD 4003 THE MATH LECTURE (BOILED DOWN, YET LIGHTLY SALTED)
Algorithms 2005 Ramesh Hariharan. Algebraic Methods.
Sullivan Algebra and Trigonometry: Section 12.3 Objectives of this Section Write the Augmented Matrix of a System of Linear Equations Write the System.
SEM Basics 2 Byrne Chapter 2 Kline pg 7-15, 50-51, ,
Finite Mathematics Dr. Saeid Moloudzadeh Sets and Set Operations 1 Contents Algebra Review Functions and Linear Models Systems of Linear.
Efficient SAS programming with Large Data Aidan McDermott Computing Group, March 2007.
CS654: Digital Image Analysis
March, 2002 Efficient Bitmap Indexing Techniques for Very Large Datasets Kesheng John Wu Ekow Otoo Arie Shoshani.
Lecture 8, CS5671 Neural Network Concepts Weight Matrix vs. NN MLP Network Architectures Overfitting Parameter Reduction Measures of Performance Sequence.
1.7 Linear Independence. in R n is said to be linearly independent if has only the trivial solution. in R n is said to be linearly dependent if there.
An Overview of Different Compression Algorithms Their application on compressing inverted files.
Accelerating Multi-Pattern Matching on Compressed HTTP Traffic Dr. Anat Bremler-Barr (IDC) Joint work with Yaron Koral (IDC), Infocom[2009]
Warm UP: Review of Inequalities What do each of the symbols represent: 1. > 2. < Greater Than 2. Less than 3. Less than or equal to 4. Greater.
Energy Efficient Data Management in Sensor Networks Sanjay K Madria Web and Wireless Computing Lab (W2C) Department of Computer Science, Missouri University.
Notes Over 4.2 Finding the Product of Two Matrices Find the product. If it is not defined, state the reason. To multiply matrices, the number of columns.
Chapter 5: Matrices and Determinants Section 5.5: Augmented Matrix Solutions.
1 SWE 423 – Multimedia System. 2 SWE Multimedia System Introduction  Compression is the process of coding that will effectively reduce the total.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
Image Processing Architecture, © Oleh TretiakPage 1Lecture 5 ECEC 453 Image Processing Architecture Lecture 5, 1/22/2004 Rate-Distortion Theory,
1 1.3 © 2016 Pearson Education, Ltd. Linear Equations in Linear Algebra VECTOR EQUATIONS.
An Introduction to Matrix Algebra Math 2240 Appalachian State University Dr. Ginn.
Optimizing the Performance of Sparse Matrix-Vector Multiplication
CSC321: Neural Networks Lecture 9: Speeding up the Learning
MTH108 Business Math I Lecture 20.
CSE 167 [Win 17], Lecture 2: Review of Basic Math Ravi Ramamoorthi
Analysis of Sparse Convolutional Neural Networks
JPEG Compression What is JPEG? Motivation
Linear Algebra Lecture 2.
File Format Benchmark - Avro, JSON, ORC, & Parquet
Welcome to the course! Meetings and communication: AC meetings
Resource Elasticity for Large-Scale Machine Learning
Linear Mixed Models in JMP Pro
Lecture 3: Solving Diff Eqs with the Laplace Transform
Applications of Matrices
Two-view geometry Computer Vision Spring 2018, Lecture 10
Section 7.4 Matrix Algebra.
COMP 175: Computer Graphics February 9, 2016
Section 6.4 Multiplicative Inverses of Matices and Matrix Equations
MapReduce Algorithm Design Adapted from Jimmy Lin’s slides.
Lecture 20: Scaling Inference with RDBMS
Lecture 21: ML Optimizers
1.3 Vector Equations.
~ Least Squares example
Section 9.4 Multiplicative Inverses of Matices and Matrix Equations
Compressed Linear Algebra for Large-Scale Machine Learning
Who: All Chair and Co-Chairs When: Friday, April 21, 2017
Compressed Linear Algebra for Large-Scale Machine Learning
~ Least Squares example
Linear Algebra review (optional)
Data Pre-processing Lecture Notes for Chapter 2
Prepared by Po-Chuan on 2016/05/24
Rohan Yadav and Charles Yuan (rohany) (chenhuiy)
ME 123 Computer Applications I Lecture 4: Vectors and Matrices 3/14/03
Presentation transcript:

Lecture 22: Compressed Linear Algebra for Large Scale ML Slides by Memar

Announcement Thanks to those that dropped by on Friday. Those that didn’t come: I hope you are cooking something awesome. If you still want to meet email me. Interesting read: https://lemire.me/blog/2018/04/17/iterating-in- batches-over-data-structures-can-be-much-faster/

Today Compression Column encoding Compressed LA

Section 1 1. Compression

Section 1 Motivation

Solution: Fit more data into memory Section 1 Solution: Fit more data into memory

Solution: Fit more data into memory Section 1 Solution: Fit more data into memory

Compression techniques Section 1 Compression techniques Heavyweight -> decompression is slow Gzip Lightweight -> moderate compression ratios -> fast decompression

Compression techniques Section 1 Compression techniques Heavyweight -> decompression is slow Gzip Lightweight -> moderate compression ratios -> fast decompression

2. Column encoding

Column-wise compression: Motivation Section 2 Column-wise compression: Motivation Column-wise compression leverages two key characteristics: few distinct values per column and high cross-column correlations. 1. Low column cardinalities

Column-wise compression: Motivation Section 2 Column-wise compression: Motivation Column-wise compression leverages two key characteristics: few distinct values per column and high cross-column correlations. Low column cardinalities Non-uniform sparsity across columns Tall and skinny matrices (more common)

Column encoding formats Section 2 Column encoding formats Uncompressed Columns (UC) Offset-List Encoding (OLE) Run-Length Encoding (RLE)

Section 2 Uncompressed Column

Section 2 Offset-List Encoding

Section 2 Run-Length Encoding

It’s all about tradeoffs! Section 2 It’s all about tradeoffs!

Section 2 Column co-coding

Combining compression methods Section 2 Combining compression methods

Section 2 Data layout: OLE

Section 2 Data layout: RLE

3. Compressed LA

Matrix-vector multiplication Section 3 Matrix-vector multiplication HOP = high-level operations LOP = low-level operations

Compression planning Section 3 HOP = high-level operations LOP = low-level operations

Estimating column compression ratios Section 3 Estimating column compression ratios

Partitioning columns into groups Section 3 Partitioning columns into groups

Choosing the encoding format for each group Section 3 Choosing the encoding format for each group