CSE Social Media & Text Analytics

Slides:



Advertisements
Similar presentations
Numerical Linear Algebra in the Streaming Model Ken Clarkson - IBM David Woodruff - IBM.
Advertisements

Ordinary Least-Squares
Linear Inverse Problems
Lirong Xia Reinforcement Learning (2) Tue, March 21, 2014.
R Lecture 4 Naomi Altman Department of Statistics Department of Statistics (Based on notes by J. Lee)
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
Multiple regression analysis
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Support Vector Regression David R. Musicant and O.L. Mangasarian International Symposium on Mathematical Programming Thursday, August 10, 2000
Concatenation MATLAB lets you construct a new vector by concatenating other vectors: – A = [B C D... X Y Z] where the individual items in the brackets.
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Linear and generalised linear models
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Linear regression models in matrix terms. The regression function in matrix terms.
Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2.
Intro to Matrices Don’t be scared….
Chapter 7 Matrix Mathematics Matrix Operations Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Collaborative Filtering Matrix Factorization Approach
Chapter 5. Loops are common in most programming languages Plus side: Are very fast (in other languages) & easy to understand Negative side: Require a.
David Corne, and Nick Taylor, Heriot-Watt University - These slides and related resources:
Matrices Square is Good! Copyright © 2014 Curt Hill.
What is the MPC?. Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of.
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Scientific Computing with NumPy & SciPy NumPy Installation and Documentation  Not much on the home page—don’t buy the guide, it’s.
Matrix Algebra and Regression a matrix is a rectangular array of elements m=#rows, n=#columns  m x n a single value is called a ‘scalar’ a single row.
Python Mini-Course University of Oklahoma Department of Psychology Lesson 21 NumPy 6/11/09 Python Mini-Course: Lesson 21 1.
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
Logistic Regression & Elastic Net
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
MTH108 Business Math I Lecture 20.
Linear Algebra Review.
Linear Algebra review (optional)
Estimation Techniques for High Resolution and Multi-Dimensional Array Signal Processing EMS Group – Fh IIS and TU IL Electronic Measurements and Signal.
Tensorflow Tutorial Homin Yoon.
Chapter 7 Matrix Mathematics
pycuda Jin Kwon Kim May 25, 2017 Hi my name is jin kwon kim.
Algorithmic complexity: Speed of algorithms
CSE 4705 Artificial Intelligence
Regression.
(Mohammed Sami) Ashhab
Prepared by Kimberly Sayre and Jinbo Bi
Torch 02/27/2018 Hyeri Kim Good afternoon, everyone. I’m Hyeri. Today, I’m gonna talk about Torch.
Matrices Definition: A matrix is a rectangular array of numbers or symbolic elements In many applications, the rows of a matrix will represent individuals.
An open-source software library for Machine Intelligence
Collaborative Filtering Matrix Factorization Approach
PH2150 Scientific Computing Skills
Matlab tutorial course
MXNet Internals Cyrus M. Vahid, Principal Solutions Architect,
MNIST Dataset Training with Tensorflow
Algorithmic complexity: Speed of algorithms
Parallelization of Sparse Coding & Dictionary Learning
MATLAB Programming Basics Copyright © Software Carpentry 2011
Lets Play with arrays Singh Tripty
Basics of Linear Algebra
Algorithmic complexity: Speed of algorithms
Dr. Sampath Jayarathna Cal Poly Pomona
Introduction to Matlab
Linear Algebra review (optional)
Dr. Sampath Jayarathna Old Dominion University
Math review - scalars, vectors, and matrices
Multiple features Linear Regression with multiple variables
Multiple features Linear Regression with multiple variables
The Elements of Linear Algebra
Python Debugging Session
Reinforcement Learning (2)
Introduction to Computer Science
Reinforcement Learning (2)
Presentation transcript:

CSE 5539 - Social Media & Text Analytics Numpy Tutorial CSE 5539 - Social Media & Text Analytics Improve OLS Add instructions on how to install numpy, jupyter 75 mins: do a little bit of time planning 15 mins: installation (max)

Numpy Core library for scientific computing with Python Provides easy and efficient implementation of vector, matrix and Tensor (N- dimensional array) operations Pros: Automatically parallelize operations on multiple CPUs Matrix and vector operations implemented in C, abstracted out from the user. Fast slicing and dicing Easy to learn, the APIs are quite intuitive Open source, maintained by a large and active community Cons: Does not exploit GPUs Append, concatenate, iteration over individual elements is slow

This Tutorial Prerequisites: Explore numpy package, ndarray object, its attributes and methods Introduces Linear Regression via Ordinary Least Squares Implement OLS using numpy Prerequisites: Python programming experience Laptop: with Python, NumPy, Jupyter Your undivided attention for an hour!!

Part I: Getting Hands Dirty with Numpy

ndarray Object multidimensional container of items of the same type and size Operations allowed - indexing, slicing, broadcasting, transposing … Can be converted to and from list

Creating ndarray object Note: All elements of an ndarray object are of same type http://web.stanford.edu/~ermartin/Teaching/CME193-Winter15/slides/Presentation5.pdf

Vectors Vectors are just 1d arrays http://nicolas.pecheux.fr/courses/python/intro_numpy.pdf

Matrices Matrices are just 2d arrays http://nicolas.pecheux.fr/courses/python/intro_numpy.pdf

Playing with ndarray Shapes

Array Broadcasting http://web.stanford.edu/~ermartin/Teaching/CME193-Winter15/slides/Presentation5.pdf

Matrix Operations Sum Product Logical Transpose Remember: The usual ‘*’ operator corresponds to element-wise product and not product of matrices as we know it. Use np.dot instead Logical Transpose

Indexing and Slicing

Statistics

Random Arrays

Linear Algebra Add a few examples here

Other Useful Functions

Some useful links Documentation: https://docs.scipy.org/doc/numpy-dev/reference/ Issues: https://github.com/numpy/numpy/issues Questions: https://stackoverflow.com/questions/tagged/numpy

Part II: Building a Simple Regression Model

Linear Regression Regression Put simply, given Y and X, find F(X) such that Y = F(X) Linear Y ~ WX + b Note: Y and X may be multidimensional.

Regression is Useful Establish relationship between quantities: Alcohol consumed and blood alcohol content Market factors and price of stocks Driving speed and mileage Prediction: Accelerometer data in phone and your running speed Impedance/Resistance and heart rate Tomorrow’s stock price, given EOD prices and market factors

Linear Regression: Analytical Solution We are using a linear model to approximate F(X) with where, Error due to this approximation (aka Loss, L) Let’s define as = The loss function can be rewritten as,

Linear Regression: Analytical Solution To make our approximation as good as possible, we want to minimize the Loss , by appropriately changing . This can be achieved by: Solving the above PDE gives:

Analytical Solution: Discussion Easy to understand and implement Involves matrix operations which are easy to parallelize Converges to “true” solution Involves matrix inversion which is slow and memory intensive Need entire dataset in the memory Correlated features lead to inverting a singular matrix.