Download presentation
Presentation is loading. Please wait.
Published byDiana Burns Modified over 9 years ago
1
B.Ramamurthy
2
Data Analytics (Data Science) EDA Data Intuition/ understand ing Big-data analytics StatsAlgs Discoveries / intelligence Statistical Inference Decisions/ Answers/ Results * *
3
Pipelines to prepare data Three types: 1. Data preparation algorithms such as sorting, workflows 2. Optimization algorithms stochastic gradient descent, least squares… 3. Machine learning algorithms…
4
Comes from Artificial Intelligence No underlying generative process Build to predict or classify something Three basic algorithms: linear regression, k-nn, k-means We already looked at linear regression as a case study for R/Rstudio We will start with k-means…
5
K-means is unsupervised: no prior knowledge of the “right answer” Goal of the algorithm Is to determine the definition of the right answer by finding clusters of data Kind of satisfaction survey data, incident report data, Assume data {age, gender, income, state, household, size}, your goal is to segment the users. K-means is the simplest of the clustering algorithms. Lets understand kmeans using an example.
6
{Age, income range, education, skills, social, paid work} Lets take just the age { 23, 25, 24, 23, 21, 31, 32, 30,31, 30, 37, 35, 38, 37, 39, 42, 43, 45, 43, 45} Classify this data using K-means Lets assume K = 3 or 3 groups Give me a guess of the centroids? Lets assume initial value of centroids to {21, 30, 40} First lets hand calculate and then use R-Studio
7
Supervised ML You know the “right answers” or at least data that is “labeled”: training set Set of objects have been classified or labeled (training set) Another set of objects are yet to be labeled or classified (test set) Your goal is to automate the processes of labeling the test set. Intuition behind k-NN is to consider most similar items --- similarity defined by their attributes, look at the existing label and assign the object a label.
8
Lets look at an example AgeLoan (X1000)Default 2540N 3560N 4580N 20 N 35120N 5218Y 2395Y 4062Y 60100Y 48220Y 33150Y
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.