Download presentation
Presentation is loading. Please wait.
1
Machine Learning & Deep Learning
2
AI vs. Human
3
AI 기술의 새로운 등장 Artificial Intelligence Machine Learning Deep Learning
4
강의 계획 강의 발표 평가 Machine Learning & Deep Learning 기초 Deep Learning Book
Machine learning basic concepts Linear regression Logistic regression (classification) Multivariable (Vector) linear/logistic regression Neural networks 발표 Deep Learning Book 평가 중간고사: 45% 기말고사: 45% 출석: 10%
5
강의 교재 (발표) http://www.deeplearningbook.org/
저자: Ian Goodfellow, Yoshua Bengio and Aaron Courville
6
Machine Learning Basics
7
Basic concepts What is ML? What is learning? What is regression?
supervised learning unsupervised learning What is regression? What is classification?
8
Supervised/Unsupervised learning
learning with labeled examples - training set
9
Supervised/Unsupervised learning
10
Supervised Learning
11
Supervised Learning Training data set
12
Supervised Learning AlphaGo
13
Types of Supervised Learning
14
Predicting final exam score : Regression
15
Predicting Pass/non-pass : binary classification
16
Predicting grades (A, B, …) : multi-labels classification
17
Linear Regression
18
Predicting exam score: regression
19
Regression data
20
Linear Hypothesis
21
Linear Hypothesis
22
Which hypothesis is better?
23
Cost function How fit the line to our (training) data
24
Cost function How fit the line to our (training) data
25
Cost function
26
ML Goal To minimize the cost function 𝑎𝑟𝑔𝑚𝑖𝑛 {𝑊,𝑏} 𝑐𝑜𝑠𝑡(𝑊,𝑏)
27
Hypothesis and Cost
28
Simplified hypothesis
29
What cost(W) looks like?
30
What cost(W) looks like?
31
What cost(W) looks like?
32
How to minimize the cost
33
Gradient descent algorithm
34
Gradient descent algorithm
How it works?
35
Cost function: formal definition
36
Cost function: formal definition
37
Cost function: convex function
38
Cost function: convex function
39
Multi-variable linear regression
40
Predicting exam score: regression using two inputs (x1, x2)
41
Hypothesis
42
Cost function
43
Matrix notation
44
Matrix notation Hypothesis without b
45
Logistic regression
46
Classification
47
Classification
48
Classification
49
Linear Regression을 이용한 classification
0.5
50
Linear Regression을 이용한 classification
0.5
51
Linear Regression을 이용한 classification
Linear regression hypothesis H(x) = Wx + b 1보다 크고 0보다 작을 수도 있음 0 ~ 1 사이의 값으로 scaling 필요
52
Logistic regression Sigmoid function (or Logistic function) 이용
53
Logistic hypothesis
54
Cost function
55
New cost function for logistic regression
56
New cost function for logistic regression
y=1 일 때, H(x)=1로 예측 => cost = 0 H(x)=0로 예측 => cost = y=0 일 때, H(x)=0로 예측 => cost = 0 H(x)=1로 예측 => cost =
57
New cost function for logistic regression
𝐶 𝐻 𝑥 , 𝑦 =−𝑦 log 𝐻 𝑥 − 1−𝑦 log(1−𝐻 𝑥 )
58
Gradient descent algorithm
59
Multinomial classification
60
Logistic regression HL(X) = WX z = H(X) g(z) : 0 ~ 1 사이의 값을 가지도록
HR(X) = g(HL(X)) z X 𝑌 : 0 ~ 1 W
61
Multinomial classification
62
Multinomial classification
X W z 𝒀 : 0 ~ 1 X W z 𝒀 : 0 ~ 1 X W z 𝒀 : 0 ~ 1
63
Multinomial classification
X W z 𝒀 𝑨 B X W z 𝒀𝑩 C X W z 𝒀 𝑪
64
Multinomial classification
X W z 𝒀 𝑨 𝑤𝐴1 𝑤𝐴2 𝑤𝐴3 𝑤𝐵1 𝑤𝐵2 𝑤𝐵3 𝑤𝐶1 𝑤𝐶1 𝑤𝐶1 𝑥1 𝑥2 𝑥3 B X W z 𝒀𝑩 C X W z 𝒀 𝑪
65
Softmax function 𝒀 𝑨 𝒀 𝑩 X Y = 𝒀 𝑪 W 0.7 0.2 0.1 = 2.0 1.0 0.1 S(Y)
= Y = probability
66
Cost function 0.7 0.2 0.1 1 0 0 𝐷 𝑆, 𝐿 =− 𝑖 𝐿 𝑖 ∗log( 𝑆 𝑖 ) S(Y)
Cross-entropy S(Y) L = Y (실제값) 1 0 0 𝐷 𝑆, 𝐿 =− 𝑖 𝐿 𝑖 ∗log( 𝑆 𝑖 )
67
Cross-entropy cost function
𝐷 𝑆, 𝐿 = 𝑖 {𝐿 𝑖 ∗−log( 𝑆 𝑖 )} 예를 들어, 실제값 Y = L = [0 1]t
68
Logistic cost vs. Cross entropy cost
𝐶 𝐻 𝑥 , 𝑦 =−𝑦 log 𝐻 𝑥 − 1−𝑦 log(1−𝐻 𝑥 ) 𝐷 𝑆, 𝐿 =− 𝑖 𝐿 𝑖 ∗log( 𝑆 𝑖 )
69
Final cost (loss) function
𝐿𝑜𝑠𝑠 𝑊 = 1 𝑁 𝑖 𝐷(𝑆 𝑖 , 𝐿 𝑖 ) Training data
70
Gradient descent 𝑤𝑖←𝑤𝑖−𝛼 𝜕𝐿𝑜𝑠𝑠 𝑊 𝜕𝑤𝑖
71
Learning rate
72
Large learning rate: overshooting
73
Small learning rate: takes too long
74
Optimal learning rates ?
Observe the cost function Check it goes down in a reasonable rate
75
Data preprocessing
76
Data preprocessing for gradient descent
w2 w1
77
Data preprocessing for gradient descent
w2 w1
78
Data preprocessing for gradient descent
79
Standardization
80
Overfitting
81
Overfitting The ML model is very good only with the training data set
Memorization Not good at test set or in real use
82
Overfitting
83
Solutions for overfitting
More training data Reduce the number of features Regularization
84
Regularization Try not to have too big number in the weights
특정 wi 가 커지는 경우
85
Regularization 𝐿𝑜𝑠𝑠 𝑊 = 1 𝑁 𝑖 𝐷(𝑆(𝑊𝑋 𝑖 +𝑏), 𝐿 𝑖 )+ 𝑤 𝑖 2
𝐿𝑜𝑠𝑠 𝑊 = 1 𝑁 𝑖 𝐷(𝑆(𝑊𝑋 𝑖 +𝑏), 𝐿 𝑖 )+ 𝑤 𝑖 2 : regularization strength 범위: 0 ~ 1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.