李宏毅-深度学习Note
视频链接:机器学习-李宏毅(2019) Machine Learning
参考:李宏毅 深度学习19(完整版)国语】机器学习 深度学习 李宏毅-深度学习Note(目录)
目录P1ML Lecture 0 - Introduction of Machine LearningP2ML Lecture 1 - Regression - Case StudyP3ML Lecture 1 - Regression - DemoP4ML Lecture 2 - Where does the error come from
P1ML Lecture 0 - Introduction of Machine Learning
ML是什么?
输入→ML(大量数据训练的程序)→输出
Learning Map
P2ML Lecture 1 - Regression - Case Study
Regression三步骤
Model → Goodness of Function → Gradient Descent
1.1 Model
自定义预测函数
1.2 Goodness of Function
A set of function → Goodness of function f ← Training Data
1.3 Gradient Descent
凡可微分的函数,都可以用梯度下降找到最优解
注:Linear Regression上是没有局部最优解,只可能有全局最优解 Because the loss function L is convex.(凸函数)
问:为什么机器学习时需要划分训练与测试阶段?
主流为监督学习
优化模型可以使函数更加复杂,泰勒展开式
一个越复杂的模型,表示能力越强,因为表示的情况越多。(高次包含低次)
P3ML Lecture 1 - Regression - Demo
import numpy
as np
import matplotlib
.pyplot
as plt
x_data
= [338., 333., 328., 207., 226., 25., 179., 60., 208., 606.]
y_data
= [640., 633., 619., 393., 428., 27., 193., 66., 226., 1591.]
x
= np
.arange
(-200, -100, 1)
y
= np
.arange
(-5, 5, 0.1)
Z
= np
.zeros
((len(x
), len(y
)))
X
, Y
= np
.meshgrid
(x
, y
)
for i
in range(len(x
)):
for j
in range(len(y
)):
b
= x
[i
]
w
= y
[j
]
Z
[j
][i
] = 0
for n
in range(len(x_data
)):
Z
[j
][i
] = Z
[j
][i
] + (y_data
[n
] - b
- w
* x_data
[n
])**2
Z
[j
][i
] = Z
[j
][i
]/len(x_data
)
x
= np
.arange
(-200, -100, 1)
y
= np
.arange
(-5, 5, 0.1)
Z
= np
.zeros
((len(x
), len(y
)))
X
, Y
= np
.meshgrid
(x
, y
)
for i
in range(len(x
)):
for j
in range(len(y
)):
b
= x
[i
]
w
= y
[j
]
Z
[j
][i
] = 0
for n
in range(len(x_data
)):
Z
[j
][i
] = Z
[j
][i
] + (y_data
[n
] - b
- w
* x_data
[n
])**2
Z
[j
][i
] = Z
[j
][i
]/len(x_data
)
b
= -120
w
= -4
lr
= 0.0000001
iteration
= 100
b_history
= [b
]
w_history
= [w
]
for i
in range(iteration
):
b_grad
= 0.0
w_grad
= 0.0
for n
in range(len(x_data
)):
b_grad
= b_grad
- 2.0*(y_data
[n
] - b
- w
*x_data
[n
])*1.0
w_grad
= w_grad
- 2.0*(y_data
[n
] - b
- w
*x_data
[n
])*x_data
[n
]
b
= b
- lr
*b_grad
w
= w
- lr
*w_grad
b_history
.append
(b
)
w_history
.append
(w
)
plt
.contourf
(x
,y
,Z
, 50, alpha
=0.5, cmap
=plt
.get_cmap
('jet'))
plt
.plot
([-188.4], [2.67], x
, ms
=12, markeredgewidth
=3, color
='orange')
plt
.plot
(b_history
, w_history
, 'o-', ms
=3, lw
=1.5, color
='black')
plt
.xlim
(-200, -100)
plt
.ylim
(-5, 5)
plt
.xlabel
(r
'$b$', fontsize
=16)
plt
.ylabel
(r
'$w$', fontsize
=16)
plt
.show
()
b
= -120
w
= -4
lr
= 1
iteration
= 100
b_history
= [b
]
w_history
= [w
]
for i
in range(iteration
):
b_grad
= 0.0
w_grad
= 0.0
for n
in range(len(x_data
)):
b_grad
= b_grad
- 2.0*(y_data
[n
] - b
- w
*x_data
[n
])*1.0
w_grad
= w_grad
- 2.0*(y_data
[n
] - b
- w
*x_data
[n
])*x_data
[n
]
b
= b
- lr
*b_grad
w
= w
- lr
*w_grad
b_history
.append
(b
)
w_history
.append
(w
)
plt
.contourf
(x
,y
,Z
, 50, alpha
=0.5, cmap
=plt
.get_cmap
('jet'))
plt
.plot
([-188.4], [2.67], x
, ms
=12, markeredgewidth
=3, color
='orange')
plt
.plot
(b_history
, w_history
, 'o-', ms
=3, lw
=1.5, color
='black')
plt
.xlim
(-200, -100)
plt
.ylim
(-5, 5)
plt
.xlabel
(r
'$b$', fontsize
=16)
plt
.ylabel
(r
'$w$', fontsize
=16)
plt
.show
()
P4ML Lecture 2 - Where does the error come from
与标准值的偏差bias + 与其他函数均值的方差variance
simple model → Large bias + Small variance
comple model → Small bias + Large variance
不能适合训练集 + large bias → Underfitting
适合训练集,在测试集上较大错误 + large variance → Overfitting
large bias:
Add more features as input
A more complex model
large variance:
More data
Regularization
Corss Validation:
训练集(训练集+验证集) + 测试集(Public) + 测试集(Private)