李宏毅-深度学习Note（Lecture 0-2 Introduction Regression）

mac2022-06-30 33

李宏毅-深度学习Note

视频链接：机器学习-李宏毅(2019) Machine Learning

参考：李宏毅深度学习19（完整版）国语】机器学习深度学习李宏毅-深度学习Note（目录）

目录P1ML Lecture 0 - Introduction of Machine LearningP2ML Lecture 1 - Regression - Case StudyP3ML Lecture 1 - Regression - DemoP4ML Lecture 2 - Where does the error come from

P1ML Lecture 0 - Introduction of Machine Learning

ML是什么？

输入→ML（大量数据训练的程序）→输出

Learning Map

P2ML Lecture 1 - Regression - Case Study

Regression三步骤

Model → Goodness of Function → Gradient Descent

1.1 Model

自定义预测函数

1.2 Goodness of Function

A set of function → Goodness of function f ← Training Data

1.3 Gradient Descent

凡可微分的函数，都可以用梯度下降找到最优解

注：Linear Regression上是没有局部最优解，只可能有全局最优解 Because the loss function L is convex.(凸函数)

问：为什么机器学习时需要划分训练与测试阶段？

主流为监督学习

优化模型可以使函数更加复杂，泰勒展开式

一个越复杂的模型，表示能力越强，因为表示的情况越多。（高次包含低次）

P3ML Lecture 1 - Regression - Demo

import numpy as np import matplotlib.pyplot as plt x_data = [338., 333., 328., 207., 226., 25., 179., 60., 208., 606.] y_data = [640., 633., 619., 393., 428., 27., 193., 66., 226., 1591.] # y_data = b + w * x_data # bias x = np.arange(-200, -100, 1) # weight y = np.arange(-5, 5, 0.1) Z = np.zeros((len(x), len(y))) X, Y = np.meshgrid(x, y) for i in range(len(x)): for j in range(len(y)): b = x[i] w = y[j] Z[j][i] = 0 for n in range(len(x_data)): Z[j][i] = Z[j][i] + (y_data[n] - b - w * x_data[n])**2 Z[j][i] = Z[j][i]/len(x_data) # bias x = np.arange(-200, -100, 1) # weight y = np.arange(-5, 5, 0.1) Z = np.zeros((len(x), len(y))) X, Y = np.meshgrid(x, y) for i in range(len(x)): for j in range(len(y)): b = x[i] w = y[j] Z[j][i] = 0 for n in range(len(x_data)): Z[j][i] = Z[j][i] + (y_data[n] - b - w * x_data[n])**2 Z[j][i] = Z[j][i]/len(x_data) # y_data = b + w * x_data # initial b b = -120 # initial w w = -4 # learning rate lr = 0.0000001 # iteration = 100000 iteration = 100 # store initial values for plotting b_history = [b] w_history = [w] # iterations for i in range(iteration): b_grad = 0.0 w_grad = 0.0 for n in range(len(x_data)): b_grad = b_grad - 2.0*(y_data[n] - b - w*x_data[n])*1.0 w_grad = w_grad - 2.0*(y_data[n] - b - w*x_data[n])*x_data[n] # update parameters b = b - lr*b_grad w = w - lr*w_grad # store parameters for plotting b_history.append(b) w_history.append(w) # plot for figure plt.contourf(x,y,Z, 50, alpha=0.5, cmap=plt.get_cmap('jet')) plt.plot([-188.4], [2.67], x, ms=12, markeredgewidth=3, color='orange') plt.plot(b_history, w_history, 'o-', ms=3, lw=1.5, color='black') plt.xlim(-200, -100) plt.ylim(-5, 5) plt.xlabel(r'$b$', fontsize=16) plt.ylabel(r'$w$', fontsize=16) plt.show() # 优化 # y_data = b + w * x_data # initial b b = -120 # initial w w = -4 # learning rate lr = 1 # iteration = 100000 iteration = 100 # store initial values for plotting b_history = [b] w_history = [w] # iterations for i in range(iteration): b_grad = 0.0 w_grad = 0.0 for n in range(len(x_data)): b_grad = b_grad - 2.0*(y_data[n] - b - w*x_data[n])*1.0 w_grad = w_grad - 2.0*(y_data[n] - b - w*x_data[n])*x_data[n] # update parameters b = b - lr*b_grad w = w - lr*w_grad # store parameters for plotting b_history.append(b) w_history.append(w) # plot for figure plt.contourf(x,y,Z, 50, alpha=0.5, cmap=plt.get_cmap('jet')) plt.plot([-188.4], [2.67], x, ms=12, markeredgewidth=3, color='orange') plt.plot(b_history, w_history, 'o-', ms=3, lw=1.5, color='black') plt.xlim(-200, -100) plt.ylim(-5, 5) plt.xlabel(r'$b$', fontsize=16) plt.ylabel(r'$w$', fontsize=16) plt.show()

P4ML Lecture 2 - Where does the error come from

与标准值的偏差bias + 与其他函数均值的方差variance

simple model → Large bias + Small variance

comple model → Small bias + Large variance

不能适合训练集 + large bias → Underfitting

适合训练集，在测试集上较大错误 + large variance → Overfitting

large bias：

Add more features as input A more complex model

large variance：

More data Regularization

Corss Validation：

训练集（训练集+验证集） + 测试集（Public） + 测试集（Private）

最新回复(0)