站点图标 AI技术聚合

Ex2_机器学习_吴恩达课程作业(Python):逻辑回归(Logistic Regression)

Ex2_机器学习_吴恩达课程作业(Python):逻辑回归(Logistic Regression)

Ex2_机器学习_吴恩达课程作业(Python):逻辑回归(Logistic Regression)

0. Pre-condition

This section includes some introductions of libraries.

# Programming exercise 2 for week 3

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import ex2_function as func

00. Self-created Functions

This section includes self-created functions.

# Sigmoid function 激活函数
def sigmoid(z):
    return 1 / (1 + np.exp(-z))

# Feature mapping 特征映射
def featureMapping(x1, x2, power):
    data = {}
    for i in np.arange(power + 1):
        for p in np.arange(i + 1):
            data["f{}{}".format(i - p, p)] = np.power(x1, i - p) * np.power(x2, p)
    return pd.DataFrame(data)

注意要特殊处理 θ0。

# Gradient of regularized logistic regression 计算正规化logistic回归的梯度
def gradientReg(theta, X, y, l=1):
    addition = l * theta / len(X)
    addition[0] = 0
    return gradient(theta, X, y) + addition

1. Logistic Regression

Implement the logstic regression.

  • 需要特别注意np.array()np.asmatrix()的区别和应用场景。
  • 调用的相关函数在文章头部”Self-created functions”中详细描述。
# 1. Logistic Regression

path_data1 = '../data/ex2data1.txt'
df1 = pd.read_csv(path_data1, names=['Exam1', 'Exam2', 'Admitted'])

1.1 Visualization

# 1.1 Visualizing the data

# Separate the admitted students and the other
positive = df1[df1['Admitted'].isin(['1'])]
negative = df1[df1['Admitted'].isin(['0'])]

# Plot the figure
fig, fig_data = plt.subplots(figsize=[10, 6])
fig_data.scatter(positive['Exam1'], positive['Exam2'], c='b', label='Admitted')
fig_data.scatter(negative['Exam1'], negative['Exam2'], c='r', label='Not admitted', marker='x')
fig_data.set_xlabel('Exam1 score')
fig_data.set_ylabel('Exam2 score')
fig_data.set_title('Exams scores and admission results')

1.2 Sigmoid function

# 1.2 Implementation

# 1.2.1 Sigmoid function 激活函数
x1 = np.arange(-10, 10, 0.1)
plt.plot(x1, func.sigmoid(x1), c='r')

1.3 Cost function & Gradient

# 1.2.2 Cost function and gradient 损失函数和梯度计算
if 'ONES' not in df1.columns:
    df1.insert(0, 'ONES', 1)
row = df1.shape[1]
X = np.array(df1.iloc[:, : -1])  # 注意这里使用np.array而不是np.matrix
y = np.array(df1.iloc[:, -1])
theta = np.zeros(X.shape[1])
print(func.cost(X, y, theta))
print(func.gradient(X, y, theta))

1.4 Learning θ using “fminunc”




# 1.2.3 Learning parameters using "fminunc" 通过优化函数学习参数
import scipy.optimize as opt

# opt.fmin_tnc()
res1 = opt.fmin_tnc(func=func.cost, x0=theta, args=(X, y), fprime=func.gradient)
# opt.minimize()
res2 = opt.minimize(fun=func.cost, x0=theta, args=(X, y), method='TNC', jac=func.gradient)

print(func.cost(res1[0], X, y))
print(func.cost(res2['x'], X, y))


# opt.fmin_tnc()
(array([-25.1613186 ,   0.20623159,   0.20147149]), 36, 0)

# opt.minimize()
fun: 0.20349770158947475
     jac: array([8.86249424e-09, 7.33646598e-08, 4.72732538e-07])
 message: 'Local minimum reached (|pg| ~= 0)'
    nfev: 36
     nit: 17
  status: 0
 success: True
       x: array([-25.1613186 ,   0.20623159,   0.20147149])
# cost

1.5 Evaluation


# 1.2.4 Evaluating logistic regression
# 与原数据作对比检验
final_theta = res1[0]
predictions = func.predict(final_theta, X)
correct = [1 if a==b else 0 for (a, b) in zip(predictions, y)]
accuracy = sum(correct) / len(X)

# 利用sklearn包检验
from sklearn.metrics import classification_report
print(classification_report(predictions, y))



              precision    recall  f1-score   support

           0       0.85      0.87      0.86        39
           1       0.92      0.90      0.91        61

    accuracy                           0.89       100
   macro avg       0.88      0.89      0.88       100
weighted avg       0.89      0.89      0.89       100

1.6 Decision Boundary

# 1.2.5 Decision Boundary
# ( theta[0] + theta[1] * x1 + theta[2] * x2 = 0 )
x1 = np.arange(130, step=0.1)
x2 = -(final_theta[0] + x1 * final_theta[1]) / final_theta[2]

# Visualization
fig, fig_bound = plt.subplots(figsize=(10, 6))
fig_bound.scatter(positive['Exam1'], positive['Exam2'], c='b', label='Admitted')
fig_bound.scatter(negative['Exam1'], negative['Exam2'], c='r', label='Not Admitted', marker='x')
fig_bound.plot(x1, x2)
fig_bound.set_title('Decision Boundary')

2. Regularized logistic regression

In this section, we will optimize the logistic regression algorithm by regularization.

In short, regularization is a term in cost function that tilts algorithms toward simpler models that will carry smaller coefficients. This theory is helpful to reduce the occurrence of over-fitting and improve the generalization ability of the model.

  • 需要特别注意np.array()np.asmatrix()的区别和应用场景。
  • 调用的相关函数在文章头部”Self-created functions”中详细描述。
# 2. Regularized logistic regression
path_data2 = '../data/ex2data2.txt'
df2 = pd.read_csv(path_data2, names=['Microchip Test 1', 'Microchip Test 2', 'Acceptance'])

2.1 Visualization

# 2.1 Visualization
positive = df2[df2['Acceptance'].isin(['1'])]
negative = df2[df2['Acceptance'].isin(['0'])]

# Plot the figure
fig, fig_data2 = plt.subplots(figsize=[10, 6])
fig_data2.scatter(positive['Microchip Test 1'], positive['Microchip Test 2'], c='b', label='Accepted')
fig_data2.scatter(negative['Microchip Test 1'], negative['Microchip Test 2'], c='r', label='Rejected', marker='x')
fig_data2.set_xlabel('Microchip Test 1')
fig_data2.set_ylabel('Microchip Test 2')
fig_data2.set_title('Tests and acceptance results')

2.2 Feature mapping

从上图可以注意到,其中的正负两类数据并没有线性的决策界限。因此,直接用 logistic回归在这个数据集上并不能表现良好,因为它只能用来寻找一个线性的决策边界。


# 2.2 Feature Mapping
x1 = np.array(df2['Microchip Test 1'])
x2 = np.array(df2['Microchip Test 2'])
power = 6
regularized_data = func.featureMapping(x1, x2, power)

2.3 Cost function & Gradient

# 2.3 Cost function and Gradient
X = regularized_data
y = np.array(df2['Acceptance'])
theta = np.zeros(X.shape[1])
print(func.costReg(theta, X, y, l=1))
print(func.gradientReg(theta, X, y, l=1))

2.4 Learning θ using “fminunc”

# 2.4 Learning parameters using "fminunc"
import scipy.optimize as opt

# Use opt.fmin_tnc()
res1 = opt.fmin_tnc(func=func.costReg, x0=theta, args=(X, y, 2), fprime=func.gradientReg)

# Use opt.minimize()
res2 = opt.minimize(fun=func.costReg, x0=theta, args=(X, y, 2), method='TNC', jac=func.gradientReg)

# Use "sklearn" lib
from sklearn import linear_model
model = linear_model.LogisticRegression(penalty='l2', C=1.0)
model.fit(X, y.ravel())
print(model.score(X, y))


(array([ 0.90267454,  0.33721089,  0.76006404, -1.39757946, -0.51417075,
       -0.91389985,  0.01516214, -0.21926017, -0.22677642, -0.16219637,
       -1.01270257, -0.04169398, -0.39984069, -0.14458017, -0.82296284,
       -0.20346048, -0.13186937, -0.04837714, -0.17183934, -0.17077936,
       -0.38820995, -0.72773035,  0.00607685, -0.19391899,  0.00314606,
       -0.21203169, -0.06947222, -0.69320886]), 25, 1)
     fun: 0.5733984516596513
     jac: array([ 1.67633708e-06, -2.90856336e-06,  8.96406772e-07, -8.32612468e-07,
        1.02439563e-07,  1.90899668e-06, -1.35242750e-06, -3.59840239e-07,
       -1.69067486e-07,  1.29963371e-06, -2.18602082e-06, -3.80209746e-07,
       -8.12873068e-07, -9.17531075e-07,  6.41429367e-07, -9.19276291e-07,
        2.91038210e-07, -4.42204968e-07,  1.89624550e-07, -4.26240933e-07,
        1.14951260e-06, -1.69610321e-06,  4.20801504e-08,  4.32306883e-07,
        8.44834719e-08,  9.34045111e-08, -9.53173077e-08,  1.04936691e-06])
 message: 'Converged (|f_n-f_(n-1)| ~= 0)'
    nfev: 25
     nit: 6
  status: 1
 success: True
       x: array([ 0.90267454,  0.33721089,  0.76006404, -1.39757946, -0.51417075,
       -0.91389985,  0.01516214, -0.21926017, -0.22677642, -0.16219637,
       -1.01270257, -0.04169398, -0.39984069, -0.14458017, -0.82296284,
       -0.20346048, -0.13186937, -0.04837714, -0.17183934, -0.17077936,
       -0.38820995, -0.72773035,  0.00607685, -0.19391899,  0.00314606,
       -0.21203169, -0.06947222, -0.69320886])

2.5 Evaluation

# 2.5 Evaluation
# 与原数据作对比检验
final_theta = res1[0]
predictions = func.predict(final_theta, X)
correct = [1 if a==b else 0 for (a, b) in zip(predictions, y)]
accuracy = sum(correct) / len(correct)

# Use "sklearn" lib
from sklearn.metrics import classification_report
print(classification_report(predictions, y))


              precision    recall  f1-score   support

           0       0.73      0.92      0.81        48
           1       0.93      0.77      0.84        70

    accuracy                           0.83       118
   macro avg       0.83      0.84      0.83       118
weighted avg       0.85      0.83      0.83       118

2.6 Decision boundary

# 2.6 Decision boundary
x = np.linspace(-1.5, 1.5, 250)
xx, yy = np.meshgrid(x, x)
z = np.asmatrix(func.featureMapping(xx.ravel(), yy.ravel(), 6))
z = z @ final_theta
z = z.reshape(xx.shape)

# Plot the figure
fig, fig_bound2 = plt.subplots(figsize=[10, 6])
fig_bound2.scatter(positive['Microchip Test 1'], positive['Microchip Test 2'], c='b', label='Accepted')
fig_bound2.scatter(negative['Microchip Test 1'], negative['Microchip Test 2'], c='r', label='Rejected', marker='x')
fig_bound2.set_xlabel('Microchip Test 1')
fig_bound2.set_ylabel('Microchip Test 2')
fig_bound2.set_title('Decision boundary')
plt.contour(xx, yy, z, 0)
plt.ylim(-0.8, 1.2)


