좌 : Classification, 우 : Regression
기존의 가설 함수 $H(x) = Wx + b$는 0과 1 사이의 값만을 보장X
0 ~ 1 사이의 값만 가지도록 하는 함수가 필요
⇒ **Sigmoid function (logisitic function)**의 등장
$g(z) = \frac 1 {(1 + e^{-z})}$ $z = Wx$ $H(x) = g(z)$
$\therefore \ H(X) = \frac 1 {1 + e^{-W^T X}}$
현재 가설함수 : $H(X) = \frac 1 {1 + e^{-W^T X}}$
기존의 비용 함수 : ****
$cost(W)=\frac { 1 }{ m } \sum _{i=1}^{m}{ { (Wx_i}-y_i{ } })^{ 2 }$
오른쪽과 같은 그래프 나오게 됨
Local Minimum 으로 인해 모델 성능 하락 ⇒ Global Minimum을 보장 못함
Convex Function X
새로운 Cost Function 필요
$cost(W) = \frac 1 m \sum c(H(x),\ y)$, $y$는 실제 값
$c(H(x), \ y) = \begin{cases}-log(H(x))\quad \quad \ \ \ (y = 1)\\ -log(1-H(x))\quad (y = 0) \end{cases}$
y = 1
y = -1
$\rightarrow c(H(x),\ y) = -ylog(H(x)) - (1-y)log(1-H(x))$
$y=1, c = -log(H(x))$
$y = 0, c=-1*log(1-H(x))$
$\therefore cost(W) = -\frac 1 m \sum ylog(H(x)) - (1-y)log(1-H(x))$
$W:=W - \alpha\frac{ \partial } {\partial W } cost(W)$
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
x_train = [[1., 2.], [2., 3.], [3., 1.], [4., 3.], [5., 3.], [6., 2.]]
y_train = [[0.], [0.], [0.], [1.], [1.], [1.]]
x_test = [[5., 2.]]
y_test = [[1.]]
x1 = [x[0] for x in x_train]
x2 = [x[1] for x in x_train]
colors = [int(y[0] % 3) for y in y_train]
plt.scatter(x1, x2, c = colors, marker='^')
plt.scatter(x_test[0][0], x_test[0][1], c='red')
plt.xlabel('x1')
plt.ylabel('x2')
plt.show()
#Tensorflow data API를 통해 학습시킬 값들을 담는다 (Batch Size는 한번에 학습시킬 Size로 정한다)
# features,labels는 실재 학습에 쓰일 Data (연산을 위해 Type를 맞춰준다)
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train)) # [0, 0] [0] * 6
W = tf.Variable(tf.zeros([2, 1]), name = 'weight')
b = tf.Variable(tf.zeros([1]), name = 'bias')
"""
X [0, 0] W 0 + b[0] = H(X)
0
"""
def logistic_regression(features):
hypothesis = tf.divide(1., 1. + tf.exp(-tf.matmul(features, W) + b))
# 1 / 1 + e^x
# x = H(X) = XW + b
return hypothesis
def loss_fn(hypothesis, labels):
cost = -tf.reduce_mean(labels * tf.math.log(hypothesis) + (1 - labels) * tf.math.log(1 - hypothesis))
# cost(h(x), y) = -ylog(h(x)) - (1 - y)log(1 - h(x))
return cost
# Gradient Descent
optimizer = tf.keras.optimizers.SGD(learning_rate = 0.01)
def accuracy_fn(hypothesis, labels):
predicted = tf.cast(hypothesis > 0.5, dtype = tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, labels), dtype = tf.int32))
# equl -> predicted, label 일치 비교
# cast -> True => 1, False -> 0 return
# reduce_mean -> 해당 값들 평균
return accuracy
def grad(features, labels):
with tf.GradientTape() as tape: # 미분 값 기록
hypothesis = logistic_regression(features)
loss_value = loss_fn(hypothesis, labels)
return tape.gradient(loss_value, [W, b])
EPOCHS = 1001
for step in range(EPOCHS):
for features, labels in iter(dataset.batch(len(x_train))): # x_train 의 길이만큼 dataset 가져오기
hypothesis = logistic_regression(features) # 행마다의 가설 함수 값
grads = grad(features, labels) # 미분 값
optimizer.apply_gradients(grads_and_vars = zip(grads, [W, b])) # W, b 값 업데이트
if step % 100 == 0:
print("Iter: {}, Loss: {:.4f}".format(step, loss_fn(hypothesis, labels)))
test_acc = accuracy_fn(logistic_regression(x_test), y_test)
print("Test Result = {}".format(tf.cast(logistic_regression(x_test) > 0.5, dtype = tf.int32)))
print("Testset Accuracy : {:.4f}".format(test_acc))