4. Logistic Classification

Classification

이전의 연속적인 값 예측을 다뤘던 Regression과는 달리 정해진 카테고리를 정하는 것
Binary Classification : 0 or 1로 분류

좌 : Classification, 우 : Regression

Logistic Hypothesis

기존의 가설 함수 $H(x) = Wx + b$는 0과 1 사이의 값만을 보장X
0 ~ 1 사이의 값만 가지도록 하는 함수가 필요

⇒ **Sigmoid function (logisitic function)**의 등장

$g(z) = \frac 1 {(1 + e^{-z})}$ $z = Wx$ $H(x) = g(z)$
- Decision Boundary : 분류를 위한 기준(Regression → Classification)
  - $g(z)$의 값이 0.5이상 ⇒ 1
  - $g(z)$의 값이 0.5 미만 ⇒ 0

$\therefore \ H(X) = \frac 1 {1 + e^{-W^T X}}$

Logistic Regression Cost Function

현재 가설함수 : $H(X) = \frac 1 {1 + e^{-W^T X}}$
기존의 비용 함수 : ****

$cost(W)=\frac { 1 }{ m } \sum _{i=1}^{m}{ { (Wx_i}-y_i{ } })^{ 2 }$
오른쪽과 같은 그래프 나오게 됨
Local Minimum 으로 인해 모델 성능 하락 ⇒ Global Minimum을 보장 못함
Convex Function X

새로운 Cost Function 필요

$cost(W) = \frac 1 m \sum c(H(x),\ y)$, $y$는 실제 값

$c(H(x), \ y) = \begin{cases}-log(H(x))\quad \quad \ \ \ (y = 1)\\ -log(1-H(x))\quad (y = 0) \end{cases}$

y = 1

y = -1

실제 $y = 1$ 일 때
- $H(x) = 1 \rightarrow cost(1) = 0$
- $H(x) = 0 \rightarrow cost(0) = \infin$
실제 $y = 0$ 일 때
- $H(x) = 0 \rightarrow cost(1) = 0$
- $H(x) = 1 \rightarrow cost(0) = \infin$

$\rightarrow c(H(x),\ y) = -ylog(H(x)) - (1-y)log(1-H(x))$

코드를 간소화하기 위해 1줄로

$y=1, c = -log(H(x))$

$y = 0, c=-1*log(1-H(x))$

$\therefore cost(W) = -\frac 1 m \sum ylog(H(x)) - (1-y)log(1-H(x))$

$W:=W - \alpha\frac{ \partial } {\partial W } cost(W)$

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

x_train = [[1., 2.], [2., 3.], [3., 1.], [4., 3.], [5., 3.], [6., 2.]]
y_train = [[0.], [0.], [0.], [1.], [1.], [1.]]

x_test = [[5., 2.]]
y_test = [[1.]]

x1 = [x[0] for x in x_train]
x2 = [x[1] for x in x_train]

colors = [int(y[0] % 3) for y in y_train]
plt.scatter(x1, x2, c = colors, marker='^')
plt.scatter(x_test[0][0], x_test[0][1], c='red')

plt.xlabel('x1')
plt.ylabel('x2')
plt.show()

#Tensorflow data API를 통해 학습시킬 값들을 담는다 (Batch Size는 한번에 학습시킬 Size로 정한다)
# features,labels는 실재 학습에 쓰일 Data (연산을 위해 Type를 맞춰준다)

dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))  # [0, 0] [0] * 6

W = tf.Variable(tf.zeros([2, 1]), name = 'weight')  
b = tf.Variable(tf.zeros([1]), name = 'bias')
"""
X [0, 0] W 0   + b[0] = H(X)
           0
"""

def logistic_regression(features):
    hypothesis = tf.divide(1., 1. + tf.exp(-tf.matmul(features, W) + b)) 
    # 1 / 1 + e^x
    # x = H(X) = XW + b
    return hypothesis

def loss_fn(hypothesis, labels):
    cost = -tf.reduce_mean(labels * tf.math.log(hypothesis) + (1 - labels) * tf.math.log(1 - hypothesis))
    # cost(h(x), y) = -ylog(h(x)) - (1 - y)log(1 - h(x))
    return cost

# Gradient Descent
optimizer = tf.keras.optimizers.SGD(learning_rate = 0.01)

def accuracy_fn(hypothesis, labels):
    predicted = tf.cast(hypothesis > 0.5, dtype = tf.float32)
    accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, labels), dtype = tf.int32))
    # equl -> predicted, label 일치 비교
    # cast -> True => 1, False -> 0 return
    # reduce_mean -> 해당 값들 평균
    return accuracy

def grad(features, labels):
    with tf.GradientTape() as tape:  # 미분 값 기록
        hypothesis = logistic_regression(features)
        loss_value = loss_fn(hypothesis, labels)
    return tape.gradient(loss_value, [W, b])

EPOCHS = 1001

for step in range(EPOCHS):
    for features, labels in iter(dataset.batch(len(x_train))):  # x_train 의 길이만큼 dataset 가져오기
        hypothesis = logistic_regression(features) # 행마다의 가설 함수 값
        grads = grad(features, labels) # 미분 값
        optimizer.apply_gradients(grads_and_vars = zip(grads, [W, b])) # W, b 값 업데이트
        
        if step % 100 == 0:
            print("Iter: {}, Loss: {:.4f}".format(step, loss_fn(hypothesis, labels)))
            
test_acc = accuracy_fn(logistic_regression(x_test), y_test)
print("Test Result = {}".format(tf.cast(logistic_regression(x_test) > 0.5, dtype = tf.int32)))
print("Testset Accuracy : {:.4f}".format(test_acc))