3. Multi-variable Linear Regression

다중 선형회귀(Multi-variable Linear Regression)

변수가 1개가 아닌 여러 개 변수를 사용하는 선형회귀
변수가 늘어나는 만큼 가중치의 개수도 늘어남
가설 함수 (Hypothesis)

$H(x_1, x_2, x_3, ..., x_n) = w_1 x_1 + w_2 x_2 + w_3 x_3 + ... \ +w_n x_n$
비용 함수 (Cost function)

$cost(W)=\frac { 1 }{ m } \sum {i=1}^{m}{ { (H(x_1, x_2, x_3, ... ,\ x_n)}-y{ i } })^{ 2 }$

Matrix

수없이 늘어나는 변수에 따른 가중치들의 값이 굉장히 많아짐에 따라 작성 및 풀어내기가 불편
Matrix의 곱셈을 사용하여 간단하게 풀이 가능
Dot Product
- 2개의 Matrix가 있을 때, 1행 1열의 값은 1번째 Matrix의 행과 2번째 Matrix 열의 개별 값을 곱하여 더해준 값...

⇒ $H(x_1, x_2, x_3, ..., x_n) = w_1 x_1 + w_2 x_2 + w_3 x_3 + ... \ +w_n x_n$

$\rightarrow \begin{pmatrix} x_1, x_2, x_3 \end{pmatrix} \times \begin{pmatrix} w_1 \\ w_2 \\ w_3 \end{pmatrix} = (x_1 w_1 + x_2 w_2 + x_3 w_3)$

$\therefore H(x) = XW$

앞에 오는 Matrix의 행과 뒤쪽에 오는 Matrix의 열을 함께 연산을 하기 때문에 X가 앞에 오고 W가 뒤에 오는 형태

장점
- 변수가 아무리 많아져도 이전과 똑같이 $XW$로 간편하게 표현 가능
- 데이터의 개수와 무관
  - 데이터 값이 어떻게 되든 표현식은 동일
  - 데이터의 행/열이 입력 데이터 feature나 W의 크기를 결정
  $\therefore$ 인스턴스의 개수, 변수의 개수와 상관없이 계속 동일한 표현을 사용

일반

import tensorflow as tf

# data input
x1 = [ 73.,  93.,  89.,  96.,  73.]
x2 = [ 80.,  88.,  91.,  98.,  66.]
x3 = [ 75.,  93.,  90., 100.,  70.]
Y  = [152., 185., 180., 196., 142.]

# weights
w1 = tf.Variable(tf.random.normal([1]))
w2 = tf.Variable(tf.random.normal([1]))
w3 = tf.Variable(tf.random.normal([1]))
b = tf.Variable(tf.random.normal([1]))

learning_rate = 0.000001

for i in range(1000 + 1):
    # 비용 함수의 gradient를 기록하기 위해
    with tf.GradientTape() as tape:
        hypothesis = w1 * x1 + w2 * x2 + w3 * x3 + b
        cost = tf.reduce_mean(tf.square(hypothesis - Y))
    
    # 비용 함수의 gradient를 계산
    w1_grad, w2_grad, w3_grad, b_grad = tape.gradient(cost, [w1, w2, w3, b])
    
    # 기존의 값에서 learning_rate * w1_grad를 빼서 업데이트
    w1.assign_sub(learning_rate * w1_grad)
    w2.assign_sub(learning_rate * w2_grad)
    w3.assign_sub(learning_rate * w3_grad)
    b.assign_sub(learning_rate * b_grad)
    
    if i % 50 == 0:
        print("{:5} | {:12.4f}".format(i, cost.numpy()))

Matrix 사용

import tensorflow as tf
import numpy as np
data = np.array([
    # X1,   X2,    X3,   y
    [ 73.,  80.,  75., 152. ],
    [ 93.,  88.,  93., 185. ],
    [ 89.,  91.,  90., 180. ],
    [ 96.,  98., 100., 196. ],
    [ 73.,  66.,  70., 142. ]
], dtype=np.float32)    # 5 X 3 행렬

# slice data
X = data[:, :-1] # [행 , 열] 슬라이싱
y = data[:, [-1]]

W = tf.Variable(tf.random.normal((3, 1))) # weight 는 3 X 1 행렬
b = tf.Variable(tf.random.normal((1,)))

learning_rate = 0.000001

# 가설함수
def predict(X):
    return tf.matmul(X, W) + b

print("epoch | cost")

n_epochs = 2000
for i in range(n_epochs+1):
    with tf.GradientTape() as tape:
        cost = tf.reduce_mean((tf.square(predict(X) - y)))

    W_grad, b_grad = tape.gradient(cost, [W, b])

    W.assign_sub(learning_rate * W_grad)
    b.assign_sub(learning_rate * b_grad)
    
    if i % 100 == 0:
        print("{:5} | {:10.4f}".format(i, cost.numpy()))