演算法筆記 | What is the role of the activation function in a neural network? 激活函數/活化函數

原文於Quara: What is the role of the activation function in a neural network?

By Sebastian Raschka,
Author of Python Machine Learning,
Reseracher applying ML to computational bio.

架構為Linear Regression→Logistic Regression→Neural Network

線性回歸 Linear Regression

就是針對某一個問題(例如：房價的預測問題)，利用訓練樣本(training sample)解一個線性模型，然後利用這個線性模型去預測新的數據所對應的結果。

Net Input Function:
net(x)=b+x1w1+x2w2+...xnwn = z

邏輯回歸 Logistic Regression

將線性分類模型中的輸出z輸入到一個非線性的激活函數(activation function)中
例如 Sigmoid function:

這個activation function返回的是一個probability(某一個樣本屬於group 1的機率): P(y=1|x)

接著，在這個activation function後面加一個階乘函數(step function)

如果sigmoid function輸出大於等於0.5，則被預測是Class 1

邏輯回歸分類器可以下圖來表示

(Maybe see this one for more details: Sebastian Raschka's answer to What is the probabilistic interpretation of regularized logistic regression? )

邏輯回歸模型是一個線性模型，雖然sigmoid function為非線性，但其分類面(decision surface)是線性的，所以仍可被視為線性分類器
例如:

由於input sample是線性可分的，所以回歸模型的效果很好

當input sample不是線性可分時，邏輯回歸模型的效果不佳，例如

這時考慮非線性的分類器(non-linear classifier)，例如多層神經網絡(multi-layer neural network)

下面的例子中含有一個隱藏層(hidden layer)來進行分類

神經網路Neural Network

這個例子中包含

Input Layer中三個單元 (x0是bias，x1, x2為座標)
Hidden Layer中有200個sigmoid function
Ouput Layer中有1個sigmoid function與step function輸出output

結論:

邏輯回歸分類器包含一個非線性的activation function，但仍為權值的線性組合，所以邏輯回歸分類器是一個"generalized" linear model
Activation function在神經網路中用於透過非線性的加權組合產生非線性的分類面(non-linear decision boundary)

常見的activation function

151 in UTC+8.5

標籤

11月 05, 2016

演算法筆記 | What is the role of the activation function in a neural network? 激活函數/活化函數