Feedforward Neural Network / Multilayer Perceptron / Backpropagation / MLP for NLP Tasks / ReLu

1. 다층 퍼셉트론(MultiLayer Perceptron, MLP)

앞서 단층 퍼셉트론의 형태에서 은닉층이 1개 이상 추가된 신경망을 다층 퍼셉트론(MLP)이라고 한다고 배웠습니다. 다층 퍼셉트론은 피드 포워드 신경망(Feed Forward Neural Network, FFNN)의 가장 기본적인 형태입니다. 피드 포워드 신경망은 입력층에서 출력층으로 오직 한 방향으로만 연산 방향이 정해져 있는 신경망을 말합니다.

Feedforward Neural Network

- MLP와 같이 입-> 출력층 방향으로 연산이 전개되는 신경망

- An extension to perceptron by foring series connection between layers of parallel perceptrons: input, output, hidden layers

- The number of layers, the nodes at each layer, and non-linear functions are design parameters.

(non-linear functions: 비선형함수 : 직선 1개로는 그릴 수 없는 함수.)

** activation function인 활성화 함수로 선형 함수를 사용하게 되면, 은닉층은 쌓지 못한다.

Multilayer Perceptron

- a node in hidden layers receives inputs from all noes in the previous layer and its output is fed to all nodes in the next layer

Multiclass Outputs

What if we have more than two output classes?

- We add one output for each class

- We use a "softmax layer" at the output to generate a probability distribution

- We use a proper loss function at the output

Universal Approximation Theorem

- MLP can represent a wide range of functions given appropriate values for the weights

- given sufficient layer -> deep nets

- given sufficient node -> wird nets

- Only existential result : it merely states that approximating most given functions is possible but does not provide the solution 주어진 대부분의 기능을 근사하는 것이 가능하지만 솔루션을 제공하지는 않는다는 것을 나타냅니다.

Training Feedforward Network

- Backpropagation algorithm : An iterative algorithm to learn network weights using annotated data

- For every training data point (x, y)

- Run forward computation to find model estimate y'

- Run backward computation to update weights (more difficalt)

* For every output node

- Compute loss L between true y and the estimated y'

- For every weight w from hidden layer to the output layer

* update the weight using gradient descent

MLP for NLP Tasks -> MLP는 가장 긴 길이를 input size로 고정해야 한다

Assume a fixed size length

1. Make the input the length of the longest input -> input size = longest input

- If shorter then pad with zero embeddings

- Truncate if you longer inputs are observed at test time

2. Creating a single "sentence embedding"

- Take the mean of all the word embeddings 임베딩의 평균 구해라

- Take the element-wise max of all the word embeddings

Neural Networks VS SVM Rivalry

Back Propagation(1986) -> SVM (1992) -> Deep Learning(2012)

- Three key reasons for reemergence of deep learning

1. Computational Power : NVIDIA CUDA (2007)

2. Annotated datasets: ImageNet (2009)

3. ReLU!!!

ReLu Function

* 음수를 입력하면 0을 출력하고, 양수를 입력하면 입력값 그대로를 반환한다

* 어떤 특정 양수 값에 수렴하지 않아 깊은 신경망에서 시그모이드보다 더 잘 작동함

* 음수 시에는 기울기가 0이 돼서, 이 뉴런이 다시 회생하기 힘들어 -> dying ReLU라고 불림

Softmax Function

- 시그모이드 함수처럼 출력층의 뉴런에서 주로 사용

- 세 가지 이상의 (상호 배타적인) 선택지 중 하나를 고르는 다중 클래스 분류 (Multiclass classification) 문제에 주로 사용됨

시그모이드 함수

* 주로 이진 분류에 주로 사용된다.

'NLP' 카테고리의 다른 글

Sequence labeling (0)	2021.10.25
Limitations of feedforward neural network / RNN / Backpropagation Through Time (BPTT) / Gated RNN / Bidirection RNN / Skip-Thought Vectors (0)	2021.10.25
Evaluation Metrics/ NL Representation / Distributional Hypothesis / Vector Embedding / Word Embedding / Matrix Factorization / Word2Vec / Skip-Gram / CBOW / Negative Sampling (0)	2021.10.24
Perceptron, SVM, Computing Margin, Logistic Regression (0)	2021.10.24
NLP preprocessing / BoW (Bag of Words) / TF-IDF / Statistical Learning / Generative models VS Discriminative models / Naive Bayes Classification / Linear classification (0)	2021.10.23

박휴지의 프로그래밍 일기

Feedforward Neural Network / Multilayer Perceptron / Backpropagation / MLP for NLP Tasks / ReLu

1. 다층 퍼셉트론(MultiLayer Perceptron, MLP)

'NLP' 카테고리의 다른 글

댓글

티스토리툴바

Feedforward Neural Network / Multilayer Perceptron / Backpropagation / MLP for NLP Tasks / ReLu

1. 다층 퍼셉트론(MultiLayer Perceptron, MLP)

'NLP' 카테고리의 다른 글

관련글

댓글

티스토리툴바