Published 2022. 7. 24. 03:21

[4.1.] Convolutional Neural Networks(3)

인공지능/DLS

Pooling Layers

[Max Pooling]

해당 영역의 가장 큰 값만 가져옴

이때 pooling에 사용되는 필터 사이즈는 2, stride 크기는 2

"if the feature is detected anywhere in the filter, then keep a high number.

But if this feature is not detected, then the max of all those numbers is still itself quite small"

--> 솔직히 이것이 max pooling이 잘 먹히는 근본적인 이유인지는 솔직히 모르겠다(응선생)

pooling에 사용되는 파라미터는 학습되지 않는다 정해진 상태 그대로 연산함

이번에는 filter 사이즈 3, stride 크기 1로 max pooling 해보자

(conv 할 때 사용했던 식 그대로 적용됨)

만약 input이 n차원이었다면 pooling의 output도 n차원

(각 채널에 대해 각각 pooling 적용)

[average pooling]

max pooling을 주로 쓰지만

- you might use average pooling to collapse your representation

pooling은 padding을 거의 쓰지 않음

nothing to learn - there are no parameters that backprop will adapt through max pooling

CNN Example

- conv + pool 를 한 묶음으로 하나의 레이어를 구성한다고 봄

- 5 x 5 x 16 output의 400개 값을 하나의 벡터로 unroll 한 다음,

- Fully connected layer ( $W^{[3]}$가 $(120, 400)$이고 $b^{[3]}$가 $(120)$인 neural network) 에 전달

- 마지막으로 softmax에 통과시켜 $\hat{y}$를 구한다

- 깊어질수록 $n_{H}$, $n_{W}$는 작아지고 $n_{C}$는 커지는 것을 볼 수 있음

- conv-pool-conv-ppol-FC-FC-FC는 자주 보이는 패턴

	Activation shape	Activation size	# parameters
Input	(32, 32, 3)	3072	0
CONV1 (f=5, s=1)	(28, 28, 8)	6272	608
POOL1	(14, 14, 8)	1568	0
CONV2 (f=5, s=1)	(10, 10, 16)	1600	3216
POOL2	(5, 5, 16)	400	0
FC3	(120, 1)	120	48120
FC4	(84, 1)	84	10164
Softmax	(10, 1)	10	850

- pooling layer 는 학습될 파라미터가 없다

- conv layer의 파라미터 개수는 상대적으로 적다

- activation 사이즈는 점차 작아진다 (갑자기 작아지는 것도 좋지 않음)

*파라미터 개수 계산할 때 bias를 고려하기

Why Convolutions?

위를 일반적인 NN으로 구현하려고 한다면,

3072 개 feature를 4704개 활성화값으로 맵핑하는 파라미터 140만개(3072 * 4704)가 필요하다

반면에 conv로 적용하려 한다면 파라미터는 (5 * 5 * 3 + 1) * 6 = 456개 파라미터만 있으면 됨

- (필터 크기 x 필터 크기 x 채널 개수 + bias) x 필터개수

[convnet이 적은 개수의 파라미터를 가지는 두 가지 이유]

(1) parameter sharing

하나의 feature detecter(즉 filter)를 한 이미지 위에서 여러 위치로 이동하며 여러 번 사용할 수 있다는 의미인 듯

좌측상단을 위한 filter 및 우측하단을 위한 filter가 각각 필요하지 않듯 이미지 내 위치에 따라 다른 파라미터들을 학습할 필요가 없다

한편 좌측상단과 우측하단이 다른 distribution을 가진다면? 그래도 비슷할 테니 괜찮다

(2) sparsity of connections

만약 filter가 이미지의 좌측상단에 포개져 있다면 그 외의 나머지 픽셀들은 그 output값에 아무런 영향을 미치지 않음

"this output depends only on these nine input features and the other pixels just don't affect this output at all"

--> 이러한 메카니즘으로 상대적으로 적은 파라미터로 학습할 수 있으며 overfitting 위험도 적다

"translation invariance를 잘 포착한다"

ex. 고양이 이미지에서 고양이를 살짝 오른쪽으로 이동시켜도 여전히 고양이 사진이다

CNN은 이렇게 shifted된 이미지도 비슷한 feature를 출력해야 하며, 동일한 라벨로 분류되어야 할 것이다

The fact that you are applying to same filter, it knows all the positons of the image, both in the earlier layers and in the late layers, that helps a neural network automatically learn to be more robust or to better capture the desiarble property of translation invariance

[training CNN]

'인공지능 > DLS' 카테고리의 다른 글

[4.2.] Case Studies(2) (0)	2022.07.26
[4.2.] Case Studies(1) (0)	2022.07.24
[4.1.] Convolutional Neural Networks(2) (0)	2022.07.24
[4.1.] Convolutional Neural Networks(1) (0)	2022.07.23
[3.2.] End-to-end Deep Learning (0)	2022.07.20

[4.1.] Convolutional Neural Networks(3)

Pooling Layers

CNN Example

Why Convolutions?

'인공지능 > DLS' 카테고리의 다른 글

티스토리툴바