PR-087 "Spectral Normalization for Generative Adversarial Networks" Review (2018 ICLR)(GAN)

1. Citations & Abstract 읽기

Citations : 2022.02.23 기준 2822 회

저자

Takeru Miyato, Toshiki Kataoka - Preferred Networks, Inc.
Masanori Koyama - Ritsumeikan University
Yuichi Yoshida - National Institute of Informatics

Abstract

2. 발표 정리

공식 논문 링크

https://openreview.net/forum?id=B1QRgziT-

Spectral Normalization for Generative Adversarial Networks

We propose a novel weight normalization technique called spectral normalization to stabilize the training of the discriminator of GANs.

openreview.net

Arxiv

https://arxiv.org/abs/1802.05957

Spectral Normalization for Generative Adversarial Networks

One of the challenges in the study of generative adversarial networks is the instability of its training. In this paper, we propose a novel weight normalization technique called spectral normalization to stabilize the training of the discriminator. Our new

arxiv.org

Presentation Slide

https://www.slideshare.net/thinkingfactory/pr12-spectral-normalization-for-generative-adversarial-networks

[PR12] Spectral Normalization for Generative Adversarial Networks

Introduction to Spectral Normalization for Generative Adversarial Networks (ICLR 2018 Oral) video: https://youtu.be/iXSYqohGQhM paper: https://openreview.net/f…

www.slideshare.net

https://youtu.be/iXSYqohGQhM

GAN은 학습하기 어렵고 불안정적. 이에 새로운 normalization인 spectral normalization을 통해 안정성을 높이고자 함.

이 norm은 batch norm과 같은 sample-wise norm이 아닌 weight space를 normalization하는 것.

1) Lipschitz constant가 유일한 hyper-parameter이고 민감하지 않게 학습

2) 간단하고 계산 비용이 작음

Discriminator 미분에 대하여 제약에 대한 부분이 없음. 이 때문에 제약을 가하면 좋은 결과가 나올 수 있지 않을까하는 생각을 하게 되고 이 생각이 발전되어 WGAN, WGAN-GP가 도입된 것

함수 space에서 제약 -> Lipschitz constrain 특정 미분 값의 값보다 작도록 제약을 거는 것

WGAN - weight clipping 0~1

WGAN-GP gradient penality regularizaiton으로 제약

(샘플 점 하나를 뽑아 이에 대한 gradient를 구하고 regularization을 사용)

두 방식 모두 sample을 뽑아 heuristic하게 사용. 원하는 함수 공간 전체에 대한 제약은 제약이 어렵기 때문에 위 방법 모두 회의적.

p=2일 때 spectral norm, l2

식 (6) matrix norm

W에 대해 W에 대한 spectral norm으로 나눠주면 f의 Lipschitz를 1보다 작거나 같도록 제약을 걸 수 있음

함수 전체 공간에서 bound가 예쁘게 가능

식 (11)

첫 번째 term은 기존에 있는 term

두 번째 term은 추가된 regularization

$\lambda$가 양수란 것은 $\delta$와 $W_{SN}h$의 방향이 일치할 때임

한 방향으로만 바라보지 않고 다양한 방향으로 바라볼 수 있도록 penalty를 주는 것

weight norm

W의 L2 norm으로 나눠주는 것 -> 원치 않는 제약이 생김 -> rank가 1이 됨

h 고정 W의 SVD시 회전 행렬 + scaling projection 최대로 함. 방향이 같도록 하나만 1 나머지 0 행렬

singular value vector 1개 rank = 1

한쪽 방향으로 projection된 방향 하나로 판단 -> 너무 강한 제약

다양한 방향의 singular vector 학습 여지를 위해 spectral norm을 사용해야함.

orthonormal SVD 모든 signular value가 1

spectrum의 모든 형태가 1 -> 중요하지 않는 sigular value도 학습 -> 안 좋음

gradient penalty

heuristic, computation heavy

다른 spectrum도 고려하기 때문에 다양한 lr, beta1, beta2, 등 다양한 hyper-parameter에 insensitive

singluar value $\sigma$ layer 1 ~ 7 in Figure 3 (b) 다른 sigular vector에 대해서도 좋은 값을 도출

feature map의 사이즈가 커져도 SN-GAN이 안정적으로 학습

다양한 네트워크 구조, hyperparameter setting, 데이터 세트에 대해 실용적인 측정 값을 설명함.

참조

GitHub - tensorflow

https://github.com/pfnet-research/sngan_projection

GitHub - pfnet-research/sngan_projection: GANs with spectral normalization and projection discriminator

GANs with spectral normalization and projection discriminator - GitHub - pfnet-research/sngan_projection: GANs with spectral normalization and projection discriminator

github.com