PR-131 "A Style-Based Generator Architecture for Generative Adversarial Networks" Review (2019 CVPR)(GAN)

1. Citations & Abstract 읽기

Citations : 2022.05.04 기준 4004회

저자

Tero Karras, Samuli Laine, Timo Aila - NVIDIA

Abstract

2. 발표 정리

공식 논문 링크

https://openaccess.thecvf.com/content_CVPR_2019/html/Karras_A_Style-Based_Generator_Architecture_for_Generative_Adversarial_Networks_CVPR_2019_paper.html

CVPR 2019 Open Access Repository

Tero Karras, Samuli Laine, Timo Aila; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 4401-4410 We propose an alternative generator architecture for generative adversarial networks, borrowing from style t

openaccess.thecvf.com

Arxiv

https://arxiv.org/abs/1812.04948

A Style-Based Generator Architecture for Generative Adversarial Networks

We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identit

arxiv.org

Presentation Slide

없음

https://youtu.be/TWzEbMrH59o

Summary

- 이미지 합성에 있어 직관적이고 Scale-specific control할 수 있게 하는 기술 소개

- interpolation quality와 disentanglement를 정량화하기 위해 두가지 새로운 자동 방법들을 제안함.

- 고해상도의 다양한 사람 얼굴 데이터 세트를 소개

StyleGAN은 ProGAN을 backbone으로 활용

- 기존 방식들은 Latent z을 바로 사용해왔지만 이럴 경우 훈련 데이터의 확률 분포를 따르게 하기 때문에 피할 수 없는 entanglement를 유도한다. (단점)

- Latent z를 normalize 후 FC 8번을 진행한 것을 활용함으로써 위의 제약에서 벗어날 수 있음.

Synthesis network g의 첫 초록 상자 = Learned Constant tensor

Explained: A Style-Based Generator Architecture for GANs - Generating and Tuning Realistic Artificial Faces - Rani Horev

- 학습된 affine tranformation에서 adaptive instance normalization(AdaIN)을 제어할 수 있도록 w은 style y의 값을 활용.

- vector W로부터 spatially invariant style y를 계산

Noise

- Uncorrelated Gaussian noise로 구성된 Single-channel image

2개의 다른 loss function

Celeba-HQ : WGAN-GP

FFHQ : non-saturating loss with R1 regularization

generating image를 잘 뽑아내기 위한 trick : truncation trick

(다만 z에 활용하던 기존의 방식과 달리 w에 사용)

기술 3가지

1) Style mixing

2) Stochastic variation

3) Separation of global effects from stochasticity

1) Style mixing

2개의 random latent code를 사용함으로써 가까운 스타일이 correlated 되는 것을 방지

이를 통해 regularization 효과를 보임

2) Stochastic variation

- 머리카락의 위치, 수염, 주근깨와 같이 stochastic variation을 가지는 것

- 이미지의 perception에서는 영향을 주지 않으며 random으로 표현

- 각 conv 후 per-pixel noise를 추가함.

- noise를 넣음으로써 localize됨.

Figure 4. (b)에서 noise를 넣음으로써 머리카락이 조금씩 달라짐을 확인할 수 있음.

Figure 5. (a) 모든 layer에 noise를 적용한 경우 자연스럽고 섬세한 머리 곱슬이 나타남.

Figure 5. (b) noise를 적용하지 않는 경우 자연스러운 머리가 어려움.

Figure 5. (c) noise를 fine layer만 적용한 경우 섬세한 머리 곱슬이 나타남.

Figure 5. (d) noise를 coarse layer에만 적용한 경우 큰 윤곽의 머리 곱슬이 나타남.

3) Separation of global effects from stochasticity

머리가 긴 남자는 훈련 데이터 세트에는 존재하지 않을 때, Z에 대한 mapping시 entanglement가 됨. 반면, W에 대한 mapping 시 우리가 원하는 훈련 분포를 갖춤.

percpetual path length 수식

Linear separability

FFHQ 데이터세트

Truncation trick in W

Tensorflow / Adam / Mirror augmentation

one week on 8 GPU

NN up/down sampling -> bilinear sampling

Train 25M images

waights initailized using N(0,1)

40 classificers

not use batch norm, spectral norm, attention mechanism, dropout, pixelwise feature vector norm

FID가 낮아질수록 Path length가 높아지는 문제가 생김 -> Discussion

참조

GitHub

https://github.com/NVlabs/stylegan

GitHub - NVlabs/stylegan: StyleGAN - Official TensorFlow Implementation

StyleGAN - Official TensorFlow Implementation. Contribute to NVlabs/stylegan development by creating an account on GitHub.

github.com

블로그

https://towardsdatascience.com/explained-a-style-based-generator-architecture-for-gans-generating-and-tuning-realistic-6cb2be0f431

Explained: A Style-Based Generator Architecture for GANs - Generating and Tuning Realistic…

NVIDIA’s novel architecture for Generative Adversarial Networks

towardsdatascience.com

저작자표시 비영리 동일조건 (새창열림)

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

AI 공부 도전기