Citation : 2024.10.28 월요일 기준 38회
저자 (소속) : ( ) [ Huazhong University of Science and Technology, Huawei Inc. ]
논문 & Github 링크 : [ Official ] [ Arxiv ] [ 공식 Github ] [ Project ]
본 논문은 3D diffusion model / 2D diffusion model / Gaussian Splatting을 통해 빠른 속도로 훈련 가능한 3D 모델을 소개한다.
Our contributions can be summarized as follows.
• We propose a text-to-3D method, named as Gaussian- Dreamer which bridges the 3D and 2D diffusion models via Gaussian splitting, enjoying both 3D consistency and rich generation details.
• Noisy point growing and color perturbation are introduced to supplement the initialized 3D Gaussians for further content enrichment.
• The overall method is simple and quite effective. A 3D instance can be generated within 15 minutes on one GPU, much faster than previous methods, and can be directly rendered in real time.
DreamFusion : https://aigong.tistory.com/652
Score distillation Sampling (SDS) loss : 2D diffusion model score function을 활용한 3D generation 가능한 loss
MipNeRF 사용
3D Gaussian Splatting
real-time이 가능한 splatting 기반 image rendering 방법
anisotropic Gaussian 집합을 통해 구성
rendering process
(1) 3D diffusion model(text-to-3D & text-to-motion diffusion models)을 통해 initialization
(2) noisy point growing and color perturbation
(3) 2D Diffusion model을 통해 optimization (SDS)
(4) Gaussian Splatting을 통한 real-time rendering
3.3.1 Text-to-3D Diffusion Model
3D 생성 모델을 통한 SDF & texture color 예측
triangle mesh m 생성 : query SDF value at vertices -> query texture colors at each vertex of m
ptm(pm,cm) : m의 vertices와 color를 point clouds로 변환
pm : position of the point clouds. vertice coordinated of m와 동일
cm : color of the point clouds
Noisy Point Growing and Color Perturbation
위에서 생성한 ptm을 기반으로 3D Gaussian 초기값으로 사용하지 않음
초기화 성능 향상을 위해 noisy point growing과 color perturbation 진행
ptm의 surface Bounding box 계산
uniformly point cloud grow 진행 ptr
이때 사용하는 것은 KDTree
최근접 point 생성 & normalized distance 0.01 내 point 선택
색상은 근접 색상의 값 + perturbation 진행 (0~0.2)
merge the positions and colors of ptm and ptr to obtain the final point clouds.
(Optional) 3.3.2 Text-to-Motion Diffusion Model
text 기반 human body motion sequence 생성을 위해 진행하는 단계
human pose keypoint을 triangle mesh m으로 표기한 SMPL model로 변환
mesh -> point clouds ptm
point cloud pt 기반 3D Gaussian 초기화
opacity αb = 0.1
covariance ∑b : 근접 2 점 사이 distance 계산
2D diffusion model F2D SDS (Score Distillation Sampling) loss을 통한 3D Gaussian θb optimize
ThreeStudio, PyTorch
3D diffusion model : Shape-E, MDM
2D diffusion model : stabilityai/stable-diffusion-2-1-base
3D Gaussian
- learning rate opacity, position 10−2,5×10−5
- 기타 잡 변수들 정리는 알아서
Training Iteration : 1200
15 min
single RTX 3090
512x512 -> 1024x1024
항상 잘 나오지 않음.
not sharp
multi-face problems
큰 규모 장면 생성의 어려움 e.g indoor scenes