
摘要
我们观察到,尽管典型的生成对抗网络(generative adversarial networks)具有层级化的卷积结构,其生成过程仍以不健康的方式依赖于绝对像素坐标。这种现象表现为,细节信息似乎被“粘贴”在图像坐标上,而非依附于所描绘物体的表面。我们追溯其根本原因,发现是生成器网络中信号处理不够谨慎,导致了混叠(aliasing)问题。通过将网络中的所有信号均视为连续信号,我们推导出一系列通用且微小的架构改进,能够确保无关信息无法渗入层级化生成过程。由此产生的网络在FID指标上与StyleGAN2相当,但在内部表征上却呈现出显著差异,且在亚像素尺度下仍完全保持平移与旋转等变性(equivariance)。我们的研究成果为开发更适用于视频与动画生成的生成模型铺平了道路。
代码仓库
lzhbrian/alias-free-gan-explanation
pytorch
GitHub 中提及
NVlabs/stylegan3
官方
pytorch
jychoi118/toward_spatial_unbiased
pytorch
GitHub 中提及
duskvirkus/alias-free-gan
pytorch
GitHub 中提及
rosinality/alias-free-gan-pytorch
pytorch
GitHub 中提及
duskvirkus/alias-free-gan-pytorch-lightning
pytorch
GitHub 中提及
kunheek/style-aware-discriminator
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| image-generation-on-afhqv2 | Alias-Free-R | EQ-R: 40.34 EQ-T: 64.89 FID: 4.40 |
| image-generation-on-afhqv2 | StyleGAN2 | EQ-R: 11.50 EQ-T: 13.83 FID: 4.62 |
| image-generation-on-afhqv2 | Alias-Free-T | EQ-R: 13.51 EQ-T: 60.15 FID: 4.04 |
| image-generation-on-ffhq-1024-x-1024 | StyleGAN3-T | FID: 2.79 |
| image-generation-on-ffhq-1024-x-1024 | StyleGAN3-R | FID: 3.07 |
| image-generation-on-ffhq-u | StyleGAN2 + Simplified generator | EQ-R: 10.41 EQ-T: 19.47 FID: 5.21 |
| image-generation-on-ffhq-u | StyleGAN2 + Non-critical sampling | EQ-R: 10.84 EQ-T: 43.90 FID: 4.78 |
| image-generation-on-ffhq-u | StyleGAN2 + No noise inputs | EQ-R: 10.84 EQ-T: 15.81 FID: 4.54 |
| image-generation-on-ffhq-u | StyleGAN2 + Rotation equiv. (Alias-Free-R) | EQ-R: 40.48 EQ-T: 66.65 FID: 4.50 |
| image-generation-on-ffhq-u | StyleGAN2 + Transformed Fourier features | EQ-R: 10.61 EQ-T: 45.20 FID: 4.64 |
| image-generation-on-ffhq-u | StyleGAN2 + Flexible layers (Alias-Free-T) | EQ-R: 13.12 EQ-T: 63.01 FID: 4.62 |
| image-generation-on-ffhq-u | Alias-Free-R | EQ-R: 47.64 EQ-T: 64.78 FID: 3.66 |
| image-generation-on-ffhq-u | StyleGAN2 + Fourier features | EQ-R: 10.81 EQ-T: 16.23 FID: 4.79 |
| image-generation-on-ffhq-u | StyleGAN2 + Boundaries & upsampling | EQ-R: 10.97 EQ-T: 24.62 FID: 6.02 |
| image-generation-on-ffhq-u | StyleGAN2 | FID: 5.14 |
| image-generation-on-ffhq-u | StyleGAN2 + Filtered nonlinearities | EQ-R: 10.81 EQ-T: 30.60 FID: 6.35 |
| image-generation-on-ffhq-u | Alias-Free-T | EQ-R: 13.95 EQ-T: 61.69 FID: 3.67 |
| image-generation-on-ffhq-u | StyleGAN2 (70000 img, 1024^2, train from scratch) | EQ-R: 10.79 EQ-T: 15.89 FID: 3.79 |
| image-generation-on-metfaces-u | Alias-Free-T | EQ-R: 16.63 EQ-T: 64.11 FID: 18.75 |
| image-generation-on-metfaces-u | Alias-Free-R | EQ-R: 48.57 EQ-T: 66.34 FID: 18.75 |
| image-generation-on-metfaces-u | StyleGAN2 | EQ-R: 13.19 EQ-T: 18.77 FID: 18.98 |