1The Hong Kong University of Science and Technology 2Institute of Artificial Intelligence (TeleAI), China Telecom
*Corresponding author
We propose a three-stage pipeline named EVA-Gaussian for 3D human novel view synthesis across diverse camera settings. Specifically, we first introduce an Efficient cross-View Attention (EVA) module to accurately estimate the position of each 3D Gaussian from the source images. Then, we integrate the source images with the estimated Gaussian position map to predict the attributes and feature embeddings of the 3D Gaussians. Finally, we employ a recurrent feature refiner to correct artifacts caused by geometric errors in position estimation and enhance visual fidelity. To further improve synthesis quality, we incorporate a powerful anchor loss function for both 3D Gaussian attributes and human face landmarks. Experimental results on the THuman2.0 and THumansit datasets showcase the superiority of our EVA-Gaussian approach in rendering quality across diverse camera settings.
Overview of EVA-Gaussian. EVA-Gaussian takes sparse-view images captured around a human subject as input and performs three key stages: (1) estimating the positions of 3D Gaussians, (2) inferring the remaining attributes (i.e., opacities, scales, quaternions, and features) of these Gaussians, and (3) refining the output image in a recurrent manner
Efficient cross-View Attention (EVA) module for 3D Gaussian position estimation. EVA takes multi-view image features as input, embeds them into window patches using a shifted algorithm, and performs cross-view attention between the features from different views.
Attribute regularization. We regularize the opacities and scales of Gaussians, as well as the position mismatches among the Gaussians in the landmark collection. The optimization of position mismatch when it falls below a specific tolerance.
@misc{hu2024evagaussian3dgaussianbasedrealtime,
title={EVA-Gaussian: 3D Gaussian-based Real-time Human Novel View Synthesis under Diverse Camera Settings},
author={Yingdong Hu and Zhening Liu and Jiawei Shao and Zehong Lin and Jun Zhang},
year={2024},
eprint={2410.01425},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2410.01425},
}