EVA-Gaussian

EVA-Gaussian: 3D Gaussian-Based Real-time Human Novel View Synthesis Under Diverse Camera Settings

Yingdong Hu¹, Zhening Liu¹, Jiawei Shao^1,2, Zehong Lin^{1 }, Jun Zhang¹,
¹The Hong Kong University of Science and Technology²Institute of Artificial Intelligence (TeleAI), China Telecom
Corresponding author

Paper

Code

Abstract

We propose a three-stage pipeline named EVA-Gaussian for 3D human novel view synthesis across diverse camera settings. Specifically, we first introduce an Efficient cross-View Attention (EVA) module to accurately estimate the position of each 3D Gaussian from the source images. Then, we integrate the source images with the estimated Gaussian position map to predict the attributes and feature embeddings of the 3D Gaussians. Finally, we employ a recurrent feature refiner to correct artifacts caused by geometric errors in position estimation and enhance visual fidelity. To further improve synthesis quality, we incorporate a powerful anchor loss function for both 3D Gaussian attributes and human face landmarks. Experimental results on the THuman2.0 and THumansit datasets showcase the superiority of our EVA-Gaussian approach in rendering quality across diverse camera settings.

Method Overview

Overview of EVA-Gaussian. EVA-Gaussian takes sparse-view images captured around a human subject as input and performs three key stages: (1) estimating the positions of 3D Gaussians, (2) inferring the remaining attributes (i.e., opacities, scales, quaternions, and features) of these Gaussians, and (3) refining the output image in a recurrent manner

EVA Module

Efficient cross-View Attention (EVA) module for 3D Gaussian position estimation. EVA takes multi-view image features as input, embeds them into window patches using a shifted algorithm, and performs cross-view attention between the features from different views.

Regularization Loss

Attribute regularization. We regularize the opacities and scales of Gaussians, as well as the position mismatches among the Gaussians in the landmark collection. The optimization of position mismatch when it falls below a specific tolerance.

EVA-Gaussian: 3D Gaussian-Based Real-time Human Novel View Synthesis Under Diverse Camera Settings

Yingdong Hu¹, Zhening Liu¹, Jiawei Shao^1,2, Zehong Lin^{1 }, Jun Zhang¹,
¹The Hong Kong University of Science and Technology²Institute of Artificial Intelligence (TeleAI), China Telecom
Corresponding author

Paper

Code

Abstract

Video

Free View Rendering

Method Overview

EVA Module

Regularization Loss

Visualization

Citation

EVA-Gaussian: 3D Gaussian-Based Real-time Human Novel View Synthesis Under Diverse Camera Settings

Yingdong Hu1, Zhening Liu1, Jiawei Shao1,2, Zehong Lin1 *, Jun Zhang1, 1The Hong Kong University of Science and Technology 2Institute of Artificial Intelligence (TeleAI), China Telecom *Corresponding author Paper Code

Abstract

Video

Free View Rendering

Method Overview

EVA Module

Regularization Loss

Visualization

Citation

Yingdong Hu¹, Zhening Liu¹, Jiawei Shao^1,2, Zehong Lin^{1 }, Jun Zhang¹,
¹The Hong Kong University of Science and Technology²Institute of Artificial Intelligence (TeleAI), China Telecom
Corresponding author

Paper

Code