Recent developments in immersive imaging technologies have enabled improved telepresence applications. Being fully matured in the commercial sense, omnidirectional (360-degree) content provides full vision around the camera with three degrees of freedom (3DoF). Considering the applications in real-time immersive telepresence, this paper investigates how a single omnidirectional image (ODI) can be used to extend 3DoF to 6DoF. To achieve this, we propose a fully learning-based method for spherical light field reconstruction from a single omnidirectional image. The proposed LFSphereNet utilizes two different networks: The first network learns to reconstruct the light field in cubemap projection (CMP) format given the six cube faces of an omnidirectional image and the corresponding cube face positions as input. The cubemap format implies a linear re-projection, which is more appropriate for a neural network. The second network refines the reconstructed cubemaps in equirectangular projection (ERP) format by removing cubemap border artifacts. The network learns the geometric features implicitly for both translation and zooming when an appropriate cost function is employed. Furthermore, it runs with very low inference time, which enables real-time applications. We demonstrate that LFSphereNet outperforms state-of-the-art approaches in terms of quality and speed when tested on different synthetic and real world scenes. The proposed method represents a significant step towards achieving real-time immersive remote telepresence experiences.
Use the slider to see the direct comparison against ground truth.
@inproceedings{gond2023_lfspherenet,
author = {Gond, Manu and Zerman, Emin and Knorr, Sebastian and Sj\"{o}str\"{o}m, M\r{a}rten},
title = {LFSphereNet: Real Time Spherical Light Field Reconstruction from a Single Omnidirectional Image},
year = {2023},
doi = {10.1145/3626495.3626500},
booktitle = {Proceedings of the 20th ACM SIGGRAPH European Conference on Visual Media Production},
keywords = {View Synthesis, Omnidirectional image, Light Field, Immersive Imaging, Deep Learning, 6DoF, 360 Degree Image},
location = {London, United Kingdom},
series = {CVMP '23}
}