We show how to build a model that allows realistic, free-viewpoint renderings of a scene under novel
lighting conditions from video. Our method -- UrbanIR: Urban Scene Inverse Rendering --
computes an inverse graphics representation from the video.
UrbanIR jointly infers shape, albedo, visibility, and sun and sky
illumination from a single video of unbounded outdoor scenes with unknown lighting.
UrbanIR uses videos from cameras mounted on cars (in contrast to many views of the same points in typical NeRF-style estimation).
As a result, standard methods produce poor geometry estimates (for example, roofs), and there are numerous ''floaters''.
Errors in inverse graphics inference can result in strong rendering artifacts.
UrbanIR uses novel losses to control these and other sources of error. UrbanIR uses a novel loss to make very good
estimates of shadow volumes in the original scene. The resulting representations facilitate controllable
editing, delivering photorealistic free-viewpoint renderings of relit scenes and inserted objects. Qualitative
evaluation demonstrates strong improvements over the state-of-the-art.
* Please select different intrinsic components and compare with reconstruction (left), rendered from novel views.
* By changing sunlight direction explicitly, UrbanIR simulates sharp and geometry-aware shadow.
* With proposed visibility optimization (right), UrbanIR generates high-quality shadow volume and enables realistic object insertion.
* Please note that the object casts shadow on the neural scene, and the scene also casts shadow on the object.
* Left: without visibility optimization, Right: with visibility optimization
Relighting: Night Simulation
* With intrinsic scene components, UrbanIR adds new light sources (e.g. headlight, streetlight) and simulates the night scene.
* Left: day (original), Right: night (simulation)