• 1University of Illinois at Urbana-Champaign
  • 2University of Maryland, College Park
Reconstruction, Normal, Albedo, Shading, Visibility, Semantics


We show how to build a model that allows realistic, free-viewpoint renderings of a scene under novel lighting conditions from video. Our method -- UrbanIR: Urban Scene Inverse Rendering -- computes an inverse graphics representation from the video. UrbanIR jointly infers shape, albedo, visibility, and sun and sky illumination from a single video of unbounded outdoor scenes with unknown lighting. UrbanIR uses videos from cameras mounted on cars (in contrast to many views of the same points in typical NeRF-style estimation). As a result, standard methods produce poor geometry estimates (for example, roofs), and there are numerous ''floaters''. Errors in inverse graphics inference can result in strong rendering artifacts. UrbanIR uses novel losses to control these and other sources of error. UrbanIR uses a novel loss to make very good estimates of shadow volumes in the original scene. The resulting representations facilitate controllable editing, delivering photorealistic free-viewpoint renderings of relit scenes and inserted objects. Qualitative evaluation demonstrates strong improvements over the state-of-the-art.

Intrinsic Decomposition

Intrinsic Decomposition



Relighting: Timelapse Simulation

Relighting: Timelapse Simulation

Object Insertion

Object Insertion
* Please note that the object casts shadow on the neural scene, and the scene also casts shadow on the object.
Object Insertion

Relighting: Night Simulation

Relighting: Night Simulation
* Left: day (original), Right: night (simulation)


