• 1University of Illinois at Urbana-Champaign
  • 2University of Maryland, College Park
Reconstruction, Albedo, Normal, Visibility, Shading, Semantics


We present UrbanIR (Urban Scene Inverse Rendering), a new inverse graphics model that enables realistic, free-viewpoint renderings of scenes under various lighting conditions with a single video. It accurately infers shape, albedo, visibility, and sun and sky illumination from wide-baseline videos, such as those from car-mounted cameras, differing from NeRF's dense view settings. In this context, standard methods often yield subpar geometry and material estimates, such as inaccurate roof representations and numerous 'floaters'. UrbanIR addresses these issues with novel losses that reduce errors in inverse graphics inference and rendering artifacts. Its techniques allow for precise shadow volume estimation in the original scene. The model's outputs support controllable editing, enabling photorealistic free-viewpoint renderings of night simulations, relit scenes, and inserted objects, marking a significant improvement over existing state-of-the-art methods.

Intrinsic Decomposition

* Please select different intrinsic components and compare with reconstruction (left), rendered from novel views.



Nighttime Simulation

* By editing original illumination and inserting new light sources (e.g. streetlights), UrbanIR simulates nighttime videos.
* Left: Reconstruction, Right: Nighttime simulation.


Relighting: Timelapse Simulation

* By changing sunlight direction explicitly, UrbanIR simulates sharp and geometry-aware shadow.
* Left: input image, Right: Timelapse Simulation

Object Insertion

* With proposed visibility optimization (right), UrbanIR generates high-quality shadow volume and enables realistic object insertion.
* Please note that the object casts shadow on the neural scene, and the scene also casts shadow on the object.
* Left: without visibility optimization, Right: with visibility optimization


The website template was borrowed from Michaƫl Gharbi, ClimateNeRF, RefNeRF , Nerfies and Semantic View Synthesis.