Called “DIB-R,” the AI takes a picture of any 2D object – an image of a bird, for example – and predicts what it would look like in three dimensions. This prediction includes lighting, texture, and depth. Credit: Nvidia DIB-R stands for differentiable interpolation-based renderer, meaning it combines what it “sees,” a 2D image, and makes inferences based on a 3D “understanding” of the world. This is strikingly similar to how humans translate the 2D input from our eyes into a 3D mental image. According to Nvidia, this research has numerous implications for the field of robotics: With further development the researchers hope to expand DIB-R to include functionality that would essentially make it a virtual reality renderer. One day, the team hopes, such a system will make it possible for the AI to create fully-immersive 3D worlds in milliseconds using only photographs: Sanja Fidler, Nvidia’s director of AI and coauthor on the team’s paper, told Venture Beat’s Khari Johnson: The ability to render the world from photographs could lead to amazing content creation pipelines. Technology such as Google Maps could become more immersive than ever. And, possibly, creatives more skilled at photography or painting than coding and development could leave all the heavy development to the machines. Imagine if making huge open-world games such as Skyrim and Grand Theft Auto, the kind traditionally relegated to companies with hundreds of staff members, were something a handful of creatives and an AI could handle on their own.

Nvidia built an AI that creates 3D models from 2D images - 27