Nvidia’s Omniverse: The Metaverse is a network not a destination

richard-kerris-nvidia-june-2022

“People will start to see the ability to experience locations,” as a first experience of the Metaverse, says Nvidia executive Richard Kerris, things like walking around a prospective hotel room before booking. Probably, though, the Metaverse will creep up on us like the Web did, so that “it almost happened when you didn't know it.”

Tiernan Ray for ZDNet

When it was first introduced by Meta's Mark Zuckerberg last fall, there was skepticism in some corners about The Metaverse, the systems of avatars and virtual worlds that Zuckerberg is building and that he says will be the next version of the Internet. 

Richard Kerris, who runs a team of a hundred people at chip giant Nvidia that is building technology for The Metaverse, known as Omniverse (more here), is not at all skeptical about that future world. 

He is skeptical about one thing, though. 

“The only thing I'm skeptical about is how people tend to talk about it,” Kerris told ZDNet, on a recent trip through New York City to meet with developers.

“People are misinterpreting Metaverse as a destination, a virtual world, a this or that,” Kerris observed. “The Metaverse is not a place, it's the network for the next version of the Web.

“Just replace the word ‘Metaverse' with the word ‘network,' it'll start to sink in.”

The “network,” in the sense that Kerris uses it, is a kind of sinewy technology that will bind together rich media on many Web sites, especially 3-D content. 

Also: At CES 2022, Nvidia sets the stage for AI everywhere

“In much the same way the Web unified so many things […] the next generation of that Web, the core underlying principles of that will be 3-D, and with that comes the challenge of making that ubiquitous between virtual worlds.

“The end result would be, in much the same way you can go from any device to any Web site without having to load something in — remember the old days, What browser do you have, what extension, etc. — all that went away with HTML being ratified — when we can do that with 3-D, it's going to be transformative.”

No surprise, being from Nvidia, which sells the vast majority of graphics chips (GPUs) to render 3-D, Kerris makes the point that “we live in a 3-D world, we think in 3-D” but the Web is a 2-D reality. “It's limited,” he said, with islands of 3-D rendering capabilities that never interconnect.

Also: Nvidia expands Omniverse with a new GPU, new collaborations

“The consistency of the connected worlds is what is the magic that's taking place,” he said. “I can teleport from one world to another, and I don't have to describe it each time that I build it.”

The analog to HTML for this new 3-D ecosystem is something called “USD,” universal scene description. As ZDNet‘s Stephanie Condon has written, USD is an interchange framework invented by Pixar in 2012, which was released as open-source software in 2016, providing a common language for defining, packaging, assembling and editing 3D data. 

(Kerris, an Apple veteran, has something of a spiritual tie to Pixar, if not actual, having worked at LucasFilm for several years in the early Naughts. See more in his LinkedIn profile.

USD is capable of describing numerous things in a 3-D environment, from lighting to physics behavior of falling objects. 

In practice, Kerris imagines the Omniverse-enabled, USD-defined Metaverse as a road trip where people hop from one 3-D world to the next as effortlessly as browsing traditional sites. “I can go from a virtual factory to a virtual resort to a virtual conference room to a virtual design center, to whatever,” says Kerris.

Also: Nvidia's new Omniverse tools will make it easier than ever to build virtual worlds

Within those environments, 3-D rendering will allow people to move past cumbersome sneaker-net of file sharing. “And it allows a lot more capability in what I do,” he said, offering the example of product designers. 

“With Metaverse, and ubiquitous plumbing for 3-D, we'll be in that 3-D environment, at the same time, and rather than sharing a Web page, we can move around, you can look at something on this side of the product, I can be looking at something else, but it's like we're in the same room at the same time.”

Nvidia, says Kerris, started down the path on USD six or seven years ago “because we simulate everything we build [at Nvidia] before we build it in the physical world,” he said. Nvidia has peers in industry working on realizing technology, including Ericsson, which wants to simulate antennae. “They all want a reality simulation,” he says of companies in the USD fold. 

Using the technology, said Kerris, one can go much deeper into the realm of digital twins, simulations of products and structures that allow for intervention, experimentation, and observation.

“Until the advent of consistent plumbing, it was done in a representative mode,” such as an illustration of a building in Autodesk. “It wasn't true to reality, I couldn't show you exactly how it would be in a windstorm,” which isn't good because, as he put it, “I want to be damn straight about stuff I'm building in the physical world.”

The “core base of situation that's true to reality,” using USD, will allow designers to more accurately simulate, backward and forward, including things such as tensile strength. 

“I'd love to have a house that's structurally sound before I design the marble finish,” he observed. “If I'm building a digital twin of a house I'm building, it's layers of stuff on there, things for structural engineers, and polish that others are going to come in and finish.” The important thing is knowing it's “true to reality” for materials and things holding the structure together, he said.

By making possible those richer interactions in 3-D, said Kerris, “In the same way that the Web transformed businesses, and experiences, and communication, so will the Metaverse do that, and in a more familiar environment, because we all work in 3-D.”

Different companies are contributing to USD in different ways. For example, Nvidia is working with Apple on what's called “rigid body definition.” 

“And there's more to come,” he said.

Nvidia has been developing the Omniverse tools as a “platform,” what Kerris calls “the operating system for the Metaverse.”

“People can plug into it, they can build on top of it, they can connect to it, they can customize it — it's really at their disposal, much the same way an operating system is today.”

The USD standard has come “quite far” in terms of adoption, said Kerris, with most 3-D companies using it. “Every company in entertainment has a USD strategy today,” he observed. “The in CAD [computer-aided design] and mechanical engineering, it's coming, they either have plans or they are participating in helping to define what's necessary.”

“HTML was the same way in early days,” he said, lacking support for video in early days, with third-party plugins such as Adobe Flash dominating, before standards evolved.

Will digital twins ignite the world's imagination about the Metaverse? It seems somewhat too industrial-focused, ZDNet observed.     

Ordinary people will gain interest as they realize it is connectedness, not a single destination. “As they realize it's the next generation of the Web, I can visit a remote location without the need of a headset, or installing specific browsers, that's one aspect,” said Kerris. “In their everyday life, as we share photos today, you'll be able to share objects; you know, your kid come home, and they made something and they'll be able to share it with the grandparents.”

“It'll just become part of what you do, whether you're buying a piece of furniture for your house, and you'll go into your phone, you'll sync with the home, you'll drop the furniture in, you'll walk around it — that's the thing people will take for granted, but it's the seamless connection.” 

The same for designing one's custom car finish, he offered. “You'll actually be connected to the factory making that car” to check out all the aspects of it. 

“It's going to change everything,” he said. 

There will be multiplier effects, said Kerris, as digital twins allow for trialing multiple scenarios, such as with training robots. 

“Today, they would plug a computer into that robot, and input it with information” to train the robot in one physical space, he said. 

In a digital twin environment, with a robot in the simulated room, “you can train not only one robot but hundreds,” using “hundreds of scenarios the robot could encounter.”

“Now, that robot is going to be thousands of times smarter than it would have been if you'd only trained it one time.” Nvidia has, in fact, been pursuing that particular approach for many years by doing autonomous driving training of machine learning in simulated road environments.

Although autonomous driving hasn't reached its promised development, Kerris believes the approach is still sound. “I can build a digital twin of Palo Alto,” the Silicon Valley town. “And I can have thousands of cars in that simulation, driving around, and I can use AI to apply every kind of simulation I can think of — a windstorm, a kid running out, chasing a ball, an oil slick, a dog — so that these cars in simulation are learning many thousands of times more scenarios than a physical car would.”

Nvidia has been doing work combining the simulated trials with real-world driving with car maker Mercedes for Level 5 autonomous driving, the most demanding level. 

“The efficiency,” he said, meaning, how well the autonomous software handles the road scenarios, “is pretty amazing,” he said. “By using synthetic data to train these cars, you have a higher degree of efficiency,” by combining scenarios.

“I would much rather trust myself riding in a car trained in a simulated environment than one trained in a physical environment.” There will still be a role for the real-world data that comes from cars on the road. 

As for the time frame for the vision, “We are seeing it already in warehouses,” noted Kerris, which are rapidly adopting the robot-training regime. That includes Amazon, where a developer downloaded Omniverse and evangelized it within Amazon. The enterprise version of Omniverse, which is a subscription-based product, was taken up by Amazon for more extensive robot training.

Amazon is currently in production with the software for its pick-and-place robots. 

“The beauty is they discovered, by using synthetic data generation, they were able to be more efficient with stuff rather than just rely on the camera” on the robot for object detection. Those cameras often would get tripped up, noted Kerris, by reflective packing tape on packages. Using synthetic Omniverse-generated data got around that limitation. That's one example of being more efficient in robotics, he said.

Consumers will probably feel effects of such simulations in the end results.

“There are a hundred thousand warehouses on the planet,” noted Kerris. “They are all looking at using robotics to be safer, more efficient, and to better utilize the space.” People “may not be aware that's taking place, but they'll reap the benefits of it.”

Moreover, consumers will “know because they're getting things a lot faster than in past,” he said. “Behind the curtain, things will be much more efficient than they were six months ago.” The same goes for retailers such as Kroger, which is using Omniverse tools to generate synthetic data to plan how to get produce to consumers faster. 

As for self-driving cars, “The presumption that all these cars will be autonomous today, it's a bit — it's not there yet,” he conceded. “But will we have aautomous taxis, and things that will take us form here to there? Oh, yeah, that's easy.” 

For a car that goes the full trip, “For a car that drives up to you, and it will drive you to New Jersey, autonomously, we have a little ways to go.”

As far as direct consumer experiences, “people will start to see the ability to experience locations,” said Kerris. Leisure industry executives are interested, for example, in how to showroom a hotel room to consumers in advance of a trip in a way better than photos. “I'm going to allow you to teleport into the room, experience it, so your decision will be based on an immersive experience, look at the window, see what my view is going to be,” Kerris described. 

The impact on education “is going to be huge,” said Kerris. Today, physical location means some inner-city schools might not experience lavish field trips. “An inner-city school is not exactly going to have a field trip to do a safari in Africa,” he mused. “I think that virtual worlds are seamlessly connected can bring new opportunities by allowing everybody to have the same experience no matter what school they're in.” 

An avatar of researcher Jane Goodall could “inspire learning,” he offered. “Think about what that does for a student.”

While emphasizing 3-D, Kerris is not pushing virtual reality or augmented reality, the two technologies people tend to focus on. Those things are part of the picture, but 3-D doesn't have to be with a headset on, he contends.

For one thing, today's VR tools, such as VR videos on YouTube, using conventional VR headsets, have been quite limited, Kerris noted. “It's not seamless, it's not easy, it's not like a Web site,” he observed.

In addition to stints at Apple and Amazon and LucasFilm, Kerris briefly ran marketing for headset developer Avegant. Those headsets were not VR, they were made to be private, immersive movie screens attached to your face, using Texas Instruments DLP projection chips. The quality of the product, Kerris reflected, “was phenomenal,” but it was too expensive to make, costing $800 at retail. And the fact of a laser projecting onto the retina “scared everyone,” he said. (Avegante is still in business, developing a technology called liquid crystal-on-silicon.)

What needs to happen is for today's disparate virtual environments to receive that sinewy tissue of USD and related technology. “They're all disconnected,” said Kerris of today's proto-Metaverse, such as Oculus Rift. “If they were just simple Web sites, where you could bop around, and go experience it, the opportunity would be much greater.” 

Rather than having to have an Oculus headset, “If I could experience it with this being a window into that world,” he said, holding up his smartphone, “chances are a lot higher I would go check it out.”

Will USD make that happen.

“Yes: that's absolutely the goal of USD to unify 3-D virtual worlds.”

Still, showrooming hotel rooms doesn't sound like it will jump-start things. When is the  Tim Berners-Lee event that will make it all happen for consumers in a grass roots way?

“When did the Web become something that became ubiquitous with consumers?” he asked, rhetorically. “Well, it started with email, then I could send a picture, then, all of a sudden I could do video — it kind of evolved as it went along.”

Kerris alluded to the early days of mobile Web sites on iPhone, when Steve Jobs first unveiled the technology in January of 2007, onstage at Macworld, when Kerris was with Apple, and later, video chat à la FaceTime,

“What was the transformative thing that allowed the Web to be in everybody's pocket? It's kind of like that,” he said. “It's almost happened when you didn't know it, and then people take it for granted.”  

Source