Last week, NVIDIA GTC blew auidences away with new AI, a spatial computing software, and hardware. The leading computing firm seeks to scale emerging technology to boost industrial and enterprise workflows.
During the event, NVIDIA demoed CloudXR, which lets standalone devices stream enterprise-ready AR/VR/MR content, such as digital twins, to less powerful computing devices like a Quest or smartphone.
On the expo floor, NVIDIA showcased how BAC uses its CloudXR platform with Autodesk VRED to design customer-unique, bespoke cars and how its AI systems power Maxine, an immersive video-calling service and developer platform for boosting engagement in hybrid working environments.
Greg Jones, NVIDIA’s Director of XR Global Business Development and Product Manager for the Maxine Developer Platform, spoke to XR Today during GTC 2024 to highlight how NVIDIA plans to scale spatial computing solutions for the workplace.
NVIDIA Omniverse to Create Accessible 3D Workflows
NVIDIA’s CloudXR solution allows BAC car designers to share visualisations of a prototype car as an AR experience accessible via devices such as an Apple iPad, displaying an Autodesk VRED 3D rendering that a customer or designer can update in real-time.
Jones added:
We’re repurposing VRED to some degree; instead of a design review with just on-site engineers, we’re putting a CloudXR version up, and a customer can do a design review at home. – We’re using the full dataset, so we have to run it in the cloud. It can’t run on an iPad alone because the datasets are large. The renderings take a tonne of computing power.
Moreover, Jones noted that streaming services can increase accessibility to tools like VRED by introducing large language models and voice commands. Therefore, a user who may not understand sophisticated VRED UI and operations can alter the AR rendering of a product, such as the BAC car, without deep software knowledge – “you don’t have to use the interface, it’s hands-free, it’s just voice.”
Jones also added:
We built AI experiences just before large language models came out, and it was a relatively complex pipeline. Then, large language models came out, and the pipelines were simplified to a huge degree. This is a technology that’s here now, and everybody can use it.
Jones expressed that it’s “phenomenal” how LLMs have appeared in the last year and are “relatively straightforward for developers to put into their systems.”
Democratising Enterprise XR
“Democratising VR isn’t just about buying everybody headsets,” remarked Jones.
Jones noted:
When we first started on this path of streaming heavy-duty VR to standalone headsets. We thought that was a great way to democratise VR. Now, an Oculus Quest, which couldn’t compute this [Omniverse] model, can now compute this model by streaming it from the cloud, and that’s great. We thought that was a great first step. Now we realise that in the manufacturing and enterprise domains, the software is actually quite complex for workers to use.
Jones also explained how while the hardware is becoming more sophisticated, human understanding of the technology is still a hurdle, “give them a headset, they put it on, they go, I have no idea what to do with this now, so making this experience as simple to get into and do functional work with is really where we’re driving this.”
Leveraging AI and XR for Video Conferencing
At the GTC expo floor, and next to its BAC demo, NVIDIA was showing off its Maxine solution, a video conferencing service that leverages spatial data, AI, and eye-tracking to create simple business 3D avatars and backgrounds for video conferencing.
The solutions create RT3D digitsations of a caller, and using AI and eye tracking, the 3D image correlates with the direction a user is facing – therefore driving engagement and interaction.
Jones explained that the Maxine solutions came following an increasing demand for video conferencing solutions in hybrid working environments; even after lockdown with the return to the office, “we’re seeing more and more video conferencing solutions.”
Jones added:
We thought what you would assume: a decrease after lockdowns. It’s actually increasing because people want a flexible workplace. One of the issues with video conferencing is that it puts people at a distance; it’s like I’m watching a movie. I’m not in the scene with the person I’m not interacting.
According to Jones, Maxine takes a 2D video stream and lifts it to 3D, “allowing the people to have a deeper engagement with that person across the video conference.”
Like CloudXR, NVIDIA encourages developers to adopt the Maxine service to embed it in pre-existing applications, a core challenge in distributing emerging technology solutions. “Fortunately, Nvidia plays a really long game, so we’ll overcome those hurdles, and this technology will roll out. It’s just about prioritisation,” Jones remarked.