Autumn’s GTC shows who Nvidia really cares about • The Register

GTC This week’s GPU Technology Conference saw Nvidia do something we haven’t seen much of from the chip designer lately: refresh a consumer product.

For the increasingly enterprise-obsessed tech giant, GTC has become less and less about GPUs for gamers and everything to do with capitalizing on new and emerging markets, such as AI, robotics, autonomous vehicles, and the ever buzz-wordy metaverse. By metaverse, in this context, we mean 3D virtual reality worlds in which you can interact and collaborate with simulations, applications, and each other.

Nvidia CEO Jensen Huang, clad in his signature leather jacket, took to the stage — or is it holodeck? we’re not sure — to unveil a trio of RTX 40-series graphics cards powered by his engineers’ Ada Lovelace architecture.

For many tuning into Huang’s nearly hour and 45 minute-long keynote, that reveal may have been the only solid, relevant announcement at this fall’s event.

Using a set of cherry-picked benchmarks, Huang bragged about the performance gains of the RTX 4090 and 4080 graphics cards over their predecessors. The chip designer said the RTX 4090 will deliver 2x-4x higher performance compared to the company’s previous flagship 3090 TI that launched this spring.

Then there’s the price of these new RTX units. The cards are some of Nvidia’s most expensive to date. At $899 for the 12GB 4080 and $1,199 for the 16GB version, the cards are $200-$500 more expensive than the 3080 when it launched two years earlier. The price creep on the 4090 isn’t as severe. At $1,599, it’s about $100 more than when the 3090 made its debut in 2020.

Huang, speaking during a press conference Wednesday, defended the increase, arguing that the performance gains and feature set more than made up for the higher price. He claimed the price hike was further justified by higher manufacturing and material costs.

“A 12-inch wafer is a lot more expensive today than it was yesterday, and it’s not a little bit more expensive, it’s a ton more expensive,” he said, adding that “our performance with Ada Lovelace is monumentally better.”

But beyond the new cards, on which Huang spent less than two minutes detailing, it was back to business as usual. Here’s a recap of Nvidia’s biggest announcements at GTC.

Back to a dual-architecture model

The roughly 15 minutes leading up to the RTX announcement was spent on Nvidia’s new Ada Lovelace architecture, which sees the chip designer return once again to a dual-architecture model.

Nvidia’s previously announced Hopper architecture will power the company’s HPC and AI-focused processors, such as the H100, while the Ada Lovelace architecture will power Nvidia’s graphics-centric chips.

Named after the 19th century mathematician, the Ada Lovelace architecture is built on TSMC’s 4N process and features NV’s third-gen real-time Ray Tracing cores and fourth-gen Tensor cores.

So there’s the split: Hooper aimed mainly at high-performance computing and big AI workloads, and Lovelace mainly aimed at everything else, from cloud server GPUs to gaming cards.

This is hardly the first time Nvidia has employed a dual-architecture model. Going back two generations, Nvidia’s datacenter chips, like the V100, used its Volta architecture. Meanwhile, its consumer and graphics-focused chips, the RTX 2000-series and Quadro RTX family for example, used the Turing microarchitecture.

In addition to Nvidia’s RTX 40-series parts, Ada Lovelace will also power Nvidia’s RTX 6000-series workstation cards and its L40 datacenter GPUs. However, unlike Hopper, Huang says the new architecture is designed to address a new generation of graphics-centric challenges, including the rise of cloud gaming and the metaverse. Those will need graphics chips somewhere to render those real-time environments – cloud gaming being where the game is rendered mainly in a backend and piped live over the internet to a screen in front of the user, such as a laptop or phone. That spares players from buying and upgrading gaming rigs, and/or carrying them around everywhere.

“In China, cloud gaming is going to be very large and the reason for that is because there’s a billion phones that game developers don’t know how to serve anymore,” he said “The best way to solve that is cloud gaming. You can reach integrated graphics, and you can reach mobile devices.”

The metaverse but as-a-service

However, Ada Lovelace isn’t limited to cloud gaming applications. Nvidia is positioning the architecture as the work horse of its first software-as-a-service offering, which it says will allow customers to access its Omniverse hardware and software stack from the cloud.

Omniverse Cloud offers the remote computing and software resources needed to run metaverse applications on demand, from the cloud. The idea being that not every enterprise wants or even has the budget to spend millions of dollars on one of Nvidia’s OVX SuperPods to provide that level of simulation and rendering on the off chance that the metaverse actually goes somewhere. Instead, they can build their metaverses in the Omniverse Cloud.

For now, Nvidia appears to be courting large logistics, manufacturing, and other industrial partners, promising to help them build and visualize digital twins. These twins are large-scale simulations – each simulation is twinned with the real world, using real data and modeling – and presented as a way to test and validate designs, processes, and systems in a virtual world before they’re rolled out in the real world.

Yes, it’s all more fancy modeling and simulation, but with new silicon, interactivity, virtual reality, and invoices.

While Omniverse Cloud is Nvidia’s first foray into managed cloud services, it won’t be its last, according to Huang, who signaled that his company is evaluating a similar model for its other software platforms.

Smarter cars, robots

Nvidia doesn’t just want to power digital twins of customers’ warehouses and manufacturing plants. During the keynote, Huang also detailed a slew of hardware designed to power everything from autonomous robots to automobiles.

Huang talked up Drive Thor, Nvidia’s all-in-one computing platform that’s designed to replace the multiplicity of computer systems used in vehicles today.

The tech will make its debut in China, where Nvidia says it will power the Zeekr and Xpeng 2025 vehicle lineup, and QCraft’s autonomous taxi service. That’s, of course, if US export restrictions don’t tighten to the point where Nvidia can no longer deliver — a prospect that Huang downplayed during the press conference Wednesday.

Meanwhile, to power the robotic minions scurrying about alongside human workers, Nvidia showed off its IGX and Orin Nano platforms.

IGX is based on Nvidia’s previously announced Orin AGX Industrial system-on-modules but adds high-speed networking. According to Nvidia, one of the first uses of the board will be in surgical robots. Meanwhile, Nvidia’s Jetson Orin Nano modules are designed to tackle less demanding applications.

Large-language models for the masses

As with previous GTCs, software dominated a considerable portion of the keynote. Two of the bigger releases for this fall’s event were Nvidia’s large-language model (LLM) services called NeMo and BioNeMo.

The services aim to make employing LLMs easier for AI researchers and biologists seeking to glean insights from complex datasets. The services let customers plug their existing data into customizable foundational models with minimal effort. For example, BioNeMo might be used to accelerate protein-folding research, it was suggested.

Every single company, in every single country, speaking every single language has probably tens of different skills that their company could adapt our large-language model to go perform

However, looking beyond the medical field, Huang expects LLMs will have broad applicability for the vast majority of enterprises. “My sense is that every single company, in every single country, speaking every single language probably has tens of different skills that their company could adapt our large-language model to perform,” he said.

“I’m not exactly sure how big that opportunity is, but it’s potentially one of the largest software opportunities ever.”

Hopper in production

Finally, Nvidia provided an update on the availability of its long-awaited Hopper H100 GPUs, which it says have entered volume production and will begin shipping to OEM system builders next month.

Announced at Nvidia’s GTC spring event, the 700W GPUs promised 6x higher AI performance compared to the outgoing A100 thanks to support for 8-bit floating point calculation. Meanwhile, for HPC applications, Nvidia says the chip will deliver 3x the performance in double-precision FP64 calculations.

However, those hoping to get their hands on Nvidia’s in-house DGX H100 servers, complete with their custom interconnect tech, will have to wait until some time in Q1 2023, a full quarter later than expected.

While Nvidia blamed the greater complexity of the DGX system, a likely culprit is Intel’s Sapphire Rapids processors used in the systems, which have reportedly been delayed until late in Q1. ®