Categories
AI

Jensen Huang Q&A: Why Moore’s Law is dead, but the metaverse will still happen

Interested in learning what’s next for the gaming industry? Join gaming executives to discuss emerging parts of the industry this October at GamesBeat Summit Next. Register today.


Asked why Nvidia’s latest 40 Series graphics cards cost as much as $1,600, Nvidia CEO Jensen Huang said that Moore’s Law is dead. He explained that the days of constantly falling costs are over, as technology advances in manufacturing have slowed and the pandemic shortage messed things up further.

But don’t worry too much. The advances in both AI and gaming are going to work together to propel the ambitious dreams of humanity, like the metaverse.

Huang spoke at a press Q&A at Nvidia’s online GTC22 conference last week.

Moore’s Law, posited by Intel chairman emeritus Gordon Moore in 1965, stated that the number of components on a chip would double every couple of years. It was a metronome that signaled that every couple of years chip performance would either double or costs would halve.

And it held true for decades, based mostly on manufacturing advances. But with the laws of physics reaching their limit in terms of miniaturization, those advances are no longer taken for granted. Intel is investing heavily to make the law hold up. But Huang said that smart chip design has to take over, which is why the company shifted to a new architecture for its latest generation of graphics chips. The result for the 40 Series graphics chips is some outstanding performance coming out for PC games just as we head into a global downturn.

Nvidia Omniverse Cloud
Nvidia Omniverse Cloud

Huang believes it’s more important than ever to keep the advances in performance and power efficiency going, as we’re on the cusp of building the metaverse, the 3D universe of virtual worlds that are all interconnected, like in novels such as Snow Crash and Ready Player One. Nvidia has built the Omniverse suite of standardized development and simulation tools to enable that metaverse to happen.

But it won’t be a real metaverse unless it’s real-time and can accommodate lots more people than can access 3D spaces today. Nvidia plans to use the Omniverse to create a digital twin of the Earth, in a supercomputing simulation dubbed Earth 2, so it can predict climate change for decades to come.

With that, we should get the metaverse for free, and we’ll need all the chip processing power available. And he noted that AI, made possible by the graphics chips driven forward by gaming, will enable developers to auto-populate their metaverse worlds with interesting 3D content. In other words, gaming and AI will be helping each other, driving both chips and the metaverse forward. To me, that sounds like a new law is in the making there.

Here’s an edited transcript of the press Q&A. We’ve transcribed the entire press Q&A, which was attended by me as well as a number of other members of the press.

Jensen Huang of Nvidia says DLSS is one of the company's greatest achievements.
Jensen Huang of Nvidia says Moore’s Law is dead.

Q: How big can the SaaS business be?

Huang: Well, it’s hard to say. That’s really the answer. It depends on what software we offer as a service. Maybe another way to take it is just a couple at a time. This GTC, we announced new chips, new SDKs, and new cloud services. I highlighted two of them. One of them is large language models. If you haven’t had a chance to look into the effectiveness of large language models and the implications on AI, please do so. It’s important stuff.

Large language models are hard to train. The applications are quite diverse. It’s been trained on a large amount of human knowledge, and so it has the ability to recognize patterns, but it also has within it a large amount of encoded human knowledge. It has human memory, if you will. In a way it’s encoded a lot of our knowledge and skills. If you wanted to adapt it to something that it was never trained to do — for example, it was never trained to answer questions or to summarize a story or to release a breaking news paraphrase. It was never trained to do these things. With a few additional shots of learning, it can learn these skills.

This basic idea of fine tuning, adapting for new skills, or what’s called zero-shot or few-shot learning, it has great implications in a large number of fields. Which is the reason why you see such a large amount of funding in digital biology. Large language models have learned the language of the structure of proteins, the language of chemistry. And so we put that model up. How large can that opportunity be? My sense is that every single company in every single country speaking every single language has probably tens of different skills that their company could adapt, that our large language models could go perform. I’m not exactly sure how big that opportunity is, but it’s potentially one of the largest software opportunities ever. The reason for that is because the automation of intelligence is one of the largest opportunities ever.

The other opportunity we spoke about was OmniVerse cloud. Remember what OmniVerse is. OmniVerse has several characteristics. The first characteristic is that it ingests. It can store. It can composite physical information, 3D information, across multiple layers or what’s called schemas. It can describe geometry, textures and materials. Properties like mass and weight and such. Connectivity. Who is the supplier? What’s the cost? What is it related to? What is the supply chain? I’d be surprised if behaviors, kinematic behaviors — it could be AI behaviors.

Nvidia Omniverse Avatar Cloud Engine.
Nvidia Omniverse Avatar Cloud Engine

The first thing OmniVerse does is it stores data. The second thing it does is it connects multiple agents. The agents can be people. They can be robots. They can be autonomous systems. The third thing it does is it gives you a viewport into this other world, another way of saying a simulation engine. OmniVerse is basically three things. It’s a new type of storage platform. It’s a new type of connecting platform. And it’s a new type of computing platform. You can write an application on top of OmniVerse. You can connect other applications through OmniVerse. For example, we showed many examples with Adobe being connected to AutoDesk applications being connected to various other applications. We’re connecting things. You could be connecting people. You could be connecting worlds. You could be connecting robots. You could be connecting agents.

The best way to think about what we’ve done with OmniVerse — think of it almost like — the easiest way to monetize that is probably like a database. It’s a modern database in the cloud. Except this database is in 3D. This database connects multiple people. Those are two SaaS applications we put up. One is the large language model, and the other is OmniVerse, basically a database engine that will be in the cloud. I think these two announcements — I’m happy that you asked. I’ll get plenty of opportunities to talk about it over and over again. But these two SaaS platforms are going to be very long-term platforms for our company. We’ll make them run in multiple clouds and so forth.

Q: Nvidia has said that it would reduce GPU sell-through into Q4. Do you mean fiscal Q4 or calendar Q4? Can you confirm that the reduced selling will last several more quarters?

Huang: Actually, it depends on — our fiscal Q4 ends in January. It’s off by a month. I can tell you that — because we only guide one quarter at a time, we are very specifically selling into the market a lot lower than what’s selling out of the market. A significant amount lower than what’s selling out of the market. I hope that by that Q4 time frame, some time in Q4, the channel will normalize and make room for a great launch for Ada. We’ll start shipping Ada starting this quarter in some amount, but the vast majority of Ada will be launched next quarter. I can’t predict the future very far these days, but our expectation and our current thinking is that what we see in the marketplace, what we know to be in the channel and the marketing actions we’ve taken, we should have a pretty terrific Q4 for Ada.

Q: What do you think about the progress of the metaverse, especially a real-time metaverse that would be more responsive than the internet we have right now? If it’s coming along maybe slower than some people would like, what are some things that could make it happen faster, and would Nvidia itself consider investing to make that come faster?

Huang: There are several things we have to do to make the metaverse, the real-time metaverse, be realized. First of all, as you know, the metaverse is created by users. It’s either created by us by hand, or it’s created by us with the help of AI. And in the future it’s very likely that we’ll describe some characteristics of a house or of a city or something like that — it’s like this city, like Toronto or New York City, and it creates a new city for us. If we don’t like it we can give it additional prompts, or we can just keep hitting enter until it automatically generates one we’d like to start from. And then from that world we’ll modify it.

The AI for creating virtual worlds is being realized as we speak. You know that at the core of that is precisely the technology I was talking about just a second ago called large language models. To be able to learn from all of the creations of humanity, and to be able to imagine a 3D world. And so from words through a large language model will come out, someday, triangles, geometry, textures and materials. From that we would modify it. Because none of it is pre-baked or pre-rendered — all of this simulation of physics and simulation of light has to be done in real time. That’s the reason why the latest technologies that we’re creating with respect to RTX narrow rendering are so important. We can’t do it [by] brute force. We’ll need the help of AI to do that. We just demonstrated Ada with DLSS3, and the results are pretty insanely amazing. 

The first part is generating worlds. The second is simulating the worlds. And then the third part is to be able to put that, the thing you were mentioning earlier about interactivity — we have to deal with the speed of light. We have to put a new type of data center around the world. I spoke about it at GTC and called it a GDN. Whereas Akamai came up with CDN, I think there’s a new world for this thing called GDN, a graphics distribution network. We demonstrated the effectiveness of it through augmenting our GeForce Now network. We have that in 100 regions around the world. By doing that we can have computer graphics, that interactivity that is essentially instantaneous. We’ve demonstrated that on a planetary scale, we can have interactive graphics down to tens of milliseconds, which is basically interactive.

Using the Magic Leap 2 headset in a Lowes store.
Using the Magic Leap 2 headset in a Lowe’s store

And then the last part of it is how to do raytracing in an augmented way, an AR or VR way. Recently we’ve demonstrated that as well. The pieces are coming together. The engine itself, the database engine called OmniVerse Nucleus, the worlds that are either built by humans or augmented by AI, all the way to the simulation and rendering using AI, and then graphics, GDNs around the world, all the pieces we’re putting together are coming together. At GTC this time you saw us — we worked with a really cool company called ReMap. Their CEO has put together with us, from their design studio, publishing an auto-configurator all the way out to the world, literally with the press of a button. We published an interactive raytraced simulation of cars in every corner of the world instantly. I think the pieces are coming together. Now that Ada is in production, we just have to get Ada stood up in the public clouds of the world, stood up in companies around the world, and continue to build out our distributed GDNs. The software is going to be there. The computing infrastructure is going to be there. We’re pretty close. 

Q: Given the inventory issues and physical supply chain issues — we’ve seen that with OmniVerse cloud you’re moving into SaaS. You already have GeForce Now. Do you foresee a point where you’re supplying the card as a service, rather than distributing the physical card anymore?

Huang: I don’t think so. There are customers who like to own. There are customers who like to rent. There are some things that I rent or subscribe to and some things I prefer to own. Businesses are that way. It depends on whether you like things capex or opex. Startups would rather have things in opex. Large established companies would rather have capex. It just depends on — if you use things sporadically you’d rather rent. If you’re fully loaded and using it all the time you’d rather just own it and operate it. Some people would rather outsource the factory.

Remember, AI is going to be a factory. It’s going to be the most important factory of the future. You know that because a factory has raw materials come in and something comes out. In the future the factories will have data come in, and what will come out is intelligence, models. The transformation of it is going to be energy. Just like factories today, some people would rather outsource their factory, and some people would rather own the factory. It depends on what business model you’re in.

It’s likely that we continue to build computers with HP and Dell and the OEMs around the world. We’ll continue to provide cloud infrastructure through the CSPs. But remember, Nvidia is a full stack accelerated computing company. Another way of saying it, I kind of said the same thing twice, but an accelerated computing company needs to be full stack. The reason for that is because there isn’t a magical thing you put into a computer and it doesn’t matter what application it is, it just runs 100 times faster. Accelerated computing is about understanding the application, the domain of the application, and re-factoring the entire stack so that it runs a lot faster.

And so accelerated computing, over the course of the last 25 years — we started with computer graphics, went into scientific computing and AI, and then into data analytics. Recently you’ve seen us in graph analytics. Over the years we’ve taken it across so many domains that it seems like the Nvidia architecture accelerates everything, but that’s not true. We accelerate. We just happen to accelerate 3,000 things. These 3,000 things are all accelerated under one architecture, so it seems like, if you put the Nvidia chip into your system, things get faster. But it’s because we did them one at a time, one domain at a time. It took us 25 years.

We had the discipline to stay with one architecture so that the entire software stack we’ve accelerated over time is accelerated by the new chips we build, for example Hopper. If you develop new software on top of our architecture, it runs on our entire installed base of 300, 400 million chips. It’s because of this discipline that’s lasted more than a couple of decades that what it appears to be is this magical chip that accelerates computing. What we’ll continue to do is put this platform out in every possible way into the world, so that people can develop applications for it. Maybe there’s some new quantum algorithms that we can develop for it so it’s prepared for cryptography in 10 or 20 years. Discovering new optimizations for search. New cybersecurity, digital fingerprinting algorithms. We want the platform to be out there so people can use it.

However there are three different domains where you’ll see us do more. The reason why we’ll do more is because it’s so hard to do that if I did it once myself, not only would I understand how to do it, but we can open up the pieces so other people can understand how to do it. Let me give you an example. Obviously you’ve seen us now take computer graphics all the way to the OmniVerse. We’ve built our own engine, our own systems. We took it all the way to the end. The reason for that is because we wanted to discover how best to do real-time raytracing on a very large data scale, fusing AI and brute force path tracing. Without OmniVerse we would have never developed that skill. No game developer would want to do it. We pushed in that frontier for that reason, and now we can open up RTX, and RTX DI and RTX GI and DLSS and we can put that into everyone else’s applications.

Nvidia's Earth 2 simulation will model climate change.
Nvidia’s Earth 2 simulation will model climate change.

The second area you saw us do this was Drive. We built an end-to-end autonomous car system so I can understand how to build robotics from end to end, and what it means for us to be a data-driven company, an ML ops company in how you build robotics systems. Now we’ve built Drive. We’ve opened up all the pieces. People can use our synthetic data generation. They can use our simulators and so on. They can use our computing stack.

The third area is large language models. We built one of the world’s largest models, earliest, almost before anyone else did. It’s called Megatron 530B. It’s still one of the most sophisticated language models in the world, and we’ll put that up as a service, so we can understand ourselves what it means. 

And then of course in order to really understand how to build a planetary-scale platform for metaverse applications — in particular we’ll focus on industrial metaverse applications. You have to build a database engine. We built OmniVerse Nucleus and we’ll put that in the cloud. There are a few applications where we think we can make a unique contribution, where it’s really hard. You have to think across the planet at data center scale, full stack scale. But otherwise we’ll keep the platforms completely open.

Q: I wanted to ask you a bit more about the China export control restrictions. Based on what you know about the criteria for the licenses at this point, do you anticipate all your future products beyond Hopper being affected by those, based on the performance and interconnect standards? And if so, do you have plans for China market specific products that will still comply with the rules, but that would incorporate new features as you develop them?

Huang: First of all, Hopper is not a product. Hopper is an architecture. Ampere isn’t a product. Ampere is an architecture. Notice that Ampere has A10, A10G, A100, A40, A30, and so on. Within Ampere there are, gosh, how many versions of products? Probably 15 or 20. Hopper is the same way. There will be many versions of Hopper products. The restrictions specify a particular combination of computing capability and chip to chip interconnection. It specifies that very clearly. Within that specification, under the envelope of that specification is a large space for us, for customers. In fact the vast majority of our customers are not affected by the specification.

Our expectation is that for the US and for China, we’ll have a large number of products that are architecturally compatible, that are within the limits, that require no licensing at all. However, if a customer would specifically like to have the limits that are specified by the restrictions or beyond, we have to go get a license for that. You could surmise that the goal is not to reduce or hamper our business. The goal is to know who it is that would need the capabilities at this limit, and give the US the opportunity to make a decision about whether that level of technology should be available to others.

Q: I had a recent talk with someone from a big British software developer diving into AI and the metaverse in general. We talked a bit about how AI can help with developing games and virtual worlds. Obviously there’s asset creation, but also pathfinding for NPCs and stuff like that. Regarding automotive, these technologies might be somewhat related to one another. You have situational awareness, something like that. Can you give us insight into how you think this might develop in the future?

Huang: When you saw the keynote, you’ll notice there were several different areas where we demonstrated pathfinding very specifically. When you watch our self-driving car, basically three things are happening. There are the sensors, and the sensors come into the computer. Using deep learning we can perceive the environment. We can perceive and then reconstruct the environment. The reconstruction doesn’t have to be exactly to the fidelity that we see, but it has to know its surroundings, the important features, where obstacles are, and where those obstacles will likely be in the near future. There’s the perception part of it, and then the second part, which is the world model creation. Within the world model creation you have to know where everything else is around it, what the map tells you, where you are within the world, and reconstructing that relative to the map and relative to everyone else. Some people call it localization and mapping for robotics.

Robots in the Omniverse-based warehouse.
Isaac-based robots in the Omniverse-based warehouse

The third part is path planning, planning and control. Planning and control has route planning, which has some AI, and then path planning, which is about wayfinding. The wayfinding has to do with where you want to go and where the obstacles are around you and how you want to navigate around it. You saw in the demo something called PathNet. You saw a whole bunch of lines that came out of the front of the cars. Those lines are essentially options that we are grading to see which one of those paths is the best path, the most safe and then the most comfortable, that takes you to your final destination. You’re doing wayfinding all the time. But second is ISAAC for robots. The wayfinding system there is a little bit more, if you will, unstructured in the sense that you don’t have lanes to follow. The factories are unstructured. There are a lot of people everywhere. Things are often not marked. You just have to go from waypoint to waypoint. Between the waypoints, again, you have to avoid obstacles, find the most efficient path, not block yourself in. You can navigate yourself into a dead end, and you don’t want that. There are all kinds of different algorithms to do path planning there.

The ISAAC path planning system, you could see that inside a game. There you could say, soldier, go from point A to point B, and those points are very far apart. In between point A and point B the character has to navigate across rocks and boulders and bushes, step around a river, those kinds of things. And so we would articulate, in a very human way. You saw ISAAC do that, and there’s another piece of AI technology you might have seen in the demo that’s called ASE. Basically it’s Adversarial Skill Embedding. It’s an AI that learned, by watching a whole bunch of humans, how to articulate in a human way from the prompts of words. You could say, walk forward to that stone, or walk forward to waypoint B. Climb the tree. Swing the sword. Kick the ball. From the phrases you can describe a human animation. I’ve just given you basically the pieces of AI models that allow us to take multiplayer games and have AI characters that are very realistic and easy to control. And so the future metaverse will have some people that are real, some people that are AI agents, and some that are avatars that you’ve entered into using VR or other methods. These pieces of technology are already here.

Q: How do you see the future of the autonomous driving business, since you’ve introduced your new chip for autonomous cars? Do you think it’s still in the early stage for this kind of business, or do you see some kind of wave coming up and sweeping the industry? Can you tell us about your strategic thinking in this area?

Huang: First of all, the autonomous car has two computers. There’s the computer in the data center for developing the data processing that’s captured in cars, turning that data into trained models, developing the application, simulating the application, regressing or replaying against all of your history, building the map, generating the map, reconstructing the map if you will, and then doing CIC and then OTM. That first computer is essentially a self-driving car, except it’s in the data center. It does everything that the self-driving car does, except it’s very large, because it collects data from the entire fleet. That data center is the first part of the self-driving car system. It has data processing, AI learning, AI training, simulation and mapping.

And then the second part is you take that whole thing and put it into the car, a small version of it. That small version is what we call in our company — Orin is the name of the chip. The next version is called Thor. That chip has to do data processing, which is called perception or inference. It has to build a world model. It has to do mapping. It has to do path planning and control.

And both of these systems are running continuously, two computers. Nvidia’s business is on both sides. In fact, you could probably say that our data center business for autonomous driving is even larger, definitely larger, and frankly, long-term, the larger of the two parts. The reason for that is because the software development for autonomous vehicles, no matter how many, will never be finished. Every company will be running their own stack. That part of the business is quite significant.

GeForce Now is available via Nvidia Drive.
GeForce Now is available via Nvidia Drive.

We created OmniVerse — the first customer for OmniVerse is DRIVE Sim, a digital twin of the fleet, of the car. DRIVE Sim is going to be a very significant part of our autonomous driving business. We use it internally. We’ll make it available for other people to use. And then in the car, there are several things philosophically that we believe. If you look at the way that people were building ADAS systems in the past, and you look at the way Nvidia built it, we invented a chip called Xavier, which is really the world’s first software programmable robotics chip. It was designed for high-speed sensors. It has lots of deep learning processors. It has Cuda in it for localization mapping and path planning and control. A lot of people, when I first introduced Xavier, said why would anybody need such a large SOC? It turns out that Xavier wasn’t enough. We needed more.

Orin is a home run. If you look at our robotics business right now, which includes self-driving cars and shuttles and trucks and autonomous systems of all kinds, our entire robotics business is running already larger than $1 billion a year. Orin is on its way — the pipeline is $11 billion now. My sense is that our robotics business is on its way to doubling in a year, and it’s going to be a very big part of our business. Our philosophy, which is very different from people in this area in the past, is that there are several different technologies that come together to make robotics possible. One of them, of course, is deep learning. We were the first to bring deep learning to autonomous driving. Before us it was really based on lidars. It was based on hand-tuned computer vision algorithms that were developed by engineers. We used deep learning because we felt that was the most scalable way of doing it.

Second, everything that we did was software-defined. You could update the software very easily, because there are two computers. There’s the computer in the data center developing the software, and then we deploy the software into the car. If you want to do that on a large fleet and move fast and improve software on the basis of software engineering, then you need a really programmable chip. Our philosophy around using deep learning and a fully software-defined platform was really a good decision. It took a little longer because it cost more. People had to learn how to develop the software for it. But I think at this point, it’s a foregone conclusion that everybody will use this approach. It’s the right way going forward. Our robotics business is on track to be a very large business. It already is a very large business, and it’s going to be much bigger.

Q: On the AI generation you mentioned for Ada, which is not just generating new pixels, but now whole new frames, with the different sources that we have for AI-generated images, we see DALL-E and all these different algorithms blowing up on the internet. For video games, it may not be the best use case for that. But how can any other side of creation — you have technologies like broadcast and things focused on creators. How can other users besides game developers make use of that AI technology to generate new images, to export new frames, to stream at new framerates? Have you been studying that approach to making more use of that AI technology?

Huang: First of all, the ability to synthesize computer graphics at very high framerates using path tracing — not offline lighting, not pre-baked lighting, but everything synthesized in real time — is very important. The reason for that is it enables user-generated content. Remember, I mentioned in the keynote that nine of the world’s top 10 video games today were mods at one time. It was because somebody took the original game and modified it into an even more fun game, into a MOBA, into a five-on-five, into a PUBG. That required fans and enthusiasts to modify a particular game. That took a lot of effort.

I think that in the future, we’re going to have a lot more user-generated content. When you have user-generated content, they simply don’t have the large army of artists to put up another wall or tear down this other wall or modify the castle or modify the forest or do whatever they want to do. Whenever you modify those things, these structures, the world, then the lighting system is no longer accurate. Using Nvidia’s path tracing system and doing everything in real time, we made it possible for every lighting environment to be right, because we’re simulating light. No pre-baking is necessary. That’s a very big deal. In fact, if you combine RTX and DLSS 3 with OmniVerse — we’ve made a version of OmniVerse called RTX Remix for mods. If you combine these ideas, I believe user-generated content is going to flourish.

OmniVerse designs can use actual car data sets.
OmniVerse designs can use actual car datasets.

When you say user-generated worlds, what is that? People will say that’s the metaverse, and it is. The metaverse is about user-generated, user-created worlds. And so I think that everybody is going to be a creator someday. You’ll take OmniVerse and RTX and this neural rendering technology and generate new worlds. Once you can do that, once you can simulate the real world, the question is, can you use your own hands to create the whole world? The answer is no. The reason for that is because we have the benefit in our world of mother nature to help us. In virtual worlds we don’t have that. But we have AI. We’ll simply say, give me an ocean. Give me a river. Give me a pond. Give me a forest. Give me a grove of palm trees. You describe whatever you want to describe and AI will synthesize, right in front of you, the 3D world. Which you can then modify.

This world that I’m describing requires a new way of doing computer graphics. We call it neural rendering. The computing platform behind it we call RTX. It’s really about, number one, making video games, today’s video games, a lot better. Making the framerate higher. Many of the games today, because the worlds are so big, they’ve become CPU limited. Using frame generation in DLSS 3 we can improve the framerates still, which is pretty amazing. On the other hand this whole world of user-generated content is the second. And then the third is the environment that we’re in today.

This video conference that we’re in today is rather archaic. In the 1960s video conferencing was really created. In the future, video conferencing will not be encode and decode. In the future it will be perception and generation. Perception and generation. Your camera will be on your side to perceive you, and then on my side it will be generating. You can control how that generation is done. As a result everybody’s framerate will be better. Everybody’s visual quality will be better. The amount of bandwidth used will be tiny, just a little tiny bit of bandwidth, maybe in kilobits per second, not megabits. The ability for us to use neural rendering for video conferencing is going to be a very exciting future. It’s another way of saying telepresence. There are a whole lot of different applications for it.

Q: I noticed in the presentation that there was no NVlink connector on the cards. Is that completely gone for Ada?

Huang: There is no NVlink on Ada. The reason why we took it out is because we needed the I/Os for something else. We used the I/Os and the area to cram in as much AI processing as we could. And also, because Ada is based on PCIe Gen 5, we now have the ability to do peer-to-peer across Gen 5 that’s sufficiently fast that it was a better tradeoff. That’s the reason. 

Q: Back to the trade issue, do you have a big-picture philosophy about trade restrictions and their potential for disrupting innovation? 

Huang: Well, first of all, there needs to be fair trade. That’s questionable. There needs to be national security. That’s always a concern. There are a lot of things that maybe somebody knows that we don’t know. However, nothing could be absolute. There just have to be degrees. You can’t have open, completely open unfair trade. You can’t have completely unfettered access to technology without concern for national security. But you can’t have no trade. And you can’t have no business. It’s just a matter of degrees. The limitations and the licensing restrictions that we’re affected by give us plenty of room to continue to conduct business in China with our partners. It gives us plenty of room to innovate and continue to serve our customers there. In the event that the most extreme examples and use of our technology is needed, we can go seek a license.

From my perspective, the restriction is no different than any other technology restriction that’s been placed on export control. Many other technology restrictions exist on CPUs. CPUs have had restrictions for a very long time, and yet CPUs are widely used around the world, freely used around the world. The reason why we had to disclose this is because it came in the middle of the quarter, and it came suddenly. Because we’re in the middle of the quarter we thought it was material to investors. It’s a significant part of our business. To others that were affected, it wasn’t a significant part of their business, because accelerated computing is still rather small outside of Nvidia. But to us it was a very significant part of our business, and so we had to disclose. But the restrictions themselves, with respect to serving customers based on the Ampere and Hopper architectures, we have a very large envelope to innovate and to serve our customers. From that perspective, I’m not at all concerned. 

Microsoft Flight Simulator doubles its frame rate using DLSS3 on a new Nvidia GPU.
Microsoft Flight Simulator doubles its frame rate using DLSS3 on a new Nvidia GPU.

Q: 4000 is finally here, which for you I’m sure feels like a huge launch. The reaction universally I am seeing out there is, oh my God, it costs so much money. Is there anything you would like to say to the community regarding pricing on the new generation of parts? Can they expect to see better pricing at some point? Basically, can you address the loud screams I’m seeing everywhere?

Huang: First of all, a 12” wafer is a lot more expensive today than it was yesterday. It’s not a little bit more expensive. It is a ton more expensive. Moore’s Law is dead. The ability for Moore’s Law to deliver twice the performance at the same cost, or the same performance [for] half the cost in every year and a half, it’s over. It’s completely over. The idea that the chip is going to go down in cost over time, unfortunately, is a story of the past. The future is about accelerated full stack. You have to come up with new architectures, come up with as good a chip design as you can, and then of course computing is not a chip problem. Computing is a software and a chip problem. We call it a full stack challenge. We innovate across the full stack.

For all of our gamers out there, here’s what I’d like you to remember and to hopefully notice. At the same price point, based on what I just said earlier, even though our costs, our materials costs are greater than they used to be, the performance of Nvidia’s $899 GPU or $1599 GPU a year ago, two years ago — our performance with Ada Lovelace is monumentally better. Off the charts better. That’s really the basis to look at it. Of course, the numbering system is just a numbering system. If you go back, 3080 compared to 1080 compared to 980 compared to 680 compared to 280, all the way back to the 280 — a 280, obviously, was a lot lower price in the past. 

Over time, we have to create in order to pursue advances in computer graphics on the one hand, deliver more value at the same price point on the other hand, expand deeper into the market as well with lower and lower priced solutions — if you look at our track record, we’re doing all three all the time. We’re pushing the new frontiers of computer graphics further into new applications. Look at all the great things that have happened as a result of advancing GeForce. But at the same price point, our value delivered generationally is off the charts, and it remains off the charts this time. If they could just remember the price point, compare price point to price point, they’ll find that they’ll love Ada.

Q: You talked about everything you’re planning, the big expectations you have from the robotics business. Are there any things that keep you up at night business-wise, that could endanger your business and how it is going at the moment? Are there things you see as challenges you have to cope with?

Huang: This year, I would say that the number of external environmental challenges to the world’s industries is extraordinary. It started with COVID. Then there were supply chain challenges. Then there are entire supply chain shutdowns in China. Entire cities being locked down week to week. More supply chain challenges. All of a sudden, a war in Europe. Energy costs going up. Inflation going sky high. I don’t know. Anything else that can go wrong? However, those things don’t keep me up at night, because they’re out of our control. We try to be as agile as we can, make good decisions.

Three or four months ago we made some very good decisions as we saw the PC market start to slow down overall. When we saw the sell-through, because of inflation, starting to cause the consumer market to slow down, we realized that we were going to have too much inventory coming to us. Our inventory and our supply chain started at the later part of last year. Those wafers and those products are coming at us. When I realized that the sell-through was going to be limited, instead of continuing to ship, we shut ourselves down. We took two quarters of hard medicine. We sold into our customers, into the world, a lot lower than what was selling out of the channel. The channel, just the desktop gaming channel, call it $2.5 billion a quarter. We sold in a lot less than that in Q2 and Q3. We got ourselves prepared, got our channel prepared and our partners prepared, for the Ada launch.

I would say the things we can do something about, we try to make good decisions. The rest of it is continuing to innovate. During this incredible time we built Hopper. We invented DLSS 3. We invented neural rendering. We built OmniVerse. Grace is being built. Orin is being ramped. In the midst of all this we’re working on helping the world’s companies reduce their computing costs by accelerating them. If you can accelerate Hopper, Hopper can accelerate computing by a factor of five times for large language models. Even though you have to add Hopper to the system, the TCO is still improved by a factor of three. How do you improve TCO by a factor of three at the end of Moore’s Law? It’s pretty amazing, incredible results, helping customers save money while we invent new ideas and new opportunities for our customers to reinvent themselves. We’re focused on the right things. I’m certain that all of these challenges, environmental challenges, will pass, and then we’ll go back to doing amazing things. None of that keeps me up at night.

Hopper GPU
Hopper GPU

Q: You have started shipping H100. That’s great news for you. The big ramp from the spring. But with Lovelace now out, I’m curious. Are we going to see an L100? Can you provide any guidance on how you’re going to divvy up those two architectures this time around?

Huang: If you look at our graphics business, let’s go all the way back to Turing. During the Turing time — this is only two generations ago, or about four or five years ago — our core graphics business was basically two segments. One of them is desktop PCs, desktop gaming, and the other was workstations. Those were really the two. Desktop workstations and desktop gaming systems. The Ampere generation, because of its incredible energy efficiency, opened up a whole bunch of notebook business. Thin and light gaming systems, thin and light workstations became a real major driving force. In fact, our notebook business is quite large, almost proportionally very similar to our desktop business, or close to it. During the Ampere generation, we were also quite successful at taking it into the cloud, into the data center. It’s used in the data center because it’s ideal for inference. The Ampere generation saw great success for inference GPUs.

This generation you’re going to see several things. There are some new dynamics happening, long-term trends that are very clear. One of them has to do with cloud graphics. Cloud gaming is, of course, a very real thing now around the world. In China cloud gaming is going to be very large. There are a billion phones that game developers don’t know how to serve anymore. They make perfectly good connections, but the graphics are so poor that they don’t know how to take a game built for a modern iPhone 14 and have it run on a phone that’s five years old, because the technology has moved forward so fast. There’s a billion phones installed in just China. In the rest of the world I would think there’s a similar number of phones. Game developers don’t know how to serve those anymore with modern games. The best way to solve that is cloud gaming. You can reach integrated graphics. You can reach mobile devices and so on.

If you could do that for cloud gaming, then you can obviously do that for streaming applications that are graphics-intensive. For example, what used to be workstation applications that would run on PCs, in the future they’ll just be SaaS that streams from the cloud. The GPU will be one of the— currently it’s A4s, A40s, A10s. Those Ampere GPUS will be streaming graphics-intensive applications. And then there’s the new one that’s quite important, and that’s augmented reality streaming to your phone. Short-form videos, image enhancement of videos, maybe re-posing, so that your eyes are making eye contact with everybody. Maybe it’s just a perfectly beautiful photograph and you’re animating the face. Those kinds of augmented reality applications are going to use GPUs in the cloud. In the Ada generation, we’re going to see probably the largest installation using graphics-intensive GPUs in the cloud for AI, graphics, computer vision, streaming. It’s going to be the universal accelerator. That’s definitely going to come. In fact, I didn’t call it L100, I called it L40. L40 is going to be our high-end Ada GPU. It’s going to be used for OmniVerse, for augmented reality, for cloud graphics, for inference, for training, for all of it. L40 is going to be a phenomenal cloud graphics GPU.

Q: It seems like a big part of the stuff you’re releasing, the car side, the medical side — it feels like very few people are in AI safety. It seems like it’s more hardware accelerated. Can you talk about the importance of AI safety?

Huang: It’s a large question. Let me break it down into a few parts, just as a starting point. There’s trustworthy AI questions in general. But even if you developed an AI model that you believe you trust, that you trained with properly curated data, that you don’t believe is overly biased or unnecessarily biased or undesirably biased — even if you came up with that model, in the context of safety, you want to have several things. The first thing is you want to have diversity and redundancy. One example would be in the context of a self-driving car. You want to observe where there are obstacles, but you also want to observe where there is the absence of obstacles, what we call a free space. Obstacles to avoid, free space that you can drive through. These two models, if overlaid on top of each other, give you diversity and redundancy.

TSMC makes chips for Nvidia
TSMC makes chips for Nvidia

We do that in companies. We do that in the medical field. It’s called multimodality and so forth. We have diversity in algorithms. We have diversity in compute, so that we do processing in two different ways. We do diversity using sensors. Some of it comes from cameras. Some of it comes from radar. Some of it comes from structure for motion. Some of it comes from lidar. You have different sensors and different algorithms, and then different compute. These are layers of safety.

And then the next part is, let’s suppose you design a system that you know to be active safety capable. You believe it’s resilient in that way. How do you know that it’s not tampered with? You designed it properly, but somebody came in and tampered with it and caused it to not be safe. We have to make sure that we have a technology called confidential computing. Everything from booting up the system, so that measure at boot that nobody tampered, to encrypting the model and making sure it wasn’t tampered with, to processing the software in a way that you can’t probe it and change it. Even that is affected. And then all the way back to the methodology of developing software.

Once you certify and validate a full stack to be safe, you want to make sure that all the engineers in the company and everybody contributing to it are contributing to the software and improving the software in a way that retains its ability to remain certified and remain safe. There’s the culture. There’s the tools used. There are methodologies. There are standards for documentation and coding. Everything from — I just mentioned tamper-proof in the car. The data center is tamper-proof. Otherwise somebody could tamper with the model in the data center just before we OTA the model to the car. Anyway, active safety, safety design into software, and safety design into AI is a very large topic. We dedicate ourselves to doing this right. 

Q: Nvidia had pre-ordered production capacity from TSMC further in advance than normal due to the shortages we were experiencing. Do AIBs also have to pre-order GPU supply that far in advance? With the reduction you’ve seen in prices, like the 3080ti, 3090ti, are there rebates, incentives with any of those prices that AIBs can take advantage of?

Huang: Last year the supply chain was so challenged. Two things happened. One thing is the lead times extended. Lead times used to be about four months from placing a PO on the wafer starts to the time you would ship the products. Maybe slightly longer. Sixteen weeks? It extended all the way to a year and a half. It’s not just the wafer starts. You have substrates to deal with, voltage regulators, all kinds of things in order for us to ship a product. It includes a whole bunch of system components. Our cycle time extended tremendously, number one. Number two, because everything was so scarce, you had to secure your allocation in advance, which then causes you to further secure allocation by probably about a year. Somewhere between normal operating conditions of four months to all of a sudden about two years or so of having to arrange for this. And we were growing so fast. Our data center business was growing nearly 100 percent each year. That’s a multi-billion-dollar business. You can just imagine, between our growth rate and the additional cycle time, how much commitment we had to place. That’s the reason why we had to make the hard decision as demand slowed down, particularly among consumers, to really dramatically slow down shipments and let the channel inventory take care of itself.

With respect to AIBs, the AIBs don’t have to place lead time orders. We ordered the components no matter what. Our AIBs are agile. We carried the vast majority of the inventory. When the market was really hot, the channel, our selling price was all exactly the same. It never moved a dollar. Our component costs kept going up, as people knew last year, but we absorbed all the increases in cost. We passed zero dollars forward to the market. We kept all of our product prices exactly at the MSRP we launched at. Our AIBs had the benefit of creating different SKUs that allowed them to capture more value. The channel, of course, the distributors and retailers, benefited during the time when the product was hot.

When the demand slowed, we took the action to create marketing, what we call marketing programs. But basically discount programs, rebate programs, that allowed the pricing in the market to come back to a price point that we felt, or the market felt, would ultimately sell through. The combination of the commitments that we made, which led to you — you guys saw that we wrote down about a billion dollars worth of inventory. Secondarily, we put a few hundred million dollars into marketing programs to help the channel reset its price. Between these two actions that we took a few months ago, we should be in a good spot in Q4 as Ada ramps hard. I’m looking forward to that. Those decisions were painful, but they were necessary. It’s six months of hardship, and hopefully after that we can move on.

Q: I was wondering if you could address why there wasn’t an RTX 4070, and if a 4070 will arrive. Are you telling consumers to buy a 3000 series card instead?

Huang: We don’t have everything ready to roll everything out at one time. What we have ready is 4090 and 4080. Over time we’ll get other products in the lower end of the stack out to the market. But it’s not any more complicated than — we usually start at the high end, because that’s where the enthusiasts want to refresh first. We’ve found that 4080 and 4090 is a good place to start. As soon as we can we’ll move further down the stack. But this is a great place to start.

Nvidia GeForce RTX 4090 graphics card
Nvidia GeForce RTX 4090 graphics card

Q: What are your thoughts on EVGA halting its production of graphics cards from the RTX 40 series onward? Was Nvidia in close discussion with EVGA as they came to this decision?

Huang: Andrew wanted to wind down the business. He’s wanted to do that for a couple of years. Andrew and EVGA are great partners and I’m sad to see them leave the market. But he has other plans and he’s been thinking about it for several years. I guess that’s about it. The market has a lot of great players. It will be served well after EVGA. But I’ll always miss them. They’re an important part of our history. Andrew is a great friend. It was just time for him to go do something else.

Q: What would you say to the Jensen of 30 years ago?

Huang: I would say to follow your dreams, your vision, your heart, just as we did. It was very scary in the beginning, because as you probably know from our history, we invented the GPU. At the time that we invented the GPU, there was no application for GPUs. Nobody cared about GPUs. At the time we came into the world to build a platform for video games, the video game market was tiny. It barely existed. We spoke about video games completely in 3D, and there weren’t even 3D design tools. You had to create 3D games practically by hand. We talked about a new computing model, accelerated computing, which was the foundation of our company in 1993. That new method of computing was so much work, nobody believed in it. Now, of course, I had no choice but to believe in it. It was our company and we wanted to make it successful. We pursued it with all of our might.

Along the way, slowly but surely, one customer after another, one partner after another, and one developer after another, the GPU became a very important platform. Nvidia invented programmable shading, which now defines modern computer graphics. It led us to invent RTX, to invent Cuda, to develop modern accelerated computing. It led us to AI. It led us to all the things we’re talking about today. All of it, every step of the way, without exception, nobody believed in it. GPU, programmable shading, Cuda, even deep learning. When I brought deep learning to the automotive industry everyone thought it was silly. In fact, one of the CEOs said, “You can’t even detect a German dog. How can you detect pedestrians?” They wrote us off. Deep learning at the time was not perfect, but today it’s of course reached superhuman capabilities.

The advice I would give a young Jensen is to stick with it. You’re doing the right thing. You have to pursue what you believe. You’re going to have a lot of people who don’t believe in it in the beginning, but not because they don’t believe you. It’s just because it’s hard to believe sometimes. How would anybody believe that the same processor that was used for playing Quake would be the processor that modernized computer science and brought AI to the world? The same processor we’re using for Portal turned out to be the same one that led to self-driving cars. Nobody would have believed it. First, you have to believe it, and then you have to help other people believe it. It could be a very long journey, but that’s okay. 

GamesBeat’s creed when covering the game industry is “where passion meets business.” What does this mean? We want to tell you how the news matters to you — not just as a decision-maker at a game studio, but also as a fan of games. Whether you read our articles, listen to our podcasts, or watch our videos, GamesBeat will help you learn about the industry and enjoy engaging with it. Discover our Briefings.

Repost: Original Source and Author Link

Categories
AI

The DeanBeat: Nvidia CEO Jensen Huang says AI will auto-populate the 3D imagery of the metaverse

Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! Watch here.


It takes AI kinds to make a virtual world. Nvidia CEO Jensen Huang said this week during a Q&A at the GTC22 online event that AI will auto-populate the 3D imagery of the metaverse.

He believes that AI will make the first pass at creating the 3D objects that populate the vast virtual worlds of the metaverse — and then human creators will take over and refine them to their liking. And while that is a very big claim about how smart AI will be, Nvidia has research to back it up.

Nvidia Research is announcing this morning a new AI model can help contribute to the massive virtual worlds created by growing numbers of companies and creators could be more easily populated with a diverse array of 3D buildings, vehicles, characters and more.

This kind of mundane imagery represents an enormous amount of tedious work. Nvidia said the real world is full of variety: streets are lined with unique buildings, with different vehicles whizzing by and diverse crowds passing through. Manually modeling a 3D virtual world that reflects this is incredibly time consuming, making it difficult to fill out a detailed digital environment.

This kind of task is what Nvidia wants to make easier with its Omniverse tools and cloud service. It hopes to make developers’ lives easier when it comes to creating metaverse applications. And auto-generating art — as we’ve seen happening with the likes of DALL-E and other AI models this year — is one way to alleviate the burden of building a universe of virtual worlds like in Snow Crash or Ready Player One.

Jensen Huang, CEO of Nvidia, speaking at the GTC22 keynote.

I asked Huang in a press Q&A earlier this week what could make the metaverse come faster. He alluded to the Nvidia Research work, though the company didn’t spill the beans until today.

“First of all, as you know, the metaverse is created by users. And it’s either created by us by hand, or it’s created by us with the help of AI,” Huang said. “And, and in the future, it’s very likely that we’ll describe will some characteristic of a house or characteristic of a city or something like that. And it’s like this city, or it’s like Toronto, or is like New York City, and it creates a new city for us. And maybe we don’t like it. We can give it additional prompts. Or we can just keep hitting “enter” until it automatically generates one that we would like to start from. And then from that, from that world, we will modify it. And so I think the AI for creating virtual worlds is being realized as we speak.”

GET3D details

Trained using only 2D images, Nvidia GET3D generates 3D shapes with high-fidelity textures and complex geometric details. These 3D objects are created in the same format used by popular graphics software applications, allowing users to immediately import their shapes into 3D renderers and game engines for further editing.

The generated objects could be used in 3D representations of buildings, outdoor spaces or entire cities, designed for industries including gaming, robotics, architecture and social media.

GET3D can generate a virtually unlimited number of 3D shapes based on the data it’s trained on. Like an artist who turns a lump of clay into a detailed sculpture, the model transforms numbers into complex 3D shapes.

“At the core of that is precisely the technology I was talking about just a second ago called large language models,” he said. “To be able to learn from all of the creations of humanity, and to be able to imagine a 3D world. And so from words, through a large language model, will come out someday, triangles, geometry, textures, and materials. And then from that, we would modify it. And, and because none of it is pre-baked, and none of it is pre-rendered, all of this simulation of physics and all the simulation of light has to be done in real time. And that’s the reason why the latest technologies that we’re creating with respect to RTX neuro rendering are so important. Because we can’t do it brute force. We need the help of artificial intelligence for us to do that.”

With a training dataset of 2D car images, for example, it creates a collection of sedans, trucks, race cars and vans. When trained on animal images, it comes up with creatures such as foxes, rhinos, horses and bears. Given chairs, the model generates assorted swivel chairs, dining chairs and cozy recliners.

“GET3D brings us a step closer to democratizing AI-powered 3D content creation,” said Sanja Fidler, vice president of AI research at Nvidia and a leader of the Toronto-based AI lab that created the tool. “Its ability to instantly generate textured 3D shapes could be a game-changer for developers, helping them rapidly populate virtual worlds with varied and interesting objects.”

GET3D is one of more than 20 Nvidia-authored papers and workshops accepted to the NeurIPS AI conference, taking place in New Orleans and virtually, Nov. 26-Dec. 4.

Nvidia said that, though quicker than manual methods, prior 3D generative AI models were limited in the level of detail they could produce. Even recent inverse rendering methods can only generate 3D objects based on 2D images taken from various angles, requiring developers to build one 3D shape at a time.

GET3D can instead churn out some 20 shapes a second when running inference on a single Nvidia graphics processing unit (GPU) — working like a generative adversarial network for 2D images, while generating 3D objects. The larger, more diverse the training dataset it’s learned from, the more varied and
detailed the output.

Nvidia researchers trained GET3D on synthetic data consisting of 2D images of 3D shapes captured from different camera angles. It took the team just two days to train the model on around a million images using Nvidia A100 Tensor Core GPUs.

GET3D gets its name from its ability to Generate Explicit Textured 3D meshes — meaning that the shapes it creates are in the form of a triangle mesh, like a papier-mâché model, covered with a textured material. This lets users easily import the objects into game engines, 3D modelers and film renderers — and edit them.

Once creators export GET3D-generated shapes to a graphics application, they can apply realistic lighting effects as the object moves or rotates in a scene. By incorporating another AI tool from NVIDIA Research, StyleGAN-NADA, developers can use text prompts to add a specific style to an image, such as modifying a rendered car to become a burned car or a taxi, or turning a regular house into a haunted one.

The researchers note that a future version of GET3D could use camera pose estimation techniques to allow developers to train the model on real-world data instead of synthetic datasets. It could also be improved to support universal generation — meaning developers could train GET3D on all kinds of 3D shapes at once, rather than needing to train it on one object category at a time.

Prologue is Brendan Greene's next project.
Prologue is Brendan Greene’s next project.

So AI will generate worlds, Huang said. Those worlds will be simulations, not just animations. And to run all of this, Huang foresees the need to create a “new type of datacenter around the world.” It’s called a GDN, not a CDN. It’s a graphics delivery network, battle tested through Nvidia’s GeForce Now cloud gaming service. Nvidia has taken that service and use it create Omniverse Cloud, a suite of tools that can be used to create Omniverse applications, any time and anywhere. The GDN will host cloud games as well as the metaverse tools of Omniverse Cloud.

This type of network could deliver real-time computing that is necessary for the metaverse.

“That is interactivity that is essentially instantaneous,” Huang said.

Are any game developers asking for this? Well, in fact, I know one who is. Brendan Greene, creator of battle royale game PlayerUnknown’s Productions, asked for this kind of technology this year when he announced Prologue and then revealed Project Artemis, an attempt to create a virtual world the size of the Earth. He said it could only be built with a combination of game design, user-generated content, and AI.

Well, holy shit.

GamesBeat’s creed when covering the game industry is “where passion meets business.” What does this mean? We want to tell you how the news matters to you — not just as a decision-maker at a game studio, but also as a fan of games. Whether you read our articles, listen to our podcasts, or watch our videos, GamesBeat will help you learn about the industry and enjoy engaging with it. Discover our Briefings.

Repost: Original Source and Author Link

Categories
AI

What will Jensen Huang cook up in his GTC keynote?

This article is part of the VB Lab / Nvidia GTC insight series.


NVIDIA is hosting another of its trademark GTC events next week, and many are wondering what CEO Jensen Huang might be cooking up in his November 9th keynote.

Judging by a few of his past keynotes, we think it’s safe to assume that he’ll focus on five major areas.

Accelerated computing: This has long been NVIDIA’s stock-in-trade — the combination of  GPUs and specialized software delivering outsized performance in a variety of domains. Gaming is an obvious example, with recent advances like RTX graphics and DLSS boosting performance and realism. But the same principle applies, as well, to dozens of other fields from data analytics to molecular biology to machine learning. We’re not expecting a new chip architecture, but NVIDIA likes to use these events to unveil new systems and software.

Data center: Jensen typically highlights new developments in cloud, data center and high-performance computing. The company has been moving more deeply into networking since closing its acquisition of Mellanox 18 months ago, and there could be more to hear there.

Omniverse: At recent GTCs, Jensen has talked a lot about Omniverse, the company’s platform for virtual collaboration and simulation, and the ability to create digital twins of structures from the physical world. We expect to hear how more companies are using it, and maybe see some cool demos.

Artificial intelligence: It’s a safe bet to assume that the company will reveal new software and research to advance AI from the cloud to the edge. NVIDIA’s been doing a lot of work recently in natural language processing, so we’re hoping to hear potential breakthroughs in conversational AI.

Robotics and self-driving cars: It’s been over a year since the company announced a partnership with Mercedes-Benz to help the German automaker build autonomous vehicles. We’re interested in seeing any updates on this, and whether they have any more news for robotics, beyond their recent announcement of a robotics developer toolbox.

The GTC keynote will premiere on Tuesday, November 9th, at 12 am PST; and will be re-broadcasted at 8 am PST for viewers in the Americas. No registration is required to view it.


VB Lab Insights content is created in collaboration with a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. Content produced by our editorial team is never influenced by advertisers or sponsors in any way. For more information, contact sales@venturebeat.com.

Repost: Original Source and Author Link

Categories
AI

Nvidia’s Jensen Huang to get semiconductor industry’s highest honor

All the sessions from Transform 2021 are available on-demand now. Watch now.


Nvidia CEO Jensen Huang will receive the chip industry’s highest honor, the Robert N. Noyce Award.

Huang will receive the honor from his peers at the Semiconductor Industry Association (SIA) annual awards dinner on November 18. The award is named after Intel cofounder Robert Noyce, who is credited with numerous pioneering achievements at the dawn of the chip industry. He was nicknamed the “mayor of Silicon Valley” and known for aphorisms like, “Don’t be encumbered by the past. Go out and do something wonderful.” Noyce passed away in 1990.

The award recognizes a leader who has made outstanding contributions to the semiconductor industry in technology or public policy.

SIA president and CEO John Neuffer said in a statement that Huang’s extraordinary vision and tireless execution have greatly strengthened the chip industry, revolutionized computing, and advanced artificial intelligence. He said Huang’s accomplishments have fueled countless innovations — from gaming to scientific computing to self-driving cars — and he continues to advance technologies that will transform the industry and the world.

Webinar

Three top investment pros open up about what it takes to get your video game funded.

Watch On Demand

CEO Jensen Huang shows off GeForce RTX 3000 series graphics cards.

Above: CEO Jensen Huang shows off GeForce RTX 3000 series graphics cards.

Image Credit: Nvidia

Huang founded Nvidia in 1993 and has served as CEO since its inception. Starting out in 3D graphics, Huang showed me a demo of the company’s graphics chip and its “Windows accelerator” application. That was when I was at the San Jose Mercury News in 1995, and it was Huang’s first interview with the press.

Nvidia went on to help build the 3D gaming market into the world’s largest entertainment industry. More recently, Nvidia tapped the parallel processing it used for its graphics processing units (GPUs) to do non-graphics compute tasks. That turned into a huge application in AI, where Nvidia’s chips are becoming the brains of computers, robots, and self-driving cars.

In the over 25 years since the company’s first chip, scene complexity in computer graphics has increased around 500 million times, Huang said. Moore’s Law, which predicts chip performance will double every couple of years, would have increased only 100,000 times in the same period if unaided by better chip design.

That relentless innovation has paid off. Nvidia is now worth $490 billion on the stock market and employs 20,000 people.

On to the metaverse

Jensen Huang is CEO of Nvidia. He gave a virtual keynote at the recent GTC event.

Above: Jensen Huang is CEO of Nvidia. He gave a virtual keynote at the recent GTC event.

Image Credit: Nvidia

Huang is also a fan of the intersection between science fiction and technology and has recently been talking more about the metaverse, the universe of virtual worlds that are all interconnected, like in novels such as Snow Crash and Ready Player One.

Huang is a recipient of the IEEE Founder’s Medal; the Dr. Morris Chang Exemplary Leadership Award; and honorary doctorate degrees from Taiwan’s National Chiao Tung University, National Taiwan University, and Oregon State University. In 2019, Harvard Business Review ranked him No. 1 on its list of the world’s 100 best-performing CEOs over the lifetime of their tenure. In 2017, he was named Fortune‘s Businessperson of the Year.

Prior to founding Nvidia, Huang worked at LSI Logic and Advanced Micro Devices. He holds a BSEE degree from Oregon State University and an MSEE degree from Stanford University.

Last year, the Noyce award went to Lisa Su, CEO of rival Advanced Micro Devices. She mentioned to me once that Huang is actually a distant relative of hers.

Jensen Huang in his early years as an engineer.

Above: Jensen Huang in his early years as an engineer.

Image Credit: Nvidia/CIE

“I am honored to receive the 2021 Noyce Award and do so on behalf of my colleagues at Nvidia, whose body of work this award recognizes,” Huang said. “It has been the greatest joy and privilege to have grown up with the semiconductor and computer industries, two that so profoundly impact the world. As we enter the era of AI, robotics, digital biology, and the metaverse, we will see super-exponential technology advances. There’s never been a more exciting or important time to be in the semiconductor and computer industries.”

He recently received a distinguished lifetime achievement award by the Asian American Engineer of the Year from the Chinese Institute of Engineers (CIE) group. Huang pointed out he was “destined to be an engineer,” as his father was an engineer in Taiwan. His brothers were engineers, and his wife, Lori, whom he met as a sophomore at Oregon State University, is also an engineer.

In his acceptance speech for the CIE award, Huang made a rare comment beyond Nvidia’s business matters, noting the scourge of recent anti-Asian violence: “Racism is one flywheel we must stop.”

GamesBeat

GamesBeat’s creed when covering the game industry is “where passion meets business.” What does this mean? We want to tell you how the news matters to you — not just as a decision-maker at a game studio, but also as a fan of games. Whether you read our articles, listen to our podcasts, or watch our videos, GamesBeat will help you learn about the industry and enjoy engaging with it.

How will you do that? Membership includes access to:

  • Newsletters, such as DeanBeat
  • The wonderful, educational, and fun speakers at our events
  • Networking opportunities
  • Special members-only interviews, chats, and “open office” events with GamesBeat staff
  • Chatting with community members, GamesBeat staff, and other guests in our Discord
  • And maybe even a fun prize or two
  • Introductions to like-minded parties

Become a member

Repost: Original Source and Author Link

Categories
AI

Nvidia CEO Jensen Huang weighs in on the metaverse, blockchain, and chip shortage

Elevate your enterprise data technology and strategy at Transform 2021.


Conversations with Nvidia CEO Jensen Huang are always blunt and illuminating because he still likes to have freewheeling chats with the press. During the recent online-only Computex event, he held an briefing with the press where he talked about the company’s recent announcements and then took a lot of questions.

I asked him about the metaverse, the universe of virtual worlds that are all interconnected, like in novels such as Snow Crash and Ready Player One. And he gave a detailed answer. Huang addressed a wide range of issues. He talked about Nvidia’s pending bid to buy Arm for $40 billion, as well as Nvidia’s effort to create Grace, an Arm-based CPU.

He also addressed progress on Nvidia’s own Omniverse, dubbed a “metaverse for engineers.” Huang talked about Nvidia’s presence in the Chinese market, the company’s efforts to discourage miners from buying all of its GPUs, Nvidia’s data processing units (DPUs), and Moore’s Law’s future and building fabs, competition from Advanced Micro Devices in graphics processing units (GPUs), and Nvidia’s reaction to the global semiconductor shortage.

I was part of a group of journalists who quizzed Huang. Here’s an edited transcript of the group interview.

Nvidia GeForce RTX 3080 Ti.

Above: Nvidia GeForce RTX 3080 Ti is its new card.

Image Credit: GamesBeat

Jensen Huang: Today I’m coming to you from Nvidia’s new building, called Voyager. This is our new facility. It was started about 2-and-a-half years ago. For the last year-and-a-half, I’ve not seen it. Today’s my first day on campus. Literally, for our event today, this is my first day on campus. It’s beautiful here. This facility is going to be the home of 3,500 Nvidians. It’s designed as a city inside a building. If you look behind me, it’s a sprawling city, and it’s a very large open space. It’s largely naturally lit. In fact, right now, as we speak, there’s a light in front of me, but everything behind us is barely lit. The reason for that is because there are all these panels in the sky that let light in.

We simulated this entire building using raytracing on our supercomputer DGX. The reason we did that is so we can balance the amount of light that comes in and the amount of energy, or otherwise heat, that we have to remove with air conditioning. The more light you bring in, the more AC you have to use. The less light you bring in, the more lighting you have to use. We have to simulate that fine balance.

The roof of this building is angled in just the right way such that the morning sun doesn’t come straight in, and the afternoon sun doesn’t come straight in. The slope of the roof line, the slope of the windows along the side, you’ll see everything was designed in such a way as to balance between natural light, which is comfortable for the eyes, and not having to use as much air conditioning as otherwise necessary. At the moment, no AC at all. This is the first day we’ve been in here. It’s incredibly comfortable.

Using a supercomputer to simulate architecture, I think this is going to happen for all buildings in the future. You’re going to design a building completely in virtual reality. The building is also designed to accommodate many robots. You’ll notice the hallways are very wide. In the future we imagine robots roaming the hallways carrying things to people, but also for telepresence, virtual presence. You can upload yourself into a robot and sit at your desk in your VR or AR headset and roam around the campus.

You’re the first in the world to be here. Welcome all of you, and I thank you for joining me today. I also want to send my thoughts and recognize that in Taiwan, COVID cases are growing again. I’m very sorry about that. I hope all of you are safe. I know that Taiwan was so rigorous in keeping the infection rates down, and so I’m terribly sorry to see it go up now. I know they can get it under control, and soon all of us will be able to see each other in person.

GeForce ecosystem

Let me say a couple of words about the announcement. We announced two basic things. In GeForce gaming, where Taiwan is the central hub of where our add-in card partners and many of our leading laptop partners are based, and the home of, the epicenter if you will, the GeForce ecosystem. It all starts there. It’s manufactured and assembled and integrated and it goes to the market through our add-in card partners and laptop builders.

Nvidia's RTX is used in more than 130 games.

Above: Nvidia’s RTX is used in more than 130 games.

Image Credit: Nvidia

The GeForce business is doing incredibly well. The invention of RTX has been a home run. It has reset and redefined computer graphics, completely reinvented modern computer graphics. It’s a journey that started more than 10 years ago, and a dream that started 35 years ago. It took that long for us to invent the possibility of doing realtime raytracing, which is really hard to do. It wasn’t until we were able to fuse our hardware accelerated raytracing core with the Tensor core GPU, AI processing, and a bunch of new rendering algorithms, that we were able to bring realtime raytracing to reality. RTX has reinvented computer graphics in the marketplace. RTX 30, the 30 family, the Ampere architecture family, has been fantastic.

We announced several things. We announced that we upgraded the RTX 30 family with the 3080Ti and the 3070Ti. It’s our regularly planned once per year upgrade to our high end GPUs. We also, with the partnership with all of our laptop partners, our AICs, launched 140 different laptops. Our laptop business is one of the fastest growing businesses in our company. This year we have twice as many notebooks going into the marketplace as we did with Turing, our last generation, RTX 20. This is one of the fastest growing businesses. The laptop business is the fastest growing segment of PCs. Nvidia laptops are growing at seven times the rate of the overall laptop business. It gives a sense of how fast RTX laptops are growing.

If you think about RTX laptops as a game console, it’s the largest game console in the world. There are more RTX laptops shipped each year than game consoles. If you were to compare the performance of a game console to an RTX, even an RTX 3060 would be 30-50 percent faster than a PlayStation 5. We have a game console, literally, in this little thin notebook, which is one of the reasons it’s selling so well. The same laptop also brings with it all of the software stacks and rendering stacks necessary for design applications, like Adobe and AutoDesk and all of these wonderful design and creative tools. The RTX laptop, RTX 3080Ti, RTX 3070Ti, and a whole bunch of new games, that was one major announcement.

Nvidia in the enterprise

The second thrust is enterprise, data centers. As you know, AI is software that can write software. Using machines you can write software that no human possibly can. It can learn from an enormous amount of data using an algorithm in an approach called deep learning. Deep learning isn’t just one algorithm. Deep learning is a whole bunch of algorithms. Some for image recognition, some for recognizing 2D to 3D, some for recognizing sequences, some for reinforcement learning in robotics. There’s a whole bunch of different algorithms that are associated with deep learning. But there’s no question that we can now write software that we’ve not been able to write before. We can automate a bunch of things that we never thought would be possible in our generation.

One of the most important things is natural language understanding. It’s now so good that you can summarize an entire chapter of a book, or the whole book. Pretty soon you can summarize a movie. Watch the movie, listen to the words, and summarize it in a wonderful way. You can have questions and answers with an NLU model.

AI has made tremendous breakthroughs, but has largely been used by the internet companies, the cloud service providers and internet services. What we announced at GTC initially a few weeks ago, and then what we announced at Computex, is a brand new platform that’s called Nvidia Certified AI for Enterprise. Nvidia Certified systems running a software stack we call Nvidia AI Enterprise. The software stack makes it possible to achieve world class capabilities in AI with a bunch of tools and pre-trained AI models. A pre-trained AI model is like a new college grad. They got a bunch of education. They’re trained. But you have to adapt them into your job and to your profession, your industry. But they’re pre-trained and really smart. They’re smart at image recognition, at language understanding, and so on.

We have this Nvidia AI Enterprise that sits on top of a body of work that we collaborated on with VMware. That sits on top of Nvidia Certified servers from the world’s leading computer makers, many of them in Taiwan, all over the world, and these are high-volume servers that incorporate our Ampere generation data center GPUs and our Mellanox BlueField DPUs. This whole stack gives you a cloud native–it’s like having an AI cloud, but it’s in your company. It comes with a bunch of tools and capabilities for you to be able to adapt it.

How would you use it? Health care would use it for image recognition in radiology, for example. Retail will use it for automatic checkout. Warehouses and logistics, moving products, tracking inventory automatically. Cities would use these to monitor traffic. Airports would use it in case someone lost baggage, it could instantly find it. There are all kinds of applications for AI in enterprises. I expect enterprise AI, what some people call the industrial edge, will be the largest opportunity of all. It’ll be the largest AI opportunity.

With the overall trend, what all of these announcements show is that Nvidia accelerated computing is gaining momentum. We had our company grow a lot last year, as many of you know. This last quarter we had a record quarter across all our product lines. We expect the next quarter to be another great quarter, and the second half also to be a great growth second half. It’s very clear that the world of computing is changing, that accelerated computing is making a contribution, and one of the most important applications is AI.

The metaverse

BMW Group is using Omniverse to build a digital factory that will mirror a real-world place.

Above: BMW Group is using Nvidia’s Omniverse to build a digital factory that will mirror a real-world place.

Image Credit: Nvidia

Question: I wonder about your latest thoughts on the metaverse and how we’re making progress toward that. Do you see steps happening in the process of creating the metaverse?

Huang: You’ve been talking about the metaverse for some time, and you’ve had interest in this area for a long time. I believe we’re right on the cusp of it. The metaverse, as you know, for all of you who are learning about it and hearing about it, it’s a virtual world that connects to the world that we live in. It’s a virtual world that is shared by a lot of people. It has real design. It has a real economy. You have a real avatar. That avatar belongs to you and is you. It could be a photoreal avatar of you, or a character.

In these metaverses, you’ll spend time with your friends. You’ll communicate, for example. We could be, in the future, in a metaverse right now. It will be a communications metaverse. It won’t be flat. It’ll be 3D. We’ll be able to almost feel like we’re there with each other. It’s how we do time travel. It’s how we travel to far places at the speed of light. It could simulate the future. There will be many types of metaverses, and video games are one of them, for example. Fortnite will eventually evolve into a form of metaverse, or some derivative of it. World of Warcraft, you can imagine, will someday evolve into a form of metaverse. There will be video game versions.

There will be AR versions, where the art that you have is a digital art. You own it using NFT. You’ll display that beautiful art, that’s one of a kind, and it’s completely digital. You’ll have our glasses on or your phone. You can see that it’s sitting right there, perfectly lit, and it belongs to you. We’ll see this overlay, a metaverse overlay if you will, into our physical world.

In the world of industry, the example I was giving earlier, this building exists fully in virtual reality. This building completely exists in VR. We designed it completely digitally. We’re going to build it out so that there will be a digital twin of this very physical building in VR. We’ll be able to simulate everything, train our robots in it. We can simulate how best to distribute the air conditioning to reduce the energy consumption. Design certain shapeshifting mechanisms that block sunlight while letting in as much light as possible. We can simulate all of that in our digital twin, our building metaverse, before we deploy anything here in the physical world. We’ll be able to go in and out of it using VR and AR.

Those are all pieces that have to come together. One of the most important technologies that we have to build, for several of them–in the case of consumers, one of the important technologies is AR, and it’s coming along. AR is important. VR is becoming more accessible and easier to use. It’s coming along. In the case of the industrial metaverse, one of the most important technologies is physically based, physically simulated VR environments. An object that you design in the metaverse, if you drop it to the ground, it’ll fall to the ground, because it obeys the laws of physics. The lighting condition will be exactly as we see. Materials will be simulated physically.

These things are essential components of it, and that’s the reason why we invented the Nvidia Omniverse. If you haven’t had a chance to look at it, it’s so important. It’s one of our most important bodies of work. It combines almost everything that Nvidia has ever built. Omniverse is now in open beta. It’s being tested by 400 companies around the world. It’s used at BMW to create a digital factory. It’s used by WPP, the world’s largest advertising agency. It’s used by large simulation architects. Bentley, the world’s largest designer of large infrastructure, they just announced that they’ll use Omniverse to create digital twins. Omniverse is very important work, and it’s worth taking a look at.

Chinese market

Nvidia GeForce RTX 3080 Ti graphics card.

Above: Nvidia GeForce RTX 3080 Ti graphics card.

Image Credit: Nvidia

Question: You mentioned the opportunities ahead of Nvidia. The recent trend in China is that China has seen a lot of GPU startups emerge in the last one or two years. It’s received billions in funding from VCs. China has a lot of reasons to develop its own Nvidia in the next few years. Are you concerned that your Chinese customers are hoping to develop a rival for you in this market?

Huang: We’ve had competition, intense competition, from companies that are gigantic, since the founding of our company. What we need to do is we need to make sure we continue to run very fast. Our company is able to invest, in a couple of years, which is one generation, $10 billion to do one thing. After investing in it for 30 years. We have a great deal of expertise and scale. We have the ability to invest greatly. We care deeply about this marketplace. We’re going to continue to run very fast. Our company’s position, of course, is not certain. We have to take all of the competition, respect them, and take them seriously, and recognize that there are many places where you could contribute to AI. We just have to keep on running hard.

However, here’s my prediction. Every data center and every server will be accelerated. The GPU is the ideal accelerator for these general purpose applications. There will be hundreds of millions of data centers. Not just 100 data centers or 1,000 data centers, but 100 million. The data centers will be in retail stores, in 5G base stations, in warehouses, in schools and banks and airports. They’ll be everywhere. Street corners. They will all be data centers. The market opportunity is quite large. This is the largest market opportunity the IT industry has ever seen. I can understand why it inspires so many competitors. We just need to continue to do our best work and run as fast as we can.

Question: Are you also worried about the government interfering in this space?

Huang: I believe that we add value to the marketplace. Nvidia’s position in China, and our contribution to China, is good. It has helped the internet companies, helped many startups, helped researchers developing AI. It’s wonderful for the gaming business and the design business. We make a lot of contributions to the IT ecosystem in China. I think the government recognizes that. My sense is that we’re welcome in China and we’ll continue to work hard to deserve to be welcome in China, and every other country for that matter. We’ll do that.

China’s game makers

Nvidia's GeForce RTX 3050 will power new laptops.

Above: Nvidia’s GeForce RTX 3050 will power new laptops.

Image Credit: Nvidia

Question: We’ve seen a few keynotes about games, and we’ve seen more and more Chinese games, games developed by Chinese companies. How do you position or commend Chinese developers? What does Nvidia plan to do to support the Chinese gaming ecosystem?

Huang: We do several things that developers love. The first thing is our installed base is very big. If you’re a developer and you develop on Nvidia’s platform, because all of our platform, all of our GeForce, are compatible–we work so hard to make sure that all of the software is high quality. We maintain and continue to update the software, to keep tuning every single GPU for every game. Every GPU, every game, we’re constantly tuning. We have a large group of engineers constantly studying and looking for ways to improve. We use our platform called GeForce Experience to update the software for the gamer.

The first thing is our installed base is very large, then. Our software quality is very good. But very important, one of the things that content developers, game developers love is our expertise in computer graphics, working with them to bring beautiful graphics to their games is excellent. We’ve invented so many algorithms. We invented programmable shading, as you know. This is almost 20 years ago, we invented the programmable pixel and vertex shaders in the GPU. We invented RTX. We teach people how to use programmable shading to create special effects, how to use RTX to create raytracing and ambient occlusion and global illumination, really beautiful computer graphics. We have a lot of expertise and a lot of technology that we can use to work with gamers to incorporate that into their games so that they’re as beautiful as possible.

When it’s done, we have fantastic marketing. We have such a large reach, we can help the developers promote their games all over the world. Many of the Chinese developers would like to reach the rest of the world, because their games are now triple-A quality, and they should be able to go all over the world. There are several reasons why game developers enjoy working with us, and those are the reasons.

Nvidia’s Grace Arm CPU

Nvidia's Grace CPU for datacenters.

Above: Nvidia’s Grace CPU for datacenters is named after Grace Hopper.

Image Credit: Nvidia

Question: At GTC you announced Grace, which seems like a big project. An ARM CPU is hard to implement. Do you think ARM can overtake the x86 processor in the server market in the future?

Huang: First of all, I think the future world is very diversified. It will be x86. It will be ARM. It will be big CPUs, small CPUs, edge CPUs, data center CPUs, supercomputing CPUs, enterprise computing CPUs, lots of CPUs. I think the world is very diversified. There is no one answer.

Our strategy is one where we’ll continue to support the x86 CPUs in the markets we serve. We don’t serve every market. We serve high-performance computing. We serve AI. We serve computer graphics. We serve the markets that we serve. For the markets that we serve, not every CPU is perfect, but some CPUs are quite ideal. Depending on the market, and depending on the application, the computing requirements, we will use the right CPU.

Sometimes the right CPU is Intel x86. For example, we have 140 laptops. The vast majority of them are Intel CPUs. We have DGX systems. We need a lot of PCI Express. It was great to use the AMD CPU. In the case of 5G base stations, Marvell’s CPU is ideal. They’re based on ARM. Cloud hyperscale, Ampere Computing’s Altra CPU is excellent. Graviton 2 is excellent. It’s fantastic. We support those. In Japan, Fujitsu’s CPU is incredible for supercomputing. We’ll support that. Different types of CPUs are designed for different applications.

The CPU we designed has never been designed before. No CPU has ever been able to achieve the level of memory bandwidth and memory capacity that we have designed for. It is designed for big data analytics. It’s designed for the state of the art in AI. There are two primary models, or AI models, that we are very interested in advancing, because they’re so important. The first one is the recommender system. It’s the most valuable piece of software, approach of software, that the world has ever known. It drives all the internet companies, all the internet services. The recommender system is very important, incredibly important science. It’s designed for that. The second is natural language understanding, which requires a lot of memory, a lot of data, to train a very smart AI for having conversational AI, answering questions, making recommendations, and so on.

These two models are probably, my estimation, the most valuable software in the world today. It requires a very large machine. We decided that we would design something just for those types of applications, where big AI is necessary. Meanwhile, there are so many different markets and edges and enterprises and this and that. We’ll support the CPUs that are right for them. I believe the future is about diversity. I believe the future is about variability and customization and those kinds of things. ARM is a great strategy for us, and x86 will remain a great strategy for us.

Arm deal

Simon Segars is CEO of Arm.

Above: Simon Segars is CEO of Arm.

Image Credit: Arm

Question: You recently had the earnings call where you talked a bit about the ARM deal, and Simon Segar’s keynote mentioned it as well, that he’s looking forward to the deal, combining their ecosystem plus all the AI capabilities of Nvidia. Is there any update about the next steps for you guys?

Huang: We’re going through the regulatory approval. It takes about 18 months. The process typically goes U.S., then the EC, and then China last. That’s the typical journey. Mellanox took about 18 months, or close to it. I expect this one to take about 18 months. That makes it early next year, or late this year.

I’m confident about the transaction. The regulators are looking for, is this good for competition? Is it pro-competitive? Does it bring innovation to the market? Does it give customers more choice? Does it give customers more offerings and more choice? You can see that on first principles, because our companies are completely complementary–they build CPUs, we build GPUs and DPUs. They don’t build GPUs. Our companies are complementary, and so by nature we’ll bring innovations that come as a result of coming together offering complementary things. It’s like ketchup and mustard coming together. It’s good for innovation.

Question: You mentioned that the acquisition will increase competition. Can you explain which areas you see for future competition? We see that AMD and also other players are starting to compete in GPUs, CPUs, and data centers.

Huang: First of all, it’s pro-competitive because it brings customers more choice. If we combine Nvidia and ARM, ARM’s R&D scale will be much larger. As you know, ARM is a big company. It’s not a small company. But Nvidia is much bigger. Our R&D budget is many times larger than ARM’s. Our combination will give them more R&D scale. It will give them technology that they don’t have the ability to build themselves, or the scale to build themselves, like all of the AI expertise that we have. We can bring those capabilities to ARM and to its market.

As a result of that, we will offer ARM customers more technology choice, better technology, more advanced technology. That ultimately is great for competition, because it allows ARM’s licensees to create even better products, more vibrant products, better leading-edge technology, which in the end market will give the end market more choice. That’s ultimately the fundamental reason for competition. It’s customer choice. More vibrant innovation, more R&D scale, more R&D expertise brings customers more choice. That, I think, is at the core of it.

For us, it brings us a very large ecosystem of developers, which Nvidia as a company, because we’re an accelerated computing company–developers drive our business. And so with 15 million more developers — we have more than 30 million developers today — those 15 million developers will develop new software that ultimately will create value for our company. Our technology, through their channel, creates value for their company. The combination is a win-win.

Semiconductor shortage

Above: Jensen Huang of Nvidia stands in a virtual environment.

Image Credit: Nvidia

Question: I’m interested in your personal thoughts on the–we’ve had all the supply chain constraints on one hand, and then on the other hand a demand surplus when it comes to the crypto world. What’s your feeling? Is it like you’re making Ferraris and people are just parking them in the garage revving the engine for the sake of revving it? Do you see an end to proof of work blockchain in the future that might help resolve that issue? What are your thoughts on the push-pull in that space?

Huang: The reason why Ethereum chose our GPUs is because it’s the largest network of distributed supercomputers in the world. It’s programmable. When Bitcoin first came out, it used our GPU. When Ethereum came out it used our GPU. When other cryptocurrencies came out in the beginning, they established their credibility and their viability and integrity with proof of work using algorithms that run on our GPUs. It’s ideal. It’s the most energy efficient method, the most performant method, the fastest method, and has the benefit of very large distributed networks. That’s the origins of it.

Am I excited about proof of stake? The answer’s yes. I believe that the demand for Ethereum has reached such a high level that it would be nice for either somebody to come up with an ASIC that does it, or for there to be another method. Ethereum has established itself. It has the opportunity now to implement a second generation that carries on from the platform approach and all of the services that are built on top of it. It’s legitimate. It’s established. There’s a lot of credibility. It works well. A lot of people depend on it for DeFi and other things. This is a great time for proof of stake to come.

Now, as we go toward that transition, it’s now established that Ethereum is going to be quite valuable. There’s a future where the processing of these transactions can be a lot faster, and because there are so many people built on top of it now, Ethereum is going to be valuable. In the meantime there will be a lot of coins mined. That’s why we created this new product called CMP. CMP is right here. It looks like this. This is what a CMP looks like. It has no display connectors, as you can probably see.

The CMP is something we learned from the last generation. What we learned is that, first of all–CMP does not yield to GeForce. It’s not a GeForce put into a different box. It does not yield to our data center. It does not yield to our workstations. It doesn’t yield to any of our product lines. It has enough functionality that you can use it for crypto mining.

The $150 million we sold last quarter and the $400 million we’re projecting to sell this quarter essentially increased supply of our company by half a billion dollars. They were supply that we otherwise couldn’t use, and we diverted good yielding supply to GeForce gamers, to workstations and such. The first thing is that CMP effectively increases our supply. CMP also has the after benefit of not being able to be resold secondhand to GeForce customers because it doesn’t play games. These things we learned from the last cycle, and hopefully we can take some pressure off of the GeForce gaming side, getting more GeForce supply to gamers.

AI supercomputer Perlmutter

Above: Perlmutter, the largest NVIDIA A100-powered system in the world.

Image Credit: Nvidia

Question: There’s a shortage problem in the semiconductor market as a whole. The price of GPU products is getting higher. What do you think it will take to stabilize that price?

Huang: Our situation is very different than other people’s situations, as you can imagine. Nvidia doesn’t make commodity components. We’re not in the DRAM business or the flash business or the CPU business. Our products are not commodity-oriented. It’s very specific, for specific applications. In the case of GeForce, for example, we haven’t raised our price. Our price is basically the same. We have an MSRP. The channel end market prices are higher because demand is so strong.

Our strategy is to alleviate, to reduce the high demand that is caused by crypto mining, and create a special product, the CMP, directly for the crypto miners. If the crypto miners can buy, directly from us, a large volume of GPUs, and they don’t yield to GeForce, so they cannot be used for GeForce, but they can be used for crypto mining, it will discourage them from buying from the open market.

The second reason is we introduced new GeForce configurations that reduce the hash rate for crypto mining. We reduced the performance of our GPU on purpose so that if you would like to buy a GPU for gaming, you can. If you’d like to buy a GPU for crypto mining, either you can buy the CMP version, or if you really would like to use the GeForce to do it, unfortunately the performance will be reduced. This allows us to save our GPUs for the gamers, and hopefully, as a result, the pricing will slowly come down.

In terms of supply, it’s the case that the world’s technology industry has reshaped itself. As you know, cloud computing is growing very fast. In the cloud, the data centers are so big. The chips can be very powerful. That’s why die size, chip size continues to grow. The amount of leading-edge process it consumes is growing. Also, smartphones are using state of the art technology. The leading-edge process consumption used to see some distribution, but now the distribution is heavily skewed toward the leading edge. Technology is moving faster and faster.

The shape of the semiconductor industry changed because of these dynamics. In our case, we have demand that exceeds our supply. That’s for sure. However, as you saw from our last quarter’s performance, we have enough supply to grow significantly year over year. We have enough supply to grow in Q2 as we guided. We have enough supply to grow in the second half. However, I do wish we had more supply. We have enough supply to grow and grow very nicely. We’re very thankful for all of our supply chain and our partners supporting us. But the world is going to be reshaped because of cloud computing, because of the way that computing is going.

Question: When do you think the ongoing chip shortage problem could be solved?

Huang: It just depends on degree and for whom. As you know, we grew tremendously year over year. We announced a great quarter last year. Record quarter for GeForce, for workstations, for data centers. Although demand was even higher than that, we had enough supply to grow quite nicely year over year. We’ll grow in Q2. We’ll grow in the second half. We have supply to do that.

However, there are several dynamics that I think are foundational to our growth. RTX has reset computer graphics. Everyone who has a GTX is looking to upgrade to RTX. RTX is going to reset workstation graphics. There are 45 million designers and creators in the world, and growing. They used to use GTX, but now obviously everyone wants to move to RTX so they can do raytracing in real time. We have this pent-up demand because we reset and reinvented computer graphics. That’s going to drive our demand for some time. It will be several years of pent-up demand that needs to re-upgrade.

In the data center it’s because of AI, because of accelerated computing. You need it for AI and deep learning. We now add to it what I believe will be the long term biggest AI market, which is enterprise industries. Health care is going to be large. Manufacturing, transportation. These are the largest industries in the world. Even agriculture. Retail. Warehouses and logistics. These are giant industries, and they will all be based on AI to achieve productivity and capabilities for their customers.

Now we have that new platform that we just announced at Computex. We have many years of very exciting growth ahead of us. We’ll just keep working with our supply chain to inform them about the changing world of IT, so that they can be better prepared for the demand that’s coming in the future. But I believe that the areas that we’re in, the markets that we’re in, because we have very specific reasons, will have rich demand for some time to come.

AMD competition

Nvidia USPS

Above: AI algorithms were developed on NVIDIA DGX servers at a U.S. Postal Service Engineering facility.

Image Credit: Nvidia

Question: I see that AMD just announced bringing their RDNA 2 to ARM-based SOCs, collaborating with Samsung to bring raytracing and VR features to Android-based devices. Will there be some further plan from Nvidia to bring RTX technology to consumer devices with ARM-based CPUs?

Huang: Maybe. You know that we build lots of ARM SOCs. We build ARM SOCs for robotics, for the Nintendo Switch, for our self-driving cars. We’re very good at building ARM SOCs. The ARM consumer market, I believe, especially for PCs and raytracing games–raytracing games are quite large, to be honest. The data set is quite large. There will be a time for it. When the time is right we might consider it. But in the meantime we use our SOCs for autonomous vehicles, autonomous machines, robots, and for Android devices we bring the best games using GeForce Now.

As you know, GeForce Now has more than 10 million gamers on it now. It’s in 70 countries. We’re about to bring it to the southern hemisphere. I’m excited about that. It has 1,000 games, 300 publishers, and it streams in Taiwan. I hope you’re using it in Taiwan. That’s how we’d like to reach Android devices, Chrome devices, iOS devices, MacOS devices, Linux devices, all kinds of devices, whether it’s on TV or a mobile device. For us, right now, that’s the best strategy.

Moore’s Law and die size

Jensen Huang of Nvidia holds the world's largest graphics card.

Above: Jensen Huang of Nvidia holds the world’s largest graphics card.

Image Credit: Nvidia

Question: I wanted to ask you about die size. Obviously with Moore’s Law, it seems we have the choice of using Moore’s Law to either shrink the die size or pack more transistors in. In the next few generations, the next three years or so, do you see die sizes shrinking, or do you think they’ll stay stable, or even rise again?

Huang: Since the beginning of time, transistor time, die sizes have grown and grown. There’s no question die sizes are increasing. Because technology cycles are increasing in pace, new products are being introduced every year. There’s no time to cost reduce into smaller die sizes. If you look at the trend, it’s unquestionably to the upper right. If you look at the application space that we see, talking very specifically about us, if you look at our die sizes, there are always reticle limits now. The reticle limits are pretty spectacular. We can’t fit another transistor. That’s why we have to use multi-chip packing, of course. We created NVLink to put a bunch of them together. There’s all kinds of strategies to increase the effective die size.

One of the important things is that cloud data centers–so much of the computing experience you have on your phone is because of computers in the cloud. The cloud is a much bigger place. The data centers are larger. The electricity is more abundant. The cooling system is better. The die size can be very large. Die size is going to continue to grow, even as transistors continue to shrink.

Building fabs?

Question: It’s expensive to spin up fabs, but in light of the prolonged silicon crunch, is that on the horizon for Nvidia to consider, spinning up a fab for yourself?

Huang: No. Boy, that’s the shortest answer I’ve had all night. It’s the only answer I know, completely. The reason for that, you know there’s a difference between a kitchen and a restaurant. There’s a difference between a fab and a foundry. I can spin up a fab, no doubt, just like I can spin up a kitchen, but it won’t be a good restaurant. You can spin up a fab, but it won’t be a good foundry.

A foundry is a service-oriented business that combines service, agility, technology, capacity, courage, intuition about the future. It’s a lot of stuff. The business is not easy. What TSMC does for a living is not easy. It’s not going to get any easier, and it’s not getting easier. It’s getting harder. There are so many people who are so good at what they do. There’s no reason for us to go repeating that. We should encourage them to develop the necessary capacity for our platform’s benefit.

Meanwhile, they now realize that the leading-edge consumption, leading-edge wafer consumption, the shape has changed because of the way the computing industry is evolving. They see the opportunity in front of them. They’re racing as fast as they can to increase capacity. I don’t think there’s anything I can do, that a fabless semiconductor company can do, that can possibly catch up to any of them. So the answer is no.

Lightspeed Studio

Nvidia's Clara AI for COVID-19 diagnosis from CT scans

Above: Nvidia’s Clara AI for COVID-19 diagnosis from CT scans

Image Credit: Nvidia

Question: I wanted to ask a process question about Lightspeed Studio. Nvidia, a couple of years ago, spun up an internal development house to work on remastering older titles to help promote raytracing and the expansion of raytracing, but it’s been a couple of years since we heard about that studio. Do you have any updates about their future pipeline?

Huang: I love that question. Thank you for that. Lightspeed Studio is an Nvidia studio where we work on remastering classics, or we develop demo art that is really ground-breaking. The Lightspeed Studio guys did RTX Quake, of course. They did RTX Minecraft. If not for Lightspeed Studio, Minecraft RTX wouldn’t have happened. Recently they created Marbles, Marbles RTX, which has been downloaded and re-crafted into a whole bunch of marble games. They’ve been working on Omniverse. Lightspeed Studio has been working on Omniverse and the technologies associated with that, creating demos for that. Whenever you see our self-driving car simulating in a photorealistic, physically based city, that work is also Lightspeed Studio.

Lightspeed Studio is almost like Nvidia’s special forces. They go off and work on amazing things the world has never seen before. That’s their mission, to do what has been impossible before. They’re the Industrial Light and Magic, if you will, of realtime computer graphics.

DPUs

The Nvidia BlueField-2 DPU.

Above: The Nvidia BlueField-2 DPU.

Image Credit: Nvidia

Question: On the DPU side, could you give a quick narrative–now that you’ve announced BlueField 2 and you can buy these things in the market, people are starting to get them a bit more. A lot of the announcements, especially the Red Hat and IBM announcements with Morpheus, and the firewall announcements before, have been focused on the network side of DPUs. We know that DPUs and GPUs will combine in the future. But what is the road map looking like right now with market interest in DPUs?

Huang: BlueField is going to be a home run. This year BlueField 2 is being tested, and software developers are integrating it and developing software all over the place. Cloud service providers, we announced a bunch of computer makers that are taking BlueField to the market. We’ve announced a bunch of IT companies and software companies developing on BlueField.

There’s a fundamental reason why BlueField needs to exist. Because of security, because of software-defined data centers, you have to take the application plane, the application itself, and separate it from the operating system. You have to separate it from the software-defined network and storage. You have to separate it from the security services and the virtualization. You have to air gap them, because otherwise–every single data center in the future is going to be cloud native. You can’t protect it from the perimeter anymore. All of the intrusion software is coming in right from the cloud and entering into the middle of the data center, into every single computer. You have to make sure that every single server is completely secure. The way to do that is to separate the application, which could be malware, could be intrusion, from the control plane, so it doesn’t wander through the rest of the data center.

Now, once you separate it, you have a whole bunch of software you have to accelerate. Once you’ve separated the networking software down to BlueField, the storage software, the security service, and all the virtualization stack, that air gapping is going to cause a lot of computation to show up on BlueField. That’s why BlueField has to be so powerful. It has to be so good at processing the operating system of the world’s data center infrastructures.

Why are we going to start incorporating more AI into BlueField, into the GPU, and why do we want to put BlueField connected to our GPUs? The reason for that is because, if I can go backward, our GPUs will be in the data center, and every single data center node will be CPU plus a GPU for compute, and then it will be a BlueField with Tensor core processing, basically GPU, for AI necessary for realtime cybersecurity. Every single packet, every single application, will be monitored in real time in the future. Every data center will be in real time using AI to study everything. You’re not just going to secure a firewall at the edge of the data center. That’s way yesterday. The future is about zero trust, cloud native, high-performance computing data centers.

All the way out on the edge, you’ll have a very powerful, but it’s going to be on one chip–essentially an edge data center on one chip. Imagine a BlueField 4 which is really strong in security and networking and such. It has powerful ARM CPUs, data center scale CPUs, and of course our GPUs. That’s essentially a data center on one chip. We’ll put that on the edge. Retail stores, hospitals, banks, 5G base stations, you name it. That’s going to be what’s called the industrial edge AI.

However you want to think about it, the combination of BlueField and GPUs is going to be quite important, and as a result, you’ll see–where today, we have tens of millions of servers in data centers, in the future you’ll see hundreds of millions of server-class computers spread out all over the world. That’s the future. It’ll be cloud native and secure. It’ll be accelerated.

Limiting hash rates to thwart miners

Nvidia's RTX 3060 Ti is excellent.

Above: Nvidia’s RTX 3060 Ti is excellent.

Image Credit: GamesBeat

Question: Do you plan to limit hash rates in the future, and do you plan to release multiple versions of your products in the future, with and without reduced hash rates?

Huang: That second question, I actually don’t know the answer. I can’t tell you that I know the future. There’s a reason why we reduced hash rates. We want to steer. We want to protect the GeForce supply for gamers. Meanwhile, we created CMP for the crypto community. The combination of the two will make it possible for the price of GeForce to come down to more affordable levels. All of our gamers that want to have RTX can get access to it.

In the future, I believe–crypto mining will not go away. I believe that cryptocurrency is here to stay. It’s a legitimate way that people want to exchange value. You can argue about whether it has value store, but you can’t argue about value exchange. More important, Ethereum and other forms like it in the future are excellent distributed blockchain methods for securing transactions. You need that blockchain to have some fundamental value, and that fundamental value could be mined. Cryptocurrency is going to be here to stay. Ethereum might not be as hot as it is now. In a year’s time it may cool down some. But I think crypto mining is here to stay.

My intuition is that we will have CMPs and we’ll have GeForce. Hopefully we can serve the crypto miners with CMP. I also hope that crypto miners can buy–when mining becomes quite large, then they can create special bases. Or when it becomes super large, like Ethereum, they can move to proof of stake. It will be up and down, up and down, but hopefully never too big.

We’ll see how it turns out. But I think our current strategy is a good one. It’s very well-received. For us it increases, effectively, the capacity of our company, which we welcome. I’ll keep that question in mind. When I have a better answer I’ll let you know.

The Omniverse

WPP is using Omniverse to build ads remotely.

Above: WPP is using Omniverse to build ads remotely.

Image Credit: Nvidia

Question: Omniverse feels like it could become the basis of future digital twin technology. Currently Nvidia is incorporating into Omniverse mainly in the graphics field and the simulation field. But how far can this Omniverse technology expand the concept, as with chemical technology or sound waves?

Huang: It’s hard to say about chemical technology. With sonic waves, sonic waves are propagation-based like raytracing, and we can use similar techniques to that. Of course there’s a lot more refraction, and sound can reverberate around corners. But that’s very similar to global illumination as well. Raytracing technology could be an excellent accelerator for sonic wave propagation. Surely we can use raytracing for microwave propagation, or even millimeter wave propagation, such as 5G.

We could, in the future, use raytracing to simulate, using Omniverse, traffic going through a city, and adapt the 5G radio, in real time, using AI to optimize the strength of the millimeter wave radios to the right antennas, with cars and people moving around them. Simulate the whole geometry of the city. Incredible energy savings, incredible data rate throughput improvement.

In the case of Omniverse, back to that again, let me make a couple of predictions. This is very important. I believe that there will be a larger market, a larger industry, more designers and creators, designing digital things in virtual reality and metaverses than there will be designing things in the physical world. Today, most of the designers are designing cars and buildings and things like that. Purses and shoes. All of those things will be many times larger, maybe 100 times larger, in the metaverse than in our universe. Number two, the economy in the metaverse, the economy of Omniverse, will be larger than the economy in the physical world. Digital currency, cryptocurrency, could be used in the world of metaverses.

The question is, how do we create such a thing? How do you create a world, a virtual world, that is so realistic that you’re willing to build something for that virtual world? If it looks like a cartoon, why try to bother? If it looks beautiful and its exquisite and it’s worthy of an artist to dedicate a lot of time to create a beautiful building, because it looks so beautiful, or you build a beautiful product that looks so beautiful, only available in the digital world–you build a car that’s only available in the digital world. You can only buy it and drive it in the digital world. A piece of art you can only buy and enjoy in the digital world.

Nvidia Omniverse

Above: Nvidia Omniverse

Image Credit: Nvidia

I believe that several things have to happen. Number one, there needs to be an engine, and this is what Omniverse is created to do, for the metaverse that is photorealistic. It has the ability to render images that are very high fidelity. Number two, it has to obey the laws of physics. It has to obey the laws of particle physics, of gravity, of electromagnetism, of electromagnetic waves, such as light, radio waves. It has to obey the laws of pressure and sound. All of those things have to be obeyed. If we can create such an engine, where the laws of physics are obeyed and it’s photorealistic, then people are willing to create something very beautiful and put it into Omniverse.

Last, it has to be completely open. That’s why we selected the universal scene description language that Pixar invented. We dedicated a lot of resources to make it so that it has the ability to be dynamic, so that physics can happen through the USD, so that AI agents can go inside and out, so that these AI agents can come out through AR. We can go into Omniverse using VR, like a wormhole. And finally, Omniverse has to be scalable and in the cloud.

We have created an engine that is photoreal, obeys the laws of physics, rendering physically based materials, supports AI, and has wormholes that can go in and out using open standards. That’s Omniverse. It’s a giant body of work. We have some of the world’s best engineers and scientists working on it. We’ve been working on it for three years. This is going to be one of our most important bodies of work.

Some final thoughts. The computer industry is in the process of being completely reshaped. AI is one of the most powerful forces the computer industry has ever known. Imagine a computer that can write software by itself. What kind of software could it write? Accelerated computing is the path that people have recognized is a wonderful path forward as Moore’s Law in CPUs by itself has come to an end.

In the future, computers are going to continue to be small. PCs will do great. Phones will continue to be better. However, one of the most important areas in computing is going to be data centers. Not only is it big, but the way we program a data center has fundamentally changed. Can you imagine that one engineer could write a piece of software that runs across the entire data center and every computer is busy? And it’s supporting and serving millions of people at the same time. Data center scale computing has arrived, and it’s now the unit of computing. Not just the PC, but the entire data center.

Last, I believe that the confluence, the convergence of cloud native computing, AI, accelerated computing, and now finally the last piece of the puzzle, private 5G or industrial 5G, is going to make it possible for us to put computers everywhere. They’ll be in far-flung places. Broom closets and attics at retail stores. They’ll be everywhere, and they’ll be managed by one pane of glass. That one pane of glass will orchestrate all of these computers while they process data and process AI applications and make the right decisions on the spot.

Several of these dynamics are very important to the future of computing. We’re doing our best to contribute to that.

GamesBeat

GamesBeat’s creed when covering the game industry is “where passion meets business.” What does this mean? We want to tell you how the news matters to you — not just as a decision-maker at a game studio, but also as a fan of games. Whether you read our articles, listen to our podcasts, or watch our videos, GamesBeat will help you learn about the industry and enjoy engaging with it.

How will you do that? Membership includes access to:

  • Newsletters, such as DeanBeat
  • The wonderful, educational, and fun speakers at our events
  • Networking opportunities
  • Special members-only interviews, chats, and “open office” events with GamesBeat staff
  • Chatting with community members, GamesBeat staff, and other guests in our Discord
  • And maybe even a fun prize or two
  • Introductions to like-minded parties

Become a member

Repost: Original Source and Author Link

Categories
AI

Nvidia CEO Jensen Huang interview: From the Grace CPU to engineer’s metaverse

Join Transform 2021 this July 12-16. Register for the AI event of the year.


Nvidia CEO Jensen Huang delivered a keynote speech this week to 180,000 attendees registered for the GTC 21 online-only conference. And Huang dropped a bunch of news across multiple industries that show just how powerful Nvidia has become.

In his talk, Huang described Nvidia’s work on the Omniverse, a version of the metaverse for engineers. The company is starting out with a focus on the enterprise market, and hundreds of enterprises are already supporting and using it. Nvidia has spent hundreds of millions of dollars on the project, which is based on 3D data-sharing standard Universal Scene Description, originally created by Pixar and later open-sourced. The Omniverse is a place where Nvidia can test self-driving cars that use its AI chips and where all sorts of industries will able to test and design products before they’re built in the physical world.

Nvidia also unveiled its Grace central processing unit (CPU), an AI processor for datacenters based on the Arm architecture. Huang announced new DGX Station mini-sucomputers and said customers will be free to rent them as needed for smaller computing projects. And Nvidia unveiled its BlueField 3 data processing units (DPUs) for datacenter computing alongside new Atlan chips for self-driving cars.

Here’s an edited transcript of Huang’s group interview with the press this week. I asked the first question, and other members of the press asked the rest. Huang talked about everything from what the Omniverse means for the game industry to Nvidia’s plans to acquire Arm for $40 billion.

Jensen Huang, CEO of Nvidia, at GTC 21.

Above: Nvidia CEO Jensen Huang at GTC 21.

Image Credit: Nvidia

Jensen Huang: We had a great GTC. I hope you enjoyed the keynote and some of the talks. We had more than 180,000 registered attendees, 3 times larger than our largest-ever GTC. We had 1,600 talks from some amazing speakers and researchers and scientists. The talks covered a broad range of important topics, from AI [to] 5G, quantum computing, natural language understanding, recommender systems, the most important AI algorithm of our time, self-driving cars, health care, cybersecurity, robotics, edge IOT — the spectrum of topics was stunning. It was very exciting.

Question: I know that the first version of Omniverse is for enterprise, but I’m curious about how you would get game developers to embrace this. Are you hoping or expecting that game developers will build their own versions of a metaverse in Omniverse and eventually try to host consumer metaverses inside Omniverse? Or do you see a different purpose when it’s specifically related to game developers?

Huang: Game development is one of the most complex design pipelines in the world today. I predict that more things will be designed in the virtual world, many of them for games, than there will be designed in the physical world. They will be every bit as high quality and high fidelity, every bit as exquisite, but there will be more buildings, more cars, more boats, more coins, and all of them — there will be so much stuff designed in there. And it’s not designed to be a game prop. It’s designed to be a real product. For a lot of people, they’ll feel that it’s as real to them in the digital world as it is in the physical world.

Omniverse lets artists design hotels in a 3D space.

Above: Omniverse lets artists design hotels in a 3D space.

Image Credit: Leeza SOHO, Beijing by ZAHA HADID ARCHITECTS

Omniverse enables game developers working across this complicated pipeline, first of all, to be able to connect. Someone doing rigging for the animation or someone doing textures or someone designing geometry or someone doing lighting, all of these different parts of the design pipeline are complicated. Now they have Omniverse to connect into. Everyone can see what everyone else is doing, rendering in a fidelity that is at the level of what everyone sees. Once the game is developed, they can run it in the Unreal engine that gets exported out. These worlds get run on all kinds of devices. Or Unity. But if someone wants to stream it right out of the cloud, they could do that with Omniverse, because it needs multiple GPUs, a fair amount of computation.

That’s how I see it evolving. But within Omniverse, just the concept of designing virtual worlds for the game developers, it’s going to be a huge benefit to their work flow.

Question: You announced that your current processors target high-performance computing with a special focus on AI. Do you see expanding this offering, developing this CPU line into other segments for computing on a larger scale in the market of datacenters?

Huang: Grace is designed for applications, software that is data-driven. AI is software that writes software. To write that software, you need a lot of experience. It’s just like human intelligence. We need experience. The best way to get that experience is through a lot of data. You can also get it through simulation. For example, the Omniverse simulation system will run on Grace incredibly well. You could simulate — simulation is a form of imagination. You could learn from data. That’s a form of experience. Studying data to infer, to generalize that understanding and turn it into knowledge. That’s what Grace is designed for, these large systems for very important new forms of software, data-driven software.

As a policy, or not a policy, but as a philosophy, we tend not to do anything unless the world needs us to do it and it doesn’t exist. When you look at the Grace architecture, it’s unique. It doesn’t look like anything out there. It solves a problem that didn’t used to exist. It’s an opportunity and a market, a way of doing computing that didn’t exist 20 years ago. It’s sensible to imagine that CPUs that were architected and system architectures that were designed 20 years ago wouldn’t address this new application space. We’ll tend to focus on areas where it didn’t exist before. It’s a new class of problem, and the world needs to do it. We’ll focus on that.

Otherwise, we have excellent partnerships with Intel and AMD. We work very closely with them in the PC industry, in the datacenter, in hyperscale, in supercomputing. We work closely with some exciting new partners. Ampere Computing is doing a great ARM CPU. Marvell is incredible at the edge, 5G systems and I/O systems and storage systems. They’re fantastic there, and we’ll partner with them. We partner with Mediatek, the largest SOC company in the world. These are all companies who have brought great products. Our strategy is to support them. Our philosophy is to support them. By connecting our platform, Nvidia AI or Nvidia RTX, our raytracing platform, with Omniverse and all of our platform technologies to their CPUs, we can expand the overall market. That’s our basic approach. We only focus on building things that the world doesn’t have.

Nvidia's Grace CPU for datacenters.

Above: Nvidia’s Grace CPU for datacenters is named after Grace Hopper.

Image Credit: Nvidia

Question: I wanted to follow up on the last question regarding Grace and its use. Does this signal Nvidia’s perhaps ambitions in the CPU space beyond the datacenter? I know you said you’re looking for things that the world doesn’t have yet. Obviously, working with ARM chips in the datacenter space leads to the question of whether we’ll see a commercial version of an Nvidia CPU in the future.

Huang: Our platforms are open. When we build our platforms, we create one version of it. For example, DGX. DGX is fully integrated. It’s bespoke. It has an architecture that’s very specifically Nvidia. It was designed — the first customer was Nvidia researchers. We have a couple billion dollars’ worth of infrastructure our AI researchers are using to develop products and pretrain models and do AI research and self-driving cars. We built DGX primarily to solve a problem we had. Therefore it’s completely bespoke.

We take all of the building blocks, and we open it. We open our computing platform in three layers: the hardware layer, chips and systems; the middleware layer, which is Nvidia AI, Nvidia Omniverse, and it’s open; and the top layer, which is pretrained models, AI skills, like driving skills, speaking skills, recommendation skills, pick and play skills, and so on. We create it vertically, but we architect it and think about it and build it in a way that’s intended for the entire industry to be able to use however they see fit. Grace will be commercial in the same way, just like Nvidia GPUs are commercial.

With respect to its future, our primary preference is that we don’t build something. Our primary preference is that if somebody else is building it, we’re delighted to use it. That allows us to spare our critical resources in the company and focus on advancing the industry in a way that’s rather unique. Advancing the industry in a way that nobody else does. We try to get a sense of where people are going, and if they’re doing a fantastic job at it, we’d rather work with them to bring Nvidia technology to new markets or expand our combined markets together.

The ARM license, as you mentioned — acquiring ARM is a very similar approach to the way we think about all of computing. It’s an open platform. We sell our chips. We license our software. We put everything out there for the ecosystem to be able to build bespoke, their own versions of it, differentiated versions of it. We love the open platform approach.

Question: Can you explain what made Nvidia decide that this datacenter chip was needed right now? Everybody else has datacenter chips out there. You’ve never done this before. How is it different from Intel, AMD, and other datacenter CPUs? Could this cause problems for Nvidia partnerships with those companies, because this puts you in direct competition?

Huang: The answer to the last part — I’ll work my way to the beginning of your question. But I don’t believe so. Companies have leadership that are a lot more mature than maybe given credit for. We compete with the ARM GPUs. On the other hand, we use their CPUs in DGX. Literally, our own product. We buy their CPUs to integrate into our own product — arguably our most important product. We work with the whole semiconductor industry to design their chips into our reference platforms. We work hand in hand with Intel on RTX gaming notebooks. There are almost 80 notebooks we worked on together this season. We advance industry standards together. A lot of collaboration.

Back to why we designed the datacenter CPU, we didn’t think about it that way. The way Nvidia tends to think is we say, “What is a problem that is worthwhile to solve, that nobody in the world is solving and we’re suited to go solve that problem and if we solve that problem it would be a benefit to the industry and the world?” We ask questions literally like that. The philosophy of the company, in leading through that set of questions, finds us solving problems only we will, or only we can, that have never been solved before. The outcome of trying to create a system that can train AI models, language models, that are gigantic, learn from multi-modal data, that would take less than three months — right now, even on a giant supercomputer, it takes months to train 1 trillion parameters. The world would like to train 100 trillion parameters on multi-modal data, looking at video and text at the same time.

The journey there is not going to happen by using today’s architecture and making it bigger. It’s just too inefficient. We created something that is designed from the ground up to solve this class of interesting problems. Now this class of interesting problems didn’t exist 20 years ago, as I mentioned, or even 10 or five years ago. And yet this class of problems is important to the future. AI that’s conversational, that understands language, that can be adapted and pretrained to different domains, what could be more important? It could be the ultimate AI. We came to the conclusion that hundreds of companies are going to need giant systems to pretrain these models and adapt them. It could be thousands of companies. But it wasn’t solvable before. When you have to do computing for three years to find a solution, you’ll never have that solution. If you can do that in weeks, that changes everything.

That’s how we think about these things. Grace is designed for giant-scale data-driven software development, whether it’s for science or AI or just data processing.

Nvidia DGX SuperPod

Above: Nvidia DGX SuperPod

Image Credit: Nvidia

Question: You’re proposing a software library for quantum computing. Are you working on hardware components as well?

Huang: We’re not building a quantum computer. We’re building an SDK for quantum circuit simulation. We’re doing that because in order to invent, to research the future of computing, you need the fastest computer in the world to do that. Quantum computers, as you know, are able to simulate exponential complexity problems, which means that you’re going to need a really large computer very quickly. The size of the simulations you’re able to do to verify the results of the research you’re doing to do development of algorithms so you can run them on a quantum computer someday, to discover algorithms — at the moment, there aren’t that many algorithms you can run on a quantum computer that prove to be useful. Grover’s is one of them. Shore’s is another. There are some examples in quantum chemistry.

We give the industry a platform by which to do quantum computing research in systems, in circuits, in algorithms, and in the meantime, in the next 15-20 years, while all of this research is happening, we have the benefit of taking the same SDKs, the same computers, to help quantum chemists do simulations much more quickly. We could put the algorithms to use even today.

And then last, quantum computers, as you know, have incredible exponential complexity computational capability. However, it has extreme I/O limitations. You communicate with it through microwaves, through lasers. The amount of data you can move in and out of that computer is very limited. There needs to be a classical computer that sits next to a quantum computer, the quantum accelerator if you can call it that, that pre-processes the data and does the post-processing of the data in chunks, in such a way that the classical computer sitting next to the quantum computer is going to be super fast. The answer is fairly sensible, that the classical computer will likely be a GPU-accelerated computer.

There are lots of reasons we’re doing this. There are 60 research institutes around the world. We can work with every one of them through our approach. We intend to. We can help every one of them advance their research.

Question: So many workers have moved to work from home, and we’ve seen a huge increase in cybercrime. Has that changed the way AI is used by companies like yours to provide defenses? Are you worried about these technologies in the hands of bad actors who can commit more sophisticated and damaging crimes? Also, I’d love to hear your thoughts broadly on what it will take to solve the chip shortage problem on a lasting global basis.

Huang: The best way is to democratize the technology, in order to enable all of society, which is vastly good, and to put great technology in their hands so that they can use the same technology, and ideally superior technology, to stay safe. You’re right that security is a real concern today. The reason for that is because of virtualization and cloud computing. Security has become a real challenge for companies because every computer inside your datacenter is now exposed to the outside. In the past, the doors to the datacenter were exposed, but once you came into the company, you were an employee, or you could only get in through VPN. Now, with cloud computing, everything is exposed.

The other reason why the datacenter is exposed is because the applications are now aggregated. It used to be that the applications would run monolithically in a container, in one computer. Now the applications for scaled out architectures, for good reasons, have been turned into micro-services that scale out across the whole datacenter. The micro-services are communicating with each other through network protocols. Wherever there’s network traffic, there’s an opportunity to intercept. Now the datacenter has billions of ports, billions of virtual active ports. They’re all attack surfaces.

The answer is you have to do security at the node. You have to start it at the node. That’s one of the reasons why our work with BlueField is so exciting to us. Because it’s a network chip, it’s already in the computer node, and because we invented a way to put high-speed AI processing in an enterprise datacenter — it’s called EGX — with BlueField on one end and EGX on the other, that’s a framework for security companies to build AI. Whether it’s a Check Point or a Fortinet or Palo Alto Networks, and the list goes on, they can now develop software that runs on the chips we build, the computers we build. As a result, every single packet in the datacenter can be monitored. You would inspect every packet, break it down, turn it into tokens or words, read it using natural language understanding, which we talked about a second ago — the natural language understanding would determine whether there’s a particular action that’s needed, a security action needed, and send the security action request back to BlueField.

This is all happening in real time, continuously, and there’s just no way to do this in the cloud because you would have to move way too much data to the cloud. There’s no way to do this on the CPU because it takes too much energy, too much compute load. People don’t do it. I don’t think people are confused about what needs to be done. They just don’t do it because it’s not practical. But now, with BlueField and EGX, it’s practical and doable. The technology exists.

Nvidia's Inception AI statups over the years.

Above: Nvidia’s Inception AI statups over the years.

Image Credit: Nvidia

The second question has to do with chip supply. The industry is caught by a couple of dynamics. Of course one of the dynamics is COVID exposing, if you will, a weakness in the supply chain of the automotive industry, which has two main components it builds into cars. Those main components go through various supply chains, so their supply chain is super complicated. When it shut down abruptly because of COVID, the recovery process was far more complicated, the restart process, than anybody expected. You could imagine it, because the supply chain is so complicated. It’s very clear that cars could be rearchitected, and instead of thousands of components, it wants to be a few centralized components. You can keep your eyes on four things a lot better than a thousand things in different places. That’s one factor.

The other factor is a technology dynamic. It’s been expressed in a lot of different ways, but the technology dynamic is basically that we’re aggregating computing into the cloud, and into datacenters. What used to be a whole bunch of electronic devices — we can now virtualize it, put it in the cloud, and remotely do computing. All the dynamics we were just talking about that have created a security challenge for datacenters, that’s also the reason why these chips are so large. When you can put computing in the datacenter, the chips can be as large as you want. The datacenter is big, a lot bigger than your pocket. Because it can be aggregated and shared with so many people, it’s driving the adoption, driving the pendulum toward very large chips that are very advanced, versus a lot of small chips that are less advanced. All of a sudden, the world’s balance of semiconductor consumption tipped toward the most advanced of computing.

The industry now recognizes this, and surely the world’s largest semiconductor companies recognize this. They’ll build out the necessary capacity. I doubt it will be a real issue in two years because smart people now understand what the problems are and how to address them.

Question: I’d like to know more about what clients and industries Nvidia expects to reach with Grace, and what you think is the size of the market for high-performance datacenter CPUs for AI and advanced computing.

Huang: I’m going to start with I don’t know. But I can give you my intuition. 30 years ago, my investors asked me how big the 3D graphics was going to be. I told them I didn’t know. However, my intuition was that the killer app would be video games, and the PC would become — at the time the PC didn’t even have sound. You didn’t have LCDs. There was no CD-ROM. There was no internet. I said, “The PC is going to become a consumer product. It’s very likely that the new application that will be made possible, that wasn’t possible before, is going to be a consumer product like video games.” They said, “How big is that market going to be?” I said, “I think every human is going to be a gamer.” I said that about 30 years ago. I’m working toward being right. It’s surely happening.

Ten years ago someone asked me, “Why are you doing all this stuff in deep learning? Who cares about detecting cats?” But it’s not about detecting cats. At the time I was trying to detect red Ferraris, as well. It did it fairly well. But anyway, it wasn’t about detecting things. This was a fundamentally new way of developing software. By developing software this way, using networks that are deep, which allows you to capture very high dimensionality, it’s the universal function approximator. If you gave me that, I could use it to predict Newton’s law. I could use it to predict anything you wanted to predict, given enough data. We invested tens of billions behind that intuition, and I think that intuition has proven right.

I believe that there’s a new scale of computer that needs to be built, that needs to learn from basically Earth-scale amounts of data. You’ll have sensors that will be connected to everywhere on the planet, and we’ll use them to predict climate, to create a digital twin of Earth. It’ll be able to predict weather everywhere, anywhere, down to a square meter, because it’s learned the physics and all the geometry of the Earth. It’s learned all of these algorithms. We could do that for natural language understanding, which is extremely complex and changing all the time. The thing people don’t realize about language is it’s evolving continuously. Therefore, whatever AI model you use to understand language is obsolete tomorrow, because of decay, what people call model drift. You’re continuously learning and drifting, if you will, with society.

There’s some very large data-driven science that needs to be done. How many people need language models? Language is thought. Thought is humanity’s ultimate technology. There are so many different versions of it, different cultures and languages and technology domains. How people talk in retail, in fashion, in insurance, in financial services, in law, in the chip industry, in the software industry. They’re all different. We have to train and adapt models for every one of those. How many versions of those? Let’s see. Take 70 languages, multiply by 100 industries that need to use giant systems to train on data forever. That’s maybe an intuition, just to give a sense of my intuition about it. My sense is that it will be a very large new market, just as GPUs were once a zero billion dollar market. That’s Nvidia’s style. We tend to go after zero billion dollar markets, because that’s how we make a contribution to the industry. That’s how we invent the future.

Arm's campus in Cambridge, United Kingdom.

Above: Arm’s campus in Cambridge, United Kingdom.

Image Credit: Arm

Question: Are you still confident that the ARM deal will gain approval by close? With the announcement of Grace and all the other ARM-relevant partnerships you have in development, how important is the ARM acquisition to the company’s goals, and what do you get from owning ARM that you don’t get from licensing?

Huang: ARM and Nvidia are independently and separately excellent businesses, as you know well. We will continue to have excellent separate businesses as we go through this process. However, together we can do many things, and I’ll come back to that. To the beginning of your question, I’m very confident that the regulators will see the wisdom of the transaction. It will provide a surge of innovation. It will create new options for the marketplace. It will allow ARM to be expanded into markets that otherwise are difficult for them to reach themselves. Like many of the partnerships I announced, those are all things bringing AI to the ARM ecosystem, bringing Nvidia’s accelerated computing platform to the ARM ecosystem — it’s something only we and a bunch of computing companies working together can do. The regulators will see the wisdom of it, and our discussions with them are as expected and constructive. I’m confident that we’ll still get the deal done in 2022, which is when we expected it in the first place, about 18 months.

With respect to what we can do together, I demonstrated one example, an early example, at GTC. We announced partnerships with Amazon to combine the Graviton architecture with Nvidia’s GPU architecture to bring modern AI and modern cloud computing to the cloud for ARM. We did that for Ampere computing, for scientific computing, AI in scientific computing. We announced it for Marvell, for edge and cloud platforms and 5G platforms. And then we announced it for Mediatek. These are things that will take a long time to do, and as one company we’ll be able to do it a lot better. The combination will enhance both of our businesses. On the one hand, it expands ARM into new computing platforms that otherwise would be difficult. On the other hand, it expands Nvidia’s AI platform into the ARM ecosystem, which is underexposed to Nvidia’s AI and accelerated computing platform.

Question: I covered Atlan a little more than the other pieces you announced. We don’t really know the node side, but the node side below 10nm is being made in Asia. Will it be something that other countries adopt around the world, in the West? It raises a question for me about the long-term chip supply and the trade issues between China and the United States. Because Atlan seems to be so important to Nvidia, how do you project that down the road, in 2025 and beyond? Are things going to be handled, or not?

Huang: I have every confidence that it will not be an issue. The reason for that is because Nvidia qualifies and works with all of the major foundries. Whatever is necessary to do, we’ll do it when the time comes. A company of our scale and our resources, we can surely adapt our supply chain to make our technology available to customers that use it.BlueField-3 DPU

Question: In reference to BlueField 3, and BlueField 2 for that matter, you presented a strong proposition in terms of offloading workloads, but could you provide some context into what markets you expect this to take off in, both right now and going into the future? On top of that, what barriers to adoption remain in the market?

Huang: I’m going to go out on a limb and make a prediction and work backward. Number one, every single datacenter in the world will have an infrastructure computing platform that is isolated from the application platform in five years. Whether it’s five or 10, hard to say, but anyway, it’s going to be complete, and for very logical reasons. The application that’s where the intruder is, you don’t want the intruder to be in a control mode. You want the two to be isolated. By doing this, by creating something like BlueField, we have the ability to isolate.

Second, the processing necessary for the infrastructure stack that is software-defined — the networking, as I mentioned, the east-west traffic in the datacenter, is off the charts. You’re going to have to inspect every single packet now. The east-west traffic in the data center, the packet inspection, is going to be off the charts. You can’t put that on the CPU because it’s been isolated onto a BlueField. You want to do that on BlueField. The amount of computation you’ll have to accelerate onto an infrastructure computing platform is quite significant, and it’s going to get done. It’s going to get done because it’s the best way to achieve zero trust. It’s the best way that we know of, that the industry knows of, to move to the future where the attack surface is basically zero, and yet every datacenter is virtualized in the cloud. That journey requires a reinvention of the datacenter, and that’s what BlueField does. Every datacenter will be outfitted with something like BlueField.

I believe that every single edge device will be a datacenter. For example, the 5G edge will be a datacenter. Every cell tower will be a datacenter. It’ll run applications, AI applications. These AI applications could be hosting a service for a client or they could be doing AI processing to optimize radio beams and strength as the geometry in the environment changes. When traffic changes and the beam changes, the beam focus changes, all of that optimization, incredibly complex algorithms, wants to be done with AI. Every base station is going to be a cloud native, orchestrated, self-optimizing sensor. Software developers will be programming it all the time.

Every single car will be a datacenter. Every car, truck, shuttle will be a datacenter. Every one of those datacenters, the application plane, which is the self-driving car plane, and the control plane, that will be isolated. It’ll be secure. It’ll be functionally safe. You need something like BlueField. I believe that every single edge instance of computing, whether it’s in a warehouse, a factory — how could you have a several-billion-dollar factory with robots moving around and that factory is literally sitting there and not have it be completely tamper-proof? Out of the question, absolutely. That factory will be built like a secure datacenter. Again, BlueField will be there.

Everywhere on the edge, including autonomous machines and robotics, every datacenter, enterprise or cloud, the control plane and the application plane will be isolated. I promise you that. Now the question is, “How do you go about doing it? What’s the obstacle?” Software. We have to port the software. There’s two pieces of software, really, that need to get done. It’s a heavy lift, but we’ve been lifting it for years. One piece is for 80% of the world’s enterprise. They all run VMware vSphere software-defined datacenter. You saw our partnership with VMware, where we’re going to take vSphere stack — we have this, and it’s in the process of going into production now, going to market now … taking vSphere and offloading it, accelerating it, isolating it from the application plane.

Nvidia has eight new RTX GPU cards.

Above: Nvidia has eight new RTX GPU cards.

Image Credit: Nvidia

Number two, for everybody else out at the edge, the telco edge, with Red Hat, we announced a partnership with them, and they’re doing the same thing. Third, for all the cloud service providers who have bespoke software, we created an SDK called DOCA 1.0. It’s released to production, announced at GTC. With this SDK, everyone can program the BlueField, and by using DOCA 1.0, everything they do on BlueField runs on BlueField 3 and BlueField 4. I announced the architecture for all three of those will be compatible with DOCA. Now the software developers know the work they do will be leveraged across a very large footprint, and it will be protected for decades to come.

We had a great GTC. At the highest level, the way to think about that is the work we’re doing is all focused on driving some of the fundamental dynamics happening in the industry. Your questions centered around that, and that’s fantastic. There are five dynamics highlighted during GTC. One of them is accelerated computing as a path forward. It’s the approach we pioneered three decades ago, the approach we strongly believe in. It’s able to solve some challenges for computing that are now front of mind for everyone. The limits of CPUs and their ability to scale to reach some of the problems we’d like to address are facing us. Accelerated computing is the path forward.

Second, to be mindful about the power of AI that we all are excited about. We have to realize that it’s a software that is writing software. The computing method is different. On the other hand, it creates incredible new opportunities. Thinking about the datacenter not just as a big room with computers and network and security appliances, but thinking of the entire datacenter as one computing unit. The datacenter is the new computing unit.

Bentley's tools used to create a digital twin of a location in the Omniverse.

Above: Bentley’s tools used to create a digital twin of a location in the Omniverse.

Image Credit: Nvidia

5G is super exciting to me. Commercial 5G, consumer 5G is exciting. However, it’s incredibly exciting to look at private 5G, for all the applications we just looked at. AI on 5G is going to bring the smartphone moment to agriculture, to logistics, to manufacturing. You can see how excited BMW is about the technologies we’ve put together that allow them to revolutionize the way they do manufacturing, to become much more of a technology company going forward.

Last, the era of robotics is here. We’re going to see some very rapid advances in robotics. One of the critical needs of developing robotics and training robotics, because they can’t be trained in the physical world while they’re still clumsy — we need to give it a virtual world where it can learn how to be a robot. These virtual worlds will be so realistic that they’ll become the digital twins of where the robot goes into production. We spoke about the digital twin vision. PTC is a great example of a company that also sees the vision of this. This is going to be a realization of a vision that’s been talked about for some time. The digital twin idea will be made possible because of technologies that have emerged out of gaming. Gaming and scientific computing have fused together into what we call Omniverse.

GamesBeat

GamesBeat’s creed when covering the game industry is “where passion meets business.” What does this mean? We want to tell you how the news matters to you — not just as a decision-maker at a game studio, but also as a fan of games. Whether you read our articles, listen to our podcasts, or watch our videos, GamesBeat will help you learn about the industry and enjoy engaging with it.

How will you do that? Membership includes access to:

  • Newsletters, such as DeanBeat
  • The wonderful, educational, and fun speakers at our events
  • Networking opportunities
  • Special members-only interviews, chats, and “open office” events with GamesBeat staff
  • Chatting with community members, GamesBeat staff, and other guests in our Discord
  • And maybe even a fun prize or two
  • Introductions to like-minded parties

Become a member

Repost: Original Source and Author Link