Generative AI is having far reaching impacts on datacenters and new data shedding light on the nature and extent of the impacts. Analysts Melissa Incera and Dan Thompson return to the podcast to dig into the data and discuss the ramifications for enterprises, datacenter builders and operators and those putting AI to work with host Eric Hanselman. The computational requirements of generative AI are unique, taking demands similar to high performance computing or cryptomining and spreading them across a vast new community of users. The demands for GPU-based capacity are consuming more power, sometimes with order of magnitude increases. It’s also driving retrofitting of existing datacenters. The need for higher capacity and innovations like liquid cooling are causing a new surge. We’re at the beginning of the shift from a focus on model training to greater amounts of inferencing, changing where capacity is needed.
Looking at money flows in this market offers additional nuance. Investments in AI startups have been massive. Capital investments from hyperscalers in their own infrastructure is also massive, but it remains to be seen if they’ll pay off in equivalent amounts of revenue. Is all of this sustainable? It’s an complicated question in the many aspects of the word.
Host: Eric Hanselman
Guests:
Links to show content:
Webinar replay (registration required)
For 451 Research clients:
https://clients.451research.com/reports/204284
https://clients.451research.com/reports/204175
451 Research commissioned report highlights:
Subscribe to Next in Tech
SubscribePresenters
ATTENDEES
Dan Thompson
Eric Hanselman
Melissa Incera
Presentation
Eric Hanselman
Welcome to Next in Tech, an S&P Global Market Intelligence podcast where the world of emerging tech lives. I'm your host, Eric Hanselman, Chief Analyst for Technology, Media and Telecom at S&P Global Market Intelligence. And today, we're going to be talking about the intersection of AI and datacenter. And joining me to talk about it, fresh off a series of webinars around this, are Melissa Incera and Dan Thompson, returning guests, both of you. So welcome back.
Dan Thompson
Yes. Thanks a lot.
Melissa Incera
Thanks, Eric. Great to be here.
Eric Hanselman
And it's great to have you back. Both of you who have been on talking about both AI and datacenters and AI datacenters, but this is really something where we're moving into a set of conversations that we've been having, really are starting to dig in a little deeper talking about really what are the machinations of and impact of AI on datacenters, on data center capabilities, on AI and really sort of where this fits. I wanted to dive into this in a little more detail.
And I guess, I think one of the things that's worth digging into is what are we really talking about when we say AI. AI has become this moniker for all things machine learning, neural networking, generative AI, and it winds up just being applied to anything that's got any sort of reasoning or automation or all sorts of things. So I guess before we go too far down the road, what are we talking about when you start thinking about this?
Melissa Incera
I think that's an important thing to flesh out early on because broadly, yes, the trends that we're going to be talking about today and the impact on the datacenter and energy market encompasses a lot of different things, right? We're talking about traditional AI. We're talking about neural networks. We're talking about cloud computing. But for the purposes of the conversation today, we're going to be putting heavy emphasis on generative AI, specifically because the hype in generative AI has been incredibly high.
It's being adopted at rates we've never seen before. And right now, it is kind of the pointy tip of the spear related to investment in all of those categories, right? It's the driver of renewed enthusiasm around traditional AI, around conversations about renovating and rearchitecting clouds. So that's one aspect. And then the second aspect is that generative AI is so uniquely computationally demanding. Generally, the thinking is that one generative AI query can require anywhere from 10 to 30x as much energy compared to a traditional one because these models are so complex. They're often involving billions of parameters.
Training them requires not only vast amounts of processing power, but very specific hardware, which I'm sure we've talked about on the podcast before. But this translates to a very significant and specialized increased need in datacenter infrastructure needs and, consequently, the energy needs. So really, it's the two aspects that make generative AI, again, the pointed tip of the spear here, right? It's driving investment in all of these categories broadly, and it is so unique in terms of the needs to get into production.
Eric Hanselman
I mean, hype driving technology. I think that's never happened before. But I think that's the big -- the issue that we're facing in traditional AI, legacy AI. These are approaches that haven't had, to your point, the same set of significant computational demands. And I'll extend that actually to talk not only about the computational demands, but just infrastructure demands in general. Because of course, you're dealing with much larger volumes of data so you have to have storage environments, you've got to have the interconnect capability to move that data around. It's really having a much bigger impact on infrastructure even more broadly.
Now the thing that's really easy to measure is the computational impact in that it needs a lot more horsepower. It needs some level of computational acceleration: GPUs, Tensor Cores, whatever it is. You got to throw some hardware at it so that it's at least slightly more energy and time efficient. But there seems a lot that's gone into this.
And Dan, we've talked about some of those impacts and some of the challenges that starts to place on datacenter infrastructure, I mean, that this starts to really create a big push. But as you pointed out, on a previous episode, that's creating this push for a resource that takes a long time to build.
Dan Thompson
Yes. And I think if I could take a swing at the question that Melissa just answered as well, from a datacenter perspective, we don't necessarily care what flavor of AI it is. What is impacting our community is really this need for GPU-based compute. And so whether it's generative AI or whether it's crypto mining or whether it's some other thing that leans everyone on GPU-based compute, which is very, very power-hungry, to kind of get into the specifics that you were talking about a minute ago, that's what it is that we're addressing.
And so as long as folks are trying to shovel in tons of GPU-based, compute-type systems, this is going to be a challenge for our industry. But I think to the point of where you were headed there, what's interesting for us is that all that extra computational power that you guys are talking about results in power-hungry datacenters. These can be in order of as much as 10x more power needed than what we've seen in the past on a rack-by-rack basis. Any time you say 10x anything, it's probably going to create a problem.
Eric Hanselman
Orders of magnitude have a tendency to do that.
Dan Thompson
Exactly. Exactly. And then to your point, I mean, he's building, you don't just fly out of the ground. So I mean some of the companies that are really, really good at this can pull out of the ground in about six months. That's if they've already got plans in place, paperwork is all done. But increasingly, it's that paperwork.
Eric Hanselman
Power is available in the local grid.
Dan Thompson
Yes. Exactly what I was about to say. So I mean, increasingly, that's the key challenge is that the paperwork can take longer now. Paperwork, meaning getting the surveys done from the power company to say, yes, I can deliver many hundreds of megawatts or even gigawatt scale that we're seeing now. And I can do that in a timely manner that's satisfactory to you, customer. Things like that are just taking much longer. And then throw in supply chain issues, which continue to kind of be a problem for everybody, and this can kind of drag out much longer. So yes, it's definitely kind of interesting times that we found ourselves and fascinating.
Eric Hanselman
Well, it relates back to your point, Melissa, about what really is different than you're bringing up, things like crypto mining that we've had concerns about energy consumption in the past. But we're in a situation today where crypto mining was something that was the realm of a relatively small set of folks. It wasn't everybody and their friends who are all doing crypto mining everywhere.
When we get to generative AI, that's something where the broad use cases and the implementations across a much larger swath of the AI or the technology integration that's out there, there are just a lot more people who are laying hands on this, leveraging it and trying to put it to work and as a result, creating this computational load. I mean that's not inconsequential. And of course, that starts to open that question of how are we going to leverage this capability both from the energy side, considering sustainability and as well as what are -- what's the role technology players are having in this shift as well.
Melissa Incera
So one thing that emerged from the webinar that was a really interesting question is, so Dan was talking about the complexity of investing in these datacenters, as you guys mentioned, they don't spring it on the ground in a day. So how do you invest in a cycle where we don't have a great picture of where demand will be, right? And we had one of our webinar participants ask us, where is the inflection point going to be? Like where do you guys predict how far away from that do you think we are?
It's really a really interesting question, and it relates to adoption within generative AI that sort of provides some indication. But it's a really relevant question because as adoption matures, the infrastructure demands, when I say infrastructure, I mean everything from networking, storage, computing power, energy demands, these shift in scale pretty dramatically. So thus far, we've seen a lot of focus on model training.
A lot of the AI workload, specific to generative AI, has been on the training side of the equation. And this is the most computationally demanding aspect of generative AI by far, right? This is where you see these specialized AI hardware offerings like GPUs, TPUs. They really make a difference because they require a massive amount of data to be processed and specifically designed for parallel processing. And this is where these very specialized datacenter build-outs are going to be incredibly valuable.
But as organizations actually start putting these things into production, it shifts the load slightly more in the direction of inferencing, which is where your end users are actually creating the models, they're producing outputs. The good thing is that when it comes to interesting workloads, these models can actually then be deployed on less demanding hardware. There's more room to optimize really any inefficiency and generally focus on reducing the operational costs and energy consumption.
Eric Hanselman
So once all the training is done, we're all set, right?
Melissa Incera
Well, not exactly.
Eric Hanselman
Not so much.
Melissa Incera
I mean we have a lot of data on training frequency, for instance. And most companies, most organizations are training their data on a weekly basis or more frequently than that. So that workload is not going to decrease by any chance. So we're adding this inferencing workload that is less computationally demanding.
It's easier to optimize, but it's incredibly high frequency. I think actually this morning, ChatGPT said, last year, they were seeing more than 10 million queries per day. It's not last year, I'm sorry, in February. They said they were seeing more than 10 million queries per day. So as we're adding more inferencing workload, there's so much unfulfilled demand that will ultimately be online.
And it's really hard to say because we're at such early stages and a lot of the top use cases have yet to be discovered or put into practice and a lot of the world has yet to come online. It's hard to translate the demand we're seeing now into what ultimate demand will be, from that perspective, project out what we'll need from a datacenter perspective.
Dan Thompson
It's interesting, too, like if you think about this. So ChatGPT, if we just kind of use that as the current use case or whatever. So as Melissa was saying, 10 million visits, whatever numbers you just threw out there, that is still with ChatGPT having guardrails on it. So they are still limiting, like you can still get the "Come back later, we're busy" thing.
What happens when they take the guardrails off? How much power -- it seems what I'm saying is it's currently artificially limited in its power consumption. So what happens when they take those guardrails off? Once they have all the infrastructure replaced that could support, however, like trillions instead of billions, what happens then in terms of power consumption? You have to wonder that some amount of queries just go away because it was busy, and then I forgot about what it was, I was going to ask ChatGPT or I'll just come back later. So anyway, these are kind of interesting questions.
And then the follow-on to that is, to Melissa's point where she was headed there, is the next ChatGPT. The next thing that we haven't even thought of yet or we haven't heard of yet, that could also be at the same time that ChatGPT is popular, it could also be popular. In the same way that like TikTok is super popular and Instagram is also still popular, right? So like all these things can happen in tandem and synchronously and folks are just using multiple platforms that are consuming loads of energy.
Eric Hanselman
And all of this continues to grow and expand. Well, and most to your point, training is not one and done. It is an ongoing process. There is this shift inferencing that's going to happen. And yes, maybe we can optimize better when we're doing inferencing and be able to manage at least some of that growth. But inferencing is also the point at which these models get distributed all over the place out to edges, and whatever the edge winds up being defined as, and there's a lot more of that. What I'm taking away from what you're both saying is that we're early enough, that this is really hard to tell, but the one thing we can say is that there will be growth. We just don't know extent, speed and location. Other than that, we're all set.
Dan Thompson
Let me jump on that, too, real quick, Eric. So I want to just kind of make sure what Melissa said just a second ago was heard, and that is that we don't see training stopping. We just see the amount of power or the amount of compute required for inferencing being suddenly greater than what is being required for training. And so what that means is, like if we think about this from a datacenter perspective and from a power consumption perspective, what we've seen so far has mostly been training, as she was saying.
And so all of these new asks from datacenter companies and the hyperscalers for additional datacenter space, and all these crazy datacenters that have been coming online that are like hundreds and hundreds and hundreds of megawatts, that has so far mostly been for training. If we assume that what she's saying is correct, which is that doesn't go away, people continue to train their models on a day and a weekly basis. That doesn't go away.
But now we start doubling down on inference, and we start implementing these systems. It's true that, that can be more distributed exactly what Melissa was saying, but NVIDIA have done a wonderful job making sure that the exact same systems that can do training can also do inference. You just need fewer. But they can still consume a lot of power. And so if we consider that you can maybe fill up one rack that is 100 kilowatts, or maybe we spread that across a couple of other racks, but we're talking 100 kilowatt chunks, if we start dropping those in all over the place, it looks a little bit different than what we've seen so far.
So, so far, you've seen giant asks in very specific locations. What this could equal out for inference is many asks across the nation that are increasingly bigger than they have been historically. In some ways, this is good because it kind of spreads the love around. In some ways, this may create more challenges. Let's say, for instance, everybody gets excited about inference and they say, "Yes, we need more power in New York City," which is already constrained, or "We need more power in downtown LA," because it's servicing those populations, right? It's the ChatGPT version of whatever the closest node is to the end user, that now needs more power.
Well, 100 kilowatts there, 100 kilowatts here, maybe that doesn't add up to much, but if these applications get super, super popular, and now we're talking about I need megawatts or tens of megawatts or hundreds of megawatts, that can become issues in these kind of downtown areas or close-to-people areas, whereas kind of the conversation of late has been these things can really be anywhere. And so put it in North Dakota, which is what we're seeing right now. Okay, great. That's not a problem. But when you get to the inference side of things, at least current thinking is it's not going to live in North Dakota. It's going to need in -- be in like Manhattan or downtown LA to really service that population.
Eric Hanselman
So if we think about this increasing level of demand, and the thing -- the big question, of course, is are all the people who are building all this stuff actually are they getting paid for it? Is it worthwhile? And realistically, who is going to pay for this? Because right now, you talk about the guardrails around OpenAI -- on ChatGPT. They're doing that to go manage utilization. But we get to a point at which there actually is sufficient money flow to make this make sense. And as we've talked previously, most about sort of where the investments are in all this, there's a lot of money that's pouring into the various odds and ends of first order AI organizations that are out there. Is this sustainable? Is it making sense? So where does this all shake out?
Dan Thompson
If we start at the very base level, people are getting paid. So if you come to me and you want me to build you a datacenter several hundred megawatts' worth, those contracts are all pre-signed. I guess what I'm trying to say is we're not seeing loads of datacenters being built speculatively. They're being built with customers in tow. And not only is this facility spoken for, the next three after it are also spoken for. So these are all locked up in contracts, at least 10-year contracts.
We're talking about hundreds of megawatts' worth of energy, and we're hearing now increasingly 15-year contracts. So at the level that we're seeing folks operate, people are getting paid. And it's companies like Microsoft, Google, Amazon who are the ones paying these bills. And so from our perspective, life is good. Everybody -- money is flowing, stuffs are coming out of the ground, people are getting paid. So on the infrastructure side of things, things are good.
I guess there's the common conversation -- in the real estate world anyway, that the land seller always gets paid and the realtor always gets paid. And so that's kind of where we are in terms of how it's being financed or who's paying and our people getting paid. How it's being financed is a separate question, and it's an interesting one because these are really big numbers that are getting thrown around. And so you have folks like Microsoft, Google, Amazon, Facebook and so on, who are really leaning on their revenues, their credits, things like this.
And similarly, all the way down to the datacenter world where you've got large-scale companies like CyrusOne, Digital Realty, Vantage, a whole slew of companies that are doing this, and they're approaching this from a finance perspective in many different ways. And of course, the financiers are coming to them and say, okay, well, this is a huge amount of money, how are you going to pay for this? And their answer is simply, well, we've got Microsoft as a customer, or we've got Facebook as a customer. And today, that's getting the impact. So that's how it's happening in my world. Melissa?
Melissa Incera
Yes. I mean, similarly, I was going to mention we didn't -- you touched on Microsoft and all these hyperscalers. So we did an analysis of the hyperscalers last year on their earnings calls to see because there's been so much hype and high expectation for what is gen AI going to translate into for them. And a lot of them play -- so specifically, we're interested in are they generating returns both from hardware and software?
So I think high level, these guys are paying astronomical amounts in capital expenditures right now to build out, as Dan alluded to, to build out the critical infrastructure for AI and cloud. Datacenter expansion being perhaps the main component of that. Microsoft had the most detail in their earnings calls where they said that they are already coming across cloud capacity limitations, so that helps to validate some of the spending. But I mean these numbers are astronomical.
Alphabet is projected to spend $50 billion this year, up from $32 billion last year. Microsoft spent $14 billion in CapEx in Q1 alone, which is a full quarter of its quarterly revenues. And we want to -- so we were seeking to understand where the return is happening. Really clear trend is if there's ROI being delivered, it is in the infrastructure side or the AI-enabling segments of this market. So there was some cloud lift for Microsoft. There was some cloud lift for Google. But they just -- right now, it's just not mapping to the degree of expenditure that we're seeing them lay out.
Eric Hanselman
Yes, because those are eye-watering numbers, just like...
Melissa Incera
Yes, it's insane. And then, of course, so generally with generative AI, we've talked on the podcast before about the astronomical fund raises that we're seeing these start-ups announced, and most of that is going to all these cloud computing costs and infrastructure costs. So...
Eric Hanselman
You're not a real gen AI company if you don't have a few billion in the pocket. That's like...
Melissa Incera
Yes, it's interesting times. So actually, Dan and I were having this separate conversation. And you and I had this conversation in an event recently talking about CoreWeave, right, and how this company has -- they focus on building out GPU clouds for AI specifically, and they're getting into some really interesting financing with asset-backed -- basically, they're collateralizing their NVIDIA chips to buy more NVIDIA chips to the tune of several billion. I mean I think their last figure was something like $5 billion. Yes.
Eric Hanselman
Oh, yes. And there is that whole chunk of the market, which are the specialized facilities' providers, alternatives to the hyperscalers, CoreWeave, Vultr, a whole set of folks who are out there who are doing that. But there are some interesting financing structures being built, shall we say, to be able to make this possible.
Melissa Incera
Exactly. Yes.
Eric Hanselman
Wow. And it kind of gets back to this question of, okay, we've talked about the hyperscalers, we've talked about sort of with the start-ups that are leveraging this. But there's also a big piece of this that is getting out into the tech industry in general because, of course, all of the big technology hardware manufacturers that are out there are looking to ride on some of these coattails.
They're looking at being able to stack up private datacenters with a lot of the stuff. I guess where does this shake out? And where do you see some of these focus happening? And I guess the extent to which organizations need to totally revamp the infrastructure that they've got, are we heading towards a shift that's going to require that sort of massive change in terms of the existing IT infrastructure?
Dan Thompson
Yes. I mean I can take that one first. I mean -- so interestingly, if we look at what it takes to run these GPU-based systems, today, folks are able to take your existing datacenters and maybe not use the physical space quite as efficiently for the sake of not being able to call these things very well, but you can kind of use a large amount of space to kind of help some of that. As we move into the future, though, like this next generation of systems from NVIDIA, that's going to start to not be a thing anymore. Meaning that you're going to eventually get to a place where you have to have liquid cooling in order to cool these things, which is a total different approach to cooling a datacenter than what we've done for the past however many decades datacenters have been a thing.
And so there is a coming shift where you're going to have to make some changes within your datacenter to make all this work. Interestingly, we've asked folks, are you planning to retrofit your datacenters for improving efficiency, for improving sustainability? You have to imagine, even though we didn't ask this, can you -- are you also doing this to accommodate new technologies? And I got back actually such a crazy number of folks who said that they were doing this that I was actually a bit skeptical.
I was actually kind of scared to say this number out loud, until I started talking to some of our peers and some of our friends that will probably be considered competing organizations just to say, like are you guys also seeing this? Or did something weird happen with my there? And across the board, everyone said, yes, we're seeing loads of people saying they're retrofitting and upgrading their existing datacenters.
Our surveys show that nearly -- actually, just more than 50% of companies who responded said they're either currently working to retrofit their datacenters or they are making plans to do so in the next two years. Now again, this question was asked in the context of are you doing this for the sake of efficiency and sustainability, but we have to also imagine that they're thinking net new -- like IT infrastructure that's going in as well and it needs to be retrofitted for that also.
That's going to be the next iteration of our survey now that we've seen these crazy results. By the way, like we asked this question for years and like folks just haven't been modifying their datacenters as retrofitting is just not -- has not had a good ROI in previous years. And then suddenly, there's been a renewed interest in doing this. So that's kind of one side of the coin.
The other side of the coin is you've got like folks like the CEO of Dell and CEO of NVIDIA together saying everybody's going to retrofit their datacenters, like you need to throw everything away and start over, which, of course, it behooves them to say this because they get to sell more things. But if you think about...
Eric Hanselman
Hey, if you can make the case, they're not the only ones. It's not just Dell. And Jensen Huang, actually, is out there. Everywhere he's saying, you need more GPUs. But yes, you can -- the AMD's doing the same thing, and we'll have the same situation. I think they hope in the not-too-distant future. But it's -- yes, it's Dell, it's HPE, it's everybody who's got a server to sell.
Dan Thompson
The interesting thing about it is, in some sense, maybe that's true. In some sense, if your plan is to roll out large-scale, AI-type tools within your datacenter and you're going to host them yourself since they're relying on a hosted type platform. Then yes, that's true. You absolutely do need to buy a new gear.
There are things like refresh cycles that are benefiting since the dawn of IT, where companies are constantly throwing out their existing stuff for the sake of new things. There is sort of a bit of reality here in that like your directory is not going to run on GPUs, right? Your mail servers, if you still host those internally, like that's not going to run on GPU-based compute. So there is a sense in which there is a whole swath of workloads that are going to continue to work on CPU-based compute.
They're going to continue to work, just as they always have, on infrastructure that looks like what you have today. You may be changing, maybe upgrading, no time really in the future do we see all the other infrastructure requiring these traffic and cooling needs and power needs and what we see for GPU-based compute. So keeping that in mind. There is definitely some change happening. There is change that we see coming. Maybe not as crazy as throwing everything away.
Melissa Incera
Dan, we have some data that corroborates what your survey was saying, and specifically I'll share it. So we did -- this is our infrastructure -- our AI and machine learning infrastructure survey from last year. And overwhelmingly, the data shows that infrastructure limitations were a huge bottleneck for companies pursuing AI projects. And in fact, they came out that infrastructure limitations were the biggest reason for AI project abandonment in our -- in last year's survey. So I will make the point that organizations are facing real problems here, and it's translating.
There's a direct correlation between that and spend. So we have -- the survey also showed that 9 and 10 of our respondents, so they plan to increase spending on AI infrastructure in the next 12 months. So I agree, right, not everything requires a GPU, not everything AI-related requires a GPU, but there's certainly -- I do think there's certainly a significant infrastructure upgrade cycle happening as a result of businesses needing to accommodate AI.
Eric Hanselman
So -- but you just go to the cloud, right? No problem. Well, of course, the problem winds up being that there is also this concern about where your data is hosted, who has access to it. You've now got things like the U.S. federal government starting to weigh in, in terms of who should beware. You've got a whole set of vendors who, of course, are piling on ideas like around private AI, and you should make sure that you own your AI and you don't expose it to any of the other folks that are out there. I mean, holy cow, the back and forth and all of this is not inconsequential.
Melissa Incera
Yes, absolutely. I was going to say, Eric, that -- so our recent infrastructure survey actually reveals that there is -- if you ask organizations where they're looking to add net new AI workloads, most of those net new venues are not public cloud. There's a lot of exploration happening within private clouds, within other types of datacenters that offer both proximity and which helps like with the latency side of the equation, but also the privacy and data security that you're mentioning that is becoming more...
Eric Hanselman
And we see this also in other commission study that we've done with the client just recently. Yes. So maybe, in fact, you do have to replace a whole bunch of your servers. Wow, a lot of things to consider in this entire question and I guess more conversations to be had around this. Thank you both for all the perspectives. We are, of course, at time at this point, but man, so much more to talk about.
Dan Thompson
I feel like we just get warmed up and then it's more time to go.
Eric Hanselman
Well, we'll have to follow up with an extended discussion on this. So I'll hopefully get a chance to get you both back and further to this discussion. But as I said, we are at time. But thank you both for being on today.
Melissa Incera
Thanks for having us.
Eric Hanselman
All right. Thank you. That is it for this episode of Next in Tech. Thanks for staying with us. And thanks to our production team, including Sophie Carr, [ Brenmae Atashian ], Gary Susman on the Marketing and Events teams and our agency partner, the 199. I hope you'll join us for our next episode where we're going to be talking about a different aspect of cloud, some perspectives about actually measuring cloud capabilities, our Cloud Pricing Index. Hope you'll join me then because there is always something Next in Tech.
No content (including ratings, credit-related analyses and data, valuations, model, software or other application or output therefrom) or any part thereof (Content) may be modified, reverse engineered, reproduced or distributed in any form by any means, or stored in a database or retrieval system, without the prior written permission of Standard & Poor's Financial Services LLC or its affiliates (collectively, S&P).