Does more data give you better decisions? That’s open for debate, but the quality of the data you use can have an impact on getting to decisions faster. Justine Iverson, head of S&P’s Global Marketplace explores the different aspects of working with data sets with host Eric Hanselman. Tools with higher levels of abstraction can put data-driven decision-making in the hands of more people and power a digitized business. Will the fridge of the future be adorned with data plots from school? Maybe.
Conflict in Ukraine
Read BlogSubscribe to Next in Tech
SubscribeTranscript provided by Kensho.
Eric Hanselman
Welcome to Next in Tech, an S&P Global Market Intelligence podcast where the world of emerging tech lives.
I'm your host, Eric Hanselman, principal research analyst for the 451 Research arm of S&P Global Market Intelligence.
And today, we will be discussing data and its distribution, the latest in our data theme series of podcasts. You caught our last episode. We were talking about data security with Justin Lam. And today, we have joining us Justine Iverson, the head of S&P's Global Marketplace.
Justine, welcome to the podcast.
Justine Iverson
Thanks for having me, Eric. Super excited to be here.
Eric Hanselman
Well, I'm excited to have you here because we've talked about a number of different aspects of data, but it's something that in the 451 Research side, in our most recent artificial intelligence and machine learning study, the greatest impediment [indiscernible] nontechnical impediment, after things like budget and skills gaps, was data access and integration. And it's one of those challenges that I think we keep seeing over and over again in terms of effectively moving better decision-making projects forward and really taking advantage of the benefits of AI and ML. And I'm guessing that this is probably something that you've seen in decision-making environments as well.
Justine Iverson
Absolutely, Eric. I think you're spot on there. A stat you often hear is 95% of the world's data was created in just the past 5 years, so you can imagine the explosion of data that's now available. And that's super exciting, right? More data equals more insight and more information is what people think, but to your point, there's a huge challenge of being able to connect that data, bring it into their systems or portals or technology tools they're using to actually uncover insights from that.
A really interesting comparison we think about is, when you buy a house or you buy a car, you don't just look at one data point, right? You look at the school district and the taxes maybe and the number of bedrooms and the yard size, and that's how you make your decision. And it's the same thing in the market for our clients. They need various data points to be able to make those decisions. And without all of that connected and integrated in a single way, it's really, really challenging to do that, so that's definitely a big pain point we see.
And I think I can even speak to an example we've seen here at S&P Global. I think you've had the Kensho team on the podcast before and...
Eric Hanselman
We, they did, yes.
Justine Iverson
Yes. They're an amazing group. We acquired them in 2018. And they are some of the most skilled AI and machine learning engineers that I've ever worked with, but one of their challenges was they didn't have the data to train their models, right? But bringing them together, marrying them with S&P Global, we've been able to create a ton of solutions that solve our internal challenges but also challenges for our clients in this space.
Eric Hanselman
Well, you raise a really important point, which is that it's not only access to individual data sets. It's the breadth of data that you've got to be able to bring together to ensure that you've actually got the scope of all of the different factors that may be influencing that eventual aspect that you're trying to gain advantage of, that you're -- the model you're trying to build and the sources you're trying to integrate.
Justine Iverson
100%. And I think, as we think about that, the cloud has changed us even more, right? Cloud has made it easier to access data, but it's also created a greater dispersion in data quality and what's available to clients to make those decisions.
Eric Hanselman
Yes. This is something that we talk a lot about on the 451 side, in a lot of the analytics stuff. Nick Patience has spent a lot of time talking about how do you really make sense of data and a lot of that ingest and data prep, ensuring that you've got data sets that are in good shape, to be able to represent what you're actually trying to extract from the actual data itself and that there's so much upfront work that that's a challenge.
I mean, especially when you think about all of the different end consumers of that data, the different teams that are all looking to put it to work, there's a big challenge in just simply dealing with all the different types of tools teams are using. I mean you've got some people want to write Python scripts. You get other people working in R, all these different aspects. I mean, what kind of organizations do to ensure that they're getting the right data in the right forms and the right platforms?
Justine Iverson
Yes. It's a great question, and I think there's a few things. One, understanding what you're getting upfront from a data perspective is really important. So that quality, that completeness, the ability to link it and use it with other data sets, but understanding what your team needs to be able to make sense of that data is important, right? And as you said, there's a ton of different tools. And I think what's been really fascinating to see over the past couple years is the interconnectivity of these tools, right?
A lot of the cloud providers today have direct connection to Python and R and Tableau and all of these tools available, so I think looking at your tech stack from end to end; and understanding what makes sense based on your users, what problem you're trying to solve, right? Are you trying to back test a model, or are you trying to build a dashboard for your internal executive team? What are you trying to do? And I think understanding that and then backing into your tech stack and equipping your team with the skills for that will really set you up for success.
Eric Hanselman
Well -- and that's an interesting angle when you think about what you're actually trying to deliver. I think there's so much focus often on the data itself, but really ensuring that you're fit for purpose in terms of what you're trying to achieve, I mean, is this something where you're trying to get real-time data? [ So you're ] identifying like for dashboard. Is it decision support? There are so many different ways in which we can put data to work. It's important to consider that you've got the means to be able to support [ what are ] a lot of different end products for this as well.
Justine Iverson
Definitely. As you think about data, I think sometimes people say data is going to be the magic bullet, right? "That's going to just give me that edge to get ahead." And while we are firm believers in that, most important point is to start with what you're trying to solve. What problem are you trying to solve? What question are you trying to answer? And then approach it that way versus trying to make your data work for you. I think, yes, data does work for you, but understand where you're trying to go with that makes a huge difference in the effectiveness of this.
I think a common misconception we hear, and yes, we are a data provider, is more data equals more insights. And I actually challenge that view a bit. I think more data can equal greater insights if you can efficiently and more effectively garner insights from that data, so I think being able to, again, connect it, tell that bigger story is where you really eke out that differentiation.
Eric Hanselman
So it's important to expand your perspective and understand what the full set of possibilities are, but don't go overboard because you need to actually ensure that you can integrate these various data sources in ways that are actually going to be useful.
Justine Iverson
Exactly, and give you that speed and time to that decision, right? You can spend all this time trying to actually gain insight from 1,000 data sets. Or you can know you're using the curated-quality data of 5 to 10 data sets and get faster and actual greater insight out of that.
Eric Hanselman
Well, that's actually something that we've got an upcoming episode with the credit and risk analytics team. And that's one of those areas that we're going to actually dig into some of the practical pieces of how many data sets actually make sense. How do you really build a system?
I mean I think practically, if we look at what's been going on for a lot of our key data users, the one that keeps coming to mind is a lot of the decision analytics that are taking place around dealmaking and a lot of the valuations of all of this frothy market that exists today. I mean [indiscernible] volumes have been at crazy levels. What should people be doing for practical things like this to be able to up their game?
Justine Iverson
From my view, I think again it falls back to that what problem are you trying to solve or what are you trying to uncover. And from there, it's finding those data sets that will save you that time on the back end, right? So are you using data providers or data that is clean, that is structured, that is linked, that is delivered in a way that you can bring it into your system [ and portals ] without spending all of your resources simply ingesting data.
I actually think it was one of the 451 Research studies done at the end of last year that stated 50% to 60% of data analysts' or data scientists' time is simply spent searching for and ingesting data and prepping data before they even get to the analysis. So if you think about that, imagine having a whole team of highly skilled data scientists. And half their time is just spent trying to get it in and figure out what to do with it, so you can imagine, if you can equip those teams properly, I think you will really see an edge on that front.
Eric Hanselman
That is one of Nick Patience's favorite statistics, which is you've got to invest to build infrastructure and to integrate data sources they -- to ensure that the upfront work isn't the thing that takes the 50%, 60% or greater percentage of that time to actually get to real results. And that's that aspect if we think about what are the places that you need to invest. Where do you -- are those efforts really best placed? It's upfront.
It's like the old Abraham Lincoln line. If I had 4 hours to prepare for a job if I was cutting down a tree, I'd spend the first 3 hours sharpening my axe. It's making sure that your tools are in good shape, that the environment in which you're trying to get work done is at the ready when you're trying to -- when you've been able to actually come to the point where you can get work done. And by working with capabilities and data sources that have done that work for you, hopefully, you can get the answers all that much faster.
Justine Iverson
Couldn't agree with you more on that one, Eric.
Eric Hanselman
So we've got, heard of different data sets, lots of different perspectives. We've talked about expanding your perspective, but how should organizations really think about -- with all these various data sets that are available, what are ways in which they can manage the process of selection and integration?
Justine Iverson
Yes, I'm going to be a little bit biased in my answer here. As you started out, I am the Head of the S&P Global Marketplace. And I think marketplaces and tools like that have completely changed the game. I think, historically, data purchasing, especially in a bulk delivery way, right, via an FTP, an API, cloud. Historically, that was a very hands-on, intensive commercial process; a lot of back and forth on questions and understanding history and how the data could link and how you could use it and the geographical coverage. All these things that are really, really important in the data-buying process is now at your fingertips. And I think that completely changes the game as we think about speed to market, speed to being able to make that purchasing decision, so I think, one, that is a great tool.
I also think, as we think about integration, it goes back to what you were saying. Investing in the upfront part is really important, right? There's been a huge migration from on premise to cloud and all of these things, and so making sure you are set up to manage that is really important for your team.
So something we've seen. There's a historical bias or historical reality of there's data silos within organizations, and I think that makes sense in certain organizations. There has to be, but a lot of organizations, there doesn't have to be, so how can you break those down internally? How you -- how can you make it easier for everyone across your organization to access that data when you are bringing it in?
And last but not least, again I think it's really important where you're getting your data from. You have your own internal data, but any external data, what's the source? What's the support model of that data, right? As you're using all these data sets, you're not going to understand every nuance of it, so is there a strong support model? Is there a quality program to ensure that data is up to snuff and can meet your needs? I think those are all considerations that can really help some of those challenges associated with utilizing a lot of different data sets.
Eric Hanselman
You have touched on one of those things that -- I know podcast listeners keep hearing me say this over and over and over again, but one of the most powerful things about tech is that we've got the ability to raise the level of abstraction with which we deal with problems. And to get away from the very low level, I'm just going to get a blob of data. I'm going to expect that I have to do all of the upfront cleaning, rationalization, processing, all those pieces, to moving to an advantage where you actually have a platform where you can do the ingest at a high level where you know where that data is coming from.
You've got, you have some confidence that it's already in a condition that you can put it directly to work. It's the power of that abstraction that lets you raise your game and, more importantly, deal with scale. And when we think about the levels of market activity that are taking place now, the speed with which all sorts of decisions have to get made, organizations have to get to a point at which they're able to operate at greater scale without requiring that they scale up their organizations. There are options here that they can put to work and really ensure that they can raise their game in ways that make a lot of sense for their teams as well.
Justine Iverson
You bring up a point. I was at a conference last week, and there was a presentation on the fintech space. And there was a stat that there are 31,000 fintechs globally, so think about that. There's 31,000 companies that can help you make this easier, right? And they have a big range, right?
There are some that are providing these very technical tools for very niche workflow, but there are some that are providing solutions for the low-, no-code area. And they really cover the whole gamut and the whole spectrum there, so I think being smart in your partnerships, being smart with your talent, being smart with your tech stack can really set you ahead in all of these arenas.
Eric Hanselman
And low and no code is yet another abstraction that allows us to now move those decision-making tools out to a much broader audience. So in fact, you can start to put those sorts of decision-making tools directly into the hands of the people who are -- or actually wrestling with the problems, as opposed to having to find the appropriate data teams, which are already in short supply; and really, once again, raise your game by raising level of abstraction.
Justine Iverson
Yes, 100%.
Eric Hanselman
So what should organizations be planning for? We've come from a world in which it was a novel thing to be able to [ suck ] a bunch of data into a Jupyter Notebook; and a lot of sort of cool, whizbang pieces, but now we've got a world in which there are platforms and capabilities, markets of data that can help them understand what they can do in new and better ways. How should they be thinking about putting this together and with an eye to the future? Where do you think this is going?
Justine Iverson
Yes. I think the first area is really planning for that next generation of employees. I think, as we look at the current workforce, current technical aptitude, it's there's a range. There's been a lot of upskilling happening and ability to kind of ingest these tools and use these tools and use this data, but that next generation, they are all going to know how to do it, right? I know middle school and high school kids that probably are more proficient coders than myself. And so that is very, very interesting as you think about your organization and how you're preparing it for the next generation because everyone is going to have access to that talent.
So I think that's one thing. It's preparing it. And again that goes back to your tech stack; to your data quality; to your data silos or, hopefully, lack of those data silos when you can, to make access available. I also again think that fintech space is really interesting.
I think there's a ton of opportunity for partnerships or collaboration and using these tools to set your business forward because, again at the end of the day, competitive pressures are going to continue. No matter what space you work in, markets are moving faster. Decisions are being made faster. There's more information to make better decisions, so any way that you can differentiate and utilize data and these tools to do it, you're going to be a step ahead.
Eric Hanselman
It's making ourselves efficient enough to be able to deal with scale and maybe counting on the idea that kids start coming home from first grade with some really cool-looking Tableau plots.
Justine Iverson
It's amazing, what I see happening at a younger and younger age. It is amazing, yes. It will be a different image hanging on the [ fridge ] of the future, I think.
Eric Hanselman
Well, we'll see whether or not that pans out, but you raised some really important points about being able to put the investments in place to be able to do decision-making at scale. And it's all of those pieces that have to come together in order to really make this possible, so very important points.
Well, thank you, Justine. This has been great.
Justine Iverson
Yes. Thank you so much for having me. I enjoyed it; love talking about data and how -- bringing it all together and linking, formatting it, structuring it and delivering in a way that's easy. It's near and dear to my heart and that of my team, so thanks so much for having me.
Eric Hanselman
Well -- and critical to being a database business. We talk in so many different areas about digitization. And hey, guess what, it's all about the data.
Many thanks.
Well, that is it for this episode of Next in Tech. Thanks to our audience for staying with us. And lest anyone think that I'm just out here making all this stuff sound great by myself, I want to thank our production team. It includes [ Caroline Wright; Caterina Iacoviello; Ethan Zimman ] on the marketing events teams; and our studio lead, [ Kyle Cangialosi ]. Thanks to all of you for making the podcast what it is.
I hope that you will join us for our next episode, where we will be talking about consumer ESG with Sheryl Kingstone. Hope you'll join us then because there is always something next in tech.
No content (including ratings, credit-related analyses and data, valuations, model, software or other application or output therefrom) or any part thereof (Content) may be modified, reverse engineered, reproduced or distributed in any form by any means, or stored in a database or retrieval system, without the prior written permission of Standard & Poor's Financial Services LLC or its affiliates (collectively, S&P).