Interview: Charles Sansbury, CEO, Cloudera

insideAI News caught up with Charles Sansbury, CEO Cloudera, at the company’s EVOLVE24 event on October 10, 2024 in New York City to gain some insights into how Cloudera is embracing machine learning and data science, but also the company’s work environment, and employees. Also discussed was news announced at the event such as the partnership with Snowflake.

insideAI News:  I know you have a number of partnerships that you’ve announced today, but snowflake was the one that caught our eye. Congratulations! You’re bringing on some great partners, and it’s just very impressive. Can you speak a little bit about the core capability of this partnership, and what it’s going to provide for your your customers?

Charles Sansbury: Sure. So what we found is a lot of our customers run Snowflake for their web based analytics. They like the user interface. They like the Compute Engine, the storage. The way storage works in Snowflake has proven to be, economically, not the best option. So we had a number of customers come to us and come to them and say, “Hey, can you guys work together where we’d like to use information stored within Cloudera?”

As I said in my keynote, we probably have more business critical information stored in Cloudera than any other system. And I would leave my information stored in Cloudera. I don’t want to use the web and pay web-based storage costs, but I’d like it available to the Snowflake Compute Engine because I’ve got a bunch of business users, not really even data scientists, but business users who love the interface, who are proficient.

I’d like to give them more access to data, but I don’t want to have the surprise of storage costs that I’ve seen in the past, and so we looked at that and from their perspective it had to be a win-win. So from their perspective, they’re a little bit concerned, because although storage revenue is a very small component of their overall revenue, it’s not zero. But they’ve said publicly that they are embracing structures like this, and embracing an Open Table format called Iceberg, which allows people to store data in places other than the cloud, but then use the cloud based Snowflake analytics engine and reporting capabilities. And so we said, “Well, look, that makes a ton of sense.

So driven by customer needs, we started working together and developed kind of a first class integration. Basically, the products work together without having to do a bunch of integration work each time and make it custom. It allows us to maintain data stored in the Cloudera data platform, with all the scalability, security and cost of ownership … allows us to offer to our customers, many of whom are snowflake customers, access to that user interface and that cloud based analytics … allows snowflake to expose more of the data that’s in Cloudera to its analytics engine, to its user interface. So the goal, if it works out, is we have more customers buying this integration from us. We are having data that’s resident in Cloudera be processed in the snowflake engine. From snowflakes perspective, they were already seeing the storage based revenue go away, which is a low single digit part of their revenue, and that’s replaced with a lot more access to data now stored in Cloudera upon which they can run analytics.

So when viewed from that perspective, our offering and their offering are complementary to one another, and so that’s kind of how it came to be. It was driven originally by some customers asking us specifically if we could do this. Then we reached out to them, and got to a point where we agreed that it mutually made sense. So again, it was customer driven, and it could be exciting for both companies.

insideAI News: I love that it is customer driven, that you go out to the clients and ask … tell me your pain points. It sounds like this opportunity really opened some floodgates for both of you.

Charles Sansbury: Yes, it’s pretty brand new. We had built some custom integrations beforehand for a couple of the early customers that were requesting this. So we know that it’s a real need, and we know that it provides real value to end customers.

insideAI News: So on to machine learning. Our Editor-in-Chief & Resident Data Scientist Daniel Gutierrez was curious about Cloudera’s work in machine learning, how it’s useful for data scientists, and how it stays ahead of the rapidly evolving demands of AI?

Charles Sansbury: AI is effectively machine learning with better models and faster process capability. The Cloudera data platform is effectively an enterprise data platform that brings together data from a whole bunch of different silos, different places, different systems, and through some data engineering, establishes kind of a common format look-and-feel across the data, then makes that data available to feed and train models, historically ML and increasingly AI based models.

Then we provide companies with both that data lake house as well as the operational tools, increasingly, to manage their their selection of deployment of evaluation of LLMs running on their data in their lake house, and allows the data scientists to develop hypotheses about business data, pull down the model of their choice, whether that’s an open source model that exists in a repository, like a Hugging Face or one of the proprietary models from … I think today we have Anthropic, Cohere, and some others. I don’t know the specifics of why some models are better with quantitative versus word based problems, but we give you the tools to both set up that Lakehouse, and also manage the deployment, evaluate the performance of different models on your data set, and allow you to put models into production that have the highest results, and allow you to evaluate performance and drift of those models as they’re in production.

So what we’re trying to do is to provide people with the operational infrastructure, to provide the tools to allow them to go pursue AI, to enable the data scientists with a combination of the creation of this data repository, and the management tools for managing all these models to empower them.

We are not building large language models. We are providing the infrastructure to hold them down and deploy them, put them back and so and we try to focus a lot on usability, ease of use, more graphical user interfaces, and increasingly, actually voice based prompts through AI within our product that allows the data scientists to be able to ask questions of the system, have those translate into queries, and have the queries run against the data. You can then evaluate the performance and either say, yes this makes sense or this is a bad plan.

The other thing we’ve done is we’ve pre-packaged certain language models and integrations in kind of quick start packages. The whole goal has been to enable the data scientists to more quickly get to a point where they’re asking questions and getting business value. They don’t have to spend all their time setting up environments, writing long SQL queries, and iterating in a way that is not very effective. The other thing it does is it allows someone who’s maybe a little bit less technical to still be effective in that analytics rule, because it abstracts away some of the inherent technical complexity.

insideAI News: Interesting. So it definitely saves a lot of time, a tremendous amount of time. Maybe, you know, a crutch, maybe it’s good to go on a limb for a data scientist.

Charles Sansbury: I don’t think a crutch. I think it’s more the idea that it reduces the amount of platform technical expertise, and I think more important than the understanding of analytics. I think our experience is with the folks who run data science teams at our big customers. They’re pushing their data scientists to understand better the business parameters they’re dealing with, and I think it allows them to spend less time on the plumbing and the technology, and more time thinking about how can this business problem be solved.

insideAI News: Can you speak to the ROI here with maybe a quick example?

Charles Sansbury: One of the customers that wanted this is a global airline, and they they wanted to give every customer service focus. They wanted to make sure that every customer could have their choice of meal on a plane. You have 500 people and with three choices. You don’t want to carry 1,500 meals. They had years of modeling and quantitative analysis over which specific meals and so they started doing some predictive analytics, and ultimately did some some AI-based analysis. What they found was their data was kind of siloed, some in North America, some in in the Middle East. Based on where they ran the data, it predicted differently what food that people would want. It makes sense when you think about it. But the point is, if you’re running a data set, you don’t think about the business issue. Then someone said we have to incorporate the demographics of our passenger set, because that’s going to impact how many people want option A versus option B.

It’s a simple use case because you needed to have human insight interaction. If the person who was responsible for the project was focused on the speeds and feeds and platforms and all that he or she might not have, you need to step back with a forest versus the trees moment. We need to use the appropriate data set based on the flight and just that insight made the project successful. If you didn’t have that insight, you wouldn’t make much progress in solving the problem exactly. My guess is there are hundreds of those types of application “Aha, moments” that exist within all of our customers, but it is dependent upon someone who has a combination of analytical skills and business understanding to be able to pull those together, to make those insights, and we’re trying to be able to have people spend a little more time on the business insights by making the technologies a little bit easier and more accessible.

insideAI News: Do you think it’s easier? I was just trying to think about data scientists, and I know a couple of them, and they’re very, very analytical. They have a hard time articulating what they’re producing. So do you think this technology is going to help them?

Charles Sansbury: I think we also have a general generational issue within the data science community. The folks who are newer to the technology are familiar with some of the web based technologies which allow people to think a little bit more in common terms about how they might structure a query. I think people have been doing it for a long time. You literally may think in ones and zeros and the English language, as you and I use it, is not how they think or construct ideas. So I think there’s still a language barrier that exists, but I think that is this designed to make a little bit easier to translate.

insideAI News: Last but not least, about your work environment. I’ve met a couple of your employees. Everybody is very professional, very nice, and goes out of their way. But I’m just curious about the Cloudera work environment. Where’s your headquarters?

Charles Sansbury: Our headquarters are in Santa Clara, but we are kind of global and not site specific. I think that makes it very hard to build culture. Whereas, in the old days, we’d all come into the office maybe have a happy hour on Friday. The last couple of companies I’ve worked for, we thought a lot about what is achievable in terms of building culture. If one of your options is to have people come in four or five days a week, you sit in the office, meet with their managers face to face, and it’s hard.

A couple of the values that I think we project across the company are transparency, so we try to tell folks as much of what’s going on and why we’re doing it as possible. We do very frequent all customer town halls with all the managers.

We all travel quite a bit, and when you’re in one of the various locations, you’ll do an all hands to give people an update. That means message discipline is very important. So from a leadership team perspective, we have to have a common set of themes we talk about, and those are themes like transparency, prioritization of what we’re doing and also what we’re not doing, because that’s probably the hardest part of prioritization, a set of shared results and accountability, and then also alignment, both in terms of making sure that the the operational goals of the company are well articulated.

Everyone has milestones tied to that. And the last piece is from an ownership perspective. Our financial sponsors have actually mandated that every employee also be an equity owner. So everybody across the company has equity and that equity is equal to the the rights of the financial sponsors.

So I think the ideas of transparency around goals, alignment, accountability are important. I’ve also tried to instill a sense of urgency and that we’re in a very competitive market. We have to move fast, and then these shared goals and alignments are super important. Although we’re not a work from home company, we all do have offices. We’re also not a “you have to come in every day company.” So we’re still trying to find the right balance with respect to time in the office.

We have this interesting phenomenon I’ve observed, which is people like me who’ve been in the workforce a long time. We grew up going to the office, being trained and mentored and we valued that time quite a bit. Younger people who kind of came of age maybe during COVID, and started their first job working from home. They’ve realized they’re missing out on professional connections as well as a social connection to the workplace. So a lot of the younger folks would prefer to come in the office, and plus, their apartments are such that they don’t really have workspace at home. It’s the folks in the middle of school aged children who value the flexibility a lot, and they’re the ones that if you try to drive a return to office, you’d ask them to drive that change, because they’re middle management layer, and they don’t want to. Because of that, I think the balance is trying to encourage people with two days out of three, maybe two days or two or three days out of three or four days, leaving aside Friday. I try to make sure I do conference calls with people on Friday, just to make sure we’re still keeping to that discipline, but trying to strike that balance.

We have another couple of things that started during COVID which have persisted. They’re called “unplugged days.” Think about when COVID started, you had Zooms from 8am to 8pm and you’re sitting around with a pair of shorts and you basically realize your day is gone. What started during COVID was trying to have one day a month. It was a day with no zoom meetings. But that evolved to one day a month where the company is unplugged. I would call it logical days when people aren’t coming in anyway. So the Friday after Thanksgiving, or if Christmas is on a Wednesday, then that week can potentially be an unplugged week, trying to give people guaranteed time off.

With the technology right now, even if you’re on vacation, you know you’re getting phone calls and texts and things you should respond to, so I think we’ve done a pretty good job of of work life balance. I do worry that no company ever got great by having work life balance as its primary goal. But the reality of the technology environment we live in is you have to create spaces where you could take time off, otherwise, literally, your work can basically fill whatever container you pour into. Arguably, I should be doing something 24 hours a day, seven days a week, and take time to go to the gym or sleep. I’m prioritizing that over something work related which might be very important. So it is very tough, and I think very tough given how connected we all are and how global our jobs are. So these unplugged days are highly valued by our team. That’s probably the most highly valued perk that we have.

insideAI News: Great idea! The reason I asked this question, and I know it’s a little bit of a softball question, but my readers and my my clients (e.g. Dell and NVIDIA and Lenovo), every time I go to a conference, they were talking about recruitment. They need help. So what does it make Cloudera attractive to a top data scientist that’s making crazy money at Meta, to come to Cloudera. So that’s why I asked the question. It’s kind of like a personal type of thing, but it humanizes the company too. You really care about your employees, about the transparency

Sign up for the free insideAI News newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insideainews/

Join us on Facebook: https://www.facebook.com/insideAINEWSNOW