TechArena Podcast Transcript: Vast Data

TechArena Podcast – Vast Data

Transcription

 Welcome to the Tech Arena. Featuring authentic discussions between Tech's leading innovators and our host, Allyson Klein. Now, let's step into the arena.

Allyson

Welcome to the Tech Arena. My name is Allyson Klein and we are here at Supercomputing bringing you stories about innovation in the scientific realm. I'm so pleased to be joined by Jeff Denworth, co-founder and CMO of Vast Data. Welcome to the program, Jeff.

Jeff

Thank you, Allyson. Awesome to be on here.

Allyson

So, Jeff, why don't you just start by talking about Vast Data, introduce it to the audience and tell me why you decided after many years in this storage arena, to found a new company focused on universal storage.

Jeff

We founded the company because we saw a variety of tradeoffs with respect to how people managed and processed on their data. And then we realized that for the first time, a number of those tradeoffs could be broken such that customers could start to interact with and work with their data in entirely in different ways than what they were able to previously. And so, as you mentioned, we created this product that we call Universal Storage, which is kind of a generic marketing name. It's kind of like a really good toothbrush that is great at cleaning teeth and stuff and doesn't really tell you much. And the reason for this is that we realize that if you can build a system that is designed to give customers both just, like, awesome scalable performance, but at the same time, not incur the same kind of classic high performance storage tax that customers have always had to pay. Well, then what it does is it defies the classic categorization of storage, right? If you think about the last 30 years, customers have been buying all sorts of different types of storage systems within their data centers for different use cases and workloads backup systems. You have file systems, you have high performance systems, and you've got everything in between. And we realized that if you could break these tradeoffs well, then you could truly buy one product that was a universal tier of flash that could be applied to all of your data.

This product is essentially something that defies classic categorization. And I'm a big believer that you can't just have like a cool product when you come into market and expect customers to buy it. There also has to be a market disruption that kind of drives the need for IT buyers and IT consumers to consider new solutions. And here we kind of timed it at the right time where people were just starting to talk about machine learning, just starting to talk about deep learning. And the realization is they started to digest what these workloads meant within their data center, not only just for new data that they were acquiring, but also for the legacy data that they wanted to open up access to. Is that you needed a new system or new data platform that ultimately could remove the constraints from these new workloads, being able to go and train and infer on their data.

And so what happened is the kind of the market opened up to us and said that we need something new because the pyramid of data storage that people have been managing for the last 20 to 30 years, well, the dynamics have been turned upside down. Where these customers now needed the fastest access to the largest amounts of data, that's not how things have been done up until now.

Allyson

You've been getting a lot of attention in the media of late, including a call out by Gartner’s being part of their Magic Quadrant. Storage companies historically have not been treated with this kind of celebrity status in the industry. Tell me why so many people are paying attention to this new approach to storage. You touched a little bit on machine learning and AI, and I can understand that it's unlocking some new opportunity, but give me some more detail about why customers are responding so well.

Jeff

So I think first and foremost, it's a story that's too good to be true. If I come into a customer environment and say, you just have flash for all your data and scales, and you never have to worry about doing anything ever again, it's like, okay, I've heard these types of stories from vendors up and down, and there's always a big asterisk. So what's the asterisk? And we basically say there's no catch. And so as a new incumbent storage vendor, let's just say it's a new kind of emerging storage vendor. The next thing that customers ask is, well, does it work? Because most storage products are pretty crappy. No offense, but if you think about companies that just get into the space, they have to prove themselves. They have to build out their capability if they build out their technology. And then you go into the HPC space and you go to a customer and say, hey, I've got this product that you can tie something like ten or 200 compute notes to, and it should probably just work, right?

These are things that a lot of storage companies aspire to, but never actually end up sticking. And so we realized in the earliest days that we had a QA the hell out of a product architecture that is just simpler to develop upon. And I think what we've done is we've kind of brought the notional time horizon to build an enterprise storage system down from what people conventional wisdom is always it takes about ten years. We brought that down to five years and at the same time, if you can actually deliver on this promise of flash for the cost of disk customer just start buying and buying and buying. And so we just announced last week that we've got now three customers that have committed since we started selling just a few years ago over $100 million around our product. And that's like the same amount of money that whole storage companies get over the first couple of years and we've gotten it from single customers. And so the bookend to our story is that it's not just a cool technology at the right time, but it's a company that has broken out of the classic kind of like let's call freshman class of new storage players. And we're now being considered among the top tier infrastructure providers in very short time because we're selling like crazy. And I think that's what Gartner called out. We had the highest ranking of any company that's ever entered into the file and object storage Magic Quadrant and that's simply a by-product of the market share that we've captured over the last couple of years.

Allyson

So where are you seeing market traction? Is it with the hyper scalers? Is it Enterprise? Is it HPC? Where is the market taking off?

Jeff

If you think about the hyper scalers, the first thing to consider is that none of them have really a good file system. They've all gone and outsourced either opensource software where a lot of the large cloud service providers, they resell Lustre, or they're reselling something like NetApp. Nobody's really built their own first party file system in a great way. And a lot of times it takes them years to roll a new technology through their service catalog for reasons that they have the weight of thousands of customers sitting on them, that they have to make really smart decisions that are very thoughtful as they kind of move through different technology generations and concepts. The enterprise customers that we work with, and in particular in HPC and I very much go to lengths to not characterize universal storage as an HPC product.

But in the early days, we realized that it was a kick ass product for HPC customers. My background is in parallel file system, so I've got a long history in the space, and we kind of recognized early on there was real connotation around parallel file systems within the market. They were thought of as being fragile and complex, and you kind of have to hire a PhD to go and manage infrastructure for your PhDs. And we basically said, let's take the computer science out of this for the customer. In doing so, we found an enterprise customer base that was is really receptive to the idea because we're not selling parallel file systems. It's just a parallel NAS. We've solved for some of the most fundamental scaling challenges of classic scale out NAS systems. And we basically say to customers, well, if you can solve the storage-side bottleneck, it turns out that you don't have the same client-side bottlenecks that you thought you did. Right? And everybody in the HPC space loves to pick on NFS as being not scalable. And now we're showing the world that you can run your hardest codes at the highest levels of scale with NFS by explaining to them, and it takes a long time, but explaining to them that the server was always the problem, not the client. And if you can solve the server-side problem, you can do anything.

Allyson

When you think about high performance computing systems and supercomputing, one of the biggest conversations at the show is going to be the intersect between AI and HPC and underlying infrastructure changes around composable infrastructure to serve the coming needs of an Exascale era. When you think of those opportunities, what do you get excited about as the person who is providing the data storage for some of the world's largest challenges?

Jeff

There's a gentleman who actually I'm not sure if he still works at HPE, but he previously did. He came in through the SGI acquisition. There's a CTO of SGI named Dr. Go. And Dr. Go, made a slide forever ago, and I use it in almost every one of my presentations, which basically shows the dichotomy between HPC IO classic, like, simulation based IO and AI IO and in the HPC kind of classic area. Had a little bit of data in the form of input directories and then a ton of data out in the form of simulation data and checkpoints and things like this. Well, in the era of AI, that dynamic completely gets turned upside down, where now you have a ton of data that needs to go into training these models and a very small amount of data that comes out. And so what we realize is that a lot of organizations have built infrastructure for that first class of workload and haven't thought at all about the read problem and the random read problem that is pervasive with AI. And I have this axiom that says pretty much every HPC center is in the process of trying to also evolve to, or expand into also being an AI center of excellence or competence. But not every AI customer wants to become an HPC customer, if you know what I mean. And so what's happening is those organizations that have strong backgrounds in HPC, well, they know about GPUs, they know about programming languages that are used for these types of AI accelerators. They know about our DMA networks, and they know about scalable storage. They're well suited to deploy AI infrastructure if they kind of move to a different application area, you. And here they all need all Flash for their infrastructure. And they're realizing that none of the parallel file systems and none of the scale out mass platforms were ever designed to make scalable flash affordable. And that's the kind of saving grace that Vast has here. But on the flip side, we made a conscious decision vision to build a NAS as opposed to a parallel file system because parallel file systems are really tricky.

I've got something like 15 years of experience with Lustre. I was in the original Lustre team, and the NAS markets sold circles around parallel file systems. Even though from a scale and a performance perspective, parallel file systems have always been better, it's just they've been always more difficult to deploy. And the customers are telling you by looking at the market share capture, well, if you don't need that performance and you can get suitable levels of capability from a NAS, customers will always choose the NAS. And so we basically looked at it and said, how can we unlock the performance from NAS such that you can use it for everything? And that's why we think we're really well poised. That's why Nvidia, for example, is an investor in Vast. That's why we are proud to say we just won the HPCWire Editor’s Choice Award because we're basically just democratizing this easy system for any class of scale and readying every organization for this movement that is about to hit us around artificial intelligence.

Allyson

Congratulations on the award. That's wonderful news. You just described incredible capability that Vast is delivering to the HPC arena. As you look forward into 2023 and the current demands that scientific community has to solve some of the world's biggest problems, what are the key breakthroughs that you're expecting from the industry as a whole to further high performance compute platforms? And what is Vast’s role within that?

Jeff

You're starting to see evidence of some. 1s Some really spectacular science that becomes possible mostly on the back of people figuring out how to incorporate neural networks, machine learning, deep learning, into the classic applications that they've been deploying. I think about some areas where a presentation from NOAA at the Hyperion user group, and they basically said, almost all of our codes are moving towards machine learning so that we can essentially get to much better model accuracy. We work with oil and gas companies, and a lot of them are now stopping exploration, but they're not stopping computing, and they're not stopping accelerating their computing investments. They realized that AI can help them from with everything from getting more efficiency out of the reservoirs that they've already discovered to finding new applications for alternative energy that ultimately can allow them to diversify their business.

But probably my favorite story was there was a code that was released by Google earlier this year called AlphaFold two. And basically up until now, you have all these customized processors that were designed for protein folding and simulation. And what Google showed the world is, if you don't need to be absolutely accurate on this determination, as you start to look for ways that proteins fold and how they can bind with certain biological structures, well, you can just use a GPU, and you can infer on that and get like 99% accuracy. And then if you find something that looks good, then you go actually calculate it properly. And this has been a huge scientific breakthrough of almost a grand challenge problem that's been solved, where now you've got universities and research labs around the world that are all doing protein folding a hundred times faster than they used to, which will lead to just so much faster drug discovery. And I think that is an example of what's going to start happening at a more and more frequent, frequent pace as you start to realize that the models that are now being trained or moving into the trillions of parameters and you're just going to have this cascading amount of innovation that comes from it. That the pace of which we've never seen as a society. So honestly, I don't know what's going to come over the next year or so, but what I do know is that the market is relentless to push the envelope in ways that we never saw before.

Allyson

Jeff, thank you so much for being on the program. You've shared some incredible thoughts about high performance computing, the innovation of storage, and Vast’s role. I appreciate you being on. One final question for you. Where can folks engage with the Vast team and continue the dialogue with you?

Jeff

Well, we're at SC 22 this week in Dallas, so if you're curious, just stop by our big old booth in the middle of the trade show floor and you can definitely have a conversation with some of our engineers there. If not, you can find us on www.vastdata.com and we can take it from there.

Allyson

Fantastic. Thank you so much for being on the show today.

Jeff

Thank you, Allyson

 

 

Thanks for joining the Tech Arena. Subscribe and engage at our website, the Techarena.net. All content is copyright by the Tech arena.

Previous
Previous

Universal Storage Disruption with Vast Data

Next
Next

Universal Storage Disruption - Vast Data Podcast