This week Chris and Martin are joined by Erik Kaulberg, Vice President at Infinidat. Erik has appeared on the podcast before and this time is here to talk about how we build storage systems of the future. You can tell from the introductions that we recorded this episode towards the end of 2019 – we haven’t been transported into the future!
With many choices in new media, system builders have a wide choice of persistent storage from which to build new architectures. There’s NAND flash, traditional hard drives and a range of technologies such as SCM and MRAM.
In an ideal world, we’d simply build systems from the best performing media, but cost and resiliency factor into the equation. Building future systems will continue to be a balancing act of choosing the right media for the right data type and matching the two efficiently.
Will there be one platform to rule them all? Probably not, but we may see differences in core and edge designs.
Elapsed Time: 00:36:15
- 00:00:00 – Intros
- 00:02:54 – What are the future challenges?
- 00:05:05 – Is any data truly unstructured?
- 00:06:07 – What new media types are we seeing?
- 00:08:40 – Application rewrites may be stalling new technology
- 00:09:50 – All media sits on a graph combing cost, performance, endurance & capacity
- 00:12:50 – Is it practical to put all data onto the fastest media?
- 00:15:30 – Will we see more “media defined” storage systems?
- 00:17:47 – Is more data moving to all-flash systems over time?
- 00:20:26 – Hyper-scalers worked out their storage architectures many years ago
- 00:23:31 – How will encryption be built into future storage designs?
- 00:27:40 – Should we build one “platform to rule them all”?
- 00:31:53 – We’re not going to see an end to new platforms & solutions
- 00:33:23 – Will all future storage systems be based on disk or flash?
- 00:35:12 – Wrap Up
Speaker 1: This is storage unpacked, subscribe at storageunpacked.com.
Chris Evans: Hi, this is Chris Evans recording a storage unpacked podcast and I’m here with Martin this week. How are you doing Martin?
Martin G: Not to bad Chris, how are you?
Chris Evans: I think I’m pretty good because I don’t have any more traveling before the end of the year and just focusing on I guess wrapping things up for 2019 it’s been a busy year.
Martin G: Hopefully wrapping up my present you mean?
Chris Evans: Oh absolutely. Yeah. Yeah, I’ve already got that. I just haven’t bought the paper yet. Anyway, enough of them our frivolities. We have a guest this week and a returning guest no less in terms of our conversation. Erik Kohlberg, Hi Erik.
Erik K: Hey, good afternoon gentlemen.
Martin G: How are you?
Erik K: I’m great. Unfortunately I have a lot more travel for the rest of the year with AWS reinvent coming up and such, so I can’t quite say that I’m heading into the glide path, but at least the weather is still nice here in California.
Chris Evans: Yeah, you do have one benefit I guess. And you’re right, I’m not attending that this year. And based on the number of people that have been at the event the last two years, I think in some respects to be quite happy about that.
Erik K: Yeah, it’s the craziest show that I’ve ever been to, too honest and it really is evolving into a mainstream enterprise infrastructure and obviously all the startups out there as well. It’s amazing how many participants in such a broad ecosystem that event is able to attract.
Chris Evans: Before we go any further, I obviously introduced you without saying exactly who you were, what you did. So although you have been on a previous podcast with me, might just be worth you introducing yourself very briefly and then we’ll dive into the discussion.
Erik K: Sure. So I run Infinidat cloud business. Infinidat operates in the large scale storage systems and services space and we’ll talk more about that I’m sure. But in addition to running our cloud business, I play a couple of smaller roles related to our Alliance partners and other activities and have a hand in those strategic direction of how we’re thinking about the next generation architectures. And approaches to the storage market.
Chris Evans: Okay. So today we thought we’d talk a little bit about something that I guess does touch on client does touch on premises and that’s really how we’re going to go forward with all these new media types and build storage systems of the future and what we should expect them to look like. Now, Erik, you and I talked a bit about the media types previously, but that’s, we’ll briefly go over that. But in general what we want to talk about on this podcast is at least some of the challenges that exist in terms of how vendors are going to build systems and obviously as a company that builds enterprise storage systems. You’re in a great position to give us your opinion, but let’s start by me just framing this and then we’ll dive into something in a bit more detail.
Chris Evans: Now there’s a few things that we’ve seen in the industry get discussed and we’ll talk about them, but they are, I think they’re relevant. I think hopefully most people know these. And the first one being that we see data growth that’s exponential, continuing to grow rapidly, mostly in unstructured, and some people are at the petabyte level, I would say nowadays. And as part of that, people say, well let’s just put it on the cheapest, lowest cost technology. Now from your perspective Martin as an end user. How do you see this go do, does that sort of resonate with you in your part of the industry?
Martin G: Yeah, so Dave Grove continues to be ridiculous out of control. Most of our growth, as you know in our business I work in is unstructured because it’s generally all media. So it’s slightly different to a lot of people. And this you say for wisdom says put capacity on cheap stone medium, but that really depends on what you want to do with it. Do you just want to put it in there or is it something which is it going to be a cockroach hotel or do you actually want to do something useful there? So it’s getting that balance right.
Chris Evans: Exactly. Erik, from your perspective, what do you see in terms of the way customers are going? Do you think it’s true that the majority of data’s being produced in the unstructured side of things or is there a sort of a bit of a rebound in the block industry?
Erik K: Well, I think it’s not necessarily the case that block is always used for structured data and unstructured data goes to some other category of storage protocol. We actually see a lot of customers put unstructured or semi-structured on, for example our protocol, our block protocols because we can support those multiple interfaces. But more broadly I think there’s a huge growth more than anything else. In terms of semi-structured data. And when I say semi-structured data, I’m referring to machine to machine type data. I’m referring to the kind of stuff like log files for example, which you dump into Splunk. I’m not saying that it’s totally uncategorized or just completely random, combination of multimedia and text files or what have you. I think that semi-structured space where you have some structure of certain elements and you want to get it in a more usable fashion is a critical part of the overall data growth picture. Looking forward for the next several years.
Chris Evans: Do you think that sort of thing really truly unstructured Martin? I mean Erik just highlighted something that was definitely structured, but even media files, even though we’ve customs and structured and not really, they’re really actually highly structured.
Martin G: They are all highly structured. I’d say they’ve all got headers, they’ve all got IDs, so yeah, really there’s no such thing as unstructured data. Almost every form of data you go to has some form of structure. I mean, I think my log file thing which are exported out, that’s very true. We’re seeing more of that. More analytics data, more data coming back from devices, whatever that device might be, and they all have a structure of some sort. It may be a very loose structured or semi-structured, but I think the thing is often when we start talking about structured data, most people are thinking about traditional databases, I suspect.
Chris Evans: I think that’s the sort of the assumption wasn’t it when we, when we think of structured, because that’s the sort of the historical view but without a doubt that’s sort of changing a little bit.
Erik K: Yeah, the growth is definitely in things that are not, that not that traditional database kind of space.
Chris Evans: Yeah. I’d like to go on and talk about media then because as we start to see how these systems are developed, clearly there has been an evolution in the technology we use. We started with tape, we’ve gone onto disk, we have flash. That’s really expanding out Erik, isn’t it? There’s a significant amount of media out there now.
Erik K: Absolutely. It’s not even just a traditional kind of tiering mindset where you might have say flash and then fastest disk and then slow disk. At one point that kind of collapsed to just flash and disk. But now we have this whole Cambrian explosion of different flash types in emerging storage class memories and you still have a couple of different flavors of hard drives out there. And then obviously tape is a whole different world, but obviously a strong ecosystem, particularly for media and content production houses.
Chris Evans: So Martin, I’m interested in the new stuff. We see a lot of Optane and there’s obviously M Ram and other things, which I don’t think necessarily are coming as devices in their own right, particularly, certainly the Optane side is, have you decided to look at that or do you think that’s still early days?
Martin G: We’re looking at, it is an area which, but we look out for some specific workloads, some specific performance workloads, but in general it’s still very early days and I think talking to my colleagues out there, even in a general purpose enterprise, but people are talking about Optane, but very few people really doing it for a moment. I think a lot of people are still playing around the MDE a traditional flash as opposed to going to Optane. So yeah, it’s been talked about. I think there’s for most people it’s probably still 18 months, two years off for people to really start deploying it in anger. There are some corner cases where I suspect you’re going to see people to deploy it, but I don’t quite see it happening just yet.
Chris Evans: Erik needs any comments on that one?
Erik K: Yeah, I tend to agree with Martin. It’s, an interesting technology. Optane is specifically and Intel’s gotten further down the productization path and other competitive storage class memory technologies, but I still think it has a long way to go till it hits anywhere near mainstream adoption and typical enterprise workloads. There are certain niche use cases like cashing scenarios for example or certain like SAP HANA core databases. Cisco did some cool stuff with that recently where you can actually take advantage of the persistent memory semantics, but by and large this points to a broader challenge which is that applications are usually written such that the information you have in RAM is ephemeral and the information you put out to storage is persistent and trying to blur those lines and to truly take advantage of Optane or similar technologies, would require wholesale application rewrites which enterprises are loath to pursue.
Chris Evans: Yeah, there’s a bit of a trend of isn’t that because on the one hand you think that this would be great media to deploy in server rather than deploy in a storage platform or storage system. But as you said, taking advantage of the persistent nature does mean potentially application rewrites unless you can use a technology that gets around that. So maybe as you say, it will be a little while before we see that.
Erik K: And then the question becomes then, okay, let’s assume that it’s going to take a while to hit servers in mass. Even though there is a rise of various cashing companies and things like that. For about the seventh time since I’ve been in the industry, I still have doubts about whether any external software layer on the server side is really going to drive a massive shift. But setting that aside, if you assume that it’s going to have to make its way towards centralized or shared storage systems, then what’s the most effective way for it to do so?
Erik K: And that’s where I think there’s a great conversation to have about the effectiveness of various mixed media type architectures.
Chris Evans: Let’s and touch on that very slightly then because if we… Before we get to that, let’s just talk about the media themselves and then we can see how that mixing might work. But if you think about the way that people have typically looked at different media types, we could look at say hard drives and say these days that they are very much placed on the graph in terms of their capacity versus cost. Whereas we look at say flash and we might say that there’s a trade off there between performance and cost and these new technologies like Optane are coming in and they are very much hitting that high end of that graph and of course tied into that is a certain degree of resiliency. So we could be buying a more endurance flash compared to the cheaper type flash that we’re seeing a sort of come into the market lately. So all of these media types are effectively hitting different metrics if you like, or different characteristics that mean that one doesn’t really fit all. And therefore as you said that we potentially are in a position where we are mixing and matching.
Martin G: Yes, Chris, I would agree. But when does this whole, your high end does fit all but it’s just too expensive. So if we could put everything on flash, we probably word or we could put everything in Optane, we probably would. So the reason we tear to desk is a pure pricing, I mean preferentially we’d like to fill out data centers for the flash. Our budgets aren’t unlimited, but flash and other type devices have massive advances in the data centers for us, for instance, power, space, heat cooling all those good things. So eventually we would like to stick it all on flash or all on tape or have the rest of all our archive on tape, on the grounds for tape itself is also very environmentally friendly. So the hard drive is an interesting compromise point of a moment.
Erik K: Yeah, I would say that you absolutely have to recognize the values that each of these media types brings to the table. And fact is that despite all of the excitement around this Cambrian explosion that I described earlier, there’s still very much a place for hard drives as a capacity oriented repository of information and just looking at the media types independently without tying them back at the system layer. Considering other IP that might do that. It’s very clear that the fast data ends up on flash. It’s at this point the economics make sense for the vast majority of the fast data end up on flash, the capacity optimized but still active data needs to be on disc and then anything longterm archive, that’s when you start thinking about tape if you have the right operational model around it. And Optane in storage class memory address certain edge cases today and as they grow in capabilities as the resiliency model and as the scalability and economic factors become more feasible than we’ll see them start to eat into the, starting eating into the top of the flash pyramid.
Chris Evans: I’m not going to challenge you slightly Martin and raise another issue. I think that relates to what you’ve just said there. And I think I agree with you in one respect that if we had the option, we put everything on the highest performing best media as in flash, but actually things like endurance get in the way and we can’t necessarily rely on the fact that flash devices will continuously operate for five years without fault, without a lot of additional work that goes on in the background. And by that I mean dealing with the media around things like where leveling and from a cost perspective in introducing things like compression and deduplication. So none of these media are used literally as they are, the vendors put a lot of additional features in, don’t they?
Martin G: Oh they do, but hard disc break. So people get very excited when they start talking about and where leveling and the lifespan of an SSD. We have a lot more engineers turning up to replace hard disk then we ever had turning up to replace them flash. It’s just because hard disk mechanically fail be spin. So I’m not convinced by that argument. And I think the models which we’ve seen vendors put in place to allow you to keep your flash for longer. And as people understand flash work, way flash works, the endurance seems to be a lot better than a lot of people thought.
Erik K: Now Martin, how long have you had flash deployed at scale within your environments?
Martin G: We don’t have it deployed at massive scale. We’ve only had it for a couple of years, but we’re not seeing it fail any more often or significantly different failure rates to hard drives. And we see how it drives and we have a lot more hard drives. We’re going to see more failures in the hard drives.
Erik K: Yeah,
Martin G: But which we’ll see.
Erik K: Makes sense and I think that’s the fundamental idea for flash versus hard drives. I think there was a ton of uncertainty and flood about flash failure rates versus hard drives, but the practical use cases and the reality that most people demand a lot less performance than they think they need. So they’re putting through a lot fewer program erase cycles than they think they need. I mean that flashing and hard drives have roughly similar maintenance expectations, particularly toward end of life in experience anyway, and in my experience in past lives in the all flash array market as well.
Chris Evans: Data’s a lot less volatile than people seem to think. Things don’t change as often and certainly won’t be getting into the unstructured space. Yeah, you might have high rise cycles if you’re only keeping a small amounts of it on your SSD or your flash tier but generally we don’t see things changing as much. You don’t very often update every column in a database for instance. It doesn’t happen very often.
Erik K: Yeah. I view unstructured data as almost as write once read many types scenario. It’s not quite to that extent, but you’re not going to go and update your log files for example. You’re just going to keep that long tail around until you decide to get rid of it.
Martin G: So it made you fall and made you fall. You never change, you never edit a media file, you write a new copy.
Chris Evans: So how does that affect them? The way that we’re designing systems, because clearly the first point there, then it’s, Oh, okay, brilliant. So you just highlight two things. There were QLC flash, which doesn’t get, doesn’t have the same level of endurance, could be a great solution for holding media or logs or other things, allowing you to process them with high performance. So does that mean we’re going to be building systems that are based on use case and nevermind the type of the media that’s out there?
Erik K: Well, I think you’re always going to see vendors provide media defined systems based on whatever the sexiest flavor of the month happens to be. That’s just the nature of the startup ecosystem and the quest for differentiation. But I would say that you have to consider more than just the failure rates due to rights. What we’re seeing with QLC in particular is that the reliability in the longterm, in terms of data retention, if you’re not refreshing periodically, for example, sometimes you can see silent data corruption in longterm deployments and we think that many environments would not be okay with that trade off at a very large scale. But more fundamentally than the resiliency statements around QLC. The fact is that there’s still a large economic gap on a price per bit basis for QLC SSDs versus say near line SAS or said data tubs on the hard drive side.
Erik K: And that’s what’s driving decisions in my experience anyway, more than any factors like resiliency, which could be supplanted by software.
Martin G: Yeah, I pretty much agree with that, I mean that’s what we do. We have some very high performance tier or high performance tier, which is flash. And then we use hard drives for a lot of our bulk storage and then we start looking at cost. But the cost differential is massive. Even today, between up QLC and hard drives, hard drives based storage, you can, has come right down in price. And as we see large, larger spenders coming out, we’re expecting to see that decline continue. You just have to make a choice about what you’re going to do with the data which sits on that device.
Chris Evans: Okay. Here’s a question for you both then. Are you seeing a gradual redistribution of data onto faster frustrates media over time? Based on what you just said, because of the cost profile changing because of the media becoming cheaper and because of the techniques that the vendors are able to use becoming better. Are we seeing that? I mean that trend shall we say towards the all flash data center. Now Erik, I will give you a chance to step outside of that boundary if you want. Because, obviously I know you sell systems that are not purely all flash-based. So you may say, of course we’re not saying that because you know you sell systems that are using disk, but just generally what your, both of you, what’s your opinion?
Martin G: Okay, so my experience at the moment is we’re putting a lot of our traditional structured workloads are going on to flash. Some are more unstructured workloads, things like some of the log processing, certainly media we put onto hard disk at the moment is just a, it’s an economics argument really. And that’s where we’re seeing the explosion. So we’re exploding in capacity, certainly in the hard drive space, a flash, we have not exploded it’s running capacity. What we’re really seeing is we’re ripping out the old 15K or 10K based arrays and replacing those with flash-based devices.
Erik K: Yeah, and I think one of your earlier comments, Martin, is very apropos to this thought. If you have the economic ability to put everything on flash or other faster storage media, I think you would absolutely do it, if you’re a mainstream enterprise. But the fact is that the economic gaps are still very much there. So, and Chris, to your original question, even though I obviously work for a vendor who incorporates different technologies into our storage systems, I would say absolutely the trend is to put data onto things that feel like flash. And the question is, can you get that flash feeling from something that’s more than just flash?
Erik K: Could you use software to blend other media types to get some elements of that experience or even the full experience? And that’s where we differentiate ourselves a bit and more broadly, I think the vast majority of the capacity growth is on the hard drive side of the equation. It’s where all that unstructured, semi-structured data is going by default and an intelligent large scale architectures. It’s also where a lot of active data ends up hitting after it goes through maybe multiple other storage media.
Chris Evans: But in that conversation then, are you implying a certain degree of public cloud and hyperscaler type activity?
Erik K: The hypersalemia figured this out a long time ago. Well ahead of many mainstream enterprises. I mean the vast majority of bits stored on Facebook, for example, are on hard drives and similar storyline for pretty much anyone who counts their server quantities in units of 10,000 or higher. There are certainly certain pools where you incorporate flash and if you’re, you’ve got to be smart about caching algorithms in general to also blend in DRAM, but the vast majority of your bits, if you’re for an ordinary consumer or interacting with cloud services, you’re interacting with hard drives all the time.
Chris Evans: Marty, do you think that’s really fair? I mean I would, I don’t know the percentage of EBS compared to S three for instance, and I’m not sure that the hypersalemia in that sense are typical enterprises that we would see.
Martin G: Hyperscale is a very different to a typical enterprise. They do things on a very different model and I think even some of our users via the users of the hyperscalers are less sophisticated than they might be a lot of the time cause it’s developer driven so [inaudible 00:21:27] any developers listen, they’ll just stick things where they think where they possibly can. It’s either going to be the fastest or the cheapest, but they’re not really going to have a lot of thoughts about it until workload goes wrong.
Chris Evans: I can see exactly what you’re saying. I think that in the hyperscale environments definitely there is a massive, well there’s an assumption that the majority of date it might be unstructured and basically placed there because it’s a great longterm storage solution. So I wonder whether, because they have limited numbers of apps compared to say an enterprise, their mix is slightly different.
Martin G: Most people are using hyperscale clouds who as they get more sophisticated, they’ll make more intelligent decisions. Enterprises who are keeping stuff on premises at that moment. A lot of our decisions are still, there is a lot more cooperation between the infrastructure team and the developer team than has ever been, and it does help them to help everybody to make better decisions for the business. If you take the operations or the storage guys out of the picture, you may find yourself making some interesting decisions, which don’t necessarily fit the workload. We often ask questions about workloads which developers haven’t really thought about until those questions are asked.
Erik K: Yeah, and I think you must have higher than typical cohesion between your infrastructure teams and your dev teams. Martin, I’d say the majority.
Martin G: No I don’t think we do. I don’t think we do it in times. It can be a very conflict driven discussion but we can have, certainly we can work as gatekeepers that in a certain extent because we actually have to go and order the storage and actually put it in place. It means that sometimes that friction or that it’s not completely frictionless does work. It allows decisions to be made. We’re often tied into a purchasing cycle, which means that people need to understand a bit more, they can’t just suddenly change their mind.
Erik K: Well it’s certainly in everyone’s best interest to have more information flowing to the people who ultimately manage the infrastructure.
Chris Evans: Okay. Before we disappear off and think about use cases and challenges and go into that in a bit more detail because I do have a few questions around that. One thing I think we haven’t touched on in terms of the cost perspective and the efficiencies that the vendors can put in there, is the whole idea of encryption. It strikes me that a lot of the planning and ideas you might have around optimization and cost per gigabyte and TCO could be completely thrown out the window by somebody who decides to encrypt all their data resource.
Martin G: Yeah. We’ve seen that requirement coming in more and more at the moment. So I think we can head in some very interesting discussions with some of the flash vendors about how we handled this better, where we’re writing encryption data and encrypting data off. Sometimes even with an operating system plugins, it’s starting to break some of the models and some of the assumptions which are made. So as encryption becomes almost a default requirement for most enterprises and it’s going to get to a stage where for most data you’re just going to encrypt. You’re not going to even going to think about it because the legal implications are just painful. So if you make a mistake, take it out and make it a default option to write in encryption, write encrypted. So it is what the vendors can do for us to actually help us with that now.
Erik K: Yeah. And to that point, I think managing encrypted data that comes from higher levels in the stack, handling that at the infrastructure layer, actually ends up feeling a lot like handling multimedia content at the infrastructure layer because you get zero value from data reduction in most scenarios. And I’m certainly familiar with various approaches like is palace vormetric a solution has different plugins that try to intercept that encryption within the stack such that you can potentially decrypt data reduce and then reencrypt. But that has its own set of security implications and complexity implications as well. So at least for the kinds of customers that I talk to, it’s a serious problem that affects their, or it’s a serious challenge that affects the economic assumptions that they’re getting from all flash centric vendors.
Chris Evans: It’s funny isn’t it really? Because, that’s something you think that would be fundamental and we did, we had it to a certain degree in platforms to encrypt the media to protect against media theft or not destroying devices properly or whatever it happens to be. But the fact that we can’t somehow have that further up the stack and not have it optimized so that somehow the application and the storage can actually talk to each other to handle that together. This seems a bit of a gap in the market.
Martin G: You also have a number of legacy applications at outbound Chris, where people are trying to bolt encryption in and they don’t necessarily want to be reliant on the storage vendors encryption or they want to make sure it’s complete transparent. So we’re seeing increasing use of offering system or plugins, things like Guardium and things like that are beginning to come in to right sensitive data. It means it’s encrypted even all through the data path down to writing it onto the drive as well. People are worried enough and we’ve seen this Chris, because where I remember when we were talking about to which Emulex or [crosstalk 00:26:29] about how we encrypted HPAs and we said well that’s never going to be a requirement. It turns out but it’s becoming a requirement.
Chris Evans: Funny that that was quite a few years ago now. That could have been at the time that was it. It seemed like it wasn’t really necessary from where it was being done, but actually now it seems a bit more likely.
Erik K: Yeah. A lot of, one more comment to cap off the discussion around encryption. I think it comes back to economics. If you could have all of your data encrypted, you would do it if it had the right economic profile and it didn’t compromise any of the other aspects of the system. So that’s the Holy grail I think in many environments.
Martin G: Yeah. And you had a properly scalable key management system as well cause everybody’s also terrified of losing their keys.
Erik K: Yep.
Chris Evans: Yeah, makes sense. Okay, well let’s go on and try and think then. If we were sitting together thinking about how people are going to build multi petabyte systems in the future, exactly. What do we think was going to be the right approach? We’ve got so much of a choice of different media to choose from. We’ve got challenges around things like encryption, but we have the benefits of things that compression do and we know that they apply differently to different media types. We have to cope with things like endurance. There’s a lot to think about in terms of building a platform in lots of respects, there are different use cases, so should we really be trying to build one platform to rule them all or is there room in this market for different solutions for different problems?
Erik K: I think for mainstream enterprises there’s at least room for a core type platform versus an edge type platform and at least in my view, you look at the core as kind of a larger centralized building block. You’re talking potentially petabyte or multi petabyte scale as a typical increment. Whereas at the edge you’re talking in the terabytes but you’re talking potentially much higher velocity of data, much lower latency, that’s where that starts feeling more like all flasher or Optane based, DRAM based type solutions. I think those are two broad buckets for primary data and then there’s a whole secondary storage and cloud angle that we can talk about as well.
Martin G: Yeah, I tend to agree. I think in most general enterprises that’s a pretty good operational model. I think we will see some specialists, types of storage. I’m surprised we’ve not seen anything come out specifically targeted at a Splunk as far as I’m aware of, so things like that, Splunk and elastic search. Where people are looking at how we optimize for those workloads, those workloads who are generating so much data and people do want the data analyzed and they want to able to respond to it very quickly. Things like security data if you want to know when something’s going on in your environment, so I do wonder at some point somebody’s going to come up with some branded storage for that kind of workload. Whether it’s really about special underneath the covers that we’ll be interesting to see, whether it’s an actual proper architecture as opposed to a marketecture.
Erik K: Do you think that justifies a whole company on onto its own or do you think that’s just an extension of an existing product line?
Martin G: I think it’s just an extension of an existing product line. I could see you getting funding to do it so I could see it being a product and I could see some VC throwing money at somebody, they thought they had some special solution. I would tend to agree. I think that’s what you’re implying, but it is just an extension. I could see somebody actually trying to do it.
Erik K: Well in today’s VC climate, if you can get the money you should go for it. But it just dovetails with how we’re thinking about that space in particular since we are pretty much the only company doing a large scale mixed media architecture. It turns out some of the large consultancies that we work with are working to bake in architecture just for that use case actually.
Chris Evans: But surely, if you look at the way that products get developed and then they get translated into the companies or they get absorbed into existing companies, we have ideas that come along for new market segments. Object storage solutions can be one. The all flash systems could be another ScaleOut or flash could be another, etc. But those products tend to get absorbed into a bought by company. So then add it to a portfolio and not to pick on any one person but company. But you can look and say HPE who now have some products internally but have a whole partnership of platforms that they can choose from when they’re building solutions for customers. And that gives the customer lots of choice but actually could also create a bit of a confusion because they’re all very different but also the same.
Erik K: Yeah, that’s the tradeoff, right? You can be Jack of all trades, master of none, which is the approach that many of the incumbent companies had traditionally taken prior to this latest wave of acquisitions or you can be kind of specialist in limited to niche use cases. And the Holy grail is landing right in the middle of that spectrum and addressing the broadest possible market.
Martin G: I was going to say Erik or you can be a hyperscaler and you can just write a service which pretends to be super special but it’s not really behind it and if we think is special so it’s easy to consume.
Erik K: That was exactly where I was going. Scale is one way to transform quote on quote niche problems into general use cases and that’s one of the areas where we think there’s room for sustainable differentiation in the larger scale end of the market. Most of the storage architectures out there today were designed with roughly a hundred terabyte design points. There are only a couple of exceptions that I can think of. So there’s a lot of room for innovation at the high end of the storage spectrum.
Chris Evans: So it sounds, from what we’ve just been discussing, it’s really interesting that it’s unlikely we’re going to see the end of new products and new solutions coming to market because as we see the requirements, they change slightly. We haven’t new media available. We can build things at a slightly different, but we’re always building for newer challenges like scale and so on. So in actual fact it’s unlikely we’re going to see the end of new solutions coming to market because there’s always an opportunity.
Martin G: Yeah, I think so. I think I made a good point, actually many of the existing systems out there were designed to scale to a hundred terabytes. And even that was unthinkable when some of these systems were being designed. So as we see scale become a more and more of a thing, I would expect to see more and more new products come to market. And with the explosion, the amount of media types, yeah, it’s going to continue. It’s going to continue on, but we will see lots and lots of new products. Some which will be great, some which will just be based on existing designs but re wrapped. So I think it’s still going to be a very interesting market.
Erik K: Couldn’t agree more. I think the cycle of innovation in data storage is never ending honestly with data growth, maintaining the massive rate that it is. And I don’t see that abating anytime soon. So I think there’s lots of opportunities for innovation and the winners will be the ones who can best harness the different storage technologies to achieve the right economic portfolio for the broadest amount of workloads.
Chris Evans: Okay. Final question then, but as a futurist question, which may be almost impossible to answer. Do we see all future systems then being based on either disk or flash or can we not predict that because we don’t know what might be coming around the corner?
Erik K: So I’m going to put a timeframe on this just to, because it’s impossible to predict things 50 years down the road. So I really appreciate some of the research that actually Gardner provides. And I think it was Joe Emsworth, their main flash guy who basically stated that there’s going to be a three X or higher cost per bit gap between hard drives and flash based systems, on the capacity optimized into the spectrum through at least 2028. So I think based on that economic fact alone, you’re mixed media approaches are going to be the reality for anyone operating at significant scale. Whether they get that from a single vendor or whether they choose to assemble different solutions into their overall storage picture would depend on the philosophy and engineering resources of the end user, but I think there just based on that economics you’re going to see mixed media systems continue to dominate.
Martin G: Yeah, I would agree. I think that’s going to be the case 2028, with the current way data keeps exploding at the moment. Whether this is a good thing or a bad thing and whether, unless somebody comes up this way which actually slowing that data growth, hard drives or magnetic media, some stuff is going to be around for a long time. You said at some point somebody might work out that there is no value in keeping all this data and then we will just see. But that’s its the hard drive because we don’t need quite so much data.
Chris Evans: Yeah, I think that’s unlikely to happen the way we’re going at the moment. We’re going to have some sort of really, really clever process for filtering and managing our data. So maybe that will never happen.
Martin G: But we need somebody smart enough to call it emperor’s new clothes and saying there is not so much value in this data is you think.
Chris Evans: Sure, that sounds like a conversation for another time, but for now I think we can leave with the message that we’re going to see complex systems being built with many types of different types of media, both disc drives and flashing update.
Martin G: Yeah. Nice. Easy, neutral answer. Good.
Chris Evans: Okay. Well any final thoughts, Erik?
Erik K: I’m good. It’s been a great discussion and looking forward to further opportunities to differentiate on that scale side of things.
Chris Evans: So Martin, Erik, thanks for your time. Great conversation. As usual, Erik, if anybody wants to follow up with you online and stalk you or find out where you are, where should they head?
Erik K: LinkedIn or infinidat.com.
Chris Evans: Perfect. Okay guys, well thanks so much and look forward to catching up again in the future. Cheers.
Martin G: Cheers, Chris. Thanks.
Related Podcasts & Blogs
- #113 – The Expanding Storage Hierarchy with Erik Kaulberg
- #130 – Making Money in the Storage Business
- #138 – Storage Predictions for 2020 (Part I)
- Building a Golden Data Repository
Copyright (c) 2016-2020 Storage Unpacked. No reproduction or re-use without permission. Podcast episode #I4FE.