Why We Need Open, Data-Centric Computing Architectures
Articles,  Blog

Why We Need Open, Data-Centric Computing Architectures


So I’m gonna go through and talk about RISC-V, but before I do that I want to kind of maybe spend some time giving you a little bit of context because everything I’m gonna go through today is not new news, right, so I know that normally when we have a press audience we want to kind of go ‘here’s the news. here’s the the Big Bang,’ but the reason why we wanted to do this is that this is a long game for us and the work that we’re doing around RISC-V is quite foundational to where we’re going over the next five to seven years, and we thought it was worthwhile for you to have the ability to understand why we’re doing this and kind of how it fits into the overall strategy so that as we start to introduce new products new capabilities that take advantage of our position on RISC-V, at least you had some sense as to why that was. Now the other thing that might be interesting is that we have largely from a historical perspective been a storage company and as Dave just said, and what you’re hearing from the videos and what we’re talking about is we think the future is about data, and so we think that puts us in a very unique position where we can do more with the data, make the devices more intelligent and try to figure out how do we go do that. Now the other thing that might seem a little strange is why is a data company or a storage company, depending on where you want to think about us from a timing perspective, talking about processors. Isn’t that like somebody else’s game, or are you getting into the processor business? And the very key point to understand here is processing for us is a means to an end. It is not the target. So when I go through this, I hope all of you, one of the things I want you to internalize is we are not going into the processor business. We do not want to be in the processor business. We do not expect to make money from processors, and that is not the goal, but we believe that we need to be closer to the processor. We to bring the processing closer to the data in order to fully realize, you know, the possibilities of data and to use a tag line that the data to thrive. And so that’s our motivation behind doing this. So then let me add another element. If you actually pay attention to the compute space, you’ll get a sense that we’ve kind of reached the gigahertz limits on processing for quite a long time. So if you think about your favorite Intel processor, ARM processor, or whatever, you kind of you know that we’re sort of capped out at that three to four gigahertz kind of limit, so getting more out of your software by simply turning up the gigahertz crank is a game that’s long gone and on top of that where you know that game actually was long gone and probably the early 2000s where we went to the parallelism effect, that continues, but we’re also reaching the limits of how much you can accomplish just through the parallelism effect as well, and so we think that there’s a another transition and that’s going to be the core of what you’re going to see, which is from general purpose to special purpose. I want to kind of give you a preview of what that means so think of general purpose as the processors that we see in our servers, desktops, laptops, phones, etc. The processors of today. They do general-purpose processing for general-purpose activities, whereas special-purpose processors are tuned and optimized for a specific activity in a specific task. A lot of times that’s a custom ASIC would be today, but those are very long development cycles, they’re non programmable, and they’re a little too sort of specific to a particular task. So we kind of need to get to that in-between and where we can mix and match the power requirements, the memory requirements, the I/O requirements, etc, to the task at hand and that’s really what we’re trying to accomplish here is to optimize the processing to the task at hand and the reality is you’re already seeing this. This is already happening right, so I’m not sort of promoting something that’s like ‘oh wow this is like this new radical concept nobody’s ever thought of’ and the way you’re seeing this show up today, is I’ll give you some examples. GPGPU, right, why aren’t we using the x86 CPU in order to do a lot of the machine learning work. Why are we using a GPGPU? Because it is more tuned and more apt to be able to deal with that problem. Right, why did we did Google create the tensor flow processing unit or TPU, right, why not just basically, hey, the tensor flow software package is a Python package you can download it and you can run it on anything. Why not just run it up? Because it is not optimized and cannot do as good a job as if it’s running on something that’s optimized for it, and then you have other examples. Microsoft’s doing a lot of work with FPGAs to optimize various parts of the data center. So this concept of going from this general-purpose world to the special-purpose world is not really that new, but we think we sit in a place because so much of the world’s data lands on our devices where we can really make a difference and change a big part of the architecture for the data center. So as I said, you know, the world today is largely focused around this general-purpose and CPU centric kind of architecture. So everything revolves around the CPU and when you buy a server today, or a PC, and I’ll show you this, you’re essentially all architected around the limits or the the constraints that the processor has imposed upon you. So let me give you some of my favorite examples. Anybody here good at math? All right 2 to the 46. 64 terabytes. Okay. That is the maximum amount of memory you can connect to an x86 processor today, or an Intel x86 processor today, okay, because they have 46 physical address bits. So is that necessarily a bad thing? For a lot of workloads it solves a lot of problems. But if all of a sudden you want to make a different choice, you can’t. That is not that option is just not available to you, and that thing continues around whether its IO lanes or how much memory you can connect. If you’ve ever opened up a server and you look at where the DRAM is and how much DRAM there’s in there, it’s because the processor has dictated this is how many DRAM sticks you can put and this is a total amount of DRAM you can put in this server. So it’s all very much dictated by what the processor can do and we want to turn that on its head, and there’s still a need for that right, so I don’t want this to be sort of a us versus them, or this versus that, but we think that there are places where we need to start with what is the data need, and then let’s go optimize around the bandwidth, the throughput, the latency, the amount of memory store, and that we need to solve our problem, and that’s what this is about is hey, the general-purpose computer centric world continues to exist and will continue to exist for a long time. We’ll just say forever. We just think there’s an expansion and that there’s another set of workloads around big data and fast data that give us an opportunity to think about the problems very very differently. So this actually gives you a really good example of you know, taking something like a general-purpose architecture and see that everything is really tied to and bound to what this what the processor vendor- and I’m trying not trying to pick on anybody, but what the processor vendor has decided is the most optimal for that PC, or that desktop, or that server. Right and again it usually revolves around what you can do with the CPU and it’s not really designed around what’s the data problem you’re trying to solve, and so if you have something that doesn’t fit that model as we see all of the workloads of the future. So as soon as you start thinking about machine learning, AI, analytics, and those things at scale, this model actually doesn’t work so well and that’s why some of the big players have been saying you know, we’re gonna basically start doing something different and whether it’s GPGPUs or TPU’s or FPGA’s is we’re gonna do something different. So the analogy we use to kind of get people to to understand this, is if you think about the 4-door sedan as the general-purpose case and hey, it’s a great car. There’s nothing wrong with it, it’s you know it’s fantastic. But the reality is that there are problems we need to solve as humans for which the four-door sedan does not solve our problem. And so we need to basically bring other tools to the job, and that can mean things like you know, nice cruise ship, anybody been on a cruise? It’s kind of cool. Great way to go visit a lot of places short amount of time don’t need to unpack.. anyways, but you might need you know, stuff on the fast data side. You might want to process things in a radically different way. You might need your your personal jet, your speedboat, whatever, but the reality is that not all of us will be able to deal with a four-door sedan, and what we’re saying here is not all of us want to solve all of our problems with the general-purpose processor and that’s kind of like our opportunity. And so now what we want to start to think about is purpose-built data centric architectures. So how do we start from the data we’re storing and then how do we optimize for that particular workload, because the reality now is with the amount of data, Steve I can’t remember how many zettabytes you said 140, 150 zettabytes of data, but the thing is that the data is so huge going in the future that you can get these categories of workloads that are unbelievably large all by themselves. One of my favorite I give, as it’s just a very simple example, it’s a little bit contrived but you’ll kind of get the point: we have a very large surveillance business, okay, and in our surveillance business you capture lots of video you store all of that on hard drives and then you try to figure out you know, what happened from a surveillance perspective. Were there any events you need to respond to, and you know, the general process right now will be to basically take all of this video data you’ve collected and send it off to some central server that will do some general-purpose processing to find out what happened, and as you know in most surveillance applications you have a camera pointed at a hallway or something like that, and 99 percent of the video frames don’t change. They’re actually identical. Nothing actually ever changes there’s only a couple of frames so you know you got robbed or whatever, there’s only a couple of frames where the bad person or the event, whatever it might be, showed up in that video frame. So why are we moving these gigabytes and terabytes of data from a video surveillance system through to a central server in order to do all that. Why don’t we do all of that processing out at the edge, and why don’t we create a processing element that is tuned specifically to go find out whether or not an interesting event occurred. And overtime as machine learning algorithms get more and more interesting and more and more capable, we can start putting a lot of intelligence there. Now if you want to take that example and take it even further, say hey, how do I basically process this stuff in the camera proper before it actually ever even lands on the server and only send the interesting stuff out. And so that’s more that special purpose processing rate at the edge. And these are all of the opportunities that we have to start thinking about how do we compress the data, how do we look at the data, how do we put intelligence to the data and do that. So notice on this chart here we have this storage centric versus memory centric. And so on the storage centric side the goal is for us to say how do we move more compute closer to the data. And so for those of you who do simple like MapReduce kinds of things okay, how do you basically move the processing for the map phase, how do you move that right close right next to the hard drive where all the data is and do all of your mapping very very close to the data, and only have the reduce phase come to centrally to at the server level or at a central network server level, but let’s go do all the mapping right on the device itself and that would be examples of storage centric compute. On the memory side the memory centric compute side is a personal passion of mine, because I believe that memory is one of the most constrained resources that we have in the data center for which there still exists huge huge huge opportunities. For those of you who know that I had worked at HP, the machine project, we built a single system image of 380 terabytes of main memory and that was just the physical machine that was built architecturally. There was actually quote unquote ‘no limit to the amount of main memory you can put’ but the entire programming paradigm change.. there was a lot of stuff that came out. But we were seeing numbers in excess of like 8,000x performance improvements by moving a workload from being an i/o bound workload to being a completely in memory workload. But as long as we’re constrained to as a DRAM DDR model of today, we won’t be able to see these kinds of capabilities. So getting to memory fabrics and be able to address very large pools of memory, still having to deal with some of the realities of the laws of physics and the distance of memory and memory locality, but there’s huge opportunities for us to look at memory and having this memory semantic data flow in a radically different way if we start thinking about the data first, versus well my processor only lets me have to DIMM sticks, okay, and that’s kind of what you’re dealing with today. And so, let me give you, actually before I go here, let me give you an example because I see Phil in the back of the room. So just give you an idea the conversations that I for example had with Phil’s team, so you saw him go through the ActiveScale architecture and the ActiveScale architecture a fantastic object store the software there, and I went through that with the team and Belgium very very impressive stuff, but the reality is from a hardware perspective we’re using you know, x86 servers there, we have a bunch of hard drives they’re connected internally through PCI channels, all that kind of good stuff right, and so the brainstorm conversation with the team was what if you didn’t have any of that? What if you have a network pipe, so you got a network cable that comes off, and from there I give you 100% flexibility to how do you get from there to the device or to the data more specifically. And you don’t have to go through a general-purpose processor, you don’t have to ask permission to get on the PCI lanes, you don’t have to restrict yourself to the amount of memory. Maybe you want to use some non-volatile memory instead. You completely change the architecture of what, I’m picking on Phil, but what his object store would look like and that’s the opportunity we see, is to just completely change the architecture and it will change everything from form factor, how data is stored, the performance levels we deal with, how we deal with resiliency, taking things like reed-solomon encoding and embedding that right into the processor. All of that kind of stuff is where the opportunities become available. So I’m just trying to give you a sense, those are the kind of the brainstorming conversations we have internally related to this stuff. And so just the opportunities keep coming at us. And so I talked a little bit about the Big Data side, so massive storage and you know, how do we push a lot of that to the device? How do we push a lot of processing to the device? And when you get to the fast data side there’s also a correlation there to when you think about the edge. Right, so we spent the last decade bringing everything into the cloud right, and then people figured out well okay, maybe there’s a whole bunch of applications where this sort of send everything to the cloud doesn’t quite work anymore, and now we need to actually do more data analytics and data processing and do it where the data is generated. So things like how do I process all the data that comes off of an aircraft aircraft engine and do it in real time? And so this memory centric compute, edge compute, and doing out there, again where it’s close to the data, we again see that that IOT opportunity, more industrial IOT types of applications, is where a lot of applications come through. And then in this sort of memory centric compute world things like security detection, event correlation, and those kinds of applications, if you can take a huge part of an IT data set, so if I went to Steve Philpott or CIO, and said hey you can take your entire event stream and you can actually fit it in main memory and start doing event correlation to find zero-day events, that’s a big big deal, okay. So those kinds of applications we go through. So the other element about RISC-V that I haven’t talked about so much yet is this notion of openness. For those of you don’t know my history, I spent five years at HP being the Linux and open-source guy. I wrote a book on open source. So this open source thing was very very very comfortable for me, it was something I understood very well. And when I started to understand the RISC-V as being open source, this notion of a community development model got me very excited at what the possibilities were, because what it meant was hey, we don’t have to do this all ourselves, right, so rather than take somebody else’s architectures, somebody else’s IP, maybe try to bring it in-house or develop our own IP in-house and do all of this from scratch, and then we own doing all of that, or we’re counting on a single vendor to basically be a dependency on us. Now what happens is we can apply in many ways the same model as the Linux ecosystem, and say how do we get a whole ecosystem of silicon developers to collaborate on a CPU architecture. And so right now just to give you a little bit of an update, we have been designing two small cores, so we’ve got two cores under development. We’re about to start our third. Our two cores are at the smallish end, for those of you keep track of this stuff, think ARM M0 and M1 kind of class course, but basically we we went very quickly. We wanted to learn how to go do this. We put this together. This is going to be used inside of a storage controller for like USB sticks or client devices, but our intent is to open-source that core. And so we’re starting to work now to hopefully by the end of the year, say hey, let’s take this all of this core development work and let’s put that into the open source community so that people can start with that as a design and make it better, expand it, customize it, create libraries, what have you, in order to be able to use in their own environment. So really taking that spirit of openness, the spirit of open source, and bringing that to the processor world, and again because we’re not in the processor business, we’re not trying to sell processors, for us this is all good because our business model is all about how do we help our customers monetize their data. And so I mentioned this already; there’s data centric applications at the edge right, so we see just a new class of applications and opportunities there, and one of the things I’ll mention in all of this is the other advantage of this open source nature and it’s also related to this is we don’t have to do it all ourselves right, and so we can go work with partners. We can say hey, start with our core, how do we help you evolve the core we’ve built that is specific to your robotics application, or to your surveillance system, and how do we basically build that relationship with the customer, and how do we provide integration with our storage devices, our world-class storage devices, so that they can basically optimize for things like weight and power consumption. In fact our first set of cores that we’re doing for our device is the key value proposition is in fact power consumption, and so you know cuz people say well okay, you took out an existing core, you put this new core in, like, who cares? And if I’m selling a client an SSD and I go to one of our customers like Dell or HP or whatever and say, “hey buy my hard drive it’s got RISC-V in it” they’re gonna look at me and go yeah sure, like why would I care. Well the reason we want them to care is because it delivers a new value proposition, in this case lower power, and power is a big big deal for us as we do this. So as we were going through this process and we looked at every architecture out there, we looked at every option out there, but this combination of being able to customize the environment. We can customize the instruction set which you can’t do with the big guys, you can do some instruction set customizations with things like 10 silica or arc, but you can’t do with let’s say an ARM or an x86, but the ability to customize that, the ability to tune it for either big data or fast data environment, the fact that we could extend it and have custom instructions. So we are already in the process of looking at custom instructions that we can embed in our cores in order to do special-purpose processing for our devices, and then the ability to for this thing to scale from the very small to the very large. And so in just in case we have geeks in the room, the 16 bit is a little bit of a misnomer it’s actually a 32 bit, but if you do instruction compression you can get down to 16 bits, so just in case it was somebody who’s going to call me on that, you know, but anyway so we see this idea that RISC-V was a great opportunity for us to be able to solve both the next generation of problems around big data and fast data, and so it just ticked so many boxes as we were going through this, but then the kicker is that.. or let’s call it the downside, right, there’s always upsides and downsides, is that this was fairly new environment. This was not sort of a mass deployment. So we basically had to take it upon ourselves to build out the ecosystem around RISC-V, and that’s the reason why we announced early. So we announced this last November I think it was, and we told the world what we were doing, was because we wanted to help kick-start that ecosystem and then the second reason is what I’m doing here today, is establish this foundation so that as we make progress over the years, you have some of this context for what we’re doing. And then the third thing quite frankly is you know again, learning from the open source background, it’s really hard to do open source development in secret, okay. That’s an oxymoron okay, so those things don’t fit together, and so I said no, we’re gonna go all-in on this open source thing and that means we’re basically gonna do open source work and we’re gonna do in public. We’re gonna let everybody know what we’re doing, and we’re going to hopefully spend some time and get people to understand what we’re doing and we still get lots of questions on you know, why didn’t you do this, why didn’t you do that, isn’t this hard, and so we get all of those and it’s all it’s all good. In fact I appreciate the question. It helps us fine-tune our message so that people can understand. So very excited about this. So as I said just got a little bit of a repeat here. Really we’re focused on the two end parts. So here’s kind of like the.. I’ll say the weird piece. RISC-V can be a general-purpose processor. We are not in that business, and that’s not our intent. So that’s not what we’re going to use. So there’s going to be plenty of people out there who go develop RISC-V processors, and they will potentially go after the general purpose marketplace. It’s all good. Not what we are focused on as a company. We are focused on the edges of this. So I think it’s a little bit of a repeat, but it’s good to see some of the applications that we’re focused on. You know the the idea around machine learning and genomics especially; these kinds of applications, like we are out of gigahertz. You know we can continue to some degree do the parallelism, but that’s also in order to really get that next step function in solving these problems in performance, we need to start building processing capabilities that only do that one thing really really well. And we’ve already got evidence that that works, and so we want to keep doing that, and so and we can tune it to the environment. In some cases it’ll be less about performance, more about power, those kinds of applications. So but we see that RISC-V just solves a lot of these various applications and it’s kind of interesting. And I think I’ve talked through from the core. So I’ve already got Phil now on the hook. You’re all in Phil, it’s all good, okay, so object store 2022, but so we’ve seen at the core the regional kinds of things, but remote at the edge etcetera, and we just again, there’s another part of what made RISC-V so interesting to us is that it allows us to use one common architecture. And I’m gonna sort of say something that sort of conflicts a little bit. On the one side there was no part of our decision that was a cost decision, okay, so in other words there was no part of our decision-making process that was, hey this this particular core processor we’re using is too expensive, and this will allow us to save cost, okay, that was really completely outside and and not part of the decision-making. But at the same time, what we do realize is a lot of our development tools and the tooling infrastructure, once we get the bulk of the Rd teams all focused on the same real-time operating system, the same core instruction set, the same firmware tools, the same build tools, all of those things actually do in the end give us- it’s less about cost but more about flexibility and efficiency in the organization, and the ability for us to have product flexibility and be able to design new things at a rate and pace which we wouldn’t have been able to see before. So that was also very interesting to us. So this was the contribution that we wanted to make. As I said we announced early, and we basically informed the world that we today ship about a billion processor cores a year. That is our current volume of processor cores. And we have made a commitment as a company that over the next number of years, and we have to do the right roadmap intersection with our products, we will transition the vast majority of our products over to RISC-V and in that process we also see a significant amount of growth in devices etc, that aour billion cores a year that we talked about today become a couple of billion cores a year. And so we we also wanted to make sure that it was clear that we were going all in to also tell the rest of the industry you can come all in with us, and so this gives you a sense of all of the various things that were after. So we’re doing this development of open source IP. So we went very quickly, developed a core, but we also want to work with the community at large, and we have other tools and capabilities that will be releasing to the open source community in order to help the RISC-V movement. We are partnering and investing. So we’ve invested in Sai five, that’s announcement we’ve made. We’ve invested in Esperanto, and we have others in the queue that we’re investing i.n I think there’s another one but I don’t have to talk about it yet. There’s other one coming. Surprise! News. There’s another one coming, and but we’re going to continue to do that, but again in order to help the ecosystem. So we’re we’re gonna do this thing that’s going to feel inconsistent in that we’re going to be investing in processor companies like Esperanto and SY5, but we don’t want to be in the processor business. So but our goal is really how do we grow the ecosystem and how do we make that happen, and so we’re gonna basically do this development around processors. We’re developing our own cores, and in some cases will use other cores. So just because we’ve developed our own cores, that was a very important part of the process for us to be able to learn what it takes to do this, but there are other cases like Esperanto is doing a very very interesting GPGPU, 4,000 RISC-V cores, or mini cores tied to sixteen big cores. Very very interesting processor and we see a lot of potential in the machine learning space with that RISC-V processor. So that might be an example of something we might use on from the outside. But again this is a long game for us. So this is not a ‘hey my product next week,’ this is a platform foundation for the next number of years. The good news is we’re not alone, and we think that our announcement did have quite an impact, because before we announce, I believe that number was eighty or ninety or something like that, like it was below 100, and now we’ve seen a number of companies come on board. So we have Mellanox in the networking space, we have Tesla in the automobile space, we have some of the NXP Qualcomm coming together, and so we’re basically talking to all of these folks, and one of the other beauties about open source is working with competitors. So in fact the day after our announcement I had a call with Samsung, and had a wonderful multi hour phone call with Samsung and we talked all about RISC-V and open-source etc, and so that’s just another I’ll say fun part of doing this. But 129 members and growing. So other people are seeing the opportunity to go do this. So this is what it means innovate for the data-centric world. So big data, fast data. Hopefully that themes coming through today with all the presentations you’ve seen. Openness and working with the community. We think the rate and pace of innovation and let’s say the the Linux world is dramatically faster than any one vendor could have done, but it’s a little bit of the flywheel effect. So right now we’re in that early stage of turning the crank, turning the flywheel, and we’re working as hard as we can but we’re really on this path where we want to get to having a really rapid pace on that and get to this level of innovation at a rate and pace that is in line with something like the Linux operating system and then we’re bringing the full power of Western Digital to the table with a billion cores a year, and then take it from there so we’re very excited about doing this.

Leave a Reply

Your email address will not be published. Required fields are marked *