IU Cloud Computing MOOC: CometCloud 1: In Support of Clouds
Articles,  Blog

IU Cloud Computing MOOC: CometCloud 1: In Support of Clouds

Are we ready to go? So I’m pleased to start the
fourth day of the summer school, and we have a great
presentation coming from… Professor Manish Parashar from Rutgers University on the CometCloud
system. And that will be followed by two contributions on integrating HPC with… Clouds, and with that I’d
like to go over to Manish, who’s ready to go at Rutgers. (Manish Parashar)
But anyway, I will start from the outline again. So this is a brief outline of how today’s going to
progress, I’m going to spend the next forty-five minutes or
so talking about, very briefly Cloud computing with
more… in more detail how Cloud computing can support CDS&E applications, that’s Computational and Data-Enabled Science and Engineering
applications. Following that we’ll give you a… introduction to
CometCloud, which is Cloud federation infrastructure that we
have… developed here at Rutgers University. And there’s a url on… your slide. It basically provides support for running
applications on federated infrastructure, which includes
Clouds. We’ll go through some detail this morning about CometCloud, and we’ll end that with a sort of a
demo movie describing… or demonstrating the capabilities of
CometCloud. After that we’ll present some material to set up for this afternoon’s hands-on session, and the afternoon session will then consist of two applications that you will… develop using CometCloud and run them on FutureGrid. [pause] So moving on to an introduction of CometCloud, I wanted to start by a definition of CometCloud… of Cloud computing, and I know it’s late in the
week, and so you probably are very, very familiar with this, but I wanted to just set the context before we move ahead, and here are two definitions. The top one is picked up from Wikipedia, and
the next one’s a little more formal definition coming in from
NIST. But basically there are two ways one can look
at Clouds. One can look at Clouds as a business model wherein you have some infrastructure that you
can use as a Service, and this typically consists of infrastructure hosted on some consolidated virtualized data
center. You access this using web services, and your
access is governed by a set of SLAs, which basically
defines how much you pay and what you get in return, what guarantees do you get in return, and
possibly what happens if you don’t get what’s promised to
you. So this is the business model that’s been used by companies such as Microsoft and Amazon and a lot of others to provide services at different
levels… including infrastructure, platforms, software
components, and entire applications that you can access
using this. There’s another way that you can look at Clouds more as an abstraction, and that’s really being
able to access resources, and these could be
computing, data, software resources, as a Service in an elastic on-demand manner. And then… and that really means is from your application
you can think about resources that are accessible to
you, and you can get… add more as you need it. And so this elastic extraction is an interesting way to allow you to think about your problems… in a way independent of the infrastructure, because you can get more resources as you go. So that… looking at Cloud… the Cloud
abstraction, independent of any specific infrastructure, is also very, very important. Okay, so, moving on, the focus of this is really to see how Clouds can be used for science. And so… this slide goes into some detail there. And it builds on the observation that Clouds are very quickly joining more traditional… cyber-
infrastructure, and by that I mean grids and HPC systems, supercomputers, clusters, data stores, as viable platforms of scientific exploration and discovery. And this reminds me of a point that Geoffrey
made in his introductory slides which really highlighted
the fact that Clouds are significantly larger in terms of
capability as compared to even the fastest supercomputer that exists right now. And so if Clouds are… an interesting… or important part of our
integrated cyber-infrastructure, we really need to
understand what application, formulations and usage modes are meaningful in this kind of infrastructure. Are there new capabilities that Clouds can
provide, can we do a… can we… formulate our problems
differently because we have these abstractions that we can use? And.. [pause] how can applications really utilize this infrastructure in terms of what services, what middleware
capabilities, what programming systems we need to make applications use this hybrid infrastructure more
effectively. Looking at some possible usage modes, and this is definitely not exhaustive, one can
think of Clouds having the same impact on Science and
Engineering… Computational Data-Enabled Science and
Engineering that they have on… in the enterprise space, where they can really take care, or offload a lot of the tedious aspects in terms of… management, configuration of the infrastructure itself, allowing the scientists to focus, really, on their problems, increasing efficiency, increasing productivity, also possibly increasing cost-effectiveness in many cases. As I mentioned earlier, Cloud abstractions can also support… potentially support new classes of algorithms and new formulations that… new usage modes that were not possible because of their… the unique capabilities that they provide, and some examples here could be that you could spawn off resources or services that you… online analytics as needed, when you see something interesting happening, or you wanted to classify the uncertainty in some data, well, you could run those on the fly on dynamically acquired resources or services. So this elasticity and on-demand nature that Clouds provide can allow you to think about problems… more interesting problem formulations. Democratization is another dimension, what I here… what I mean here is the fact that the Cloud abstraction allows a more uniform access to resources without having to necessarily depend on local resources at the site that you are. Now it’s well-established that access to resources is… has a tremendous impact on research productivity, this is well-documented. And so by being able to give everybody… potentially everybody access to resources to abstractions… the service abstractions, removes this differentiator that… between the researchers that have access to these resources and those that don’t. And so in a sense this can lead to a more democratization of research that is computational and data intensive. [pause] And this really goes back to… one of the earlier points is that… when I’m solving a problem and running my application, I don’t have to necessarily limit it to the amount of resources that I have locally. So if I have only ‘x’ gigabytes of memory or ‘y’ cores, I have to formulate my problem to fit that. I can say, ‘Okay, what does my science need? If there’s interesting physics going on, I need a lot more memory to resolve the features, and I need a lot more compute to solve my problems’, well, I can elastically go and get them when I need them. And so then I can… I don’t have to artificially limit the resolution based on this… on the system. And so this potential could lead to more interesting formulations. Clearly implementing them right requires appropriate abstractions for science. They have to expose Cloud services that can be integrated with applications more seamlessly. So clearly there are advantages to integrating Clouds into science as platforms for scientific explorations and discovery, and there are also many challenges. There’s a research agenda that many people are… exploring quite actively, looking at application types and capabilities that can be supported by Clouds. By adding Cloud services to infrastructure… would… it allow you to do things that you couldn’t do otherwise? Can it allow you to… new ways to formulate your applications, and… how… what abstractions… what services, middleware platforms, are essential to support these formulations and these usage modes in these hybrid platforms? And this is just a subset of the… research challenges that are required to really make the users, federated hybrid cyber-infrastructure that includes Clouds, usable by the science community.

Leave a Reply

Your email address will not be published. Required fields are marked *