Distributed Computing
Articles,  Blog

Distributed Computing


Distributed computing, a relatively recent
development to computation that actually changes the way in which we approach a lot of problems. What was once basically impossible is now
actually attainable with relative ease. Iin today’s episodem we’ll take a look at
some definitions and concepts behind distributed computing and then we’ll move on to some examples
including some that you can participate. All this and more after the break! Hello and welcome back to another random Wednesday
episode so let’s take distributed computing from the very top. What is it and why is it such a helpful and
useful thing? Now traditionally, computation is usually
performed on one machine. That is, you have a computer, you give it
input, it processes this input and it throws out some output. This is in fact the very standard thing that
basically you and I are experiencing every single day when we use our computers our mobile
phones, or whatever other computational device. Now, it’s true that in our day-to-day lives
this is actually very much enough, but when we are talking about larger scale projects,
for example if you’re doing say 3d graphics, if you’re doing video rendering or in fact
in the even larger scale, if you are researcher and you’re trying to crack a complicated scientific
problem, chances are in such situations the processing power of your single computer may
not be enough. A single computer may be too slow to solve
a large problem and that is where distributed computing comes in. The idea is simple – You take a large complex
task and you actually chop it up into little bits, and distribute this workload over a
large number of computers so that each computer only needs to chew through a small job, but
all the computers are working in unison and as a result you get back the result of your
large computation in far less time. So really conceptually, it’s very simple. However when it comes to choosing a task that
is actually suitable for distributed computing, some care has to be taken. Certain tasks are actually not very suitable
for distributed computing and you can imagine such a task to be something along these lines
– If you have one complicated task then you know has to be done in a single large step
you may have difficulties actually cutting it up into little bits to actually distribute
it to a large number of computers. Such a task may basically have a large number
of computations but every computation actually relies on a result of the previous computation. Once there is the sort of serial pattern involved,
it is much harder to actually distribute this task. So the most suitable tasks for distributed
computing are parallelizable tasks. That is a task that may require a large number
of complicated operations, but many of these operations can actually take place independently
of each other. What this means then is that you take each
one of these tasks and simply distribute them out. Since one task doesn’t actually rely on the
results of a different task, all these different tasks can then be done at the same time without
regard of the other tasks. The way this is done is simple – Basically
you have a host computer as well as an array of computers that are going to help out with
the distributed computing. The host computer is the computer in which
you actually set up your task, you run your main program, and this computer has the task
of dividing all the little jobs up, and then distributing these tasks out to the rest of
the computers. Each computer then does the processing of
these little tasks and the results are sent back. The host computer then takes the result of
these individual tasks and basically puts them all together to generate your final result. In fact a lot of the time this communication
between the computers actually happen over a network, and this network is the very same
type we have in our homes. Also the array of computers that actually
contribute to the results of this distributed computing operation are actually also just
any old computers, and what this means is if you have a couple of spare computers lying
around, you can actually set up your own distributed computing network. And in fact our first example today actually
illustrates this point. Our first example is actually a render farm. Now if you do 3D work yourself, or if you’ve
watched me doing 3d work in my series called Speed Model which you can watch by the way
by clicking on that link – You’ll know that setting up a 3d scene is fine. It can be done very quickly and you will be
able to preview and work with the scene basically in real time. However, to turn the scene from what is essentially
a collection of polygons into a nice JPEG image that has all a nice shading effects
or reflections and everything, it’s actually a very time-consuming process. This process of converting from a 3d mesh
into an image is called rendering, and basically the reason why it’s such a complicated task
is because what is called ray tracing is performed. Ray tracing is a technique that tries to model
exactly how light works in a real world, and basically this involves tracing of individual
rays and modeling the behavior of actual lights. For example when a light ray hits a reflective
surface it must be reflected at the correct angle. A typical scene comprises millions of rays
and so you would expect the task to be a very complicated one. What many 3d programs do is that they don’t
actually render the entire image as one. Instead what they do is they chop up the image
into little squares and each one of these squares can be rendered independently of other
squares in the image. This is where the magic of this music computing
can come in. After the scene is set up by the host computer,
each one of the little squares can then be distributed out into what is known as a render
farm. A render farm is just a distributed computing
setup that comprises multiple computers, and of course to make things more efficient your
one a large number of computers, but that’s not absolutely necessary. Each one of these computers then handle a
certain number of squares of the actual image and that is basically distributed computing. You’re offloading one complicated task to
multiple computers. Now our second example of distributed computing
is a more interesting one because this one also contains elements of crowdsourcing. What this means is you can actually participate
in this project by installing a copy of the program on your computer. When you’re actually not using a computer,
that program can run and make use of your computer’s computing power to help out in
the project. In fact the distributed computing project
I’m talking about here is called [email protected], a research project starting off at Stanford
University. Now very much in brief because I’m not a biologist,
this project is actually trying to simulate the folding of proteins. When proteins are actually synthesized they
are basically just a long strand, but in the body they actually sort of fold up and form
a certain structure. If I understand this correctly, at present
we actually still have some difficulties in understanding this particular process, which
is why we’re trying to true a lot of computation power at this problem to see if we are able
to better understand this phenomenon. This is actually a very parallelizable project
which is why [email protected] was actually created. When you download this software, it acts as
a screen saver so basically every time you’re idle the screen saver starts to run, and it
pulls down a particular folding operation from the [email protected] servers. Your computer then tries to perform some computation
and results are returned to the server. Because many people are actually contributing
to this project, the progress on it is actually a lot faster than can be done if this wasn’t
actually a distributed problem. In fact according to what I’ve been reading
online, hundreds of papers have actually been produced thanks to the results from the [email protected]
project. So yeah that is a distributed computing project
that you can contribute to. Personally I think it’s a great thing to do. It’s a great way to make a problem that is
complicated basically be solved in less time and of course for us as individuals, obviously
it’s nice to be able to donate some idle computing time to the advancement of science. So there you have it! That is basically Distributed Computing in
a nutshell I hope you’ve learned something today I hope you know you’re interested in
the [email protected] project at a very least. But yeah that’s it! Thank you very much for watching and until next time you’re watching 0612 TV
withNERDfirst.net.

36 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *