App Server Scaling – Web Development
Articles,  Blog

App Server Scaling – Web Development


Okay so uh,we’ve taking a fair amount of load off our database. Now if we were to go through that whole solution, we’d only be doing database right and we’d very rarely be doing a database read at all. And this is a big improvement over doing a database read on every page view. So let’s go back to what our request does. And every request remember we, we process the request. You know, this was HTTP, URLs and all that stuff. We did the DB query, the database query. We collated the results, which for you know, ASCII Chan or the blog is, there’s not really much involved in that at all. And then, we rendered the HTML. So, we’ve improved the database query, but what if we want to improve these other, other pieces. Yeah, we can use caching to render HTML. That’s definitely a technique there, and we can actually use another technique for handling all three of these parts of the requests, which is adding additional app service and this looks something like this. To date conceptually we’ve basically had one program running, that handles all of our requests in your blog and an ASCII Chan. We’ve got the simple program you know, requests come in. Responses come back out, but if so many requests that one machine can’t handle it. We can do is we can add multiple machines, to take up some of the load. So all of these extra requests can go to all of these machines, and these machines maybe interacting with the database. They may not be. Presumably they have their little caches that we just implemented that lives in our program and, and this helps so how do we get requests to multiple machines. Well, there’s a piece of technology that sits between the user and all of your app servers called a load balancer. And this is a special machine, it’s a physical machine just like your app server might be or just like your database server might be, that’s optimized for doing one this, for spreading requests across multiple machines. So, what happens is, this load balancer has a list of all of the app servers that, that are in existence, and requests come in from the outside world, many, many, many of them. And the load balancer decides which app server to send the request to. Send one here, then send one here. Then send one here. And it can keep going to that process. And the reason this load balancer can handle all this traffic while the app servers can’t, is the low balancer isn’t doing anything other than taking in the request, choosing a server, and forwarding the connection along. It doesn’t have to parse HTTP or it may only parse, parse minimal HTTP. It doesn’t have to, it’s not doing database queries, it’s not rendering HTML, it’s not, going to the cache it’s, it’s doing almost nothing at all, other than deciding which server to send a request to. You’ve probably won’t ever have to write one of these, but it’s good to know how they work. And when you’re using App Engine, Google kind of does all of this for you. They’ll automatically create new servers running your program and and scale. You can actually go into the app engine admin page and see how many servers they are using to host your app. Which is pretty cool. Normally this is a really challenging thing, and not knowing how to do it the first time, this took me a little while to figure out when we were scaling Reddit. That doesn’t mean I’m not going to make you understand these things a little bit deeper. So, there are a couple algorithms a load balancer can use to determine which server to send traffic to. The simplest one is probably just to randomly choose a server. Now, a request comes into the load balancer and the load balancer just picks a server and sends a request there which will, you know, probably work pretty well. You know, over, over time, if you have enough requests, each server should get about the same amount of load. Other approach is round robin, and round robin just means a load balancer will choose one server at the time. You know, first this guy, then this guy, then this guy, then this guy, you know, just, in order. That’s also a fairly simple algorithm. And then some of the balances are really smart and they know what the load is on each server, how many requests are outstanding at each server, and it may use some sort of load based algorithm just so you know. This guy’s already handling like five requests and this guy’s not doing anything so let’s send, you know, future requests here until, you know, things even out. There’s lots of approaches to doing this. What I’m going to ask you to do now in the form of a quiz is implement a basic round robin algorithm.

5 Comments

  • zheng yu

    good to explain that why LoadBalance can handle the massive amount of requests compared with app server. Another thing is sometimes LoadBalancer is implemented in hardware other than just software

  • Boris The Soviet Love-Hammer

    //Round robin
    servers = {A, B, C, D}
    recent = 0;
    while(True){
    send_request_to(servers[recent])
    recent++;
    if (recent == servers.Length){recent=0;}
    }

Leave a Reply

Your email address will not be published. Required fields are marked *