Routing

Part 1, Chapter 2


Chapter two covers the core functionality of a load balancer: Routing. Although, there are a number of ways to handle routing, we'll focus only on the following types:

  1. host-based
  2. path-based

Why Do We Need Routing?

Routing is a general concept in networking which refers to the path a packet takes to reach a destination. In our case, when we implement routing in a load balancer we're referring to how HTTP requests reach other collections of servers (or different backend servers). Different routing strategies can be used to control the behavior of the load balancer.

What's Host-based Routing?

With host-based routing, requests are directed based on the request host header. If the header is not found then the load balancer returns a 404 to the client.

Host-based routing

In the above example, the client on the left sends their host header as "www.mango.com" and the load balancer directs them to the mango backend servers. The client on the right, meanwhile, sends "www.notmango.com" and gets a "404 Not Found" back because the load balancer only recognizes "www.mango.com" and "www.apple.com".

So how does one set the host header? There are two ways -- explicit and implicit. The explicit way is to simply pass it in your HTTP client.

For example:

curl -H 'Host: www.google.com' 172.217.169.68

The implicit way is:

curl www.google.com

In the implicit case, curl will set the host header for you as it resolves the DNS. For the explicit case we have to pass the header ourselves otherwise we get a redirect 301 code. In the example above, www.google.com is a slightly more complicated case since it has a CDN (Content Distribution Network) in front, so we are not hitting the load balancer directly.

What's Path-based Routing?

Path-based routing relies on the URI to send the request to the backend servers.

For example:

https://www.mywebsite.com/apples
https://www.mywebsite.com/mangoes

Let's break down the above:

  1. https is the protocol
  2. www.mywebsite.com is the fully qualified domain name (FQDN)
  3. /apples and /mangoes are the paths

If someone requests the /apples path, we'll direct the request to the apples backend servers and if someone hits the /mangoes path we'll direct them to the mangoes backend server.

Path-based routing is a very common pattern for creating microservices to break a large web application into a smaller one. As with the above example, host-based and path-based routing complement each other: Essentially, path-based routing relies on host-based routing.

Implementing Host-Based Routing

Tests

Before we configure the backends up, we need to write some tests:

# test_loadbalancer.py

import pytest

from loadbalancer import loadbalancer


@pytest.fixture
def client():
    with loadbalancer.test_client() as client:
        yield client


def test_host_routing_mango(client):
    result = client.get('/', headers={'Host': 'www.mango.com'})
    assert b'This is the mango application.' in result.data


def test_host_routing_apple(client):
    result = client.get('/', headers={'Host': 'www.apple.com'})
    assert b'This is the apple application.' in result.data


def test_host_routing_notfound(client):
    result = client.get('/', headers={'Host': 'www.notmango.com'})
    assert b'Not Found' in result.data
    assert 404 == result.status_code

We removed the test_hello test and added three new tests. Did you notice that the last test also looks at the HTTP status code to confirm that we're returning the correct status code to the client? Running the above will result in all three tests failing which is what we want.

(env)$ python -m pytest

==================================================== test session starts =====================================================
platform darwin -- Python 3.9.0, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/http-load-balancer
collected 3 items

test_loadbalancer.py FFF                                                                                               [100%]

========================================================== FAILURES ==========================================================
__________________________________________________ test_host_routing_mango ___________________________________________________

client = <FlaskClient <Flask 'loadbalancer'>>

    def test_host_routing_mango(client):
        result = client.get('/', headers={'Host': 'www.mango.com'})
>       assert b'This is the mango application.' in result.data
E       AssertionError: assert b'This is the mango application.' in b'hello'
E        +  where b'hello' = <Response 5 bytes [200 OK]>.data

test_loadbalancer.py:14: AssertionError
__________________________________________________ test_host_routing_apple ___________________________________________________

client = <FlaskClient <Flask 'loadbalancer'>>

    def test_host_routing_apple(client):
        result = client.get('/', headers={'Host': 'www.apple.com'})
>       assert b'This is the apple application.' in result.data
E       AssertionError: assert b'This is the apple application.' in b'hello'
E        +  where b'hello' = <Response 5 bytes [200 OK]>.data

test_loadbalancer.py:19: AssertionError
_________________________________________________ test_host_routing_notfound _________________________________________________

client = <FlaskClient <Flask 'loadbalancer'>>

    def test_host_routing_notfound(client):
        result = client.get('/', headers={'Host': 'www.notmango.com'})
>       assert b'Not Found' in result.data
E       AssertionError: assert b'Not Found' in b'hello'
E        +  where b'hello' = <Response 5 bytes [200 OK]>.data

test_loadbalancer.py:24: AssertionError
================================================== short test summary info ===================================================
FAILED test_loadbalancer.py::test_host_routing_mango - AssertionError: assert b'This is the mango application.' in b'hello'
FAILED test_loadbalancer.py::test_host_routing_apple - AssertionError: assert b'This is the apple application.' in b'hello'
FAILED test_loadbalancer.py::test_host_routing_notfound - AssertionError: assert b'Not Found' in b'hello'
===================================================== 3 failed in 0.17s ======================================================

Multiple Backends

In order to test out the load balancer's routing functionality, we need to set up multiple servers for each of our hosts. To simplify this, we'll use both Docker and Docker Compose:

  1. Docker is a containerization tool designed to simplify the development and deployment of an application.
  2. Docker-compose is a tool used for defining and running multi-container Docker applications. Each of our backends will run in a separate container.

Let's start off by creating a very basic Python web server with Flask in a new file called app.py:

# app.py

import os

from flask import Flask

app = Flask(__name__)


@app.route('/')
def sample():
    return f'This is the {os.environ["APP"]} application.'


if __name__ == '__main__':
    app.run(host='0.0.0.0')

The above application takes an environment variable, APP, and simply returns it when the / URL is hit. This allows us to quickly identify the application.

Next, add a Dockerfile:

# Dockerfile

FROM python:3
RUN pip install flask
COPY ./app.py /app/app.py
CMD ["python", "/app/app.py"]

Build the image:

$ docker build -t server .

After pulling the Python 3 base image, we installed Flask, copied app.py over to the container, and ran the app.

We can use this image to spin up multiple containers with Docker Compose:

# docker-compose.yml

version: '3'
services:
  mango1:
    image: server
    environment:
      - APP=mango
    ports:
      - "8081:5000"
  mango2:
    image: server
    environment:
      - APP=mango
    ports:
      - "8082:5000"
  apple1:
    image: server
    environment:
      - APP=apple
    ports:
      - "9081:5000"
  apple2:
    image: server
    environment:
      - APP=apple
    ports:
      - "9082:5000"

To spin up the containers, run:

$ docker-compose up -d

You should see:

Creating network "http-load-balancer_default" with the default driver
Creating http-load-balancer_mango1_1 ... done
Creating http-load-balancer_apple1_1 ... done
Creating http-load-balancer_apple2_1 ... done
Creating http-load-balancer_mango2_1 ... done

If you get errors in the above step you might have something else running on the 8081, 8082, 9081 and 9082 ports. If that's the case, change the port numbers to the ones that are free on your operating system.

Test the applications to ensure they work as expected:

$ curl localhost:8081
This is the mango application.

$ curl localhost:8082
This is the mango application.

$ curl localhost:9081
This is the apple application.

$ curl localhost:9082
This is the apple application.

Bring the containers down:

$ docker-compose down

Next, to automate spinning the containers up and then tearing them down, add a Makefile to the project root:

# Makefile

test:
        docker-compose up -d
        pytest --disable-warnings || true
        docker-compose down

Makefiles use tabs rather than spaces. So, if you see Makefile:2: *** missing separator (did you mean TAB instead of 8 spaces?). Stop. when you run make test be sure to convert the indentation in your code editor to tabs for the Makefile.

Also note that we added an || true to the pytest command so that that regardless of the exit code, we'll ignore it and teardown our containers. We want our tests to not leave anything behind so we can run them again without any manual intervention.

Run make test. You should see a similar output to this:

(env) make test

docker-compose up -d
Creating network "http-load-balancer_default" with the default driver
Creating http-load-balancer_mango1_1 ... done
Creating http-load-balancer_apple1_1 ... done
Creating http-load-balancer_apple2_1 ... done
Creating http-load-balancer_mango2_1 ... done

pytest --disable-warnings || true
==================================================== test session starts =====================================================
platform darwin -- Python 3.9.0, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/http-load-balancer
collected 3 items

test_loadbalancer.py FFF                                                                                               [100%]

========================================================== FAILURES ==========================================================
__________________________________________________ test_host_routing_mango ___________________________________________________

client = <FlaskClient <Flask 'loadbalancer'>>

    def test_host_routing_mango(client):
        result = client.get('/', headers={'Host': 'www.mango.com'})
>       assert b'This is the mango application.' in result.data
E       AssertionError: assert b'This is the mango application.' in b'hello'
E        +  where b'hello' = <Response 5 bytes [200 OK]>.data

test_loadbalancer.py:14: AssertionError
__________________________________________________ test_host_routing_apple ___________________________________________________

client = <FlaskClient <Flask 'loadbalancer'>>

    def test_host_routing_apple(client):
        result = client.get('/', headers={'Host': 'www.apple.com'})
>       assert b'This is the apple application.' in result.data
E       AssertionError: assert b'This is the apple application.' in b'hello'
E        +  where b'hello' = <Response 5 bytes [200 OK]>.data

test_loadbalancer.py:19: AssertionError
_________________________________________________ test_host_routing_notfound _________________________________________________

client = <FlaskClient <Flask 'loadbalancer'>>

    def test_host_routing_notfound(client):
        result = client.get('/', headers={'Host': 'www.notmango.com'})
>       assert b'Not Found' in result.data
E       AssertionError: assert b'Not Found' in b'hello'
E        +  where b'hello' = <Response 5 bytes [200 OK]>.data

test_loadbalancer.py:24: AssertionError
================================================== short test summary info ===================================================
FAILED test_loadbalancer.py::test_host_routing_mango - AssertionError: assert b'This is the mango application.' in b'hello'
FAILED test_loadbalancer.py::test_host_routing_apple - AssertionError: assert b'This is the apple application.' in b'hello'
FAILED test_loadbalancer.py::test_host_routing_notfound - AssertionError: assert b'Not Found' in b'hello'
===================================================== 3 failed in 0.14s ======================================================

docker-compose down
Stopping http-load-balancer_mango2_1 ... done
Stopping http-load-balancer_apple2_1 ... done
Stopping http-load-balancer_apple1_1 ... done
Stopping http-load-balancer_mango1_1 ... done
Removing http-load-balancer_mango2_1 ... done
Removing http-load-balancer_apple2_1 ... done
Removing http-load-balancer_apple1_1 ... done
Removing http-load-balancer_mango1_1 ... done
Removing network http-load-balancer_default

Implementation

Time to make our tests pass. We'll us the Requests library to send the HTTP request to the backend server. Start by installing it:

(env)$ pip install requests

Update the load balancer application:

# loadbalancer.py

import random

import requests
from flask import Flask, request

loadbalancer = Flask(__name__)

MANGO_BACKENDS = ['localhost:8081', 'localhost:8082']
APPLE_BACKENDS = ['localhost:9081', 'localhost:9082']


@loadbalancer.route('/')
def router():
    host_header = request.headers['Host']
    if host_header == 'www.mango.com':
        response = requests.get(f'http://{random.choice(MANGO_BACKENDS)}')
        return response.content, response.status_code
    elif host_header == 'www.apple.com':
        response = requests.get(f'http://{random.choice(APPLE_BACKENDS)}')
        return response.content, response.status_code
    else:
        return 'Not Found', 404

First, we defined our backend servers for our hosts which are based on the Docker Compose file we generated earlier. Then we fetched the Host header and ran an if condition to check if it matches our criteria. If we got a match, we then chose a backend server at random and sent an HTTP request and returned the HTTP response body and status code back to the client. If we didn't get any matches, we returned "Not Found" along with a 404 status code.

Let's run our tests:

(env)$ make test

They should pass

==================================================== test session starts =====================================================
platform darwin -- Python 3.9.0, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/http-load-balancer
collected 3 items

test_loadbalancer.py ...                                                                                               [100%]

===================================================== 3 passed in 0.22s ======================================================

We have successfully implemented host-based routing. Now let's do the same for path-based routing.

Implementing Path-Based Routing

Tests

Add the following new tests to test_loadbalancer.py:

def test_path_routing_mango(client):
    result = client.get('/mango')
    assert b'This is the mango application.' in result.data


def test_path_routing_apple(client):
    result = client.get('/apple')
    assert b'This is the apple application.' in result.data


def test_path_routing_notfound(client):
    result = client.get('/notmango')
    assert b'Not Found' in result.data
    assert 404 == result.status_code

We don't care about the host header in this case.

Implementation

# loadbalancer.py

...

@loadbalancer.route('/mango')
def mango_path():
    response = requests.get(f'http://{random.choice(MANGO_BACKENDS)}')
    return response.content, response.status_code


@loadbalancer.route('/apple')
def apple_path():
    response = requests.get(f'http://{random.choice(APPLE_BACKENDS)}')
    return response.content, response.status_code

All six tests should now pass:

==================================================== test session starts =====================================================
platform darwin -- Python 3.9.0, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/http-load-balancer
collected 6 items

test_loadbalancer.py ......                                                                                            [100%]

===================================================== 6 passed in 0.22s ======================================================

Conclusion

We've gone through what host and path-based routing is, how to write tests for both cases, and how to implement them. Whilst setting up the tests, we used Docker Compose to spin up multiple backend servers for hosts supported by our load balancer.

See you in chapter 3!




Mark as Completed