Configuration

Part 1, Chapter 3


Chapter three covers the implementation of a configuration file for our load balancer. We'll explore why there's a need for configuration files and what problems they help to solve. Towards the end we'll also see how our TDD approach helps us move faster.

Why Config Files

Configuration files are common for all sorts of applications whether they are exposed via GUIs or modified by hand. They give us the ability to tweak applications for different needs and allows us to change the behavior for different environments. We can also keep our configuration out of our code which results in less code changes.

Introducing Change

Before making any changes, let's take a step back to review: We currently have a load balancer that works as expected and we're about to introduce a fairly major change. This is where our tests will play a big part. We can verify that after we implement this change our tests still pass. If they still pass we can be reasonably confident that we won't break anything or change the behavior of our application.

Implementation

Host-based Routing

Instead of hardcoding the following backend servers for each host, we'll define them in our configuration file:

MANGO_BACKENDS = ["localhost:8081", "localhost:8082"]
APPLE_BACKENDS = ["localhost:9081", "localhost:9082"]

Create a loadbalancer.yaml file:

# loadbalancer.yaml

hosts:
  - host: www.mango.com
    servers:
      - localhost:8081
      - localhost:8082
  - host: www.apple.com
    servers:
      - localhost:9081
      - localhost:9082

To use this file in our code, install PyYAML in the Dockerfile:

FROM python:3
RUN pip install flask pyyaml
COPY ./app.py /app/app.py
CMD ["python", "/app/app.py"]

Next, update loadbalancer.py:

# loadbalancer.py

import random

import requests
import yaml
from flask import Flask, request


loadbalancer = Flask(__name__)


def load_configuration(path):
    with open(path) as config_file:
        config = yaml.load(config_file, Loader=yaml.FullLoader)
    return config


config = load_configuration('loadbalancer.yaml')


@loadbalancer.route('/')
def router():
    host_header = request.headers['Host']
    for entry in config['hosts']:
        if host_header == entry['host']:
            response = requests.get(f'http://{random.choice(entry["servers"])}')
            return response.content, response.status_code
    return 'Not Found', 404

...

Here, we imported the yaml module and defined a load_configuration function to load loadbalancer.yaml into a Python dictionary. The router function then iterates through the hosts list defined in our configuration file and checks if the host key matches the Host header. If there's a match we sent a request to the backend servers defined in the servers otherwise we return a 404.

Build the image:

$ docker build -t server .

We can now run our tests:

(env)$ pip install pyyaml
(env)$ make test

Four out of six should pass:

==================================================== test session starts =====================================================
platform darwin -- Python 3.9.0, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/http-load-balancer
collected 6 items

test_loadbalancer.py ...FF.                                                                                            [100%]

========================================================== FAILURES ==========================================================
__________________________________________________ test_path_routing_mango ___________________________________________________

client = <FlaskClient <Flask 'loadbalancer'>>

    def test_path_routing_mango(client):
        result = client.get('/mango')
>       assert b'This is the mango application.' in result.data
E       assert b'This is the mango application.' in b'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">\n<title>500 Internal Server Error</title>\n<h1>Internal Serv...nd was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>\n'
E        +  where b'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">\n<title>500 Internal Server Error</title>\n<h1>Internal Serv...nd was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>\n' = <Response 290 bytes [500 INTERNAL SERVER ERROR]>.data

test_loadbalancer.py:30: AssertionError
----------------------------------------------------- Captured log call ------------------------------------------------------
ERROR    loadbalancer:app.py:1891 Exception on /mango [GET]
Traceback (most recent call last):
  File "/Users/michael/repos/testdriven/http-load-balancer/env/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users/michael/repos/testdriven/http-load-balancer/env/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Users/michael/repos/testdriven/http-load-balancer/env/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/Users/michael/repos/testdriven/http-load-balancer/env/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/Users/michael/repos/testdriven/http-load-balancer/env/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Users/michael/repos/testdriven/http-load-balancer/env/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/Users/michael/repos/testdriven/http-load-balancer/loadbalancer.py", line 31, in mango_path
    response = requests.get(f'http://{random.choice(MANGO_BACKENDS)}')
NameError: name 'MANGO_BACKENDS' is not defined
__________________________________________________ test_path_routing_apple ___________________________________________________

client = <FlaskClient <Flask 'loadbalancer'>>

    def test_path_routing_apple(client):
        result = client.get('/apple')
>       assert b'This is the apple application.' in result.data
E       assert b'This is the apple application.' in b'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">\n<title>500 Internal Server Error</title>\n<h1>Internal Serv...nd was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>\n'
E        +  where b'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">\n<title>500 Internal Server Error</title>\n<h1>Internal Serv...nd was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>\n' = <Response 290 bytes [500 INTERNAL SERVER ERROR]>.data

test_loadbalancer.py:35: AssertionError
----------------------------------------------------- Captured log call ------------------------------------------------------
ERROR    loadbalancer:app.py:1891 Exception on /apple [GET]
Traceback (most recent call last):
  File "/Users/michael/repos/testdriven/http-load-balancer/env/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users/michael/repos/testdriven/http-load-balancer/env/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Users/michael/repos/testdriven/http-load-balancer/env/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/Users/michael/repos/testdriven/http-load-balancer/env/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/Users/michael/repos/testdriven/http-load-balancer/env/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Users/michael/repos/testdriven/http-load-balancer/env/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/Users/michael/repos/testdriven/http-load-balancer/loadbalancer.py", line 37, in apple_path
    response = requests.get(f'http://{random.choice(APPLE_BACKENDS)}')
NameError: name 'APPLE_BACKENDS' is not defined
================================================== short test summary info ===================================================
FAILED test_loadbalancer.py::test_path_routing_mango - assert b'This is the mango application.' in b'<!DOCTYPE HTML PUBLIC ...
FAILED test_loadbalancer.py::test_path_routing_apple - assert b'This is the apple application.' in b'<!DOCTYPE HTML PUBLIC ...
================================================ 2 failed, 4 passed in 0.29s =================================================

Path-based Routing

Update the config file like so:

# loadbalancer.yaml

hosts:
  - host: www.mango.com
    servers:
      - localhost:8081
      - localhost:8082
  - host: www.apple.com
    servers:
      - localhost:9081
      - localhost:9082
paths:
  - path: /mango
    servers:
      - localhost:8081
      - localhost:8082
  - path: /apple
    servers:
      - localhost:9081
      - localhost:9082

We can now combine mango_path and apple_path into a single view function:

@loadbalancer.route('/<path>')
def path_router(path):
    for entry in config['paths']:
        if ('/' + path) == entry['path']:
            response = requests.get(f'http://{random.choice(entry["servers"])}')
            return response.content, response.status_code
    return 'Not Found', 404

We iterated through the paths key and matched it with the path given in the URL, and then sent a request to the matching backend. If nothing is matched then we returned a 404.

loadbalancer.py should now look like this:

# loadbalancer.py

import random

import requests
import yaml
from flask import Flask, request

loadbalancer = Flask(__name__)


def load_configuration(path):
    with open(path) as config_file:
        config = yaml.load(config_file, Loader=yaml.FullLoader)
    return config


config = load_configuration('loadbalancer.yaml')


@loadbalancer.route('/')
def router():
    host_header = request.headers['Host']
    for entry in config['hosts']:
        if host_header == entry['host']:
            response = requests.get(f'http://{random.choice(entry["servers"])}')
            return response.content, response.status_code
    return 'Not Found', 404


@loadbalancer.route('/<path>')
def path_router(path):
    for entry in config['paths']:
        if ('/' + path) == entry['path']:
            response = requests.get(f'http://{random.choice(entry["servers"])}')
            return response.content, response.status_code
    return 'Not Found', 404

The tests should pass:

(env)$ make test

==================================================== test session starts =====================================================
platform darwin -- Python 3.9.0, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/http-load-balancer
collected 6 items

test_loadbalancer.py ......                                                                                            [100%]

===================================================== 6 passed in 0.22s ======================================================

Conclusion

We implemented a configuration file for our load balancer application and covered reasons for why adding a configuration file makes development more flexible. Taking a TDD approach paid off for us because we were able implement this change knowing that the behavior of our application did not change.




Mark as Completed