.. title: Follow Conventions to Build Infrastructure
.. slug: follow-conventions-to-build-infrastructure
.. date: 2016-08-30 05:00:16 UTC
.. updated: 2016-08-30 05:00:16 UTC
.. tags: centos, continuous integration, docker, packaging, salt, ubuntu
.. category:
.. link:
.. description:
.. type: text

Any sufficiently large software system depends on a lot of third-party
artifacts, from operating systems to libraries to pre-built packages and
everything else in between. For example, to deploy a single web server running
a Django application requires a developer to pull in a Linux OS, Python,
Django, Gunicorn, nginx, etc. Whether this complete package is deployed on
a VM under your control or your customers', you want to follow conventions
in the entire lifecycle of the application.

.. TEASER_END: Read more

Let's start with the OS and build on that. Say CentOS becomes the OS of choice.
You choose its minimal version for development, testing, and production
deployment. Some packages needed are not provided in the base repository
(repo) and you turn to additional repos like EPEL and IUS.

How do you make sure you can reliably stand up a development environment? A
popular choice these days is to use Vagrant. I've been using it for a while as
well and like it. It has its own conventions to follow, though. You can build
a CentOS box yourself but community convention is to pull in official boxes
if you don't need to heavily customize the minimal install. You certainly don't
want to spend your time building Vagrant boxes when the community has already
done the hard work for you.

Next you may want to use Docker to easily package and deploy your artifacts.
Here again you should follow convention of a single process per container.
Since the convention is widely accepted, trying to treat a container as a
mini Virtual Machine (VM) goes against it. You'll find far better help through
blogs, forums, etc. if you stick to conventions here.

You now have a good way to stand up a development environment for yourself and
others. You used what the Vagrant community built and you followed Docker
conventions to make your team's and your life easier.

Next step is packaging your application regularly to be deployed to other
members of your team, to a continuous integration (CI) environment, and maybe
to a continuous deployment (CD) environment. How do you package your
application? Once again you follow conventions. In our example it's a Django
application so you do what the wider community has accepted as its convention.
Thus your package easily fits into the ecosystem you have chosen for your
development process: Vagrant, Docker, CI, CD.

Your CI process takes deploys a Vagrant box, packages your application as a
Docker image, deploys it as a container, runs tests on it, and then deploys
to a CD environment. In this entire process when you follow conventions in
each step you'll likely encounter fewer bugs and other members of the team will
find it easier to understand and contribute.

I have recently asked myself a few questions on when to deviate from
conventions and how much. Let's take nginx as an example. Should I build a
custom rpm package for my uses or use what us provided in EPEL? To reduce my
burden of maintaining a package's lifecycle I use the one provided by EPEL. But
what if it gets upgraded in EPEL? Do I have enough control to manage when
packages get upgraded in my environment? I mitigate that risk by mirroring
EPEL within my own network and selectively releasing packages to it. This is a
widely used convention but introduces overhead. My team needs to add more
duties because we want to control our risk.

The same thing happens with Docker images. I should add a private repo for it
from where I pull my images. I release newer upstream images to this repo when
I'm ready for it. This is also a well adopted convention.

The problem comes when I have to customize the nginx package for deployment.
Enter configuration management. I use something, say SaltStack, to configure
nginx in development, testing, and production. I take the upstream package,
install it in a Docker image, and use Salt to customize the container when it's
deployed. Thus the configuration "truth" stays in one place: Salt.

Do not add customized configuration to a cloned nginx rpm package. Do not
create dozens of Docker images with customized configuration in each. Use a
configuration management tool to do the thing it's made to do. This is the
conventional use of modern tools. Of course, if you use your config management
tool to create config packages that would be great as well.

I have witnessed Dockerfiles that create VM-like images with multiple processes
and all configuration done right there. This complicates life when deployment
does not match assumptions made at image build time. Keeping things simple
and following conventions reduces your chances of making such mistakes.

In your CI system as you package your application, think of the various
artifacts that are useful when deploying it. Start with a base VM. Since we're
running CentOS our artifact must be rpm. How difficult is it to also create a
deb package so it can install on Ubuntu? In the grand scheme of things: not
that difficult. Do not create a Docker image artifact in the same build.
Instead, kick off a secondary build that creates two Docker images: one using
a base CentOS image that installs your application from an rpm package and the
other using an Ubuntu image that installs your application from a deb package.

Next deploy all these artifacts to appropriate environments, configure them
with Salt, and run tests.

List of "Do Nots":

* Do not install and run multiple services in the same Docker container.
* Do not replicate the job of packages in Dockerfile. For example, creating users, copying files, setting permissions, etc.
* Do not create SysV init service files.
* Do not add configuration steps in a Dockerfile.

List of "Dos":

* Install and run one service per Docker container.
* Create users, copy files to right locations, set file permissions, etc. within the rpm or deb package.
* Create systemd unit files.
* Be ready to deploy your application on a single server bare metal VM, multiple VMs, multiple Docker containers.
* Configure a Docker image during deployment using a configuration management system. Or install config packages (rpm, deb, etc.).

List of "You May":

You may take an upstream vanilla source rpm or deb package and break it up into
multiple packages: application source, config files, systemd unit files. Then
create customized packages according to the needs of deployment. This is
helpful in industries where each install is manual, can't be touched for
months on end, has no Internet connectivity. Each deployment thus shares the
same application and unit files packages but has its own config file package.