Managing Docker Instances with Puppet

In a previous article, "Provisioning Docker with Puppet", in the December 2016 issue, I covered one of the ways you can install the Docker service onto a new system with Puppet. By contrast, this article focuses on how to manage Docker images and containers with Puppet.

Reasons for Integrating Docker with Puppet

There are three core use cases for integrating Docker with Puppet or with another configuration management tool, such as Chef or Ansible:

  1. Using configuration management to provision the Docker service on a host, so that it is available to manage Docker instances.

  2. Adding or removing specific Docker instances, such as a containerized web server, on managed hosts.

  3. Managing complex or dynamic configurations inside Docker containers using configuration management tools (for example, Puppet agent) baked into the Docker image.

"Provisioning Docker with Puppet", in the December 2016 issue of LJ, covered the first use case. This article is primarily concerned with the second.

Container management with Puppet allows you to do a number of things that become ever more important as an organization scales up its systems, including the following:

  1. Leveraging the organization's existing configuration management framework, rather than using a completely separate process just to manage Docker containers.

  2. Treating Docker containers as "just another resource" to converge in the configuration management package/file/service lifecycle.

  3. Installing Docker containers automatically based on hostname, node classification or node-specific facts.

  4. Orchestrating commands inside Docker containers on multiple hosts.

Although there certainly are other ways to achieve those goals (see the Picking a Toolchain sidebar), it takes very little work to extend your existing Puppet infrastructure to handle containers as part of a node's role or profile. That's the focus for this article.

Picking a Toolchain

Why focus on container management with Puppet? There certainly are other ways to manage Docker instances, containers and clusters, including some native to Docker itself. As with any other IT endeavor, your chosen toolchain both provides and limits your capabilities. For a home system, your choice of toolchain is largely a matter of taste, but in the data center, it's often better to leverage existing tools and in-house expertise whenever possible.

Puppet was chosen for this series of articles because it is a strong enterprise-class solution that has been widely deployed for more than a decade. However, you could do much the same thing with Chef or Ansible if you choose.

Puppet also was selected over other container orchestration tools because many large organizations already make use of at least one configuration management tool. In many cases, it's advantageous to include container management within the existing toolchain rather than climbing the learning curve of a more specialized tool, such as Kubernetes.

If you already use Puppet, Chef or Ansible in your data center, getting started with container management by extending your current toolset is probably smart money. However, if you find yourself bumping up against the limitations of your configuration management tool, you may want to evaluate other enterprise-class solutions, such as Apache Mesos, Kubernetes or DC/OS.

Creating a Test Environment

To follow along with the code listings and examples in the remainder of this article, ensure that Vagrant and VirtualBox are already installed. Next, you'll prepare a set of provisioning scripts to configure a test environment on an Ubuntu virtual machine.

Preparing Your Provisioning Scripts

Create a directory to work in, such as ~/Documents/puppet-docker. Place the Vagrantfile and docker.pp manifest within this directory (see Listings 1 and 2).

The Vagrantfile is a Ruby-based configuration file that Vagrant uses to drive one or more "providers". Vagrant supports VirtualBox, Hyper-V and Docker by default, but it also supports many other providers, such as VMware Fusion, DigitalOcean, Amazon AWS and more. Because the goal is to simulate the management of the Docker dæmon, images and containers on a full-fledged OS, let's focus on the cross-platform VirtualBox provider.

Listing 1. Vagrantfile


Vagrant.configure(2) do |config|
  # Install the official Ubuntu 16.04 Vagrant guest.
  config.vm.box = 'ubuntu/xenial64'

  # Forward port 8080 on the Ubuntu guest to port
  # 8080 on the VirtualBox host. Set the host value
  # to another unused port if 8080 is already in
  # use.
  config.vm.network 'forwarded_port',
                    guest: 8080,
                    host:  8080

  # Install the puppet agent whenever Vagrant
  # provisions the guest. Note that subsequent
  # releases have renamed the agent package from
  # "puppet" to "puppet-agent".
  config.vm.provision 'shell', inline: <<-SHELL
    export DEBIAN_FRONTEND=noninteractive
    apt-get -y install puppet
  SHELL
end

Note that this particular Vagrantfile (Listing 1) installs Puppet 3.8.5, which is the currently supported version for Ubuntu 16.04.1 LTS. Different versions are available as Puppet Enterprise packages or Ruby gems, but this article focuses on the version provided by Ubuntu for its current long-term support release.

Docker.pp is a Puppet manifest that uses a declarative syntax to bring a node into a defined state. The docker.pp manifest (Listing 2) makes use of an officially supported Puppet Forge module that takes care of a great deal of low-level work for you, making the installation and management of the Docker dæmon, images and containers easier than rolling your own.

On some systems, a Puppet manifest can enable a basic Docker setup with only a simple include 'docker' statement. However for this article, you will override specific settings, such as your user name on the guest OS, so that the right user is added to the group that can communicate with the Docker dæmon. You also will override the name of the Docker package to install, as you want to use the Ubuntu-specific "docker.io" package rather than the upstream "docker-engine" package the Puppet module uses by default.

Listing 2. docker.pp


# Most Vagrant boxes use 'vagrant' rather than
# 'ubuntu' as the default username, but the Xenial
# Xerus image uses the latter.
class { 'docker':
  package_name => 'docker.io',
  docker_users => ['ubuntu'],
}

# Install an Apache2 image based on Alpine Linux.
# Use port forwarding to map port 8080 on the
# Docker host to port 80 inside the container.
docker::run { 'apache2':
  image   => 'httpd:alpine',
  ports   => ['8080:80'],
  require => Class['docker'],
}

When placing docker.pp into the same directory as the Vagrantfile, Vagrant will make the Puppet manifest available inside the virtual machine automagically using its synced folder feature. As you will see shortly, this seemingly minor step can pay automation-friendly dividends when provisioning the guest OS.

Provisioning with Puppet Apply

With the Vagrantfile and docker.pp stored in your working directory, you're ready to launch and configure the test environment. Since this article is all about automation, let's go ahead and script those activities too.

Create the shell script shown in Listing 3 in the same directory as the Vagrantfile. You can name it anything you like, but a sensible name, such as vagrant_provisioning.sh, makes it clear what the script does.

Listing 3. vagrant_provisioning.sh


#!/usr/bin/env bash

# Provision an Ubuntu guest using VirtualBox.
vagrant up --provider virtualbox

# Install the officially-supported Docker module
# from the Puppet Forge as a non-root user.
vagrant ssh -c \
    'puppet module install \
     puppetlabs-docker_platform --version 2.1.0'

# Apply our local Docker manifest using the Puppet
# agent. No Puppet Master required!
#
# Note that the modulepath puppet installs to can
# vary on different Ubuntu releases, but this one is
# valid for the image defined in our Vagrantfile.
vagrant ssh -c \
    'sudo puppet apply \
     --modulepath ~/.puppet/modules \
     /vagrant/docker.pp'

# After adding the "ubuntu" user as a member of the
# "docker" group to enable non-root communications
# with the Docker daemon, we deliberately close the
# SSH control connection to avoid unhelpful Docker
# errors such as "Cannot connect to the Docker
# daemon. Is the docker daemon running on this
# host?" on subsequent connection attempts.
vagrant ssh -- -O exit

Make the script executable with chmod 755 vagrant_provisioning.sh, then run it with ./vagrant_provisioning.sh. This will start and configure the virtual machine, but it may take several minutes (and a great deal of screen output) before you're returned to the command prompt. Depending on the horsepower of your computer and the speed of your internet connection, you may want to go make yourself a cup of coffee at this point.

When you're back with coffee in hand, you may see a number of deprecation warnings caused by the Puppet Forge module, but those can be safely ignored for your purposes here. As long as the docker.pp manifest applies with warnings and not errors, you're ready to validate the configuration of both the guest OS and the Docker container you just provisioned.

Believe it or not, with the docker.pp Puppet manifest applied by the Puppet agent, you're already done! You now have a Docker container running Apache, and serving up the default "It works!" document. You can test this easily on your top-level host with curl localhost:8080 or at https://localhost:8080/ in your desktop browser.

Applying Local Manifests with Puppet Agent

By using puppet apply as shown here, you're able to perform the same process that you'd employ in a more traditional client/server Puppet configuration, but without the need to do the following:

  1. Configure a Puppet Master first.

  2. Manage SSL client certificates.

  3. Install server-side modules into the correct Puppet environment.

  4. Specify a Puppet environment for the node you want to manage.

  5. Define roles, profiles or nodes that will use the manifest.

This is actually one of the key techniques for Puppet testing and an essential skill for running a masterless Puppet infrastructure. Although a discussion of the pros and cons of masterless Puppet is well outside the scope of this article, it's important to know that Puppet does not actually require a Puppet Master to function.

Don't be fooled. Although you haven't really done anything yet that couldn't be done with a few lines at the command prompt, you've automated it in a consistent and repeatable way. Consistency and repeatability are the bedrock of automation and really can make magic once you extend the process with roles and profiles.

Controlling Docker with Puppet Roles and Profiles

It may seem like a lot of work to automate the configuration of a single machine. However, even when dealing with only a single machine, the consistency and repeatability of a managed configuration is a big win. In addition, this work lays the foundation for automating an unlimited number of machines, which is essential for scaling configuration management to hundreds or thousands of servers. Puppet makes this possible through the "roles and profiles" workflow.

In the Puppet world, roles and profiles are just special cases of Puppet manifests. It's a way to express the desired configuration through composition, where profiles are composed of component modules and then one or more profiles comprise a role. Roles are then assigned to nodes dynamically or statically, often through a site.pp file or an External Node Classifier (ENC).

Let's walk through a simplified example of what a roles-and-profiles workflow looks like. First, you'll create a new manifest in the same directory as your Vagrantfile named roles_and_profiles.pp. Listing 4 shows a useful example.

Listing 4. roles_and_profiles.pp


####################################################
# Profiles
####################################################
# The "dockerd" profile uses a forge module to
# install and manage the Docker daemon. The only
# difference between this and the "docker" class
# from the earlier docker.pp example is that we're
# wrapping it inside a profile.
class profile::dockerd {
    class { 'docker':
      package_name => 'docker.io',
      docker_users => ['ubuntu'],
    }
}

# The "alpine33" profile manages the presence or
# absence of the Alpine 3.3 Docker image using a
# parameterized class. By default, it will remove
# the image.
class profile::alpine33 ($status = 'absent') {
    docker::image { 'alpine_33':
        image     => 'alpine',
        image_tag => '3.3',
        ensure    => $status,
    }
}


# The "alpine34" profile manages the presence or
# absence of the Alpine 3.4 Docker image. By
# default, it will remove the image.
class profile::alpine34 ($status = 'absent') {
    docker::image { 'alpine_34':
        image     => 'alpine',
        image_tag => '3.4',
        ensure    => $status,
    }
}

####################################################
# Roles
####################################################
# This role combines two profiles, passing
# parameters to add or remove the specified images.
# This particular profile ensures the Alpine 3.3
# image is installed, and removes Alpine 3.4 if
# present.
class role::alpine33 {
    class { 'profile::alpine33':
        status => 'present',
    }

    class { 'profile::alpine34':
        status => 'absent',
    }
}

# This role is the inverse of role::alpine33. It
# calls the same parameterized profiles, but
# installs Alpine 3.4 and removes Alpine 3.3.
class role::alpine34 {
    class { 'profile::alpine33':
        status => 'absent',
    }

    class { 'profile::alpine34':
        status => 'present',
    }
}

####################################################
# Nodes
####################################################
# Apply role::alpine33 to any host with "alpine33"
# in its hostname.
node /alpine33/ {
    include ::role::alpine33
}

# Apply role::alpine34 to any host with "alpine34"
# in its hostname.
node /alpine34/ {
    include ::role::alpine34
}

Note that all the profiles, roles and nodes are placed into a single Puppet manifest. On a production system, those should all be separate manifests located in appropriate locations on the Puppet Master. Although this example is illustrative and extremely useful for working with masterless Puppet, be aware that a few rules are broken here for the sake of convenience.

Let me briefly discuss each section of the manifest. Profiles are the reusable building blocks of a well organized Puppet environment. Each profile should have exactly one responsibility, although you can allow the profile to take optional arguments that make it more flexible. In this case, the Alpine profiles allow you to add or remove a given Docker image depending on the value of the $status variable you pass in as an argument.

A role is the "reason for being" that you're assigning to a node. A node can have more than one role at a time, but each role should describe a singular purpose regardless of how many component parts are needed to implement that purpose. In the wild, some common roles assigned to a node might include:

  • role::ruby_on_rails

  • role::jenkins_ci

  • role::monitored_host

  • role::bastion_host

Each role is composed of one or more profiles, which together describe the purpose or function of the node as a whole. For this example, you define the alpine34 role as the presence of the Docker dæmon with Alpine 3.4 and the absence of an Alpine 3.3 image, but you could just as easily have described a more complex role composed of profiles for NTP, SSH, Ruby on Rails, Java and a Splunk forwarder.

This separation of concerns is borrowed from object-oriented programming, where you try to define nodes through composition in order to isolate the implementation details from the user-visible behavior. A less programmatic way to think of this is that profiles generally describe the features of a node, such as its packages, files or services, while roles describe the node's function within your data center.

Nodes, which are generally defined in a Puppet Master's site.pp file or an external node classifier, are where roles are statically or dynamically assigned to each node. This is where the real scaling power of Puppet becomes obvious. In this example, you define two different types of nodes. Each node definition uses a string or regular expression that is matched against the hostname (or certname in a client/server configuration) to determine what roles should be applied to that node.

In the node section of the example manifest, you tell Puppet to assign role::alpine33 to any node that includes "alpine33" as part of its hostname. Likewise, any node that includes "alpine34" in the hostname gets role::alpine34 instead. Using pattern-matching in this way means that you could have any number of hosts in your data center, and each will pick up the correct configuration based on the hostname that it's been assigned. For example, say you have five hosts with the following names:

  1. foo-alpine33

  2. bar-alpine33

  3. baz-alpine33

  4. abc-alpine34

  5. xyz-alpine34

Then, the first three will pick up the Alpine 3.3 role when they contact the Puppet Master, and the last two will pick up the Alpine 3.4 role instead. This is almost magical in its simplicity. Let's see how this type of dynamic role assignment works in practice.

Dynamic Role Assignments

Assuming that you've already placed roles_and_profiles.pp into the directory containing your Vagrantfile, you're able to access the manifest within the Ubuntu virtual machine. Let's log in to the VM and test it out (Listing 5).

Listing 5. Logging in to the Ubuntu Virtual Machine


# Ensure we're in the right directory on our Vagrant
# host.
cd ~/Documents/puppet-docker

# Ensure that the virtual machine is active. There's
# no harm in running this command multiple times,
# even if the machine is already up.
vagrant up

# Login to the Ubuntu guest.
vagrant ssh

Next, run the roles_and_profiles.pp Puppet manifest to see what happens. Hint: it's going to fail, and then you're going to explore why that's a good thing. Here's what happens:


ubuntu@ubuntu-xenial:~$ sudo puppet apply --modulepath
 ↪~/.puppet/modules /vagrant/roles_and_profiles.pp
Error: Could not find default node or by name with
 ↪'ubuntu-xenial.localdomain, ubuntu-xenial' on node
 ↪ubuntu-xenial.localdomain
Error: Could not find default node or by name with
↪'ubuntu-xenial.localdomain, ubuntu-xenial' on node
 ↪ubuntu-xenial.localdomain

Why did the manifest fail to apply? There are actually several reasons for this. The first reason is that you did not define any nodes that matched the current hostname of "ubuntu-xenial". The second reason is that you did not define a default to be applied when no other match is found. Puppet allows you to define a default, but in many cases, it's better to raise an error than to get a configuration you weren't expecting.

In this test environment, you want to show that Puppet is able to assign roles dynamically based on the hostname of the node where the Puppet agent is running. With that in mind, let's modify the hostname of the Ubuntu guest to see how a site manifest can be used to configure large clusters of machines appropriately based solely on each machine's hostname.

Changing a Linux Hostname

When changing the hostname on a Linux system, it's important to understand that the sudo utility will complain loudly and often if a number of information sources don't agree on the the current hostname. In particular, on an Ubuntu system, the following should all agree:

  1. The hostname stored in /etc/hostname.

  2. The hostname defined for 127.0.1.1 in /etc/hosts.

  3. The hostname reported by /bin/hostname.

If they don't all match, you may see errors such as:


> sudo: unable to resolve host quux

And in extreme cases, you even may lose the ability to run the sudo command. It's best to avoid the situation by ensuring that you update all three data sources to the same value when changing your hostname.

In order to avoid errors with the sudo command, you actually need to change the hostname of your virtual machine in several places. In addition, the hostname reported by the PS1 prompt will not be updated until you start a new shell. The following commands, when run inside the Ubuntu guest, will make the necessary changes:


# Must be exported to use in sudo's environment.
export new_hostname="foo-alpine33"

# Preserve the environment or sudo will lose the
# exported variable. Also, we must explicitly
# execute on localhost rather than relying on
# whatever sudo thinks the current hostname is to
# avoid "sudo: unable to resolve host" errors.
sudo \
    --preserve-env \
    --host=localhost \
    -- \
    sed --in-place \
        "s/${HOSTNAME}/${new_hostname}/g" \
        /etc/hostname /etc/hosts
sudo \
    --preserve-env \
    --host=localhost \
    -- \
    hostname "$new_hostname"

# Replace the current shell in order to pick up the
# new hostname in the PS1 prompt.
exec "$SHELL"

Your prompt now should show that the hostname has changed. When you re-run the Puppet manifest, it will match the node list because you've defined a rule for hosts that include "alpine33" in the hostname. Puppet then will apply role::alpine33 for you, simply because the hostname matches the node definition! For example:


# Apply the manifest from inside the Ubuntu guest.
sudo puppet apply \
    --modulepath ~/.puppet/modules \
    /vagrant/roles_and_profiles.pp

# Verify that the role has been correctly applied.
docker images alpine

REPOSITORY   TAG       IMAGE ID        CREATED         SIZE
alpine       3.3       6c2aa2137d97    7 weeks ago     4.805MB

Ignore "update_docker_image.sh" Errors

When running the Puppet manifest in the example, you may see several errors that contain the following substring:


> update_docker_image.sh alpine:3.4 returned 3 instead of one of [0,1]

These errors currently are caused by upstream bugs in the Puppet Docker modules used in the examples. Bugs have been filed upstream, but can safely be ignored for the immediate purposes of this article. Despite the reported error, the Docker images actually still are being properly installed, which you can verify yourself inside the virtual machine with docker images alpine.

If you want to track the progress of these bugs, please see:

To apply this role to an entire cluster of machines, all you need to do is ensure they have hostnames that match your defined criteria. For example, say you have five hosts with the following names:

  1. foo-alpine33

  2. bar-alpine33

  3. baz-alpine33

  4. abc-alpine33

  5. xyz-alpine33

Then, the single node definition for /alpine33/ would apply to all of them, because the regular expression matches each of their hostnames. By assigning roles to patterns of hostnames, you can configure large segments of your data center simply by setting the proper hostnames! What could be easier?

Reassigning Roles at Runtime

Well, now you have a way to assign a role to thousands of boxes at a time. That's impressive all by itself, but the magic doesn't stop there. What if you need to reassign a system to a different role?

Imagine that you have a box with the Alpine 3.3 image installed, and you want to upgrade that box so it hosts the Alpine 3.4 image instead. In reality, hosting multiple images isn't a problem, and these images aren't mutually exclusive. However, it's illustrative to show how you can use Puppet to add, remove, update and replace images and containers.

Given the existing node definitions, all you need to do is update the hostname to include "alpine34" and let Puppet pick up the new role:


# Define a new hostname that includes "alpine34"
# instead of "alpine33".
export new_hostname="foo-alpine34"

sudo \
    --preserve-env \
    --host=localhost \
    -- \
    sed --in-place \
        "s/${HOSTNAME}/${new_hostname}/g" \
        /etc/hostname /etc/hosts
sudo \
    --preserve-env \
    --host=localhost \
    -- \
    hostname "$new_hostname"
exec "$SHELL"

# Rerun the manifest using the new node name.
sudo puppet apply \
    --modulepath ~/.puppet/modules \
    /vagrant/roles_and_profiles.pp

# Show the Alpine images installed.
docker images alpine

REPOSITORY   TAG       IMAGE ID        CREATED         SIZE
alpine       3.4       baa5d63471ea    7 weeks ago     4.803MB

As you can see from the output, Puppet has removed the Alpine 3.3 image, and installed Alpine 3.4 instead! How did this happen? Let's break it down into steps:

  1. You renamed the host to include the substring "alpine34" in the hostname.

  2. Puppet matched the substring using a regular expression in its node definition list.

  3. Puppet applied the Alpine 3.4 role (role::alpine34) assigned to nodes that matched the "alpine34" substring.

  4. The Alpine 3.4 role called its component profiles (which are actually parameterized classes) using "present" and "absent" arguments to declare the intended state of each image.

  5. Puppet applied the image management declarations inside the Alpine 3.3 and Alpine 3.4 profiles (profile::alpine33 and profile::alpine34, respectively) to install or remove each image.

Although hostname-based role assignment is just one of the many ways to manage the configuration of multiple systems, it's a very powerful one, and certainly one of the easiest to demonstrate. Puppet supports a large number of ways to specify what configurations should apply to a given host. The ability to configure systems dynamically based on discoverable criteria makes Puppet a wonderful complement to Docker's versioned images and containerization.

Other Puppet Options for Node Assignment

Puppet can assign roles, profiles and classes to nodes in a number of ways, including the following:

  • Classifying nodes with the Puppet Enterprise Console.

  • Defining nodes in the main site manifest—for example, site.pp.

  • Implementing an External Node Classifier (ENC), which is an external tool that replaces or supplements the main site manifest.

  • Storing hierarchical data in a Hiera YAML configuration file.

  • Using Puppet Lookup, which merges Hiera information with environment and module data.

  • Crafting conditional configurations based on facts known to the server or client at runtime.

Each option represents a set of trade-offs in expressive power, hierarchical inheritance and maintainability. A thorough discussion of these trade-offs is outside the scope of this article. Nevertheless, it's important to understand that Puppet gives you a great deal of flexibility in how you classify and manage nodes at scale. This article focuses on the common use case of name-based classification, but there are certainly other valid approaches.

Conclusion

In this article, I took a close look at managing Docker images and containers with docker::image and docker::run, but the Puppet Docker module supports a lot more features that I didn't have room to cover this time around. Some of those additional features include:

  • Building images from a Dockerfile with the docker::image class.

  • Managing Docker networks with the docker::networks class.

  • Using Docker Compose with the docker::compose class.

  • Implementing private image registries using the docker::registry class.

  • Running arbitrary commands inside containers with the docker::exec class.

When taken together, this powerful collection of features allows you to compose extremely powerful roles and profiles for managing Docker instances across infrastructure of almost any scale. In addition, by leveraging Puppet's declarative syntax and its ability to automate role assignment, it's possible to add, remove and modify your Docker instances on multiple hosts without having to manage each instance directly, which is typically a huge win in enterprise automation. And finally, the standardization and repeatability of Puppet-driven container management makes systems more reliable when compared to hand-tuned, hand-crafted nodes that can "drift" from the ideal state over time.

In short, Docker provides a powerful tool for creating lightweight golden images and containerized services, while Puppet provides the means to orchestrate those images and containers in the cloud or data center. Like strawberries and chocolate, neither is "better" than the other; combine them though, and you get something greater than the sum of its parts.

Resources

Key Files from This Article, Available on GitHub: https://github.com/CodeGnome/MDIWP-Examples

Docker: https://www.docker.co

Puppet Home Page (Docs and Commercial Versions): https://puppet.com

Puppet Ruby Gem (Open-Source Version): https://rubygems.org/gems/puppet

Puppet Labs docker_platform Module: https://forge.puppet.com/puppetlabs/docker_platform

The garethr-docker Module Wrapped by docker_platform: https://github.com/garethr/garethr-docker

Official Apache HTTP Server Docker Images: https://hub.docker.com/_/http

Oracle VirtualBox: https://www.virtualbox.org

Vagrant by HashiCorp: https://www.vagrantup.com

Ubuntu Images on HashiCorp Atlas: https://atlas.hashicorp.com/ubuntu

Puppet Documentation on the "Roles and Profiles" Pattern: https://docs.puppet.com/pe/2016.4/r_n_p_intro.html

Todd A. Jacobs is the CEO of Flow Capital Group, which acquires and manages companies specializing in IT automation, DevOps & agile transformations, security and compliance, fractional CIO/CTO services, and board advisory services for cyber risk and other hot technology issues. Todd waited his whole life for Matt Smith's character on Dr. Who to make bow ties cool again. He lives near Baltimore, MD with his wife and son, to whom he hopes to pass on his love of Linux and technology—but perhaps not his fashion sense.

Load Disqus comments