Jay Taylor's notes
back to listing indexLaunching Containers with fleet
[web search]Launching Containers with fleet
fleet
is a cluster manager that controls systemd
at the cluster level. To run your services in the cluster, you must submit regular systemd units combined with a few fleet-specific properties.
If you're not familiar with systemd units, check out our Getting Started with systemd guide.
This guide assumes you're running fleetctl
locally from a CoreOS machine that's part of a CoreOS cluster. You can also control your cluster remotely. All of the units referenced in this blog post are contained in the unit-examples repository. You can clone this onto your CoreOS box to make unit submission easier.
Types of Fleet Units
Two types of units can be run in your cluster — standard and global units. Standard units are long-running processes that are scheduled onto a single machine. If that machine goes offline, the unit will be migrated onto a new machine and started.
Global units will be run on all machines in the cluster. These are ideal for common services like monitoring agents or components of higher-level orchestration systems like Kubernetes, Mesos or OpenStack. There are two fleetctl commands to view units in the cluster: list-unit-files
, which shows the units that fleet knows about and whether or not they are global, and list-units
, which shows the current state of units actively loaded into machines in the cluster. Here's an example cluster with 3 machines, running both types of units:
$ fleetctl list-unit-files
UNIT HASH DSTATE STATE TMACHINE
global-unit.service 8ff68b9 launched launched 3 of 3
standard-unit.service 7710e8a launched launched 148a18ff.../10.10.1.1
You can view all of the machines in the cluster by running list-machines
:
$ fleetctl list-machines
MACHINE IP METADATA
148a18ff-6e95-4cd8-92da-c9de9bb90d5a 10.10.1.1 -
491586a6-508f-4583-a71d-bfc4d146e996 10.10.1.2 -
c9de9451-6a6f-1d80-b7e6-46e996bfc4d1 10.10.1.3 -
Now when looking at the status of units, we should expect to see 3 copies of global-unit.service - one running on each machine:
$ fleetctl list-units
UNIT MACHINE ACTIVE SUB
global-unit.service 148a18ff.../10.10.1.1 active running
global-unit.service 491586a6.../10.10.1.2 active running
global-unit.service c9de9451.../10.10.1.3 active running
standard-unit.service 148a18ff.../10.10.1.1 active running
Run a Container in the Cluster
Running a single container is very easy. All you need to do is provide a regular unit file without an [Install]
section. Let's run the same unit from the Getting Started with systemd guide. First save these contents as myapp.service
on the CoreOS machine:
[Unit]
Description=MyApp
After=docker.service
Requires=docker.service
[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill busybox1
ExecStartPre=-/usr/bin/docker rm busybox1
ExecStartPre=/usr/bin/docker pull busybox
ExecStart=/usr/bin/docker run --name busybox1 busybox /bin/sh -c "while true; do echo Hello World; sleep 1; done"
ExecStop=/usr/bin/docker stop busybox1
If you've been running docker commands manually, be sure you don't copy a docker run
command that starts a container in detached mode (-d
). Detached mode won't start the container as a child of the unit's pid. This will cause the unit to run for just a few seconds and then exit.
Run the start command to start up the container on the cluster:
$ fleetctl start myapp.service
The unit should have been scheduled to a machine in your cluster:
$ fleetctl list-units
UNIT MACHINE ACTIVE SUB
myapp.service c9de9451.../10.10.1.3 active running
You can view all of the machines in the cluster by running list-machines
:
$ fleetctl list-machines
MACHINE IP METADATA
148a18ff-6e95-4cd8-92da-c9de9bb90d5a 10.10.1.1 -
491586a6-508f-4583-a71d-bfc4d146e996 10.10.1.2 -
c9de9451-6a6f-1d80-b7e6-46e996bfc4d1 10.10.1.3 -
Run a High Availability Service
The main benefit of using CoreOS is to have your services run in a highly available manner. Let's walk through deploying a service that consists of two identical containers running the Apache web server.
First, let's write a unit file that we'll run two copies of, named apache.1.service
and apache.2.service
:
[Unit]
Description=My Apache Frontend
After=docker.service
Requires=docker.service
[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill apache1
ExecStartPre=-/usr/bin/docker rm apache1
ExecStartPre=/usr/bin/docker pull coreos/apache
ExecStart=/usr/bin/docker run -rm --name apache1 -p 80:80 coreos/apache /usr/sbin/apache2ctl -D FOREGROUND
ExecStop=/usr/bin/docker stop apache1
[X-Fleet]
Conflicts=apache.*.service
The Conflicts
attribute tells fleet
that these two services can't be run on the same machine, giving us high availability. A full list of options for this section can be found in the fleet units guide.
Let's start both units and verify that they're on two different machines:
$ fleetctl start apache.*
$ fleetctl list-units
UNIT MACHINE ACTIVE SUB
myapp.service c9de9451.../10.10.1.3 active running
apache.1.service 491586a6.../10.10.1.2 active running
apache.2.service 148a18ff.../10.10.1.1 active running
As you can see, the Apache units are now running on two different machines in our cluster.
How do we route requests to these containers? The best strategy is to run a "sidekick" container that performs other duties that are related to our main container but shouldn't be directly built into that application. Examples of common sidekick containers are for service discovery and controlling external services such as cloud load balancers or DNS.
Run a Simple Sidekick
The simplest sidekick example is for service discovery. This unit blindly announces that our container has been started. We'll run one of these for each Apache unit that's already running. Make two copies of the unit called apache-discovery.1.service
and apache-discovery.2.service
. Be sure to change all instances of apache.1.service
to apache.2.service
and apache1
to apache2
when you create the second unit.
[Unit]
Description=Announce Apache1
BindsTo=apache.1.service
[Service]
ExecStart=/bin/sh -c "while true; do etcdctl set /services/website/apache1 '{ \"host\": \"%H\", \"port\": 80, \"version\": \"52c7248a14\" }' --ttl 60;sleep 45;done"
ExecStop=/usr/bin/etcdctl rm /services/website/apache1
[X-Fleet]
MachineOf=apache.1.service
This unit has a few interesting properties. First, it uses BindsTo
to link the unit to our apache.1.service
unit. When the Apache unit is stopped, this unit will stop as well, causing it to be removed from our /services/website
directory in etcd
. A TTL of 60 seconds is also being used here to remove the unit from the directory if our machine suddenly died for some reason.
Second is %H
, a variable built into systemd, that represents the hostname of the machine running this unit. Variable usage is covered in our Getting Started with systemd guide as well as in systemd documentation.
The third is a fleet-specific property called MachineOf
. This property causes the unit to be placed onto the same machine that apache.1.service
is running on.
Let's verify that each unit was placed on to the same machine as the Apache service is bound to:
$ fleetctl start apache-discovery.*.service
$ fleetctl list-units
UNIT MACHINE ACTIVE SUB
myapp.service c9de9451.../10.10.1.3 active running
apache.1.service 491586a6.../10.10.1.2 active running
apache.2.service 148a18ff.../10.10.1.1 active running
apache-discovery.1.service 491586a6.../10.10.1.2 active running
apache-discovery.2.service 148a18ff.../10.10.1.1 active running
Now let's verify that the service discovery is working correctly:
$ etcdctl ls /services/ --recursive
/services/website
/services/website/apache1
/services/website/apache2
$ etcdctl get /services/website/apache1
{ "host": "ip-10-182-139-116", "port": 80, "version": "52c7248a14" }
Run an External Service Sidekick
If you're running in the cloud, many services have APIs that can be automated based on actions in the cluster. For example, you may update DNS records or add new containers to a cloud load balancer. Our Example Deployment with fleet contains a pre-made presence container that updates an Amazon Elastic Load Balancer with new backends.
Run a Global Unit
As mentioned earlier, global units are useful for running a unit across all of the machines in your cluster. It doesn't differ very much from a regular unit other than a new X-Fleet
parameter called Global=true
. Here's an example unit from a blog post to use Data Dog with CoreOS. You'll need to set an etcd key ddapikey
before this example will work — more details are in the post.
[Unit]
Description=Monitoring Service
[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill dd-agent
ExecStartPre=-/usr/bin/docker rm dd-agent
ExecStartPre=/usr/bin/docker pull datadog/docker-dd-agent
ExecStart=/usr/bin/bash -c \
"/usr/bin/docker run --privileged --name dd-agent -h `hostname` \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /proc/mounts:/host/proc/mounts:ro \
-v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
-e API_KEY=`etcdctl get /ddapikey` \
datadog/docker-dd-agent"
[X-Fleet]
Global=true
If we start this unit, it should be running on all 3 of our machines:
$ fleetctl start datadog.service
$ fleetctl list-units
UNIT MACHINE ACTIVE SUB
myapp.service c9de9451.../10.10.1.3 active running
apache.1.service 491586a6.../10.10.1.2 active running
apache.2.service 148a18ff.../10.10.1.1 active running
apache-discovery.1.service 491586a6.../10.10.1.2 active running
apache-discovery.2.service 148a18ff.../10.10.1.1 active running
datadog.service 148a18ff.../10.10.1.1 active running
datadog.service 491586a6.../10.10.1.2 active running
datadog.service c9de9451.../10.10.1.3 active running
Global units can deployed to a subset of matching machines with the MachineMetadata
parameter, which is explained in the next section.
Schedule Based on Machine Metadata
Applications with complex and specific requirements can target a subset of the cluster for scheduling via machine metadata. Powerful deployment topologies can be achieved — schedule units based on the machine's region, rack location, disk speed or anything else you can think of.
Metadata can be provided via cloud-config or a config file. Here's an example config file:
# Comma-delimited key/value pairs that are published to the fleet registry.
# This data can be referenced in unit files to affect scheduling decisions.
# An example could look like: metadata="region=us-west,az=us-west-1"
metadata="platform=metal,provider=rackspace,region=east,disk=ssd"
Metadata can be viewed in the machine list when configured:
$ fleetctl list-machines
MACHINE IP METADATA
29db5063... 172.17.8.101 disk=ssd,platform=metal,provider=rackspace,region=east
ebb97ff7... 172.17.8.102 disk=ssd,platform=cloud,provider=rackspace,region=east
f823e019... 172.17.8.103 disk=ssd,platform=cloud,provider=amazon,region=east
The unit file for a service that does a lot of disk I/O but doesn't care where it runs could look like:
[X-Fleet]
ConditionMachineMetadata=disk=ssd
If you wanted to ensure very high availability you could have 3 unit files that must be scheduled across providers but in the same region:
[X-Fleet]
Conflicts=webapp*
MachineMetadata=provider=rackspace
MachineMetadata=platform=metal
MachineMetadata=region=east
[X-Fleet]
Conflicts=webapp*
MachineMetadata=provider=rackspace
MachineMetadata=platform=cloud
MachineMetadata=region=east
[X-Fleet]
Conflicts=webapp*
MachineMetadata=provider=amazon
MachineMetadata=platform=cloud
MachineMetadata=region=east
More Information
Example Deployment with fleet fleet Unit Specifications fleet Configuration