Jay Taylor's notes

back to listing index

LXC

[web search]
Original source (help.ubuntu.com)
Tags: ubuntu linux tutorial containers lxc howto help.ubuntu.com
Clipped on: 2013-08-09

Containers are a lightweight virtualization technology. They are more akin to an enhanced chroot than to full virtualization like Qemu or VMware, both because they do not emulate hardware and because containers share the same operating system as the host. Therefore containers are better compared to Solaris zones or BSD jails. Linux-vserver and OpenVZ are two pre-existing, independently developed implementations of containers-like functionality for Linux. In fact, containers came about as a result of the work to upstream the vserver and OpenVZ functionality. Some vserver and OpenVZ functionality is still missing in containers, however containers can boot many Linux distributions and have the advantage that they can be used with an un-modified upstream kernel.

There are two user-space implementations of containers, each exploiting the same kernel features. Libvirt allows the use of containers through the LXC driver by connecting to 'lxc:///'. This can be very convenient as it supports the same usage as its other drivers. The other implementation, called simply 'LXC', is not compatible with libvirt, but is more flexible with more userspace tools. It is possible to switch between the two, though there are peculiarities which can cause confusion.

In this document we will mainly describe the lxc package. Toward the end, we will describe how to use the libvirt LXC driver.

In this document, a container name will be shown as CN, C1, or C2.

Installation

The lxc package can be installed using


sudo apt-get install lxc

This will pull in the required and recommended dependencies, including cgroup-lite, lvm2, and debootstrap. To use libvirt-lxc, install libvirt-bin. LXC and libvirt-lxc can be installed and used at the same time.

Host Setup

Basic layout of LXC files

Following is a description of the files and directories which are installed and used by LXC.

  • There are two upstart jobs:

    • /etc/init/lxc-net.conf: is an optional job which only runs if /etc/default/lxc specifies USE_LXC_BRIDGE (true by default). It sets up a NATed bridge for containers to use.

    • /etc/init/lxc.conf: runs if LXC_AUTO (true by default) is set to true in /etc/default/lxc. It looks for entries under /etc/lxc/auto/ which are symbolic links to configuration files for the containers which should be started at boot.

  • /etc/lxc/lxc.conf: There is a default container creation configuration file, /etc/lxc/lxc.conf, which directs containers to use the LXC bridge created by the lxc-net upstart job. If no configuration file is specified when creating a container, then this one will be used.

  • Examples of other container creation configuration files are found under /usr/share/doc/lxc/examples. These show how to create containers without a private network, or using macvlan, vlan, or other network layouts.

  • The various container administration tools are found under /usr/bin.

  • /usr/lib/lxc/lxc-init is a very minimal and lightweight init binary which is used by lxc-execute. Rather than `booting' a full container, it manually mounts a few filesystems, especially /proc, and executes its arguments. You are not likely to need to manually refer to this file.

  • /usr/lib/lxc/templates/ contains the `templates' which can be used to create new containers of various distributions and flavors. Not all templates are currently supported.

  • /etc/apparmor.d/lxc/lxc-default contains the default Apparmor MAC policy which works to protect the host from containers. Please see the Apparmor for more information.

  • /etc/apparmor.d/usr.bin.lxc-start contains a profile to protect the host from lxc-start while it is setting up the container.

  • /etc/apparmor.d/lxc-containers causes all the profiles defined under /etc/apparmor.d/lxc to be loaded at boot.

  • There are various man pages for the LXC administration tools as well as the lxc.conf container configuration file.

  • /var/lib/lxc is where containers and their configuration information are stored.

  • /var/cache/lxc is where caches of distribution data are stored to speed up multiple container creations.

lxcbr0

When USE_LXC_BRIDGE is set to true in /etc/default/lxc (as it is by default), a bridge called lxcbr0 is created at startup. This bridge is given the private address 10.0.3.1, and containers using this bridge will have a 10.0.3.0/24 address. A dnsmasq instance is run listening on that bridge, so if another dnsmasq has bound all interfaces before the lxc-net upstart job runs, lxc-net will fail to start and lxcbr0 will not exist.

If you have another bridge - libvirt's default virbr0, or a br0 bridge for your default NIC - you can use that bridge in place of lxcbr0 for your containers.

Using a separate filesystem for the container store

LXC stores container information and (with the default backing store) root filesystems under /var/lib/lxc. Container creation templates also tend to store cached distribution information under /var/cache/lxc.

If you wish to use another filesystem than /var, you can mount a filesystem which has more space into those locations. If you have a disk dedicated for this, you can simply mount it at /var/lib/lxc. If you'd like to use another location, like /srv, you can bind mount it or use a symbolic link. For instance, if /srv is a large mounted filesystem, create and symlink two directories:


sudo mkdir /srv/lxclib /srv/lxccache
sudo rm -rf /var/lib/lxc /var/cache/lxc
sudo ln -s /srv/lxclib /var/lib/lxc
sudo ln -s /srv/lxccache /var/cache/lxc

or, using bind mounts:


sudo mkdir /srv/lxclib /srv/lxccache
sudo sed -i '$a \
/srv/lxclib /var/lib/lxc    none defaults,bind 0 0 \
/srv/lxccache /var/cache/lxc none defaults,bind 0 0' /etc/fstab
sudo mount -a

Containers backed by lvm

It is possible to use LVM partitions as the backing stores for containers. Advantages of this include flexibility in storage management and fast container cloning. The tools default to using a VG (volume group) named lxc, but another VG can be used through command line options. When a LV is used as a container backing store, the container's configuration file is still /var/lib/lxc/CN/config, but the root fs entry in that file (lxc.rootfs) will point to the lV block device name, i.e. /dev/lxc/CN.

Containers with directory tree and LVM backing stores can co-exist.

Btrfs

If your host has a btrfs /var, the LXC administration tools will detect this and automatically exploit it by cloning containers using btrfs snapshots.

Apparmor

LXC ships with an Apparmor profile intended to protect the host from accidental misuses of privilege inside the container. For instance, the container will not be able to write to /proc/sysrq-trigger or to most /sys files.

The usr.bin.lxc-start profile is entered by running lxc-start. This profile mainly prevents lxc-start from mounting new filesystems outside of the container's root filesystem. Before executing the container's init, LXC requests a switch to the container's profile. By default, this profile is the lxc-container-default policy which is defined in /etc/apparmor.d/lxc/lxc-default. This profile prevents the container from accessing many dangerous paths, and from mounting most filesystems.

If you find that lxc-start is failing due to a legitimate access which is being denied by its Apparmor policy, you can disable the lxc-start profile by doing:

sudo apparmor_parser -R /etc/apparmor.d/usr.bin.lxc-start
sudo ln -s /etc/apparmor.d/usr.bin.lxc-start /etc/apparmor.d/disabled/

This will make lxc-start run unconfined, but continue to confine the container itself. If you also wish to disable confinement of the container, then in addition to disabling the usr.bin.lxc-start profile, you must add:

lxc.aa_profile = unconfined

to the container's configuration file. If you wish to run a container in a custom profile, you can create a new profile under /etc/apparmor.d/lxc/. Its name must start with lxc- in order for lxc-start to be allowed to transition to that profile. After creating the policy, load it using:

sudo apparmor_parser -r /etc/apparmor.d/lxc-containers

The profile will automatically be loaded after a reboot, because it is sourced by the file /etc/apparmor.d/lxc-containers. Finally, to make container CN use this new lxc-CN-profile, add the following line to its configuration file:

lxc.aa_profile = lxc-CN-profile

lxc-execute does not enter an Apparmor profile, but the container it spawns will be confined.

Control Groups

Control groups (cgroups) are a kernel feature providing hierarchical task grouping and per-cgroup resource accounting and limits. They are used in containers to limit block and character device access and to freeze (suspend) containers. They can be further used to limit memory use and block i/o, guarantee minimum cpu shares, and to lock containers to specific cpus. By default, LXC depends on the cgroup-lite package to be installed, which provides the proper cgroup initialization at boot. The cgroup-lite package mounts each cgroup subsystem separately under /sys/fs/cgroup/SS, where SS is the subsystem name. For instance the freezer subsystem is mounted under /sys/fs/cgroup/freezer. LXC cgroup are kept under /sys/fs/cgroup/SS/INIT/lxc, where INIT is the init task's cgroup. This is / by default, so in the end the freezer cgroup for container CN would be /sys/fs/cgroup/freezer/lxc/CN.

Privilege

The container administration tools must be run with root user privilege. A utility called lxc-setup was written with the intention of providing the tools with the needed file capabilities to allow non-root users to run the tools with sufficient privilege. However, as root in a container cannot yet be reliably contained, this is not worthwhile. It is therefore recommended to not use lxc-setup, and to provide the LXC administrators the needed sudo privilege.

The user namespace, which is expected to be available in the next Long Term Support (LTS) release, will allow containment of the container root user, as well as reduce the amount of privilege required for creating and administering containers.

LXC Upstart Jobs

As listed above, the lxc package includes two upstart jobs. The first, lxc-net, is always started when the other, lxc, is about to begin, and stops when it stops. If the USE_LXC_BRIDGE variable is set to false in /etc/defaults/lxc, then it will immediately exit. If it is true, and an error occurs bringing up the LXC bridge, then the lxc job will not start. lxc-net will bring down the LXC bridge when stopped, unless a container is running which is using that bridge.

The lxc job starts on runlevel 2-5. If the LXC_AUTO variable is set to true, then it will look under /etc/lxc for containers which should be started automatically. When the lxc job is stopped, either manually or by entering runlevel 0, 1, or 6, it will stop those containers.

To register a container to start automatically, create a symbolic link /etc/default/lxc/name.conf pointing to the container's config file. For instance, the configuration file for a container CN is /var/lib/lxc/CN/config. To make that container auto-start, use the command:


sudo ln -s /var/lib/lxc/CN/config /etc/lxc/auto/CN.conf

Container Administration

Creating Containers

The easiest way to create containers is using lxc-create. This script uses distribution-specific templates under /usr/lib/lxc/templates/ to set up container-friendly chroots under /var/lib/lxc/CN/rootfs, and initialize the configuration in /var/lib/lxc/CN/fstab and /var/lib/lxc/CN/config, where CN is the container name

The simplest container creation command would look like:


sudo lxc-create -t ubuntu -n CN

This tells lxc-create to use the ubuntu template (-t ubuntu) and to call the container CN (-n CN). Since no configuration file was specified (which would have been done with `-f file'), it will use the default configuration file under /etc/lxc/lxc.conf. This gives the container a single veth network interface attached to the lxcbr0 bridge.

The container creation templates can also accept arguments. These can be listed after --. For instance


sudo lxc-create -t ubuntu -n oneiric1 -- -r oneiric

passes the arguments '-r oneiric1' to the ubuntu template.

Help

Help on the lxc-create command can be seen by using lxc-create -h. However, the templates also take their own options. If you do


sudo lxc-create -t ubuntu -h

then the general lxc-create help will be followed by help output specific to the ubuntu template. If no template is specified, then only help for lxc-create itself will be shown.

Ubuntu template

The ubuntu template can be used to create Ubuntu system containers with any release at least as new as 10.04 LTS. It uses debootstrap to create a cached container filesystem which gets copied into place each time a container is created. The cached image is saved and only re-generated when you create a container using the -F (flush) option to the template, i.e.:


sudo lxc-create -t ubuntu -n CN -- -F

The Ubuntu release installed by the template will be the same as that on the host, unless otherwise specified with the -r option, i.e.


sudo lxc-create -t ubuntu -n CN -- -r lucid

If you want to create a 32-bit container on a 64-bit host, pass -a i386 to the container. If you have the qemu-user-static package installed, then you can create a container using any architecture supported by qemu-user-static.

The container will have a user named ubuntu whose password is ubuntu and who is a member of the sudo group. If you wish to inject a public ssh key for the ubuntu user, you can do so with -S sshkey.pub.

You can also bind user jdoe from the host into the container using the -b jdoe option. This will copy jdoe's password and shadow entries into the container, make sure his default group and shell are available, add him to the sudo group, and bind-mount his home directory into the container when the container is started.

When a container is created, the release-updates archive is added to the container's sources.list, and its package archive will be updated. If the container release is older than 12.04 LTS, then the lxcguest package will be automatically installed. Alternatively, if the --trim option is specified, then the lxcguest package will not be installed, and many services will be removed from the container. This will result in a faster-booting, but less upgrade-able container.

Ubuntu-cloud template

The ubuntu-cloud template creates Ubuntu containers by downloading and extracting the published Ubuntu cloud images. It accepts some of the same options as the ubuntu template, namely -r release, -S sshkey.pub, -a arch, and -F to flush the cached image. It also accepts a few extra options. The -C option will create a cloud container, configured for use with a metadata service. The -u option accepts a cloud-init user-data file to configure the container on start. If -L is passed, then no locales will be installed. The -T option can be used to choose a tarball location to extract in place of the published cloud image tarball. Finally the -i option sets a host id for cloud-init, which by default is set to a random string.

Other templates

The ubuntu and ubuntu-cloud templates are well supported. Other templates are available however. The debian template creates a Debian based container, using debootstrap much as the ubuntu template does. By default it installs a debian squeeze image. An alternate release can be chosen by setting the SUITE environment variable, i.e.:


sudo SUITE=sid lxc-create -t debian -n d1

Since debian cannot be safely booted inside a container, debian containers will be trimmed as with the --trim option to the ubuntu template.

To purge the container image cache, call the template directly and pass it the --clean option.


sudo SUITE=sid /usr/lib/lxc/templates/lxc-debian --clean

A fedora template exists, which creates containers based on fedora releases <= 14. Fedora release 15 and higher are based on systemd, which the template is not yet able to convert into a container-bootable setup. Before the fedora template is able to run, you'll need to make sure that yum and curl are installed. A fedora 12 container can be created with


sudo lxc-create -t fedora -n fedora12 -- -R 12

A OpenSuSE template exists, but it requires the zypper program, which is not yet packaged. The OpenSuSE template is therefore not supported.

Two more templates exist mainly for experimental purposes. The busybox template creates a very small system container based entirely on busybox. The sshd template creates an application container running sshd in a private network namespace. The host's library and binary directories are bind-mounted into the container, though not its /home or /root. To create, start, and ssh into an ssh container, you might:


sudo lxc-create -t sshd -n ssh1
ssh-keygen -f id
sudo mkdir /var/lib/lxc/ssh1/rootfs/root/.ssh
sudo cp id.pub /var/lib/lxc/ssh1/rootfs/root/.ssh/authorized_keys
sudo lxc-start -n ssh1 -d
ssh -i id root@ssh1.

Backing Stores

By default, lxc-create places the container's root filesystem as a directory tree at /var/lib/lxc/CN/rootfs. Another option is to use LVM logical volumes. If a volume group named lxc exists, you can create an lvm-backed container called CN using:


sudo lxc-create -t ubuntu -n CN -B lvm

If you want to use a volume group named schroots, with a 5G xfs filesystem, then you would use


sudo lxc-create -t ubuntu -n CN -B lvm --vgname schroots --fssize 5G --fstype xfs

Cloning

For rapid provisioning, you may wish to customize a canonical container according to your needs and then make multiple copies of it. This can be done with the lxc-clone program. Given an existing container called C1, a new container called C2 can be created using


sudo lxc-clone -o C1 -n C2

If /var/lib/lxc is a btrfs filesystem, then lxc-clone will create C2's filesystem as a snapshot of C1's. If the container's root filesystem is lvm backed, then you can specify the -s option to create the new rootfs as a lvm snapshot of the original as follows:


sudo lxc-clone -s -o C1 -n C2

Both lvm and btrfs snapshots will provide fast cloning with very small initial disk usage.

Starting and stopping

To start a container, use lxc-start -n CN. By default lxc-start will execute /sbin/init in the container. You can provide a different program to execute, plus arguments, as further arguments to lxc-start:


sudo lxc-start -n container /sbin/init loglevel=debug

If you do not specify the -d (daemon) option, then you will see a console (on the container's /dev/console, see Consoles for more information) on the terminal. If you specify the -d option, you will not see that console, and lxc-start will immediately exit success - even if a later part of container startup has failed. You can use lxc-wait or lxc-monitor (see Monitoring container status ) to check on the success or failure of the container startup.

To obtain LXC debugging information, use -o filename -l debuglevel, for instance:


sudo lxc-start -o lxc.debug -l DEBUG -n container

Finally, you can specify configuration parameters inline using -s. However, it is generally recommended to place them in the container's configuration file instead. Likewise, an entirely alternate config file can be specified with the -f option, but this is not generally recommended.

While lxc-start runs the container's /sbin/init, lxc-execute uses a minimal init program called lxc-init, which attempts to mount /proc, /dev/mqueue, and /dev/shm, executes the programs specified on the command line, and waits for those to finish executing. lxc-start is intended to be used for system containers, while lxc-execute is intended for application containers (see this article for more).

You can stop a container several ways. You can use shutdown, poweroff and reboot while logged into the container. To cleanly shut down a container externally (i.e. from the host), you can issue the sudo lxc-shutdown -n CN command. This takes an optional timeout value. If not specified, the command issues a SIGPWR signal to the container and immediately returns. If the option is used, as in sudo lxc-shutdown -n CN -t 10, then the command will wait the specified number of seconds for the container to cleanly shut down. Then, if the container is still running, it will kill it (and any running applications). You can also immediately kill the container (without any chance for applications to cleanly shut down) using sudo lxc-stop -n CN. Finally, lxc-kill can be used more generally to send any signal number to the container's init.

While the container is shutting down, you can expect to see some (harmless) error messages, as follows:

$ sudo poweroff
[sudo] password for ubuntu: =

$ =

Broadcast message from ubuntu@cn1
        (/dev/lxc/console) at 18:17 ...

The system is going down for power off NOW!
 * Asking all remaining processes to terminate...
   ...done.
 * All processes ended within 1 seconds....
   ...done.
 * Deconfiguring network interfaces...
   ...done.
 * Deactivating swap...
   ...fail!
umount: /run/lock: not mounted
umount: /dev/shm: not mounted
mount: / is busy
 * Will now halt

A container can be frozen with sudo lxc-freeze -n CN. This will block all its processes until the container is later unfrozen using sudo lxc-unfreeze -n CN.

Monitoring container status

Two commands are available to monitor container state changes. lxc-monitor monitors one or more containers for any state changes. It takes a container name as usual with the -n option, but in this case the container name can be a posix regular expression to allow monitoring desirable sets of containers. lxc-monitor continues running as it prints container changes. lxc-wait waits for a specific state change and then exits. For instance,


sudo lxc-monitor -n cont[0-5]*

would print all state changes to any containers matching the listed regular expression, whereas


sudo lxc-wait -n cont1 -s 'STOPPED|FROZEN'

will wait until container cont1 enters state STOPPED or state FROZEN and then exit.

Consoles

Containers have a configurable number of consoles. One always exists on the container's /dev/console. This is shown on the terminal from which you ran lxc-start, unless the -d option is specified. The output on /dev/console can be redirected to a file using the -c console-file option to lxc-start. The number of extra consoles is specified by the lxc.tty variable, and is usually set to 4. Those consoles are shown on /dev/ttyN (for 1 <= N <= 4). To log into console 3 from the host, use


sudo lxc-console -n container -t 3

or if the -t N option is not specified, an unused console will be automatically chosen. To exit the console, use the escape sequence Ctrl-a q. Note that the escape sequence does not work in the console resulting from lxc-start without the -d option.

Each container console is actually a Unix98 pty in the host's (not the guest's) pty mount, bind-mounted over the guest's /dev/ttyN and /dev/console. Therefore, if the guest unmounts those or otherwise tries to access the actual character device 4:N, it will not be serving getty to the LXC consoles. (With the default settings, the container will not be able to access that character device and getty will therefore fail.) This can easily happen when a boot script blindly mounts a new /dev.

Container Inspection

Several commands are available to gather information on existing containers. lxc-ls will report all existing containers in its first line of output, and all running containers in the second line. lxc-list provides the same information in a more verbose format, listing running containers first and stopped containers next. lxc-ps will provide lists of processes in containers. To provide ps arguments to lxc-ps, prepend them with --. For instance, for listing of all processes in container plain,


sudo lxc-ps -n plain -- -ef

lxc-info provides the state of a container and the pid of its init process. lxc-cgroup can be used to query or set the values of a container's control group limits and information. This can be more convenient than interacting with the cgroup filesystem. For instance, to query the list of devices which a running container is allowed to access, you could use


sudo lxc-cgroup -n CN devices.list

or to add mknod, read, and write access to /dev/sda,


sudo lxc-cgroup -n CN devices.allow "b 8:* rwm"

and, to limit it to 300M of RAM,


lxc-cgroup -n CN memory.limit_in_bytes 300000000

lxc-netstat executes netstat in the running container, giving you a glimpse of its network state.

lxc-backup will create backups of the root filesystems of all existing containers (except lvm-based ones), using rsync to back the contents up under /var/lib/lxc/CN/rootfs.backup.1. These backups can be restored using lxc-restore. However, lxc-backup and lxc-restore are fragile with respect to customizations and therefore their use is not recommended.

Destroying containers

Use lxc-destroy to destroy an existing container.


sudo lxc-destroy -n CN

If the container is running, lxc-destroy will exit with a message informing you that you can force stopping and destroying the container with


sudo lxc-destroy -n CN -f

Advanced namespace usage

One of the Linux kernel features used by LXC to create containers is private namespaces. Namespaces allow a set of tasks to have private mappings of names to resources for things like pathnames and process IDs. (See Resources for a link to more information). Unlike control groups and other mount features which are also used to create containers, namespaces cannot be manipulated using a filesystem interface. Therefore, LXC ships with the lxc-unshare program, which is mainly for testing. It provides the ability to create new tasks in private namespaces. For instance,


sudo lxc-unshare -s 'MOUNT|PID' /bin/bash

creates a bash shell with private pid and mount namespaces. In this shell, you can do

root@ubuntu:~# mount -t proc proc /proc
root@ubuntu:~# ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  6 10:20 pts/9    00:00:00 /bin/bash
root       110     1  0 10:20 pts/9    00:00:00 ps -ef

so that ps shows only the tasks in your new namespace.

Ephemeral containers

Ephemeral containers are one-time containers. Given an existing container CN, you can run a command in an ephemeral container created based on CN, with the host's jdoe user bound into the container, using:


lxc-start-ephemeral -b jdoe -o CN -- /home/jdoe/run_my_job

When the job is finished, the container will be discarded.

Container Commands

Following is a table of all container commands:

Container commands

Command

Synopsis

lxc-attach

(NOT SUPPORTED) Run a command in a running container

lxc-backup

Back up the root filesystems for all lvm-backed containers

lxc-cgroup

View and set container control group settings

lxc-checkconfig

Verify host support for containers

lxc-checkpoint

(NOT SUPPORTED) Checkpoint a running container

lxc-clone

Clone a new container from an existing one

lxc-console

Open a console in a running container

lxc-create

Create a new container

lxc-destroy

Destroy an existing container

lxc-execute

Run a command in a (not running) application container

lxc-freeze

Freeze a running container

lxc-info

Print information on the state of a container

lxc-kill

Send a signal to a container's init

lxc-list

List all containers

lxc-ls

List all containers with shorter output than lxc-list

lxc-monitor

Monitor state changes of one or more containers

lxc-netstat

Execute netstat in a running container

lxc-ps

View process info in a running container

lxc-restart

(NOT SUPPORTED) Restart a checkpointed container

lxc-restore

Restore containers from backups made by lxc-backup

lxc-setcap

(NOT RECOMMENDED) Set file capabilities on LXC tools

lxc-setuid

(NOT RECOMMENDED) Set or remove setuid bits on LXC tools

lxc-shutdown

Safely shut down a container

lxc-start

Start a stopped container

lxc-start-ephemeral

Start an ephemeral (one-time) container

lxc-stop

Immediately stop a running container

lxc-unfreeze

Unfreeze a frozen container

lxc-unshare

Testing tool to manually unshare namespaces

lxc-version

Print the version of the LXC tools

lxc-wait

Wait for a container to reach a particular state

Configuration File

LXC containers are very flexible. The Ubuntu lxc package sets defaults to make creation of Ubuntu system containers as simple as possible. If you need more flexibility, this chapter will show how to fine-tune your containers as you need.

Detailed information is available in the lxc.conf(5) man page. Note that the default configurations created by the ubuntu templates are reasonable for a system container and usually do not need customization.

Choosing configuration files and options

The container setup is controlled by the LXC configuration options. Options can be specified at several points:

  • During container creation, a configuration file can be specified. However, creation templates often insert their own configuration options, so we usually specify only network configuration options at this point. For other configuration, it is usually better to edit the configuration file after container creation.

  • The file /var/lib/lxc/CN/config is used at container startup by default.

  • lxc-start accepts an alternate configuration file with the -f filename option.

  • Specific configuration variables can be overridden at lxc-start using -s key=value. It is generally better to edit the container configuration file.

Network Configuration

Container networking in LXC is very flexible. It is triggered by the lxc.network.type configuration file entries. If no such entries exist, then the container will share the host's networking stack. Services and connections started in the container will be using the host's IP address. If at least one lxc.network.type entry is present, then the container will have a private (layer 2) network stack. It will have its own network interfaces and firewall rules. There are several options for lxc.network.type:

  • lxc.network.type=empty: The container will have no network interfaces other than loopback.

  • lxc.network.type=veth: This is the default when using the ubuntu or ubuntu-cloud templates, and creates a veth network tunnel. One end of this tunnel becomes the network interface inside the container. The other end is attached to a bridged on the host. Any number of such tunnels can be created by adding more lxc.network.type=veth entries in the container configuration file. The bridge to which the host end of the tunnel will be attached is specified with lxc.network.link = lxcbr0.

  • lxc.network.type=phys A physical network interface (i.e. eth2) is passed into the container.

Two other options are to use vlan or macvlan, however their use is more complicated and is not described here. A few other networking options exist:

  • lxc.network.flags can only be set to up and ensures that the network interface is up.

  • lxc.network.hwaddr specifies a mac address to assign the the nic inside the container.

  • lxc.network.ipv4 and lxc.network.ipv6 set the respective IP addresses, if those should be static.

  • lxc.network.name specifies a name to assign inside the container. If this is not specified, a good default (i.e. eth0 for the first nic) is chosen.

  • lxc.network.lxcscript.up specifies a script to be called after the host side of the networking has been set up. See the lxc.conf(5) manual page for details.

Control group configuration

Cgroup options can be specified using lxc.cgroup entries. lxc.cgroup.subsystem.item = value instructs LXC to set cgroup subsystem's item to value. It is perhaps simpler to realize that this will simply write value to the file item for the container's control group for subsystem subsystem. For instance, to set the memory limit to 320M, you could add


lxc.cgroup.memory.limit_in_bytes = 320000000

which will cause 320000000 to be written to the file /sys/fs/cgroup/memory/lxc/CN/limit_in_bytes.

Rootfs, mounts and fstab

An important part of container setup is the mounting of various filesystems into place. The following is an example configuration file excerpt demonstrating the commonly used configuration options:


lxc.rootfs = /var/lib/lxc/CN/rootfs
lxc.mount.entry=proc /var/lib/lxc/CN/rootfs/proc proc nodev,noexec,nosuid 0 0
lxc.mount = /var/lib/lxc/CN/fstab

The first line says that the container's root filesystem is already mounted at /var/lib/lxc/CN/rootfs. If the filesystem is a block device (such as an LVM logical volume), then the path to the block device must be given instead.

Each lxc.mount.entry line should contain an item to mount in valid fstab format. The target directory should be prefixed by /var/lib/lxc/CN/rootfs, even if lxc.rootfs points to a block device.

Finally, lxc.mount points to a file, in fstab format, containing further items to mount. Note that all of these entries will be mounted by the host before the container init is started. In this way it is possible to bind mount various directories from the host into the container.

Other configuration options

  • lxc.cap.drop can be used to prevent the container from having or ever obtaining the listed capabilities. For instance, including

    
    lxc.cap.drop = sys_admin
    
    

    will prevent the container from mounting filesystems, as well as all other actions which require cap_sys_admin. See the capabilities(7) manual page for a list of capabilities and their meanings.

  • lxc.aa_profile = lxc-CN-profile specifies a custom Apparmor profile in which to start the container. See Apparmor for more information.

  • lxc.console=/path/to/consolefile will cause console messages to be written to the specified file.

  • lxc.arch specifies the architecture for the container, for instance x86, or x86_64.

  • lxc.tty=5 specifies that 5 consoles (in addition to /dev/console) should be created. That is, consoles will be available on /dev/tty1 through /dev/tty5. The ubuntu templates set this value to 4.

  • lxc.pts=1024 specifies that the container should have a private (Unix98) devpts filesystem mount. If this is not specified, then the container will share /dev/pts with the host, which is rarely desired. The number 1024 means that 1024 ptys should be allowed in the container, however this number is currently ignored. Before starting the container init, LXC will do (essentially) a

    
    sudo mount -t devpts -o newinstance devpts /dev/pts
    
    

    inside the container. It is important to realize that the container should not mount devpts filesystems of its own. It may safely do bind or move mounts of its mounted /dev/pts. But if it does

    
    sudo mount -t devpts devpts /dev/pts
    
    

    it will remount the host's devpts instance. If it adds the newinstance mount option, then it will mount a new private (empty) instance. In neither case will it remount the instance which was set up by LXC. For this reason, and to prevent the container from using the host's ptys, the default Apparmor policy will not allow containers to mount devpts filesystems after the container's init has been started.

  • lxc.devttydir specifies a directory under /dev in which LXC will create its console devices. If this option is not specified, then the ptys will be bind-mounted over /dev/console and /dev/ttyN. However, rare package updates may try to blindly rm -f and then mknod those devices. They will fail (because the file has been bind-mounted), causing the package update to fail. When lxc.devttydir is set to LXC, for instance, then LXC will bind-mount the console ptys onto /dev/lxc/console and /dev/lxc/ttyN, and subsequently symbolically link them to /dev/console and /dev/ttyN. This allows the package updates to succeed, at the risk of making future gettys on those consoles fail until the next reboot. This problem will be ideally solved with device namespaces.

Updates in Ubuntu containers

Because of some limitations which are placed on containers, package upgrades at times can fail. For instance, a package install or upgrade might fail if it is not allowed to create or open a block device. This often blocks all future upgrades until the issue is resolved. In some cases, you can work around this by chrooting into the container, to avoid the container restrictions, and completing the upgrade in the chroot.

Some of the specific things known to occasionally impede package upgrades include:

  • The container modifications performed when creating containers with the --trim option.

  • Actions performed by lxcguest. For instance, because /lib/init/fstab is bind-mounted from another file, mountall upgrades which insist on replacing that file can fail.

  • The over-mounting of console devices with ptys from the host can cause trouble with udev upgrades.

  • Apparmor policy and devices cgroup restrictions can prevent package upgrades from performing certain actions.

  • Capabilities dropped by use of lxc.cap.drop can likewise stop package upgrades from performing certain actions.

Libvirt LXC

Libvirt is a powerful hypervisor management solution with which you can administer Qemu, Xen and LXC virtual machines, both locally and remote. The libvirt LXC driver is a separate implementation from what we normally call LXC. A few differences include:

  • Configuration is stored in xml format

  • There no tools to facilitate container creation

  • By default there is no console on /dev/console

  • There is no support (yet) for container reboot or full shutdown

Converting a LXC container to libvirt-lxc

Creating Containers showed how to create LXC containers. If you've created a valid LXC container in this way, you can manage it with libvirt. Fetch a sample xml file from


wget http://people.canonical.com/~serge/o1.xml

Edit this file to replace the container name and root filesystem locations. Then you can define the container with:


virsh -c lxc:/// define o1.xml

Creating a container from cloud image

If you prefer to create a pristine new container just for LXC, you can download an ubuntu cloud image, extract it, and point a libvirt LXC xml file to it. For instance, find the url for a root tarball for the latest daily Ubuntu 12.04 LTS cloud image using


url1=`ubuntu-cloudimg-query precise daily $arch --format "%{url}\n"`
url=`echo $url1 | sed -e 's/.tar.gz/-root\0/'`
wget $url
filename=`basename $url`

Extract the downloaded tarball, for instance


mkdir $HOME/c1
cd $HOME/c1
sudo tar zxf $filename

Download the xml template


wget http://people.canonical.com/~serge/o1.xml

In the xml template, replace the name o1 with c1 and the source directory /var/lib/lxc/o1/rootfs with $HOME/c1. Then define the container using


virsh define o1.xml

Interacting with libvirt containers

As we've seen, you can create a libvirt-lxc container using


virsh -c lxc:/// define container.xml

To start a container called container, use


virsh -c lxc:/// start container

To stop a running container, use


virsh -c lxc:/// destroy container

Note that whereas the lxc-destroy command deletes the container, the virsh destroy command stops a running container. To delete the container definition, use


virsh -c lxc:/// undefine container

To get a console to a running container, use


virsh -c lxc:/// console container

Exit the console by simultaneously pressing control and ].

The lxcguest package

In the 11.04 (Natty) and 11.10 (Oneiric) releases of Ubuntu, a package was introduced called lxcguest. An unmodified root image could not be safely booted inside a container, but an image with the lxcguest package installed could be booted as a container, on bare hardware, or in a Xen, kvm, or VMware virtual machine.

As of the 12.04 LTS release, the work previously done by the lxcguest package was pushed into the core packages, and the lxcguest package was removed. As a result, an unmodified 12.04 LTS image can be booted as a container, on bare hardware, or in a Xen, kvm, or VMware virtual machine. To use an older release, the lxcguest package should still be used.

Security

A namespace maps ids to resources. By not providing a container any id with which to reference a resource, the resource can be protected. This is the basis of some of the security afforded to container users. For instance, IPC namespaces are completely isolated. Other namespaces, however, have various leaks which allow privilege to be inappropriately exerted from a container into another container or to the host.

By default, LXC containers are started under a Apparmor policy to restrict some actions. However, while stronger security is a goal for future releases, in 12.04 LTS the goal of the Apparmor policy is not to stop malicious actions but rather to stop accidental harm of the host by the guest.

See the LXC security wiki page for more, uptodate information.

Exploitable system calls

It is a core container feature that containers share a kernel with the host. Therefore, if the kernel contains any exploitable system calls, the container can exploit these as well. Once the container controls the kernel it can fully control any resource known to the host.

Resources

  • The DeveloperWorks article LXC: Linux container tools was an early introduction to the use of containers.

  • The Secure Containers Cookbook demonstrated the use of security modules to make containers more secure.

  • Manual pages referenced above can be found at:

  • The upstream LXC project is hosted at Sourceforge.

  • LXC security issues are listed and discussed at the LXC Security wiki page

  • For more on namespaces in Linux, see: S. Bhattiprolu, E. W. Biederman, S. E. Hallyn, and D. Lezcano. Virtual Servers and Check- point/Restart in Mainstream Linux. SIGOPS Op- erating Systems Review, 42(5), 2008.