Proxmox + Docker in LXC containers
The old way
When I started with Proxmox, I installed Proxmox on a disk pool with ZFS. In the meantime the exact disk layout has changed, but the filesystem is still ZFS. Docker didn't natively support ZFS as a filesystem. It worked, but it would be really slow. In 2024 this shouldn't be the case anymore as Docker now has a ZFS storage driver. More on that in "The new way section".
The best way to get Docker working, also the recommended way, is by using a full VM. This VM will probably format its virtual disk as ext4
which will perform great with Docker. However, a VM uses more resource than an LXC container. LXC, however, uses the filesystem of the OS (host) it runs on, in my case ZFS. Luckily there is a workaround.
- Create EXT4 partition on ZFS: https://wiki.joeplaa.com/zfs#create-ext4-partition-on-zfs
- Create a mount point: that stores Docker data
/var/lib/docker
on this partition instead of on ZFS: https://wiki.joeplaa.com/en/proxmox#docker-in-container
Pros, cons and gotchas
This approach works perfectly fine. However, there are some things to think about:
- If you migrate an LXC container to another Proxmox host, the container is stopped on the old host, replicated on the new host, started on the new host and destroyed on the old host. This will result in downtime during the replication step. When using a VM, there is virtually no downtime.
- You can save backup time and space if you uncheck "Backup" on the mount point. Why backup images, containers and volumes, right? That's part of the reason of using Docker in the first place. Those should all be ephemeral.
- If however you use Docker volumes to store data instead of a mapped path, all your data is lost if you unchecked "Backup" on the mount point and restore the container from a backup. Yes, that happened to me.
- If you decide to backup the mount point, you probably run into
exit code 23
(something like:command 'rsync --stats -h -X -A --numeric-ids -aH --delete --no-whole-file --sparse --one-file-system --relative '--exclude=/tmp/?*' '--exclude=/var/tmp/?*' '--exclude=/var/run/?*.pid' /proc/295998/root//./ /var/tmp/vzdumptmp296420_101' failed: exit code 23
). This can be solved by setting theacltype
toposixacl
for the temporary backup directory.- Create a backup temp folder
/tmp/vzdump
:sudo mkdir /tmp/vzdump
- Create a zfs mount
rpool/tmp
:sudo zfs set mountpoint=/tmp/vzdump rpool/tmp
- Set acl type to
POSIX
for mount:sudo zfs set acltype=posixacl rpool/tmp
- Edit
tmpdir: /tmp/vzdump
in/etc/vzdump.conf
:sudo nano /etc/vzdump
- Create a backup temp folder
# vzdump default settings
tmpdir: /tmp/vzdump
#dumpdir: DIR
#storage: STORAGE_ID
#mode: snapshot|suspend|stop
#bwlimit: KBPS
#performance: [max-workers=N][,pbs-entries-max=N]
#ionice: PRI
#lockwait: MINUTES
#stopwait: MINUTES
#stdexcludes: BOOLEAN
#mailto: ADDRESSLIST
notification-policy: failure
#prune-backups: keep-INTERVAL=N[,...]
#exclude-path: PATHLIST
#pigz: N
notes-template: {{guestname}}
- If you want to mount an NFS share directly in a container, the container needs to be privileged. Docker doesn't like that, something with
AppArmor
. You have to open the config file/etc/pve/lxc/xyz.conf
and add the lines below to make Docker start (https://forum.proxmox.com/threads/run-docker-inside-lxc.112004/).
# xyz.conf
...
lxc.apparmor.profile: unconfined
lxc.cgroup2.devices.allow: a
lxc.cap.drop:
The new way (not working)
- Create
/etc/docker/daemon.json
and add:
{
"storage-driver": "zfs"
}
- Restart docker:
systemctl restart docker
- And it doesn't work:
failed to start daemon: error initializing graphdriver: prerequisites for driver not satisfied (wrong filesystem?): zfs
Right... So I tried to reboot the container, add a dedicated mount point to /var/lib/docker
, made sure the filesystem is zfs: df -T
.
root@docker-test:~# df -T
Filesystem Type 1K-blocks Used Available Use% Mounted on
rpool/data/subvol-100-disk-0 zfs 8388608 808320 7580288 10% /
rpool/data/subvol-100-disk-1 zfs 8388608 256 8388352 1% /var/lib/docker
none tmpfs 492 4 488 1% /dev
efivarfs efivarfs 176 93 79 55% /sys/firmware/efi/efivars
tmpfs tmpfs 131892140 0 131892140 0% /dev/shm
tmpfs tmpfs 52756856 84 52756772 1% /run
tmpfs tmpfs 5120 0 5120 0% /run/lock
None of the above worked. It seems that docker communicates with zfs, which runs on the host, not inside the LXC.
I tried mapping /dev/zfs
to the container: lxc.mount.entry: /dev/zfs dev/zfs none rbind,create=file
but this gave me a new error: failed to start daemon: error initializing graphdriver: Cannot find root filesystem rpool/data/subvol-100-disk-1: exit status 1: "/usr/sbin/zfs list -rHp -t filesystem -o name,origin,used,available,mountpoint,compression,type,volsize,quota,referenced,written,logicalused,usedbydataset rpool/data/subvol-100-disk-1" => cannot open 'rpool/data/subvol-100-disk-1': dataset does not exist
.
This sounds reasonable as that mount is from the host. Telling the LXC what this mount means by passing /proc/mounts/
(lxc.mount.entry: /proc/self/mounts proc/self/mounts none rbind,create=file
) made the container not boot at all (Too many levels of symbolic links
).