![]()
|
We have discussed Linux Containers in the past. There are many good reasons to run containers, such as file system isolation, ease of deployment, and fewer resources (e.g. less overhead).
In this session we'll discuss Containers from a networking aspect. Covering different Container frameworks, what are the advantages, and limitations of the frameworks and networking. We'll cover:
All of these containers use Linux Network Name Spaces (NNS) to isolate the containers from the main host networking system. However not all register their namespaces with iproute2
.
The key to Container networking, is that from inside the container, the network appears to be a normal eth0
interface with an IP address and a default route. But from the host side, one will see veth
interfaces which are used to forward the packets from the container to the hosts interface.
From the Man Page : A namespace wraps a global system resource in an abstraction that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource. Changes to the global resource are visible to other processes that are members of the namespace, but are invisible to other processes. One use of namespaces is to implement containers.
Container environments use Namespaces to prevent containers from stepping on other containers network configurations. to view the namespaces configured on the host use the lsns
command:
NS TYPE NPROCS PID USER COMMAND
4026531834 time 10 3768 cvmiller -ash
4026531835 cgroup 9 3768 cvmiller -ash
4026531836 pid 9 3768 cvmiller -ash
4026531837 user 3 3768 cvmiller -ash
4026531838 uts 9 3768 cvmiller -ash
4026531839 ipc 9 3768 cvmiller -ash
4026531840 net 7 3768 cvmiller -ash
4026531841 mnt 3 3768 cvmiller -ash
4026533669 user 7 4632 cvmiller catatonit -P
4026533670 mnt 5 4632 cvmiller catatonit -P
4026533671 net 3 4784 cvmiller rootlessport
4026533744 mnt 1 4811 cvmiller /whoami
4026533745 mnt 1 4780 cvmiller /usr/bin/slirp4netns --disable-host-loopback --mtu=65520 --enable-sandbox --enable-seccomp --enabl
4026533746 uts 1 4811 cvmiller /whoami
4026533747 ipc 1 4811 cvmiller /whoami
4026533748 pid 1 4811 cvmiller /whoami
4026533749 cgroup 1 4811 cvmiller /whoami
Docker uses the term bridge to describe what is really a proxy. In Classical networking a bridge is a Layer 2 device, that forwards packets based on destination MAC address. This is not how the Docker Bridge works.
Container bridges often include input/output port mapping, and even IP protocol conversion (IPv6<->IPv4).
The advantage/disadvantage of the Container framework bridge is that it creates isolation of the container from the network. For example, only certain TCP/UDP ports are forwarded into the Container.
There are two general types of Container network attachments:
Not surprisingly, NAT (Net Address Translation) has problems.
You can usually tell if NAT is involved by running the ip addr
command on the host and looking at the names of the interfaces (see table below), but not always.
Another way to determine which containers are running is via the netstat -antp
command (must be root for -p), and looking for the telltale proxy running.
sudo netstat -antp
...
tcp 0 0 :::8080 :::* LISTEN 4784/rootlessport
tcp 0 0 :::8888 :::* LISTEN 4932/docker-proxy
I have spoke of the limitations of NAT in the past (NAT is evil). One of the biggest limitations in the world of Containers is that NAT limits the container to which port (think: TCP) it can listen to.
For example, if you have webservices app in a Container that listens on ports 80 and 443 with NAT, you can map those to the external host on ports 80 and 443. No problem, yet. But if you start a second container with webservices the second container can not listen to port 80 and 443, because the first container is already listening to those ports.
Of course, you could set the second container to listen to 8080 and 8443, but then you would need to communicate that to your user-base that you are using special ports for that service. And you would have to adjust your firewall rules to allow those secondary ports to be open.
As you add more containers, this problem becomes worse, and doesn't really scale, since all the containers share the same IP address of the Host.
Here's a summary table of some of the container types, and their characteristics using the default configuration:
FrameWork | NAT? | Host Interface | Version | IPv6 support |
---|---|---|---|---|
LXD | No | bridge=veth | 6.0.0 | Yes |
Incus | No | macvlan=nothing | 5.0.2 | Yes |
Docker | Yes | docker0 & veth | 26.1.2 | Port forwarding |
Podman | Yes | nothing | 4.9.4 | Port forwarding |
All of the (above) containers support IPv6, however they take different strategies, which can be divided into three support mechanisms:
Log into Container Host and answer the following questions?
Attempt/Answer the following:
Basic commands to see what containers have been instantiated and their PIDs in a given container frame work:
Framework | show containers | show container PID |
---|---|---|
LXD | lxc ls |
lxc -ls -c n,p |
Incus | incus ls |
incus ls -c n,p |
Docker | docker ps -a |
docker top <container id> |
Podman | podman ps -a |
podman ps -a --ns |
Looking at the network namespace for each container, using PID , and running the ip addr
command from within that container's namespace:
sudo nsenter -t <PID> -n ip addr
As mentioned earlier, some container frameworks have excellent IPv6 support, allowing one to have several containers all listening on port 80 & 443, others less so. That said, Docker and Podman support Open Containers Initiative (OCI) Containers and Container Images, and therefore can run a wider range of containers. For example Podman can run Docker containers.
Unfortunately, LXD & Incus, do not support OCI container images, but use a simpler bundled compressed-tar-format, which can be created using distrobuilder.
Many reasons why it is useful to run containers, and it is possible to run multiple container frameworks on one system. The Raspberry Pi 3b+ used in the lab is running three systems simultaneously!
So now that you know a little more about what is happening under the hood, freel free to go out and containerize your applications.
Notes:
Installation Links
Good namespaces tutorial on creating namespaces manually
It is possible to bind docker and podman containers to a single IPv6 address, and therefore run services on the same ports, but listening to different IPv6 addresses. For example, one can bind to the ULA and the GUA separately (assuming your GUA starts with 2001, and your ULA begins with fd10):
export IPADDR=$(ip a | grep 2001 | cut -d ' ' -f 6 | cut -d '/' -f 1)
podman run --name iampodman --rm -p "[$IPADDR]:8888:80" docker.io/containous/whoami
export IPADDR=$(ip a | grep 'fd10:.*288' | cut -d ' ' -f 6 | cut -d '/' -f 1)
docker run --name iamdocker --rm -p "[$IPADDR]:8888:80" containous/whoami
23 May 2024