Containerization

Containerization
Photo by Andy Li / Unsplash

In the last article, I talked about what virtualization is and how important the concept and technology is in the whole cloud computing paradigm.

Now I want to talk about containerization – a different approach to isolation that does not use a hypervisor, but instead it relies on a specific kernel features that isolate processes from the rest of the system.

So in short containerization is a form of operating system virtualization, where we have applications running in isolated user spaces called containers.

Each process “container” or “jail” has a private root file system and process namespace.
While sharing the kernel and other services of the underlying OS, they cannot access files or resources outside of their container.
In essence, containers are fully packaged and portable computing environments.

Source: Unix And Linux System Administration Handbook
Unix And Linux System Administration Handbook

We said that this type of virtualization does not require hypervisor, and because there is no need for the virtualization of the hardware, resource overhead for this type of virtualization is low.

This means container start up time is pretty quick, or if you will start time is pretty short. Creation of a container has the overhead of creating a Linux process, which can be of the order of the miliseconds, while creating a VM can take seconds.

The containerized application can be run on various types of infrastructure—on bare metal, within VMs, and in the cloud—without needing to refactor it for each environment

How then containers relate to VMs?

Both are portable, isolated, execution environments, and both look and act as a full operating systems.
Unlike VM, which has an OS kernel, drivers to interact with hardware etc. a container is merely a mimic of an operating system. Container itself is abstracted away from the host OS, with only limited access to underlying resources – we can say it is a lightweight VM.

The containers-on-VMs architecture is standard for containerized applications that need to run on public cloud instances.

How containerization actually works?

We said that containerization relies on specific kernel features, but which features are those?
Containerization as we know it evolved from cgroups, a feature for isolating and controlling resource usage (e.g., how much CPU and RAM and how many threads a given process can access) within the Linux kernel.
cgroups were originally developed by Paul Menage and Rohit Seth of Google, and their first features were merged into Linux 2.6.24. Cgroups became Linux containers (LXC), with more advanced features for namespace isolation of components, such as routing tables and file systems.

Namespaces are a kernel mechanism for limiting the visibility that a group of processes has of the rest of a system. For example you can limit visibility to certain process trees, network interfaces, user IDs or filesystem mounts. namespaces were originally developed by Eric Biederman, and the final major namespace was merged into Linux 3.8.

Since kernel version 5.6, there are 8 kinds of namespaces. Namespace functionality is the same across all kinds: each process is associated with a namespace and can only see or use the resources associated with that namespace, and descendant namespaces where applicable. This way each process (or process group thereof) can have a unique view on the resources.

bojana@linux:~$ sudo lsns -p 270
        NS TYPE   NPROCS PID USER COMMAND
4026531835 cgroup    130   1 root /sbin/init
4026531836 pid       126   1 root /sbin/init
4026531837 user      130   1 root /sbin/init
4026531838 uts       123   1 root /sbin/init
4026531839 ipc       126   1 root /sbin/init
4026531840 mnt       114   1 root /sbin/init
4026531992 net       126   1 root /sbin/init

In Linux, lsns lists information about all the currently accessible namespaces

  • cgroup
  • mnt (mount points, filesystems)
  • pid (processes)
  • net (network stack)
  • ipc (System V IPC)
  • uts (hostname)
  • user (UIDs)
  • time namespace

Modern containers evolved from these 2 kernel features, and LXC served as a basis for Docker launched in 2013. In it’s early years it was based on using LXC, but later developed its own lib instead.

Docker

For most people nowadays when you say “container” Docker is the first association, but we see that the containerization technology is not that new (cgroups dating back to 2008). However, Docker expansion can be contributed to the set of tools it introduced taking advantage of already existing containerization technology.

Because of the rapid evolvement of docker tools, and new versions being incompatible with existing deployments, to counter this Docker Inc. became one of the founder member of Open Container Initiative, a consortium whose mission is to guide the growth of container technology in a healthily competitive direction that fosters standards and collaboration.

We will be talking about docker architecture and docker as a container engine in one of the following articles.