Managing multi-process applications in containers using s6

How s6 can help to manage multi-process applications in a container - taking the burden of security aspects, signal handling, configuration and log management off your shoulders & make your code more readable!
17.12.2020
Łukasz Bednarczyk
Tags

Typically, when already using a process manager, with the Docker stack being a quite popular example, you do not often find yourself looking for an additional one. And I say Docker is a process supervisor because - unless you’re running in the swarm mode - that’s exactly what it does - process control with the added containerization layer…and a few other features.

And then it may hit you that you need some kind of smart process management somewhere in your container, just as long as there exist a few usual and mundane tasks which you absolutely have to take care of. For example: configuration management. The very thing that you really, really want to avoid here is debugging that scripting layer which will inevitably occur, along with the additional burden of packages bloating the image size in each and every Dockerfile.

Just a small hint here: s6 is the perfect option for solving that problem.

Other straightforward reasons to consider the migration include:

  • the strain on both your patience and your CI/CD system with building yet another image for some trivial but necessary subsidiary process
  • getting a bad headache from managing data synchronization between containers sharing a common task

Let me alleviate all that suffering with the solution I cannot recommend highly enough - the s6!

What makes s6 so special?

Looking from the perspective of someone expecting his life to become easier, deploying a s6-backed container does exactly that in many use cases. Let me just show you how…

Execline shell

With Bash being the king of scripting you may ask: why on Earth do I need an additional shell on top of my additional process manager? The answer is this: simplicity. Execline shell is an excellent answer for long and complex stacks of binaries calling each other, be that “su-exec”, “dockerize”, “envplate”, what have you… Packed into just 29kB of shell binary size, you get a quite different, yet brilliant idea for handling process definitions where all script lines are executed as a single command. This does produce much cleaner Git diffs.

Let me clarify with a typical Bash entrypoint script:

    #!/usr/bin/env bash
    exec dockerize -wait tcp://host:3456 \
        ep -v /etc/nginx/nginx.conf -- \
        su-exec www-data nginx -g 'daemon off;'

Let’s now have a look at how a simple and uninvolved refactor to execline shell could look like, featuring its “exec all lines together” philosophy:

    #!/usr/bin/execlineb -P
    s6-setuidgid www-data
    dockerize -wait tcp://host:3456
    nginx -g 'daemon off;'

Please note that:

  • configuration templating (“ep”) has been extracted due to security reasons (explained below),
  • line-break escaping makes no sense,
  • don’t have to explicitly “exec” into the application,
  • we have replaced “su-exec” with the bundled “s6-setuidgid” binary,
  • we have moved “s6-setuidgid” to the top ensuring the “least privilege principle”.

Further refactoring of the common usage cases to the s6 ecosystem is where the real fun begins. And here comes a word of caution: keeping things simple will prevent exceeding your kernel buffer limit, which needs to handle both command arguments and environment variables (if any). If in doubt, check your system using: “getconf ARG_MAX”, but for any sane case this should not be less than 64kB. This is why it is preferred to run execline with the “-P” switch. More on that below…

Process security with environment variables

A good example of code simplification would be the case of handling access to the environment variables defined on the container, which, following the Twelve-Factor App config principle, are the preferred configuration scheme nowadays. This presents the usual problem of passing secrets, which should only reach specific areas, again, solved only in the swarm mode with the Docker secrets framework. With a simple non-s6 Dockerfile run in the standard mode, the containerized process would have access to all the defined environment variables, and that possibly could be leaked via network using just any kind of query tool (e.g. “phpinfo()”). Usually it would require complex scripting to mitigate the problem, but fortunately we do have s6 to the rescue, with its pre-flight scripting framework.

Let me provide a concrete example. Let’s prepare two config files for s6. First - the pre-flight configuration handler, which will be run with full access to the container environment:

    #!/usr/bin/with-contenv bash
    # --- /etc/cont-init.d/01-myapp-env-config-script ---
    # update config files with full access to env vars
    # thanks to the "with-contenv" shebang 
    ep -v /etc/nginx/nginx.conf

And then comes the actual process definition, which is kept away from the environment variables:

    #!/usr/bin/execlineb -P
    # --- /etc/services.d/nginx ---
    # current process has explicitly purged environment 
    # because of the "-P" flag for execlineb
    s6-setuidgid www-data
    nginx -g 'daemon off;'

As portrayed above, the pre-flight framework defines a clear separation between configuration and the execution environment, providing exceptional security. This would also fare well for any kind of configuration management software, i.e. Ansible/Chef/Saltstack. For example:

    #!/usr/bin/with-contenv bash
    # --- /etc/cont-init.d/01-myapp-provision-script ---
    # provision container configuration
    ansible-playbook /usr/src/ansible/playbooks/env_configuration.yml

following that with references to environment variables in the playbook:

    #!/usr/bin/env ansible-playbook
    # --- /usr/src/ansible/playbooks/env_configuration.yml ---
    # update app configuration using environment variables
    - hosts: localhost
      connection: local
      tasks:
          - name: app - overwrite configuration file
            copy:
              content: |
                [general]
                fqdn = {{ ansible_env.FQDN }}
                production_mode = {{ ansible_env.PRODUCTION_MODE }}
              dest: /etc/app/settings.conf

Environment file loader

If you think that per-process environment cleanup is all s6 can do for you, do take another closer look because there’s more in fact. One of the s6 binaries was designed exactly to alleviate the problem of loading environment variables from regular files, and it’s called “s6-envdir”. As the name suggests, all you need is to have a directory full of config files with just a single rule available: config file name becomes the environment variable name, and the first file line becomes its value. Passing file variables to a process has just become as easy as adding just a line to your execline task script, for example:

    #!/usr/bin/execlineb -P
    # --- /etc/cont-init.d/01-myapp-env-loader-script ---
    # explicitly load container environment
    with-contenv
    # load env variables from files
    s6-envdir /environment/variables/in/a/directory
    # update configuration using variables from both sources
    ep -v /configuration/files/*.conf

And since we have already separated configuration handling from our process definition, the latter does not need any updates.

Signal handling

Ever thought that running your application as the main container process (PID1) was fun? Think again because you’re in a world of surprises. The special treatment and expectations from the kernel for PID1 are the sole reason why you need a container init system, even a simple one, like the famous “dumb-init” project.

And why waste time configuring that anyway? If you want to stop scratching your head every time you refactor your base image from, say, ubuntu to alpine, and be safe from wondering why your favourite init package is not there anymore, just go with s6 - it takes care of passing signals and cleaning up zombie processes just fine.

Privilege control

Obviously you could not get very far without some system of privilege management, which s6 happily provides for you, also nicely aligned with the execline shell philosophy. Both common patterns: privilege escalation and de-escalation, are supported. The installation package includes, respectively, “s6-sudo” and “s6-setuidgid” binaries.

Syslogd support

Another mundane task you definitely would not like to script and/or provide packages for is log handling, specifically rotation. And you’ve guessed it, there’s a s6 binary just for that, incidentally called “s6-log”, which does all the heavy lifting you may ever require.

Just in case you still prefer handing off the log management to the Docker daemon, you may consider setting the environment variable “S6_LOGGING=0”, which will forward subprocess streams to the container stdout/stderr.

Resource management

Another cool part is the flexibility with per-subprocess resource quota system, enforced with the use of the “s6-softlimit” program.

Fixing ownership and permissions

The s6 configuration framework also supports patching file/directory ownership and permissions using configuration files placed in /etc/fix-attrs.d, which is quite well documented. Just a word of warning: you should expect serious performance issues in case of recursing over deeply nested directory structures with a large number of underlying files. This will not be any faster than running chown/chmod over each item, so select your path wisely…

And last but not least…

Worth mentioning is the small s6 installation package size and platform compatibility, which is absolutely stellar - all base images are supported.

How to use s6 in a Docker environment

This topic has already been extensively covered elsewhere, but just in case you need a short summary, it can be boiled down to just a few pointers. First, extract the s6 overlay package to the root of your container filesystem, then add a layer with your preflight scripts and service definitions to appropriate locations (/etc/cont-init.d, /etc/services.d, etc.), and lastly, slap an s6 entrypoint declaration (/init) and voilà, you’re ready to go!


Cover Image Credits: Pixabay