GKE v1.13 Upgrade

May 2019 ยท 1 minute read

We recently upgraded a Kubernetes cluster to GKE v1.13.5.

After the upgrade, the docker client was no longer responsive and crashed hard.

$ docker ps
Segmentation fault (core dumped)

Uh-oh.

This cluster runs a Jenkins instance which launches build containers in the cluster. Each build container eventually runs Docker commands like docker build -t ${env.IMAGE_NAME} -f path/to/Dockerfile . to build container images. We mount the Docker client into the build container from the Container-Optimized OS host VM.

A simplified Pod definition of the build container:

apiVersion: v1
kind: Pod
metadata:
  name: build-container
spec:
  containers:
    - name: build-container
      image: 'centos:7'
      command: 
        - sh
        - -c 
        - '<build command>'
      volumeMounts:
        - mountPath: /var/run/docker.sock
          name: docker-sock
        - mountPath: /usr/bin/docker
          name: docker
  volumes:
    - name: docker
      hostPath:
        path: /usr/bin/docker
        type: File
    - name: docker-sock
      hostPath:
        path: /var/run/docker.sock
        type: Socket

Whether or not this should have ever worked is a fair question. However, it worked for the past two years, and using the mounted docker client broke following an upgrade to GKE v1.13.

Our current workaround to the Docker client segmentation fault is to use the official static Docker binaries in our build container.

There was never an issue reading or writing to the socket, and using official Docker client binaries seems to work.