Use a Dockerfile for your Binder

Binder supports configuration files for package installation, environment specification, post-build shell scripts, and more. It should be possible to create the environment that you want without using a Dockerfile. For more information about the different environment configuration files that Binder can create, see Preparing a repository for Binder.

However, in case you cannot meet all your needs with these configuration files, it is also possible to use a Dockerfile to define your environment. This is considered an advanced use case, and we cannot guarantee that the Dockerfile will work.

This guide will help you in preparing your Dockerfile so that it has the components needed to run JupyterHub, allowing it to work on Binder deployments.

Important

We recommend against using a Dockerfile as a way to make your repository usable with binder. Use them as a last resort after all methods in Preparing a repository for Binder have failed.

Note

Binder’s requirements for Dockerfiles are in beta and subject to change. Dockerfiles may break on Binder from time to time during the beta period.

When should you use a Dockerfile?

Below are a few use-cases where you might want to use a Dockerfile with Binder.

When you are building complex software

Most Binder configurations can be achieved without a Dockerfile. Before resorting to a Dockerfile, we recommend trying to use postBuild commands for configuration. See the repo2docker documentation for examples.

When you’re using a language that is not directly supported

Binder supports many languages, but not all of them. If your need to use a different language, it may be possible to accomplish this with a Dockerfile.

For a list of languages that Binder supports with configuration files, see the repo2docker documentation.

Note

We welcome contributions to repo2docker to add support for new languages. If interested, please open an issue.

Preparing your Dockerfile

For a Dockerfile to work on Binder, it must meet the following requirements:

  1. It must install a recent version of Jupyter Notebook. This should be installed via pip with the notebook package. So in your dockerfile, you should have a command such as:

    RUN pip install --no-cache-dir notebook==5.*
    
  2. It must explicitly specify a tag in the image you source.

    When sourcing a pre-existing Docker image with FROM, a tag is required. The tag cannot be latest. Note that tag naming conventions differ between images, so we recommend using the SHA tag of the image.

    Here’s an example of a Dockerfile FROM statement that would work.

    FROM jupyter/scipy-notebook:cf6258237ff9
    

    Note

    The following examples would not work:

    FROM jupyter/scipy-notebook
    

    or

    FROM jupyter/scipy-notebook:latest
    
  3. It must set up a user whose uid is 1000. It is bad practice to run processes in containers as root, and on binder we do not allow root container processes. If you are using an ubuntu or debian based container image, you can create a user easily with the following directives somewhere in your Dockerfile:

    ENV NB_USER jovyan
    ENV NB_UID 1000
    ENV HOME /home/${NB_USER}
    
    RUN adduser --disabled-password \
        --gecos "Default user" \
        --uid ${NB_UID} \
        ${NB_USER}
    

    This is the user that will be running the jupyter notebook process when your repo is launched with binder. So any files you would like to be writeable by the launched binder notebook should be owned by this user.

  4. It must copy its contents to the $HOME directory and change permissions.

    To make sure that your repository contents are available to users, you must copy all contents to $HOME and then make this folder owned by the user you created in step 3. If you used the snippet provided in step 3, you can accomplish this copying with the following snippet:

    # Make sure the contents of our repo are in ${HOME}
    COPY . ${HOME}
    USER root
    RUN chown -R ${NB_UID} ${HOME}
    USER ${NB_USER}
    

    This chown is required because Docker will be default set the owner to root, which would prevent users from editing files.

  5. It must specify a default command.

    This is the command that is executed on startup.

    # Specify the default command to run
    CMD ["jupyter", "notebook", "--ip", "0.0.0.0"]
    

Ensuring reproducibility with Dockerfiles

Ensuring that your Binder environment is reproducible requires extra considerations when using a Dockerfile. This section provides some guidelines for making sure your Binder environment does not change in unexpected ways.

As mentioned above, make sure that you source your Dockerfile from a tag of another image. This ensures that you will continue building off of the same image even if the image is updated to a new version.

Next, make sure that all packages installed with your Dockerfile are pinned to specific versions. You should do this with the the image you are sourcing as well.