Repeated installation of a package inside docker image

0

0

I built a python package called my-package. I have no intension to make it public so installation is mostly through our internal servers. Recently one senior developer built an architecture using docker where the application is hosted and my-package is a dependency.

The problem is in order to test the package, I REPEATEDLY need to COPY my code into docker image, then uninstall old version of package and re-install from the local code.

  1. Rebuilding entire image again takes half an hour. - Not an option.
  2. Create another Dockerfile FROM existing image and run only specific commands to COPY and install the pip package. - My current solution yet not very efficient.

I am pretty sure the docker users would have come across this issue so need an expert opinion on the most efficient way to handle this.

UPDATE: The Dockerfile

# VERSION 1.8.2
# AUTHOR: Matthieu "Puckel_" Roisil
# DESCRIPTION: Basic Airflow container
# BUILD: docker build --rm -t puckel/docker-airflow .
# SOURCE: https://github.com/puckel/docker-airflow

FROM ubuntu:17.10
MAINTAINER Puckel_

# Never prompts the user for choices on installation/configuration of packages
ENV DEBIAN_FRONTEND noninteractive
ENV TERM linux

# Airflow
ARG AIRFLOW_VERSION=1.8.9
ARG AIRFLOW_HOME=/usr/local/airflow

# Define en_US.
ENV LANGUAGE en_US.UTF-8
ENV LANG en_US.UTF-8
ENV LC_ALL en_US.UTF-8
ENV LC_CTYPE en_US.UTF-8
ENV LC_MESSAGES en_US.UTF-8
ENV LC_ALL en_US.UTF-8

ENV MATPLOTLIBRC /etc

RUN set -ex \
    && buildDeps=' \
        python3.6-dev \
        libkrb5-dev \
        libsasl2-dev \
        libssl-dev \
        libffi-dev \
        build-essential \
        libblas-dev \
        liblapack-dev \
        libpq-dev \
        git \
        wget \
    ' \
    && apt-get update -yqq \
    && apt-get dist-upgrade -yqq \
    && apt-get install -yqq --no-install-recommends \
        $buildDeps \
        python3.6 \
        python3.6-tk \
        apt-utils \
        curl \
        netcat \
        locales \
        ca-certificates \
        sudo \
        libmysqlclient-dev \
    && ln -s /usr/bin/python3.6 /usr/bin/python \
    && sed -i 's/^# en_US.UTF-8 UTF-8$/en_US.UTF-8 UTF-8/g' /etc/locale.gen \
    && locale-gen \
    && update-locale LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 \
    && useradd -ms /bin/bash -d ${AIRFLOW_HOME} -u 1500 airflow \
    && mkdir ${AIRFLOW_HOME}/logs \
    && wget https://bootstrap.pypa.io/get-pip.py \
    && python get-pip.py \
    && rm -rf get-pip.py \
    && python -m pip install Cython \
    && python -m pip install requests \
    && python -m pip install pytz \
    && python -m pip install pyOpenSSL \
    && python -m pip install ndg-httpsclient \
    && python -m pip install pyasn1 \
    && python -m pip install Flask-OAuthlib \
    && python -m pip install apache-airflow[crypto,celery,postgres,ldap,jdbc,mysql,s3,samba]==$AIRFLOW_VERSION \
    && python -m pip install celery[redis]==4.1.0 \
    && python -m pip install boto3 \
    && python -m pip install pymongo \
    && python -m pip install statsd \   
    && apt-get remove --purge -yqq $buildDeps \
    && apt-get clean \
    && rm -rf \
        /var/lib/apt/lists/* \
        /tmp/* \
        /var/tmp/* \
        /usr/share/man \
        /usr/share/doc \
        /usr/share/doc-base \
    && apt-get autoremove -yqq

The important part is in the end.

ARG CACHEBUST=1

COPY config/matplotlibrc /etc/matplotlibrc
COPY script/entrypoint.sh /entrypoint.sh
COPY script/shell.sh /shell.sh
COPY config/airflow.cfg ${AIRFLOW_HOME}/airflow.cfg

RUN chown -R airflow: ${AIRFLOW_HOME}

RUN pip install matplotlib seaborn xlsxwriter pandas Jinja2
#Add custom PIP repo - THIS IS OF INTEREST
COPY config/pip.conf /etc/pip.conf 
RUN python -m pip install my-package

COPY my-package2 /usr/local/my-package2
# RUN pip uninstall my-package2
RUN python -m pip install /usr/local/my-package2

EXPOSE 8080 5555 8793

USER airflow
WORKDIR ${AIRFLOW_HOME}
ENTRYPOINT ["/entrypoint.sh"]

As you can see, I copy my-package2 from my local machine to the image and run pip install.

  1. The image size is getting bigger every time I rebuild the image.
  2. Volumes is definitely an option I haven't tried yet. I already make use of script/shell.sh which just has $@. I set that as entry point and run any command I wish to run inside the image without much haggle.
  3. I use docker-compose so every time I rebuild with the new tag, I need to update in docker-compose as well. Over the time it gets annoying to do this for a single line change in code.

obligated to keep his content

Posted 2017-12-07T04:57:13.143

Reputation: 73

1Why would that not be efficient? In addition, as for rebuilding the image, if you make it an additional layer it should improve times as you would have the intermediates in place (assuming you don't delete them every time). At least that's my understanding for Docker. – Seth – 2017-12-07T06:17:02.017

1Not very efficient because

  1. The image size keeps increasing with every try (Not sure why. Technically it shouldn't) and my mac is running out of memory.
  2. Need to retag, rebuild and delete the old tag with every try. Quite a cumbersome process.
  3. < – obligated to keep his content – 2017-12-07T06:41:16.143

I think the idea is sound (plus it seems like it's been decided for you that Docker has to be used). I think it would be better to focus on the issue of having to repeatedly having to copy your code. Also, when you're changing a single layer of a docker image it shouldn't take half an hour. Maybe you can switch the order in which you do things, or get a faster computer. To give advice on that, we would definitely need much more details. – mtak – 2017-12-22T07:28:38.533

Could you share enough of your dockerfile so we could understand why it takes so long install the pip package. But first a question: Why don't you, instead of building an image for testing, just use the package from the host via the parameter of -v /path/on/host:/path/in/container? – harrymc – 2017-12-22T12:38:36.877

@harrymc Updated with docker file and few comments on my trials. – obligated to keep his content – 2017-12-26T05:32:42.080

@mtak Updated with the dockerfile – obligated to keep his content – 2017-12-26T05:33:16.990

1Almost all of the dockerfile is installing dependencies. Why don't you create an image that contains all of them, just repeating the last few steps for testing each time? – harrymc – 2017-12-26T06:41:33.290

Answers

1

You will need to share some of your dockerfile so we could understand why it takes so long to install the pip package. If you wish to optimize it, these references might help :

An alternate solution is, instead of building an image for testing, just use the package from the host via the Docker parameter of -v /host/directory:/container/directory.

This will let you immediately test your package in the context of the container, so you will only create the production image when the testing is complete.

Much more information can be found, for example : Understanding Volumes in Docker.


From your posted dockerfile, it seems that almost all of it is for installing dependencies. For testing, you can create an image where all these dependencies are already installed, then just repeat the last step for installing your application each time for testing.

For readability, you could finally write the dockerfile as multi-stage, for separating dependencies-building from production, and perhaps also to only generate a final minimal production build. The ONBUILD instruction might be useful here.

Only you know what you are trying to achieve and what are your constraints. The above links can serve as a starting-point, and there are many more articles to be found on the subject.

harrymc

Posted 2017-12-07T04:57:13.143

Reputation: 306 093

Thanks for the suggestion. I ended up with a combination of two solutions that you mentioned. In docker-compose, I set the volume so that the package code is shared between container and host. Once the package code is stabilized, I rebuilt the next version of image using ONBUILD. Though the repeated copy is increasing the image size, since COPY is not frequent as before, I can live with it. – obligated to keep his content – 2017-12-26T09:53:06.327