Author: Thomas Ferreira de Lima email: thomas@tlima.me I wrote these notes as I was learning how to use docker and how to build python packages inside a docker image prepared by the pypa team. They require us to build there to allow wheels to have the `manylinux1` tag, meaning that these wheels would be compatible with most linux distributions around. Chapter 1 is about testing in my own computer (MacOS Mojave) and Chapter 2 is about how to automate this build using travis-ci.org. # Chapter 1. Testing on your own computer ## Step 1. Make sure you have the quay.io/pypa/manylinux1_x86_64 image. ```bash $ docker images REPOSITORY TAG IMAGE ID CREATED SIZE quay.io/pypa/manylinux1_x86_64 latest 1c8429c548f2 2 months ago 879MB hello-world latest 4ab4c602aa5e 3 months ago 1.84kB # My image was old: $ docker pull quay.io/pypa/manylinux1_x86_64 Using default tag: latest latest: Pulling from pypa/manylinux1_x86_64 7d0d9526f38a: Already exists 3324bfadf9cb: Pull complete 20f27c7e3062: Pull complete 5bc21fc5fe97: Pull complete Digest: sha256:a13b2719fb21daebfe25c0173d80f8a85a2326dd994510d7879676e7a2193500 Status: Downloaded newer image for quay.io/pypa/manylinux1_x86_64:latest ``` ## Step 2. This step was inspired by https://dev.to/jibinliu/how-to-persist-data-in-docker-container-2m72 Create a volume for klayout. This is necessary because docker containers don't persist data. ```bash $ docker volume create klayout-persist $ docker volume inspect klayout-persist [ { "CreatedAt": "2018-12-18T15:01:48Z", "Driver": "local", "Labels": {}, "Mountpoint": "/var/lib/docker/volumes/klayout-persist/_data", "Name": "klayout-persist", "Options": {}, "Scope": "local" } ] ``` ## Step 3. Build image `myimage` with: ```bash $ docker build -t myimage:latest -f Dockerfile.x86_64 . ``` This creates an image called `myimage` (temporary). This image will not overwrite old ones. Tip: prune old, unused images with `docker image prune`. Then I run the docker with a terminal shell and load the volume klayout-persist in /persist: ```bash $ docker run --name klayout --mount source=klayout-persist,target=/persist -it myimage ``` ## Step 4. In the shell, pull master from klayout. ```bash cd /persist git clone https://github.com/lightwave-lab/klayout.git mkdir -p wheelhouse cd klayout # make wheel with python 3.6 (for example) /opt/python/cp36-cp36m/bin/python setup.py bdist_wheel -d /persist/wheelhouse/ cd /persist auditwheel repair "wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-linux_x86_64.whl" -w wheelhouse/ # Need to manually fix the wheel #/opt/python/cp36-cp36m/bin/pip install klayout --no-index -f /wheelhouse ``` The produced wheel from auditwheel, klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl, is defective in the following way: dbcore.so etc. have RPATHs reset to `$ORIGIN/.libs`, so we need to move all .so's `lib_*` into `.libs`, as well as `db_plugins`. We also need to change the dist-info/RECORD file paths. This is a bug from auditwheel, it should either have added a new RPATH, $ORIGIN/.libs, where it places libz, libcurl, libexpat, instead of renaming the existing ones, or moved the files to the right place. Procedure to fix the wheel: ```bash unzip wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl -d tempwheel cd tempwheel/klayout mv lib_* db_plugins .libs/ cd ../klayout-0.26.0.dev8.dist-info/ sed -i 's/^klayout\/lib_/klayout\/.libs\/lib_/g' RECORD sed -i 's/^klayout\/db_plugins/klayout\/.libs\/db_plugins/g' RECORD cd ../ rm -f ../wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl zip -r ../wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl ./* cd .. rm -rf tempwheel ``` Now we can install and test: ```bash /opt/python/cp36-cp36m/bin/pip install klayout --no-index -f /persist/wheelhouse cd /persist/klayout /opt/python/cp36-cp36m/bin/python -m unittest testdata/pymod/import_db.py testdata/pymod/import_rdb.py testdata/pymod/import_tl.py # Tests passed! ``` Encoded this behavior in a script called fix_wheel.sh. now you only need to run `./fix_wheel.sh wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl`, and it will overwrite the wheel. ## Step 5. Iterate over all python versions. For that, we need something like: ```bash # Compile wheels for PYBIN in /opt/python/*/bin; do "${PYBIN}/python" setup.py bdist_wheel -d /persist/wheelhouse/ done # Bundle external shared libraries into the wheels via auditwheel for whl in /persist/wheelhouse/*linux_*.whl; do auditwheel repair "$whl" -w /persist/wheelhouse/ done # Fix each wheel generated by auditwheel for whl in /persist/wheelhouse/*manylinux1_*.whl; do ./ci-scripts/docker/fix_wheel.sh "$whl" done # Install packages and test TEST_HOME=/persist/klayout/testdata for PYBIN in /opt/python/*/bin/; do "${PYBIN}/pip" install klayout --no-index -f /persist/wheelhouse "${PYBIN}/python" $TEST_HOME/pymod/import_db.py "${PYBIN}/python" $TEST_HOME/pymod/import_rdb.py "${PYBIN}/python" $TEST_HOME/pymod/import_tl.py ``` I tested step 1-5 with both quay.io/pypa/manylinux1_x86_64 and quay.io/pypa/manylinux1_i686. So far the only failure was with `cp27-cp27mu` which gave this import error: `ImportError: /opt/python/cp27-cp27mu/lib/python2.7/site-packages/klayout/.libs/lib_pya.so: undefined symbol: PyUnicodeUCS2_AsUTF8String` I noticed that the ccache folder ended up with 800MB. I was hoping that the gcc compilation could reuse a lot of previously built objects but that didn't happen. I think that's because each python comes with its own header. So going forward it doesn't make sense to create a docker image for every python version. I will just cache a ccache folder via travis. The ccache folder after a single build has 657MB. Go figure. I discovered that fix_wheel script was actually not properly working. So instead I looked into fixing `auditwheel` directly. Here's the commit that fixes it: https://github.com/thomaslima/auditwheel/tree/87f5306ec02cc68020afaa9933543c898b1d47c1 So now the plan is to change the `docker_build.sh` script so it uses the proper auditwheel, instead of their default. # Chapter 2. Testing CI flow with docker # Step 1. Testing commands in own computer First cloned with: ```bash git clone git@github.com:lightwave-lab/klayout.git -b tmp/manylinux cd klayout ``` Let's work with a few environment variables, like in https://github.com/pypa/python-manylinux-demo/blob/master/.travis.yml DOCKER_IMAGE options: quay.io/pypa/manylinux1_x86_64 quay.io/pypa/manylinux1_i686 PY_VERSION options: cp27-cp27m cp27-cp27mu cp34-cp34m cp35-cp35m cp36-cp36m cp37-cp37m Total of 2x6 = 12 possibilities ```bash export DOCKER_IMAGE=quay.io/pypa/manylinux1_x86_64 export PY_VERSION="cp36-cp36m" docker pull $DOCKER_IMAGE mkdir -p ccache mkdir -p wheelhouse docker run --name klayout -v `pwd` -it $DOCKER_IMAGE ``` Inside docker shell: ```bash yum install -y zlib-devel yum install -y zip yum install -y ccache ln -s /usr/bin/ccache /usr/lib64/ccache/c++ ln -s /usr/bin/ccache /usr/lib64/ccache/cc ln -s /usr/bin/ccache /usr/lib64/ccache/gcc ln -s /usr/bin/ccache /usr/lib64/ccache/g++ echo $PATH # /usr/lib64/ccache:/opt/rh/devtoolset-2/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin export CCACHE_DIR="/io/ccache" # Compile wheel /opt/python/$PY_VERSION/bin/python setup.py bdist_wheel -d /io/wheelhouse/ # Bundle external shared libraries into the wheels via auditwheel for whl in /io/wheelhouse/*linux_*.whl; do auditwheel repair "$whl" -w /io/wheelhouse/ done # Fix each wheel generated by auditwheel for whl in /io/wheelhouse/*manylinux1_*.whl; do ./ci-scripts/docker/fix_wheel.sh "$whl" done ``` # Step 2. Automating step 1 in travis (CI). DOCKER_IMAGE options: quay.io/pypa/manylinux1_x86_64 quay.io/pypa/manylinux1_i686 PY_VERSION options: cp27-cp27m cp27-cp27mu cp34-cp34m cp35-cp35m cp36-cp36m cp37-cp37m Build: spawn 12 travis jobs, one for each combination of word-size and python version. Output: populated ./ccache with compiled objects and wheels inside ./wheelhouse/, one useless, `*linux_*.whl` and one useful `*manylinux1_*.whl`. Post-build: - cache `./ccache` - deploy `./wheelhouse/*manylinux1_*.whl` to dropbox (./deploy folder) # Step 3. Automating deployment to PyPI: TBD