Adding documentation regarding manylinux, CI, and PyPI wheels

This commit is contained in:
Thomas Ferreira de Lima 2018-12-20 16:01:15 -05:00
parent 2975d57d22
commit 1bba72e45b
No known key found for this signature in database
GPG Key ID: 43E98870EAA0A86E
7 changed files with 249 additions and 207 deletions

View File

@ -1,216 +1,22 @@
# Chapter 1. Testing on your own computer Author: Thomas Ferreira de Lima
## Step 1. email: thomas@tlima.me
Make sure you have the quay.io/pypa/manylinux1_x86_64 image. This folder contains scripts to be run inside docker images. See instructions on how to test this yourself in ci-scripts/docker/development_notes.
## docker_build.sh
We need two environment variables to get going:
```bash ```bash
$ docker images DOCKER_IMAGE="quay.io/pypa/manylinux1_x86_64"
REPOSITORY TAG IMAGE ID CREATED SIZE PY_VERSION="cp37-cp37m"
quay.io/pypa/manylinux1_x86_64 latest 1c8429c548f2 2 months ago 879MB
hello-world latest 4ab4c602aa5e 3 months ago 1.84kB
# My image was old:
$ docker pull quay.io/pypa/manylinux1_x86_64
Using default tag: latest
latest: Pulling from pypa/manylinux1_x86_64
7d0d9526f38a: Already exists
3324bfadf9cb: Pull complete
20f27c7e3062: Pull complete
5bc21fc5fe97: Pull complete
Digest: sha256:a13b2719fb21daebfe25c0173d80f8a85a2326dd994510d7879676e7a2193500
Status: Downloaded newer image for quay.io/pypa/manylinux1_x86_64:latest
``` ```
## Step 2. The script must be run inside an image pulled from $DOCKER_IMAGE and with klayout's git repo cloned in /io. Inside the git clone folder, run:
This step was inspired by https://dev.to/jibinliu/how-to-persist-data-in-docker-container-2m72
Create a volume for klayout. This is necessary because docker containers don't persist data.
```bash ```bash
$ docker volume create klayout-persist docker run --rm -e DOCKER_IMAGE -e PY_VERSION -v `pwd`:/io $DOCKER_IMAGE $PRE_CMD "/io/ci-scripts/docker/docker_build.sh";
$ docker volume inspect klayout-persist # $PRE_CMD is empty for now (useless currently).
[
{
"CreatedAt": "2018-12-18T15:01:48Z",
"Driver": "local",
"Labels": {},
"Mountpoint": "/var/lib/docker/volumes/klayout-persist/_data",
"Name": "klayout-persist",
"Options": {},
"Scope": "local"
}
]
``` ```
## Step 3. This command will generate a wheel and place it in `wheelhouse/klayout-*manylinux1*.whl`. This is the wheel that needs to be uploaded to PyPI via twine. See ci-scripts/twine/README.md.
Build image `myimage` with:
```bash
$ docker build -t myimage:latest -f Dockerfile.x86_64 .
```
This creates an image called `myimage` (temporary). This image will not overwrite old ones. Tip: prune old, unused images with `docker image prune`.
Then I run the docker with a terminal shell and load the volume klayout-persist in /persist:
```bash
$ docker run --name klayout --mount source=klayout-persist,target=/persist -it myimage
```
## Step 4.
In the shell, pull master from klayout.
```bash
cd /persist
git clone https://github.com/lightwave-lab/klayout.git
mkdir -p wheelhouse
cd klayout
# make wheel with python 3.6 (for example)
/opt/python/cp36-cp36m/bin/python setup.py bdist_wheel -d /persist/wheelhouse/
cd /persist
auditwheel repair "wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-linux_x86_64.whl" -w wheelhouse/
# Need to manually fix the wheel
#/opt/python/cp36-cp36m/bin/pip install klayout --no-index -f /wheelhouse
```
The produced wheel from auditwheel, klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl, is defective in the following way: dbcore.so etc. have RPATHs reset to `$ORIGIN/.libs`, so we need to move all .so's `lib_*` into `.libs`, as well as `db_plugins`. We also need to change the dist-info/RECORD file paths. This is a bug from auditwheel, it should either have added a new RPATH, $ORIGIN/.libs, where it places libz, libcurl, libexpat, instead of renaming the existing ones, or moved the files to the right place.
Procedure to fix the wheel:
```bash
unzip wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl -d tempwheel
cd tempwheel/klayout
mv lib_* db_plugins .libs/
cd ../klayout-0.26.0.dev8.dist-info/
sed -i 's/^klayout\/lib_/klayout\/.libs\/lib_/g' RECORD
sed -i 's/^klayout\/db_plugins/klayout\/.libs\/db_plugins/g' RECORD
cd ../
rm -f ../wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl
zip -r ../wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl ./*
cd ..
rm -rf tempwheel
```
Now we can install and test:
```bash
/opt/python/cp36-cp36m/bin/pip install klayout --no-index -f /persist/wheelhouse
cd /persist/klayout
/opt/python/cp36-cp36m/bin/python -m unittest testdata/pymod/import_db.py testdata/pymod/import_rdb.py testdata/pymod/import_tl.py
# Tests passed!
```
Encoded this behavior in a script called fix_wheel.sh. now you only need to run `./fix_wheel.sh wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl`, and it will overwrite the wheel.
## Step 5. Iterate over all python versions.
For that, we need something like:
```bash
# Compile wheels
for PYBIN in /opt/python/*/bin; do
"${PYBIN}/python" setup.py bdist_wheel -d /persist/wheelhouse/
done
# Bundle external shared libraries into the wheels via auditwheel
for whl in /persist/wheelhouse/*linux_*.whl; do
auditwheel repair "$whl" -w /persist/wheelhouse/
done
# Fix each wheel generated by auditwheel
for whl in /persist/wheelhouse/*manylinux1_*.whl; do
./ci-scripts/docker/fix_wheel.sh "$whl"
done
# Install packages and test
TEST_HOME=/persist/klayout/testdata
for PYBIN in /opt/python/*/bin/; do
"${PYBIN}/pip" install klayout --no-index -f /persist/wheelhouse
"${PYBIN}/python" $TEST_HOME/pymod/import_db.py
"${PYBIN}/python" $TEST_HOME/pymod/import_rdb.py
"${PYBIN}/python" $TEST_HOME/pymod/import_tl.py
```
I tested step 1-5 with both quay.io/pypa/manylinux1_x86_64 and quay.io/pypa/manylinux1_i686. So far the only failure was with `cp27-cp27mu` which gave this import error:
`ImportError: /opt/python/cp27-cp27mu/lib/python2.7/site-packages/klayout/.libs/lib_pya.so: undefined symbol: PyUnicodeUCS2_AsUTF8String`
I noticed that the ccache folder ended up with 800MB. I was hoping that the gcc compilation could reuse a lot of previously built objects but that didn't happen. I think that's because each python comes with its own header. So going forward it doesn't make sense to create a docker image for every python version. I will just cache a ccache folder via travis.
The ccache folder after a single build has 657MB. Go figure.
I discovered that fix_wheel script was actually not properly working. So instead I looked into fixing `auditwheel` directly. Here's the commit that fixes it: https://github.com/thomaslima/auditwheel/tree/87f5306ec02cc68020afaa9933543c898b1d47c1
So now the plan is to change the `docker_build.sh` script so it uses the proper auditwheel, instead of their default.
# Chapter 2. Testing CI flow with docker
# Step 1. Testing commands in own computer
First cloned with:
```bash
git clone git@github.com:lightwave-lab/klayout.git -b tmp/manylinux
cd klayout
```
Let's work with a few environment variables, like in https://github.com/pypa/python-manylinux-demo/blob/master/.travis.yml
DOCKER_IMAGE options: quay.io/pypa/manylinux1_x86_64 quay.io/pypa/manylinux1_i686
PY_VERSION options: cp27-cp27m cp27-cp27mu cp34-cp34m cp35-cp35m cp36-cp36m cp37-cp37m
Total of 2x6 = 12 possibilities
```bash
export DOCKER_IMAGE=quay.io/pypa/manylinux1_x86_64
export PY_VERSION="cp36-cp36m"
docker pull $DOCKER_IMAGE
mkdir -p ccache
mkdir -p wheelhouse
docker run --name klayout -v `pwd` -it $DOCKER_IMAGE
```
Inside docker shell:
```bash
yum install -y zlib-devel
yum install -y zip
yum install -y ccache
ln -s /usr/bin/ccache /usr/lib64/ccache/c++
ln -s /usr/bin/ccache /usr/lib64/ccache/cc
ln -s /usr/bin/ccache /usr/lib64/ccache/gcc
ln -s /usr/bin/ccache /usr/lib64/ccache/g++
echo $PATH
# /usr/lib64/ccache:/opt/rh/devtoolset-2/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
export CCACHE_DIR="/io/ccache"
# Compile wheel
/opt/python/$PY_VERSION/bin/python setup.py bdist_wheel -d /io/wheelhouse/
# Bundle external shared libraries into the wheels via auditwheel
for whl in /io/wheelhouse/*linux_*.whl; do
auditwheel repair "$whl" -w /io/wheelhouse/
done
# Fix each wheel generated by auditwheel
for whl in /io/wheelhouse/*manylinux1_*.whl; do
./ci-scripts/docker/fix_wheel.sh "$whl"
done
```
# Step 2. Automating step 1 in travis (CI).
DOCKER_IMAGE options: quay.io/pypa/manylinux1_x86_64 quay.io/pypa/manylinux1_i686
PY_VERSION options: cp27-cp27m cp27-cp27mu cp34-cp34m cp35-cp35m cp36-cp36m cp37-cp37m
Build: spawn 12 travis jobs, one for each combination of word-size and python version.
Output: populated ./ccache with compiled objects and wheels inside ./wheelhouse/, one useless, `*linux_*.whl` and one useful `*manylinux1_*.whl`.
Post-build:
- cache `./ccache`
- deploy `./wheelhouse/*manylinux1_*.whl` to dropbox (./deploy folder)
# Step 3. Automating deployment to PyPI:
TBD

View File

@ -0,0 +1,221 @@
Author: Thomas Ferreira de Lima
email: thomas@tlima.me
I wrote these notes as I was learning how to use docker and how to build python packages inside a docker image prepared by the pypa team. They require us to build there to allow wheels to have the `manylinux1` tag, meaning that these wheels would be compatible with most linux distributions around. Chapter 1 is about testing in my own computer (MacOS Mojave) and Chapter 2 is about how to automate this build using travis-ci.org.
# Chapter 1. Testing on your own computer
## Step 1.
Make sure you have the quay.io/pypa/manylinux1_x86_64 image.
```bash
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/pypa/manylinux1_x86_64 latest 1c8429c548f2 2 months ago 879MB
hello-world latest 4ab4c602aa5e 3 months ago 1.84kB
# My image was old:
$ docker pull quay.io/pypa/manylinux1_x86_64
Using default tag: latest
latest: Pulling from pypa/manylinux1_x86_64
7d0d9526f38a: Already exists
3324bfadf9cb: Pull complete
20f27c7e3062: Pull complete
5bc21fc5fe97: Pull complete
Digest: sha256:a13b2719fb21daebfe25c0173d80f8a85a2326dd994510d7879676e7a2193500
Status: Downloaded newer image for quay.io/pypa/manylinux1_x86_64:latest
```
## Step 2.
This step was inspired by https://dev.to/jibinliu/how-to-persist-data-in-docker-container-2m72
Create a volume for klayout. This is necessary because docker containers don't persist data.
```bash
$ docker volume create klayout-persist
$ docker volume inspect klayout-persist
[
{
"CreatedAt": "2018-12-18T15:01:48Z",
"Driver": "local",
"Labels": {},
"Mountpoint": "/var/lib/docker/volumes/klayout-persist/_data",
"Name": "klayout-persist",
"Options": {},
"Scope": "local"
}
]
```
## Step 3.
Build image `myimage` with:
```bash
$ docker build -t myimage:latest -f Dockerfile.x86_64 .
```
This creates an image called `myimage` (temporary). This image will not overwrite old ones. Tip: prune old, unused images with `docker image prune`.
Then I run the docker with a terminal shell and load the volume klayout-persist in /persist:
```bash
$ docker run --name klayout --mount source=klayout-persist,target=/persist -it myimage
```
## Step 4.
In the shell, pull master from klayout.
```bash
cd /persist
git clone https://github.com/lightwave-lab/klayout.git
mkdir -p wheelhouse
cd klayout
# make wheel with python 3.6 (for example)
/opt/python/cp36-cp36m/bin/python setup.py bdist_wheel -d /persist/wheelhouse/
cd /persist
auditwheel repair "wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-linux_x86_64.whl" -w wheelhouse/
# Need to manually fix the wheel
#/opt/python/cp36-cp36m/bin/pip install klayout --no-index -f /wheelhouse
```
The produced wheel from auditwheel, klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl, is defective in the following way: dbcore.so etc. have RPATHs reset to `$ORIGIN/.libs`, so we need to move all .so's `lib_*` into `.libs`, as well as `db_plugins`. We also need to change the dist-info/RECORD file paths. This is a bug from auditwheel, it should either have added a new RPATH, $ORIGIN/.libs, where it places libz, libcurl, libexpat, instead of renaming the existing ones, or moved the files to the right place.
Procedure to fix the wheel:
```bash
unzip wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl -d tempwheel
cd tempwheel/klayout
mv lib_* db_plugins .libs/
cd ../klayout-0.26.0.dev8.dist-info/
sed -i 's/^klayout\/lib_/klayout\/.libs\/lib_/g' RECORD
sed -i 's/^klayout\/db_plugins/klayout\/.libs\/db_plugins/g' RECORD
cd ../
rm -f ../wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl
zip -r ../wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl ./*
cd ..
rm -rf tempwheel
```
Now we can install and test:
```bash
/opt/python/cp36-cp36m/bin/pip install klayout --no-index -f /persist/wheelhouse
cd /persist/klayout
/opt/python/cp36-cp36m/bin/python -m unittest testdata/pymod/import_db.py testdata/pymod/import_rdb.py testdata/pymod/import_tl.py
# Tests passed!
```
Encoded this behavior in a script called fix_wheel.sh. now you only need to run `./fix_wheel.sh wheelhouse/klayout-0.26.0.dev8-cp36-cp36m-manylinux1_x86_64.whl`, and it will overwrite the wheel.
## Step 5. Iterate over all python versions.
For that, we need something like:
```bash
# Compile wheels
for PYBIN in /opt/python/*/bin; do
"${PYBIN}/python" setup.py bdist_wheel -d /persist/wheelhouse/
done
# Bundle external shared libraries into the wheels via auditwheel
for whl in /persist/wheelhouse/*linux_*.whl; do
auditwheel repair "$whl" -w /persist/wheelhouse/
done
# Fix each wheel generated by auditwheel
for whl in /persist/wheelhouse/*manylinux1_*.whl; do
./ci-scripts/docker/fix_wheel.sh "$whl"
done
# Install packages and test
TEST_HOME=/persist/klayout/testdata
for PYBIN in /opt/python/*/bin/; do
"${PYBIN}/pip" install klayout --no-index -f /persist/wheelhouse
"${PYBIN}/python" $TEST_HOME/pymod/import_db.py
"${PYBIN}/python" $TEST_HOME/pymod/import_rdb.py
"${PYBIN}/python" $TEST_HOME/pymod/import_tl.py
```
I tested step 1-5 with both quay.io/pypa/manylinux1_x86_64 and quay.io/pypa/manylinux1_i686. So far the only failure was with `cp27-cp27mu` which gave this import error:
`ImportError: /opt/python/cp27-cp27mu/lib/python2.7/site-packages/klayout/.libs/lib_pya.so: undefined symbol: PyUnicodeUCS2_AsUTF8String`
I noticed that the ccache folder ended up with 800MB. I was hoping that the gcc compilation could reuse a lot of previously built objects but that didn't happen. I think that's because each python comes with its own header. So going forward it doesn't make sense to create a docker image for every python version. I will just cache a ccache folder via travis.
The ccache folder after a single build has 657MB. Go figure.
I discovered that fix_wheel script was actually not properly working. So instead I looked into fixing `auditwheel` directly. Here's the commit that fixes it: https://github.com/thomaslima/auditwheel/tree/87f5306ec02cc68020afaa9933543c898b1d47c1
So now the plan is to change the `docker_build.sh` script so it uses the proper auditwheel, instead of their default.
# Chapter 2. Testing CI flow with docker
# Step 1. Testing commands in own computer
First cloned with:
```bash
git clone git@github.com:lightwave-lab/klayout.git -b tmp/manylinux
cd klayout
```
Let's work with a few environment variables, like in https://github.com/pypa/python-manylinux-demo/blob/master/.travis.yml
DOCKER_IMAGE options: quay.io/pypa/manylinux1_x86_64 quay.io/pypa/manylinux1_i686
PY_VERSION options: cp27-cp27m cp27-cp27mu cp34-cp34m cp35-cp35m cp36-cp36m cp37-cp37m
Total of 2x6 = 12 possibilities
```bash
export DOCKER_IMAGE=quay.io/pypa/manylinux1_x86_64
export PY_VERSION="cp36-cp36m"
docker pull $DOCKER_IMAGE
mkdir -p ccache
mkdir -p wheelhouse
docker run --name klayout -v `pwd` -it $DOCKER_IMAGE
```
Inside docker shell:
```bash
yum install -y zlib-devel
yum install -y zip
yum install -y ccache
ln -s /usr/bin/ccache /usr/lib64/ccache/c++
ln -s /usr/bin/ccache /usr/lib64/ccache/cc
ln -s /usr/bin/ccache /usr/lib64/ccache/gcc
ln -s /usr/bin/ccache /usr/lib64/ccache/g++
echo $PATH
# /usr/lib64/ccache:/opt/rh/devtoolset-2/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
export CCACHE_DIR="/io/ccache"
# Compile wheel
/opt/python/$PY_VERSION/bin/python setup.py bdist_wheel -d /io/wheelhouse/
# Bundle external shared libraries into the wheels via auditwheel
for whl in /io/wheelhouse/*linux_*.whl; do
auditwheel repair "$whl" -w /io/wheelhouse/
done
# Fix each wheel generated by auditwheel
for whl in /io/wheelhouse/*manylinux1_*.whl; do
./ci-scripts/docker/fix_wheel.sh "$whl"
done
```
# Step 2. Automating step 1 in travis (CI).
DOCKER_IMAGE options: quay.io/pypa/manylinux1_x86_64 quay.io/pypa/manylinux1_i686
PY_VERSION options: cp27-cp27m cp27-cp27mu cp34-cp34m cp35-cp35m cp36-cp36m cp37-cp37m
Build: spawn 12 travis jobs, one for each combination of word-size and python version.
Output: populated ./ccache with compiled objects and wheels inside ./wheelhouse/, one useless, `*linux_*.whl` and one useful `*manylinux1_*.whl`.
Post-build:
- cache `./ccache`
- deploy `./wheelhouse/*manylinux1_*.whl` to dropbox (./deploy folder)
# Step 3. Automating deployment to PyPI:
TBD

View File

@ -0,0 +1,15 @@
After building all the travis wheels, go to the folder where the wheels were deployed. In my case, for example, `/Users/tlima/Dropbox/Apps/travis-deploy/Builds/klayout/dist-pymod/0.26.0.dev10`.
Then, run the command
```bash
travis upload *.whl
```
, which will ask for an username and password related to your account.
After this upload was successful, you can upload the source tarball. Go to klayout's git folder and run `python setup.py sdist`. Inside `dist/`, you'll find a tarball named, e.g., `klayout-0.26.0.dev10.tar.gz`. So just run
```bash
travis upload klayout-0.26.0.dev10.tar.gz
```