February 12, 2020

Container + Python 2 and Virtualenv: Does It Actually Work?

My GitLab CI used to play very well with container runners. It started to get wrong. After a night of hardwork, I have the solution now. But I still don't know the answer. All I could tell is that virtualenv is a useful tool but not a reliable one.

Container + Python 2 and Virtualenv: Does It Actually Work?

My GitLab CI used to play very well with container runners. However, after an image registry migration, I rebuit the runner image. It worked fine at the first time. Then after commits, it started to get wrong.

"No module named zipp" from Virtualenv

The first error came up when setting up virtualenv.

$ pip install virtualenv pip==9.0.3 --upgrade
Collecting virtualenv
....
Collecting pip==9.0.3
....
$ virtualenv venv
ERROR:root:ImportError: No module named zipp
ERROR: Job failed: exit code 1

I quickly created a local container from the same runtime image and this problem is re-producible. A simple debug showed that if I installed pip first then virtualenv, the problem would not happen. The bundled, old version 8.1.2 of pip might the cause of this problem.

So the solution is simple: separate the install command to upgrade pip first.

....
before_script:
  - pip install --upgrade pip==9.0.3
  - pip install virtualenv
....
Part of .gitlab-ci.yml

"Cannot call rmtree on a symbolic link" When Uninstalling Packages

Now the virtual environment could be created and activated. I thought it was a small incident, but it was nothing like that.

....
Installing collected packages: six, configparser, singledispatch, enum34, lazy-object-proxy, wrapt, backports.functools-lru-cache, astroid, mccabe, futures, isort, pylint, pyparsing, packaging, pip, PyYAML, idna, MarkupSafe, jinja2, pbr, requests, multi-key-dict, python-jenkins, monotonic, fasteners, stevedore, jenkins-job-builder
  Attempting uninstall: pip
    Found existing installation: pip 20.0.2
    Uninstalling pip-20.0.2:
ERROR: Could not install packages due to an EnvironmentError: Cannot call rmtree on a symbolic link

ERROR: Job failed: exit code 1

Again, I created a local container to test with the same commands. But this time I cannot reproduce it.

Why was that, what is that symbolic link? I created a virtual environment and list the site packages inside it.

[[email protected] builds]# virtualenv venv
[[email protected] builds]# ls -l venv/lib/python2.7/site-packages/
total 0
lrwxrwxrwx. 1 root root 116 Feb 11 17:18 easy_install.py -> /root/.local/share/virtualenv/seed-v1/2.7/image/SymlinkPipInstall/setuptools-44.0.0-py2.py3-none-any/easy_install.py
lrwxrwxrwx. 1 root root 117 Feb 11 17:18 easy_install.pyc -> /root/.local/share/virtualenv/seed-v1/2.7/image/SymlinkPipInstall/setuptools-44.0.0-py2.py3-none-any/easy_install.pyc
lrwxrwxrwx. 1 root root  97 Feb 11 17:18 pip -> /root/.local/share/virtualenv/seed-v1/2.7/image/SymlinkPipInstall/pip-20.0.2-py2.py3-none-any/pip
lrwxrwxrwx. 1 root root 114 Feb 11 17:18 pip-20.0.2.dist-info -> /root/.local/share/virtualenv/seed-v1/2.7/image/SymlinkPipInstall/pip-20.0.2-py2.py3-none-any/pip-20.0.2.dist-info
lrwxrwxrwx. 1 root root 125 Feb 11 17:18 pip-20.0.2.dist-info.virtualenv -> /root/.local/share/virtualenv/seed-v1/2.7/image/SymlinkPipInstall/pip-20.0.2-py2.py3-none-any/pip-20.0.2.dist-info.virtualenv
lrwxrwxrwx. 1 root root 114 Feb 11 17:18 pkg_resources -> /root/.local/share/virtualenv/seed-v1/2.7/image/SymlinkPipInstall/setuptools-44.0.0-py2.py3-none-any/pkg_resources
lrwxrwxrwx. 1 root root 111 Feb 11 17:18 setuptools -> /root/.local/share/virtualenv/seed-v1/2.7/image/SymlinkPipInstall/setuptools-44.0.0-py2.py3-none-any/setuptools
lrwxrwxrwx. 1 root root 128 Feb 11 17:18 setuptools-44.0.0.dist-info -> /root/.local/share/virtualenv/seed-v1/2.7/image/SymlinkPipInstall/setuptools-44.0.0-py2.py3-none-any/setuptools-44.0.0.dist-info
lrwxrwxrwx. 1 root root 139 Feb 11 17:18 setuptools-44.0.0.dist-info.virtualenv -> /root/.local/share/virtualenv/seed-v1/2.7/image/SymlinkPipInstall/setuptools-44.0.0-py2.py3-none-any/setuptools-44.0.0.dist-info.virtualenv
lrwxrwxrwx. 1 root root 101 Feb 11 17:18 wheel -> /root/.local/share/virtualenv/seed-v1/2.7/image/SymlinkPipInstall/wheel-0.34.2-py2.py3-none-any/wheel
lrwxrwxrwx. 1 root root 118 Feb 11 17:18 wheel-0.34.2.dist-info -> /root/.local/share/virtualenv/seed-v1/2.7/image/SymlinkPipInstall/wheel-0.34.2-py2.py3-none-any/wheel-0.34.2.dist-info
lrwxrwxrwx. 1 root root 129 Feb 11 17:18 wheel-0.34.2.dist-info.virtualenv -> /root/.local/share/virtualenv/seed-v1/2.7/image/SymlinkPipInstall/wheel-0.34.2-py2.py3-none-any/wheel-0.34.2.dist-info.virtualenv

There were a bunch of symlinks! In fact, virtualenv will create symbolic links as possible to save time while creating new environment. In that case, if there is a way to let virtualenv copy files instead, this problem would be solved.

Does Virtualenv Actually Work?

Is there a way to tell virtualenv to copy files? Yes!

[[email protected] builds]# virtualenv --help
usage: virtualenv [--version] [-v | -q] [--discovery {builtin}] [-p py] [--creator {builtin,cpython2-posix}] [--seeder {app-data,pip}] [--no-seed] [--activators comma_separated_list] [--clear] [--system-site-packages]
                  [--symlinks | --copies] [--download | --no-download] [--extra-search-dir d [d ...]] [--pip version] [--setuptools version] [--wheel version] [--no-pip] [--no-setuptools] [--no-wheel] [--clear-app-data] [--prompt prompt]
                  [-h]
....
  --symlinks                       try to use symlinks rather than copies, when symlinks are not the default for the platform (default: True)
  --copies, --always-copy          try to use copies rather than symlinks, even when symlinks are the default for the platform (default: False)
....
The help menu of virtualenv

From its help menu, specifying --copies should do the trick. But no, this virtualenv still created symbolic links even with the parameter! Why was that happening...I searched for related issues, but I only got one fixed.

virtualenv --always-copy fails on CentOS (lib64 problem?) · Issue #1332 · pypa/virtualenv
Creating a virtualenv with --always-copy on a platform that uses /lib64 directory seems to be impossible. On CentOS 7.6, the same happens both with the system packages (python 2.7.5, virtualenv 15....

Ways to Avoid Removing Files

I soon came up with a new idea: if the environment is created with certain pip and setuptools version, I will not have to reinstall any package. Therefore, there would be no need to remove files.

Luckily, virtualenv has a --pip and a --setuptools parameter. But when I typed virtualenv --pip 9.0.3 and enter, it actually blamed me with KeyError: u'pip'! How am I expected to use it? From its documentation:

Named Parameter: --pip

Default Value: latest

pip version to install, bundle for bundled

So latest apprently would not work for me. But what is bundle? I've tried virtualenv --pip bundle and it returned the same error. I found that there is another paramereter --seeder.

Named Parameter: --seeder

Default Value: app-data

seed packages install method; choice of: app-data, pip

I realized that the version of bundled pip  might be new ones, and if I use pip to seed, I can use older ones.

[[email protected] builds]# virtualenv --seeder pip --pip 9.0.3
DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
ERROR: Could not find a version that satisfies the requirement pip==9.0.3 (from versions: 19.1.1, 20.0.2)
ERROR: No matching distribution found for pip==9.0.3
RuntimeError: failed seed with code 1

It wasn't working as expected. But it led to a question: why do I have to make a decision between 2 given version even if I choose to seed from pip? That does not make any sense!

Seed Everything From Manual Commands

Fortunately there is my last options, --no-pip and --no-setuptools. I created an environment by running virtualenv --no-pip --no-setuptools. And I found easy_install was still installed, so I used it to setup the rest.

(venv) [[email protected] builds]# easy_install pip==9.0.3 setuptools
WARNING: The easy_install command is deprecated and will be removed in a future version.
Searching for pip==9.0.3
Reading https://pypi.org/simple/pip/
Downloading https://files.pythonhosted.org/packages/ac/95/a05b56bb975efa78d3557efa36acaf9cf5d2fd0ee0062060493687432e03/pip-9.0.3-py2.py3-none-any.whl#sha256=c3ede34530e0e0b2381e7363aded78e0c33291654937e7373032fda04e8803e5
Best match: pip 9.0.3
Processing pip-9.0.3-py2.py3-none-any.whl
Installing pip-9.0.3-py2.py3-none-any.whl to /builds/venv/lib/python2.7/site-packages
Adding pip 9.0.3 to easy-install.pth file
Installing pip script to /builds/venv/bin
Installing pip3.6 script to /builds/venv/bin
Installing pip3 script to /builds/venv/bin

Installed /builds/venv/lib/python2.7/site-packages/pip-9.0.3-py2.7.egg
Processing dependencies for pip==9.0.3
Finished processing dependencies for pip==9.0.3
Searching for setuptools
Best match: setuptools 44.0.0
Adding setuptools 44.0.0 to easy-install.pth file
Installing easy_install script to /builds/venv/bin
Installing easy_install-3.8 script to /builds/venv/bin

Using /builds/venv/lib/python2.7/site-packages
Processing dependencies for setuptools
Finished processing dependencies for setuptools

The easy_install command is deprected, yes. But this is the final solution.

....
before_script:
  - pip install --upgrade pip==9.0.3
  - pip install virtualenv
  - virtualenv venv --no-pip --no-setuptools
  - source venv/bin/activate
  - easy_install pip==9.0.3 setuptools
....
Part of .gitlab-ci.yml

So Why on Earth Did the Error Happen?

This symptom of the removal failure actually reminds me of an old issue.

passing -n ftype=1 to mkfs.xfs for overlay2 docker storage driver support · Issue #194 · projectatomic/container-storage-setup
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/7.2_Release_Notes/technology-preview-file_systems.html says that xfs file systems must be created with the -n ftype=1 o...

When I built my first OpenShift cluster with metals running CentOS on XFS, I forgot to mark the filesystems ftype=1. This caused the problem that image build would fail if there was any line containing commands that delete files. I really doubt if the GitLab runners are setup on an XFS filesystem where ftype=0. But since the runners are hosted, I would not know the details.

So why did things happen? After a night of hardwork, I have the solution now. But I still don't know the answer. It is so mistery. All I could tell is that virtualenv is a useful tool but not a reliable one.


Encore: "--only-binary" is not a New "--use-wheel"

There are some very old codes in this project, including something like pip install --use-wheel .... And they are causing errors.

....
no such option: --use-wheel
ERROR: Job failed: exit code 1

I learned that --use-wheel is deprecated since pip 7:

--no-use-wheel and --use-wheel are deprecated in favour of new options --no-binary and --only-binary. The equivalent of --no-use-wheel is --no-binary=:all:. (#2699)

And I found something like git grep -l -- --use-wheel | while read f; do sed -i -e 's|use-wheel|only-binary=:all:|g' ${f}; done to replace them all. But it still failed after appling the trick.

....
    pip.main(['install', '--only-binary', '--retries', '3'] + pkg_name + extra_cmd)
  File "/builds/libvirt-auto/libvirt-ci/venv/lib/python2.7/site-packages/pip/__init__.py", line 18, in main
    return _wrapper(args)
....
  File "/builds/libvirt-auto/libvirt-ci/venv/lib/python2.7/site-packages/pip/_internal/models/format_control.py", line 48, in handle_mutual_excludes
    "--no-binary / --only-binary option requires 1 argument."
pip._internal.exceptions.CommandError: --no-binary / --only-binary option requires 1 argument.
ERROR: Job failed: exit code 1

No, it is not a simple replacement from --use-wheel to --only-binary. So how does it work? I know that --use-wheel tells pip to use binary as possible instead of building from source. And I found that this became the default actions, meaning that I do not have to specify the parameter at all. Therefore, removing all the --use-wheel will solve the problem.