Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of files in the generated ZIP file is indeterministic #183

Closed
agronholm opened this issue Apr 3, 2017 · 14 comments
Closed

Order of files in the generated ZIP file is indeterministic #183

agronholm opened this issue Apr 3, 2017 · 14 comments
Labels

Comments

@agronholm
Copy link
Contributor

Originally reported by: Matthias Bach (Bitbucket: MatthiasBach, GitHub: Unknown)


While the order of entries in the RECORDS files is explicitly made deterministic, everything but the dist-info directory is written to the generated ZIP file in random order. Technically it relies on os.walk, but that gives an undefined order. When attempting to create reproducible builds this can cause builds on different working copies to produce wheels with different check sums. Sadly this is pretty hard to test for, as given the unspecified order of os.walk there is no definitive way to mess it up.


@FRidh
Copy link

FRidh commented Nov 23, 2017

This issue does not seem to be solved entirely:

$ diffoscope /nix/store/bcr7zspwddzr93ab31a7zw4qv6bj3z45-python3.6-py-1.4.34 /nix/store/bcr7zspwddzr93ab31a7zw4qv6bj3z45-python3.6-py-1.4.34.check
--- /nix/store/bcr7zspwddzr93ab31a7zw4qv6bj3z45-python3.6-py-1.4.34
+++ /nix/store/bcr7zspwddzr93ab31a7zw4qv6bj3z45-python3.6-py-1.4.34.check
├── lib
│ ├── python3.6
│ │ ├── site-packages
│ │ │ ├── py-1.4.34.dist-info
│ │ │ │ ├── RECORD
│ │ │ │ │┄ ordering differences only
│ │ │ │ │ @@ -35,40 +35,40 @@
│ │ │ │ │  py-1.4.34.dist-info/LICENSE.txt,sha256=lzT2iwmQMhJkdHxOoJR_hS0kgPQRX2RJzjv7_aF32OM,1080
│ │ │ │ │  py-1.4.34.dist-info/METADATA,sha256=hEWF428bV0ssV8MGbb9VHcTn-XAn2DAvnzf1PkHJb-Y,1732
│ │ │ │ │  py-1.4.34.dist-info/RECORD,,
│ │ │ │ │  py-1.4.34.dist-info/WHEEL,sha256=kdsN-5OJAZIiHN-iO4Rhl82KyS0bDWf4uBwMbkNafr8,110
│ │ │ │ │  py-1.4.34.dist-info/metadata.json,sha256=5kCkr9nTzxboZYvPq9Cjw2k8WuzOL2UjsIcYaGRXJe0,1010
│ │ │ │ │  py-1.4.34.dist-info/top_level.txt,sha256=rwh8_ukTaGscjyhGkBVcsGOMdc-Cfdz2QH7BKGENv-4,3
│ │ │ │ │  py-1.4.34.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
│ │ │ │ │ -py/_path/__pycache__/common.cpython-36.pyc,,
│ │ │ │ │ -py/_path/__pycache__/__init__.cpython-36.pyc,,
│ │ │ │ │ -py/_path/__pycache__/local.cpython-36.pyc,,
│ │ │ │ │ -py/_path/__pycache__/cacheutil.cpython-36.pyc,,
│ │ │ │ │ -py/_path/__pycache__/svnurl.cpython-36.pyc,,
│ │ │ │ │ -py/_path/__pycache__/svnwc.cpython-36.pyc,,
│ │ │ │ │ +py/__pycache__/test.cpython-36.pyc,,
│ │ │ │ │ +py/__pycache__/_xmlgen.cpython-36.pyc,,
│ │ │ │ │  py/__pycache__/_std.cpython-36.pyc,,
│ │ │ │ │ -py/__pycache__/_builtin.cpython-36.pyc,,
│ │ │ │ │ -py/__pycache__/__init__.cpython-36.pyc,,
│ │ │ │ │  py/__pycache__/_iniconfig.cpython-36.pyc,,
│ │ │ │ │ -py/__pycache__/_xmlgen.cpython-36.pyc,,
│ │ │ │ │ -py/__pycache__/test.cpython-36.pyc,,
│ │ │ │ │ -py/__pycache__/_apipkg.cpython-36.pyc,,
│ │ │ │ │  py/__pycache__/_error.cpython-36.pyc,,
│ │ │ │ │ +py/__pycache__/_builtin.cpython-36.pyc,,
│ │ │ │ │ +py/__pycache__/_apipkg.cpython-36.pyc,,
│ │ │ │ │  py/__pycache__/__metainfo.cpython-36.pyc,,
│ │ │ │ │ -py/_io/__pycache__/saferepr.cpython-36.pyc,,
│ │ │ │ │ -py/_io/__pycache__/__init__.cpython-36.pyc,,
│ │ │ │ │ -py/_io/__pycache__/capture.cpython-36.pyc,,
│ │ │ │ │ -py/_io/__pycache__/terminalwriter.cpython-36.pyc,,
│ │ │ │ │ -py/_process/__pycache__/cmdexec.cpython-36.pyc,,
│ │ │ │ │ +py/__pycache__/__init__.cpython-36.pyc,,
│ │ │ │ │ +py/_process/__pycache__/killproc.cpython-36.pyc,,
│ │ │ │ │  py/_process/__pycache__/forkedfunc.cpython-36.pyc,,
│ │ │ │ │ +py/_process/__pycache__/cmdexec.cpython-36.pyc,,
│ │ │ │ │  py/_process/__pycache__/__init__.cpython-36.pyc,,
│ │ │ │ │ -py/_process/__pycache__/killproc.cpython-36.pyc,,
│ │ │ │ │ -py/_log/__pycache__/__init__.cpython-36.pyc,,
│ │ │ │ │ -py/_log/__pycache__/log.cpython-36.pyc,,
│ │ │ │ │ +py/_path/__pycache__/svnwc.cpython-36.pyc,,
│ │ │ │ │ +py/_path/__pycache__/svnurl.cpython-36.pyc,,
│ │ │ │ │ +py/_path/__pycache__/local.cpython-36.pyc,,
│ │ │ │ │ +py/_path/__pycache__/common.cpython-36.pyc,,
│ │ │ │ │ +py/_path/__pycache__/cacheutil.cpython-36.pyc,,
│ │ │ │ │ +py/_path/__pycache__/__init__.cpython-36.pyc,,
│ │ │ │ │  py/_log/__pycache__/warning.cpython-36.pyc,,
│ │ │ │ │ -py/_code/__pycache__/_assertionold.cpython-36.pyc,,
│ │ │ │ │ +py/_log/__pycache__/log.cpython-36.pyc,,
│ │ │ │ │ +py/_log/__pycache__/__init__.cpython-36.pyc,,
│ │ │ │ │ +py/_io/__pycache__/terminalwriter.cpython-36.pyc,,
│ │ │ │ │ +py/_io/__pycache__/saferepr.cpython-36.pyc,,
│ │ │ │ │ +py/_io/__pycache__/capture.cpython-36.pyc,,
│ │ │ │ │ +py/_io/__pycache__/__init__.cpython-36.pyc,,
│ │ │ │ │  py/_code/__pycache__/source.cpython-36.pyc,,
│ │ │ │ │ -py/_code/__pycache__/_py2traceback.cpython-36.pyc,,
│ │ │ │ │  py/_code/__pycache__/code.cpython-36.pyc,,
│ │ │ │ │ -py/_code/__pycache__/__init__.cpython-36.pyc,,
│ │ │ │ │ -py/_code/__pycache__/_assertionnew.cpython-36.pyc,,
│ │ │ │ │  py/_code/__pycache__/assertion.cpython-36.pyc,,
│ │ │ │ │ +py/_code/__pycache__/_py2traceback.cpython-36.pyc,,
│ │ │ │ │ +py/_code/__pycache__/_assertionold.cpython-36.pyc,,
│ │ │ │ │ +py/_code/__pycache__/_assertionnew.cpython-36.pyc,,
│ │ │ │ │ +py/_code/__pycache__/__init__.cpython-36.pyc,,

I've had more cases where the order is still not deterministic, e.g.

$ diffoscope /nix/store/vn45ajf0ikbwz8flghfi2h8r73b8b9bg-python2.7-setuptools_scm-1.15.6 /nix/store/vn45ajf0ikbwz8flghfi2h8r73b8b9bg-python2.7-setuptools_scm-1.15.6.check
--- /nix/store/vn45ajf0ikbwz8flghfi2h8r73b8b9bg-python2.7-setuptools_scm-1.15.6
+++ /nix/store/vn45ajf0ikbwz8flghfi2h8r73b8b9bg-python2.7-setuptools_scm-1.15.6.check
├── lib
│ ├── python2.7
│ │ ├── site-packages
│ │ │ ├── setuptools_scm-1.15.6.dist-info
│ │ │ │ ├── RECORD
│ │ │ │ │ @@ -17,12 +17,12 @@
│ │ │ │ │  setuptools_scm-1.15.6.dist-info/top_level.txt,sha256=kiu-91q3_rJLUoc2wl8_lC4cIlpgtgdD_4NaChF4hOA,15
│ │ │ │ │  setuptools_scm-1.15.6.dist-info/zip-safe,sha256=AbpHGcgLb-kRsJGnwFEktk7uzpZOCcBY74-YBdrKVGs,1
│ │ │ │ │  setuptools_scm-1.15.6.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
│ │ │ │ │  setuptools_scm/hacks.pyc,,
│ │ │ │ │  setuptools_scm/git.pyc,,
│ │ │ │ │  setuptools_scm/integration.pyc,,
│ │ │ │ │  setuptools_scm/utils.pyc,,
│ │ │ │ │ -setuptools_scm/discover.pyc,,
│ │ │ │ │  setuptools_scm/__main__.pyc,,
│ │ │ │ │  setuptools_scm/version.pyc,,
│ │ │ │ │  setuptools_scm/hg.pyc,,
│ │ │ │ │ +setuptools_scm/discover.pyc,,
│ │ │ │ │  setuptools_scm/__init__.pyc,,

cc @agronholm @theMarix

@agronholm
Copy link
Contributor Author

How did you reproduce the problem?

@agronholm
Copy link
Contributor Author

I'd like to point out that the files marked here are .pyc files which are NOT packaged by wheel.

@FRidh
Copy link

FRidh commented Nov 23, 2017

I'd like to point out that the files marked here are .pyc files which are NOT packaged by wheel.

Well, they do appear in the RECORD file as you can see.

This is with our Nix builds, with NixOS/nixpkgs@a30fa6d, which has wheel at 0.30.0, setuptools at 36.4.0 and pip at 9.0.1.

@FRidh
Copy link

FRidh commented Nov 23, 2017

Also, according to PEP 427 wheels may contain bytecode:

Wheel, being an installation format that is intended to work across multiple versions of Python, does not generally include .pyc files.

All possible entries are hashed, including any generated files such as .pyc files

@agronholm
Copy link
Contributor Author

Can you give me exact repro steps for this? No matter what I do, I cannot reproduce the problem locally.

@agronholm
Copy link
Contributor Author

And by that I mean that even if the .pyc files exist, wheel does not pack them.

@FRidh
Copy link

FRidh commented Nov 23, 2017

That is going to be interesting, due to our build infra. Short story, we build our packages using
python setup.py bdist_wheel and install them with pip install *.whl.

What is relevant is how we do the building of the wheels. The exact line for building the wheel is:

(${python.interpreter} nix_run_setup.py ${lib.optionalString (setupPyBuildFlags != []) ("build_ext " + (lib.concatStringsSep " " setupPyBuildFlags))} bdist_wheel)

where nix_run_setup.py is a shim that inserts setuptools. This is likely not needed anymore though. Note that there are no setupPyBuildFlags so that part evaluates to "".
Furthermore, SOURCE_DATE_EPOCH=1

@agronholm
Copy link
Contributor Author

agronholm commented Nov 23, 2017

So are you saying that the actual generated wheel files (and not just the installation directories) contain .pyc files?

@FRidh
Copy link

FRidh commented Nov 23, 2017

So, the wheels do not contain .pyc files, and neither does the RECORD file inside the wheel. However, when pip installs the wheel, it writes the RECORD to the py-1.4.34.dist-info] folder, and then apparently updates it, including the bytecode. It seems like I am in the wrong project then ;)

@FRidh
Copy link

FRidh commented Nov 23, 2017

So it seems that wheel indeed updates the RECORD when installing a wheel
https://github.com/pypa/wheel/blob/master/wheel/install.py#L318

@pfmoore
Copy link
Member

pfmoore commented Nov 23, 2017

That's correct, and pip does this as well. The RECORD file in the installed metadata needs to contain actual paths, to support uninstall.

@FRidh
Copy link

FRidh commented Nov 23, 2017

I'll open an issue on the Pip tracker. Both pip and wheel seem to have their own implementation on how to install a wheel. Scanning wheel's implementation, they seem to do the ordering correct (though I could have missed it), but looking at Pip's implementation, they seem to be using a set to record what is installed. Now, I suppose that should still be fine with PYTHONHASHSEED=0 but maybe somewhere else it goes wrong.

@FRidh
Copy link

FRidh commented Nov 23, 2017

Oh, look at that. It should be fixed on Pip master: pypa/pip#4667

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants