Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mark requested for installing packages #12948

Closed
1 task done
eirnym opened this issue Sep 1, 2024 · 17 comments
Closed
1 task done

Add mark requested for installing packages #12948

eirnym opened this issue Sep 1, 2024 · 17 comments
Labels
resolution: needs standard Should be agreed as a standard before implementation state: needs discussion This needs some more discussion type: feature request Request for a new feature

Comments

@eirnym
Copy link

eirnym commented Sep 1, 2024

What's the problem this feature will solve?

On my development machine I have 2 kinds of virtual environments: per-project and for general development and I'd like to upgrade packages I'm interested in, and dependencies in cascade as it's done in most system package managers.

VE for general development are usually shared between projects like one for general tools such as black, ruff, tox, other one for general toying with Jupyter Lab and few other packages. In these environments I install packages manually, have no requirements.txt or pyproject.toml files in any form as there's no specific app or library associated with them.

I'd like to do regular upgrades using script automation for those environments without recreating them from scratch.

Problem is when I list packages, which are not required (pip list --not-required), some of these tools and libraries I requested are actually listed as a dependency for other one. Good examples are black, ruff and plugins for pytest/jupyterlab/etc. But I'd like to see only those I explicitly requested including those that are often in requirements of some project.

Describe the solution you'd like

When I use a system package manager such as FreeBSD ports, MacPorts, Homebrew, pacman/yay/apt and others I can obtain list of packages, which have've explicitly requested, thus the most important.

The primary difference between statuses requested and no dependencies is requested packages are important for a user, as they are maybe being used elsewhere (e.g. go is used by me to build some projects and as a build dependency for packages). Thus, packages not marked as requested and have no dependencies are marked as leaves and could be removed at any time.

List of requested packages is also the packages a user would like to have after installation of new version of system and/or package manager.

This requires to add an additional flag to be stored in metadata and maybe an additional package management commands/flags to toggle this flag explicitly and list them.

Packages could be added as a requested per my opinion are:

  • Manually installed using pip install ... including an editable installation
  • Installed using requirements file as you have no project to track them otherwise.

This solution would help community who're using such projects as pip-autoremove.

Alternative Solutions

Naive approach

Execute pip install -U <package> for all packages you interested in. For shared virtual environments between projects, it's easy to forget a tool to upgrade. Also if you won't list all tools you're interested in, it's easy to make VE unusable if you don't upgrade a packages for too long ("it've just worked for a year, why should I upgrade?")

Naive semi-automated approach

Naive approach is pip install -U $(pip list --format=freeze | awk -F= '{print $1}'). This would easily lead to an unmaintainable virtual environment due to dependency incompatibilities. Also for environments with a lot of packages, list may be too huge to be out of command line arguments limits, thus you have to break it, thus more potential incompatibilites.

Upgrade not requested

Better Naive approach is pip install -U $(pip list --not-required --format=freeze | awk -F= '{print $1}'). For example with Jupyterlab, this would list a plugin for a Jupyterlab, but not Jupyterlab itself.

Dummy project

An alternative solution would be to keep a dummy pyproject.toml project which has no sources, but only dependencies and make semi-manual upgrades.

This solution is less robust and less maintainable in a long run as requires high organizational standards from a person who manages this environment.

Additional context

Code of Conduct

@eirnym eirnym added S: needs triage Issues/PRs that need to be triaged type: feature request Request for a new feature labels Sep 1, 2024
@eirnym
Copy link
Author

eirnym commented Dec 27, 2024

Any updates on this?

@pfmoore
Copy link
Member

pfmoore commented Dec 27, 2024

As this requires new metadata, it would need a new packaging standard before pip would implement it.

However, in practice, this is something better handled by an environment/project/workflow manager like uv, hatch, PDM or poetry. Those tools are far better suited to this problem than pip is - pip by design works at a lower level, where information about what is “requested” (as opposed to being installed as a dependency) isn’t available.

@ichard26 ichard26 added state: needs discussion This needs some more discussion resolution: needs standard Should be agreed as a standard before implementation and removed S: needs triage Issues/PRs that need to be triaged labels Dec 27, 2024
@jsirois
Copy link
Contributor

jsirois commented Dec 28, 2024

Isn't this the metadata needed here?: https://peps.python.org/pep-0376/#requested

That said, PyPA does this extremely confusing thing and uses new, non-PEP pages for a subset of packaging PEPs. In this case that page neither mentions REQUESTED nor it's elision in history: https://packaging.python.org/en/latest/specifications/recording-installed-packages/#history

@jsirois
Copy link
Contributor

jsirois commented Dec 28, 2024

Aha, I did not read closely enough. History eventually gets you here: https://peps.python.org/pep-0627/#the-requested-file-removed-from-spec

So, in this case, it seems, a standard was made, not followed, and then loosened as a result. That's.... unfortunate to say the least.

@pfmoore
Copy link
Member

pfmoore commented Dec 28, 2024

It's not just that the standard was not followed, it's that it isn't practical to follow in the first place. Consider the following case:

  1. You install B.
  2. You install A, which has B as a dependency.
  3. You no longer need B as a standalone package. You can't uninstall it, because A depends on it. But you can't record the fact that you no longer want B to be "requested" any other way.

Workflow tools don't have this problem, because they maintain a list of "what is requested" independently of the packages themselves (for example, uv add and uv remove handle "requesting" and "unrequesting" packages - in fact it does so by managing a "dummy" pyproject.toml in essentially the way that's described in the original post). But pip doesn't have any way of doing this1 - and more importantly, there's no place where a standard can store that information.

Footnotes

  1. there's no relationship between the directory pip is being run from and the environment pip is targetting, so the "dummy project" approach won't work for us.

@jsirois
Copy link
Contributor

jsirois commented Dec 28, 2024

But @pfmoore IIUC correctly the package managers referenced in the OP, apt, dnf, etc. have the exact same problem space but do solve it; so they deem it practical to solve.

@pfmoore
Copy link
Member

pfmoore commented Dec 28, 2024

Maybe, but whatever they do is presumably more sophisticated than the REQUESTED file approach, as that didn't work (see my example above). It's possible that their approach can be made to work for Python. I don't have any experience with those systems so I can't say. It would still need a PEP and a new standard, though.

@jsirois
Copy link
Contributor

jsirois commented Dec 28, 2024

When I use a system package manager such as FreeBSD ports, MacPorts, Homebrew, pacman/yay/apt and others I can obtain list of packages, which have've explicitly requested, thus the most important.

That's from the OP and I think a fair snip that is indicative of what is being asked for. That is exactly solved by REQUESTED it seems. I.E. the PEP existed, was standards track final, and was then partially yanked. The OP does not talk about the problem you indicate in step 3 which involves uninstalling. They just want a list of what things were explicitly requested for install. That ~X/Y problem aside, 3 can clearly be solved by the uninstall of B not uninstalling B (since A depends on it), but only removing its REQUESTED file. That would probably go along with a UI decision on the part of the installer to warn, etc, but this very sort of modification of existing metadata was covered by the original PEP in the other direction 1:

  1. You request A which also installs dependency B.
  2. Later you request B which is already installed, but not REQUESTED; so a REQUESTED file is added to its dist-info dir.

So not only was the PEP written and approved, but it addresses the style of thing needed in your example case - mutating the dist-info dir of an already installed distribution to change its REQUESTED status.

Footnotes

  1. Snipped from https://peps.python.org/pep-0376/#requested

    If a distribution that was already installed on the system as a dependency is later installed by name, the distutils install command will create the REQUESTED file in the .dist-info directory of the existing installation.

@pfmoore
Copy link
Member

pfmoore commented Dec 28, 2024

Feel free to make a proposal to standardise this. Pip won't implement it without a standard backing it. While the history of the REQUESTED file standard might be interesting, it isn't really relevant any more - what matters is what the current standards say.

@jsirois
Copy link
Contributor

jsirois commented Dec 28, 2024

I have absolutely 0 interest in perpetuating the growing list of standards. I'm just here to raise a fairly obvious one that was not pointed out in the initial response to the OP. I learned along the way that REQUESTED had since been yanked. I think the takeaway for @eirnym is there was a well-suited standard and it was yanked; so any further progress involves re-visiting that standard and its history and its conclusions right or wrong.

@eirnym
Copy link
Author

eirnym commented Dec 29, 2024

There's basically 2 way of usage of venvs:

  1. Fully automated temporary venvs for production use, which will be fully removed afterwards and reinstalled from scratch
  2. Long-term venvs to install various tooling and sometimes shard between projects.

First kind is quite easy to manage via pyproject.toml file and/or other files. Second kind is… currently not so easy to manage.

You can argue if I can or cannot use system package manager and general answer is simply no, as an example I'd suggest to install Jupiter Lab with various extensions using system package manager and see how it's easy to manage (including installation of third party libraries you'd need for your tasks) and how often packages are upgraded in system package manager repositories.

Note about distutil

The implementation was… the least performant of what I've seen and I'm glad that THAT implementation was scrapped out.

The better implementation is used by system package manager by keeping some kind of database/registry where this flag could be stored along with other package information and their contents.

This registry dedicated to make package managing simpler (e.g. better handling of conflicts).

When metadata (stored currently in files) would be moved into a database, package management would be much easier from user and tooling perspective as it would make less internal conflicts as it does now. For example, from time to time I have to completely remove venv because too many files left after I removed all packages beside pip, setuptools and wheel.

@sbidoul
Copy link
Member

sbidoul commented Dec 29, 2024

I'll note that pip does implement REQUESTED, and if I understand the OP correctly, there is a solution today: pip inspect | jq -r '.installed[] | select(.requested == true) | .metadata.name'.

pip implemented it before it was removed from the PEP.

If one uses other tools than pip to manipulate the environment, results may vary of course as not all tools implement it or implement it incorrectly (for instance uv emits REQUESTED for non top-level packages).

As a side note, I for one don't quite understand why it was removed from the PEP.

you can't record the fact that you no longer want B to be "requested" any other way.

One can remove the REQUESTED metadata file to achieve that, right? So to me this is a UX issue but not in itself a problem with the standard.

@notatallshaw
Copy link
Member

This issue is a duplicate of #7811, as as best as I can tell it was implemented and has not been removed, so this issue can be closed.

Discussion has been veering into being a duplicate of #9812 (and the many other issues that ask pip to be a package manager rather than just a package installer).

If you would like a pip-like interface that handeles package management there is pip-tools or uv pip, the workflow of both is to use a requirements.in file that represents your core dependencies and that generates a requirements.txt file which represents your environment and then you use a sync command to get your environment to match it. You can emulate the same workflow directly with pip but you will need to delete your virtual environment each time and recreate it as pip does not directly have a sync command.

If you would like a more complete package/project manager there are several options, such as poetry, uv, pdm, and hatch. Further if you have external dependencies that are satisfied by conda you can use pixi.

I have absolutely 0 interest in perpetuating the growing list of standards.

That's fine, but I would say there is currently no appetite from the pip maintainers to implement bespoke processes, workflows, file formats, etc. We would rather push forward standards that anyone can implement without needing to reference pip's implementation. So further discussion would need to take place at https://discuss.python.org/c/packaging/14. Although I would imagine PEP 751 (standard lock file) would need to be agreed and implemented first as any stepping stone towards package management, discussion for that is ongoing now.

Also before starting any discussion it would be useful to read up on why pip has never implemented this, both from the answers already given here any the many issues posted in the past, e.g. #12106, #9118, #5823, #2635. There may also be past discussion on the discussion forum.

@notatallshaw notatallshaw closed this as not planned Won't fix, can't repro, duplicate, stale Dec 29, 2024
@eirnym
Copy link
Author

eirnym commented Dec 29, 2024

@sbidoul nice find, however I see few packages which I wasn't really requested, such as webcolors. Could it be that leaves (orphans) become "requested" somehow?

I agree that different tools may store metadata in a different way.

@eirnym
Copy link
Author

eirnym commented Dec 29, 2024

@notatallshaw FYI: pip is seen as package manager by everybody, while distutils/setuptools/poetry/etc as a package installer.

@notatallshaw
Copy link
Member

notatallshaw commented Dec 29, 2024

FYI: pip is seen as package manager by everybody, while distutils/setuptools/poetry/etc as a package installer.

Unfortunately these terms are not standardized and mean different things to different people, I can assure you, not everyone uses them as you describe.

In this context I mean a package manager as a tool which manages the lifecycle of a package, which would be poetry etc. and not pip, setuptools, or distutils.

Pip "just" manages installing packages, not the lifecycle of a package, just two install commands in even a clean environment pip can fail to produce a consistent environment, e.g. pip install foo and pip install -U bar can be inconsistent (the most common example being that the latest version of foo depends on an old version of bar) but a single install in a clean environment is guaranteed to be consistent, e.g. pip install foo bar. These problems are solved by package and project managers that track the lifecycle of user requested packages.

@eirnym
Copy link
Author

eirnym commented Dec 30, 2024

Nevertheless, I'd move metadata to a SQLite db and install/remove packages based on this info. As I mentioned before, pip uninstall doesn't uninstall things properly all the time even I have venv for a project and do pip install -e/pip-autoremove. Issue is floating and not easily reproducible, so I haven't report it.

Additionally, pip doesn't handle conflicts and many other things.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 30, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
resolution: needs standard Should be agreed as a standard before implementation state: needs discussion This needs some more discussion type: feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

6 participants