Skip to content
This repository was archived by the owner on Sep 9, 2020. It is now read-only.

Command line args, --no-recurse, --shallow #1589

Closed
colek42 opened this issue Jan 27, 2018 · 4 comments
Closed

Command line args, --no-recurse, --shallow #1589

colek42 opened this issue Jan 27, 2018 · 4 comments

Comments

@colek42
Copy link

colek42 commented Jan 27, 2018

I propose:

dep should have command line args to clone without recursing and shallow.

--no-recurse would not clone submodules
--shallow would clone with a depth of 1.

Both of these options would save bytes across the wire, as well as improve build stability and speed for my team. If there is interest I will make the PR.

@sdboyer
Copy link
Member

sdboyer commented Jan 27, 2018

hi, welcome! Thanks for your interest, and the suggestion. But there needs to be considerably more detail here:

as well as improve build stability

How?

speed

In what specific stages of your pipeline would you anticipate using these flags?

Adding flags like this is unlikely, at least in the short term. Any changes in this area further complicate the state machine that we use to manage the on-disk repositories, and that's a complex, highly concurrent code path - there has to be a major upside to justify the risk. Allowing control of those things via flags tends to be much worse, though, because we typically have to either record sidecar state, or infer from disk, how the on-disk repositories were managed on previous runs, and therefore what state they're in, and therefore how we interact with them at different points in the state machine.

Before anything such flags can be considered, we have to exhaust the possibilities for improvement that we can utilize just from the state the system already knows. i'm pretty sure that, for example, we could detect if no local clone yet exists, and if so then fetch tarballs instead of cloning, if the host supports it (as GitHub does, via their HTTP API). That's something i'd accept contributions on - though we need #1250 in order to have the necessary info in the state machines.

@colek42
Copy link
Author

colek42 commented Jan 29, 2018

@sdboyer thanks for the quick response.

  1. Improve Build Stability.

By this, I mean the stability of our build pipelines. We have submodules that are related to our frontend in our repos. These submodules, with full recursion, can be fairly large. Dep will needlessly download these submodules that are not needed to build the go code. This takes a good deal of time and is dependent on network conditions.

  1. Improve build speed

Shallow clones and no clone recursion can save a huge amount of bytes.

In what specific stages of your pipeline would you anticipate using these flags?

These flags would be used in all stages of our build pipeline.


We changed the command context here:

func (r *gitRepo) get(ctx context.Context) error {
to change the default behavior to NOT recurse. It has solved the issue on our end.

After actually looking through the code paths to get to that execution point from the command line does seem complicated. I would tend to agree with you that there would need to be a significant reason for making this change.

Few questions:

  1. Why is recurse the default behavior?
  2. Why is a full clone, not shallow, the default behavior?

@cben
Copy link

cben commented May 27, 2019

#795 (comment) has some good explanations why dep didn't shallow clone by default (2 years old, no idea if still relevant).

@sdboyer please consider however the use case of throw-away checkouts in stateless CI.
Yes, enabling DEPCACHEDIR that survives across builds is a major speedup and generally a good idea, but:

  1. In environments like Travis where by default nothing survives across builds, it takes extra effort to set caching up.
  2. In environments like Travis where persistence is actually upload->cloud->download, caching is not always a win.
  3. In my team, we've seen caching cause problems(*), and chose to disable it. Given tradeoff between slowness vs even a rumor of stateful issues, I suspect a lot of teams would choose slow but stateless. It's now becoming too slow for us, so we'll try caching again... But if dep had a semi-fast stateless mode based on shallow cloning, we'd probably happily stick with it.

(*) Sorry don't have details. I'm not even sure it was dep's fault! (disabling caching was a step towards finding out but we didnt get further...)
I realize my last point sounds egocentric "I don't even have time to report bugs but I'm expecting you give me a workaround". Just wanted to mention a use case, I trust your judgement of project direction, thank you for all the work on dep in any case ❤️

@mvdan
Copy link
Member

mvdan commented Sep 4, 2020

Dep was officially deprecated earlier this year, and the proposal to archive this repository was accepted. As such, I'm closing outstanding issues before archiving the repository. For any further comments, please use the proposal thread on the Go issue tracker. Thanks!

@mvdan mvdan closed this as completed Sep 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants