-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Traverse refactor, part 1 #288
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good step towards the traverse logic 😍
Thank you for writing informative commit messages, they are an easy read forming a story :) Also, the graph makes it easy to understand the main idea behind the PR!
Remarks
The thing I would like you to look at closer is the data flow f the CodeSource
and DepsSource
.
CodeSource
The problem is that the validation step for CodeSource
is done twice in is_valid_code_source
and then again in parse_source
. It would be more readable and less error-prone to pass an already validated source to parse_source
and make it on the type level. In that case, we could use pydantic-type validation in the CodeSource
object, which has either a path to a particular file or stdin. In pydantic it would look like this:
class CodeSource(BaseModel):
"""A Python code source to be parsed for imports statements."""
path: PathOrSpecial
base_dir: Optional[Path] = None
@validator("base_dir")
def extension_must_be_in_parsable_set(cls, v)
if v is not None and v.suffix not in {".py", ".ipynb"+
raise UnparsablePathException(ctx="Supported formats are .py and .ipynb; Cannot parse code", path=base_dir)
return v
and the corresponding section in is_valid_code_source
will change to:
if path.is_file():
return CodeSource(path, base_dir)
That way we will not face redoing the validation in parse_source
.
DepsSource
As for DepsSource
we could also move the whole validation section for a single file to the DepsSource
object.
Other remarks
Also from what I understand, both DepsSource
and CodeSource
path parts are actually files, never directories. That could also be either checked at the object creation validation stage or informed by a comment.
3b87def
to
3346cdf
Compare
70278ef
to
2e620ed
Compare
@mknorps writes:
I refactored
Yes, I attempted a similar/symmetrical When re-reviewing, it's probably better to review commit-by-commit, starting from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very good. My main concern was double validating sources, and you managed to work around it with __post_init__
🎉
Kudos for structuring the changes after PR comments, it was an easy read, not overwhelming, and it was easy to continue the review.
🚀 🚀 🚀
This will be used in traverse tests to verify that we don't traverse into dot directories.
This module will take over the traversal of the project files/directories specified in the Settings object, and generate sources that will then be processed other parts of FawltyDeps, primarily extract_imports and extract_declared_dependencies. We start by reimplementing the file/directory traversal that currently lives in the extract_imports and extract_declared_dependencies modules, except that we focus on traversing the project _once_ and generating CodeSource and DepsSource objects that will later be passed on to the extract_imports and extract_declared_dependencies modules, respectively. For now, the new code is only used by a test suite that exercises various traversal scenarios on the projects under tests/sample_projects.
The new parse_sources() and parse_source() functions will soon replace the older functions that operate directly on files/dirs instead of CodeSource objects (parse_any_args(), parse_any_arg(), parse_dir()). Rephrase test cases for the functions that are about to go away to use the new parse functions instead. (Remove some test cases already covered better by test_traverse_project.)
The new parse_sources() and parse_source() functions will soon replace the older functions that operate directly on files/dirs instead of DepsSource objects (extract_declared_dependencies() and extract_declared_dependencies_from_path()). Rephrase test cases for the functions that are about to go away to use the new parse functions instead. (Remove some test cases already covered better by test_traverse_project.)
The _ordering_ of log messages is not something we should be too strict about in tests, as it just makes it harder to refactor our main logic without having to deal with lots of "collateral diff damage" in our tests. Establish a common pattern across all our command-line tests for verifying expected log messages: We declare a list of expected log message lines, we then split the stderr into lines and use the assert_unordered_equivalence() helper to communicate that the _ordering_ of log messages is not important.
Use the new traverse_project module to find Source objects from the given settings, and pass those on to the new parsing interface in extract_imports and extract_declared_dependencies.
extract_declared_dependencies_from_path() and extract_declared_dependencies() have now been replaced by parse_source() and parser_sources(), respectively.
Use CodeSource.__post_init__() to verify invariants about the CodeSource object, instead of doing this in validate_code_parse() and then re-asserting the same in parse_source(). Some asserts remain in order to appease Mypy. Suggested-by: Maria Knorps <[email protected]>
In preparation for storing parser choice inside DepsSource.
The parser choice was passed into both validate_deps_source() (to help select which files become DepsSource objects), and into parse_source() (to select the parser to be used for each DepsSource object). This is redundant when we instead can record the parser_choice as part of the DepsSource object.
250d980
to
e5835ec
Compare
When rebasing this onto The end result was no longer recognizable as 99c2c96 ("test_cmdline: Remove unnecessary asserts for log messages"), so I ended up removing that commit from this PR and spinning it off into a separate PR (to be posted as soon as this is merged). |
This is the first part of several with the ultimate goal to:
--list-code-sources
and--list-deps-sources
, not yet tracked in any issue, I believe)This part of the bigger refactoring, however, does not do any of that.
There should in fact be ~zero observable difference before and after this PR (modulo some changed INFO log messages). This PR is all about providing a single traversal structure that will more easily enable the goals above to be done in future PRs.
In graphical terms, this PR tries to implement this restructuring:

Probably best viewed on a per-commit basis.
Commits:
sample_project/no_issues
: Add a dummy package inside the bundled venvtraverse_project
moduleextract_imports
: Add new parse functionsextract_declared_dependencies
: Add new parse functionstest_cmdline
: Unify testing of log messagesmain
: Use new traversal logic and new parserstest_cmdline
: Remove unnecessary asserts for log messagesextract_imports
: Remove obsolete API for traversing files/dirsextract_declared_dependencies
: Remove obsolete API