Skip to content

createNodeBuilder, declaration emit, and associated utility port #791

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 52 commits into
base: main
Choose a base branch
from

Conversation

weswigham
Copy link
Member

@weswigham weswigham commented Apr 11, 2025

This branch to replaces the current typeToString implementation with the ported node-builder backed implementation, which is the top-down goal of this PR, ultimately. It also enables declaration emit and tests for it. CI on it should not pass until the node builder is mostly (if not totally) completely ported, due to interdependencies between many of its' subsystems. Baseline diffs are seperately in weswigham#1 because if they were merged into this PR, it would become impossible to view in the github web UI - so we'll only do that once reviewers are done with it.

Structurally, nodebuilder.go is the inner contents of createNodeBuilder's closure environment, lifted into members of a struct. All context parameters have been removed and replaced with lookups of context on the struct itself, since we're OO now, but that's about the only refactor. nodebuilderapi.go is the wrapper returned by createNodeBuilder which maps all the internal closed over methods to the internal node builder API shape (which is recorded as the NodeBuilderInterface... interface)- basically it's the logic that handles setting up context objects for each request. Some of that might get renamed to reduce confusion eventually, but the structure seems sound. nodebuilderscopes.go may or may not go away - NodeBuilder.enterNewScope was pretty big but isolated (used by mapped type and signature construction), so felt like it could stand alone, and it has some utilities only it uses. symbolaccessibility.go is for the checker's symbol accessibility functionality - these are also mostly self-contained, though do depend on one-another and some common utilities (though I only have stubs here right now - my previous attempts to optimize them as I ported them have broken them, so we're just gonna port them straight as we can for now).

This is already a bit of a beast to review, size-wise, and I'd say there's still a fair bit left for a full port - but if we add some extra unit tests, some subsystems, like the specifier generation and maybe accessibility, can reasonably stand alone as changes. Those things just aren't currently unit tested outside of their integrations into the builder in strada, though, so those tests'd all be additional greenfield work.

The remaining features to port (from the TODOs left in the code), which may or may not make this PR or followups depending on reviewer satisfaction, are:

  • Expando function declaration emit (is this fully implemented in checker now?)
  • Late bound index signature declaration emit (checker implementation here is currently partial, and missing the parts required for accurate declaration emit)
  • JS declaration emit support (will need to be substantively rewritten given different upfront parsing and checking of JSDoc structures - likely for the better)
  • isolatedDeclarations support and associated node reuse logic (this is a large amount of error checking code for very little practical payoff)
  • support for attaching type arguments to identifiers in the node builder for quickinfo
  • support attaching synthetic comments to nodes for better truncated output in some modes
  • extract most nodebuilder logic from the checker package into the nodebuilder package with an interface indirection over the checker (likely requires renaming a lot of things on checker to make them "public" and reworking how they're accessed)
  • clean up layering of emitContext/emitResolver/nodebuilderapi layers to automatically pass through more bits so things like EmitContexts don't need to be arguments to functions on the nodebuilderapi
  • bundled emit support? seems mostly cut for js output, but we're the only provider of upfront-bundled declaration output
  • support preserving input quote style in declaration emit (not currently supported on the AST itself)
  • Memoize printers used for diagnostics and node builder in the checker (is this even actually needed? spawning new printers on demand seems like a pretty cheap allocation)
  • Support for the OmitTrailingSemicolon and NeverAsciiEscape printer options (only affects some diagnostic output minorly - also warrants investigating if createSemicolonDeferringWriter actually needs to be ported or if it overlaps with the printer flag of the same)
  • project references support in module specifier generation (currently stubbed, since project references don't exist)
  • cached getOutputPathsFor on the emit host (or decide if this is cheap enough a cache isn't worth bothering with)
  • stripInternal support? Unsure if we planned to support this. It's not bad in the declaration emitter so long as jsdoc tag parsing is in place...
  • Flatten immediately nested module objects? We now parse module a.b {} into module a { export module b {} }, but AFAIK the AST doesn't even support representing the former anymore, which makes printing it back that way for declaration emit difficult!
  • resolutionMode is not currently varied with usage declarations - the helper for calculating it is missing and unused at module lookup sites. This is a more general issue across the compiler presently, but persists into this declaration emit logic.
  • Symlink cache support in module specifier generation to support generating specifiers for modules not directly imported in a project (the loader doesn't keep a symlink cache around anymore as far as I can tell, so that's required before it can be reused in specifier generation)
  • output path remapping in specifier generation for import maps (output path calculation helpers are currently in weird places and need some refactoring to be reused in the modulespecifiers package)
  • underlineing type baseliner to resume measuring node reuse, likely after isolated declarations logic is ported

Copy link
Member

@jakebailey jakebailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be interested to see if we can move any of this out of the checker package; it's getting pretty big in there.

(I'm not sure how much you want eyes on the contents of the PR yet but I'm happy to look if you do.)

This needs more pulled in from stashed old work to fill it out,
(part of the way through the declaration transform itself and resolver)
and it's useless without the nodebuilder itself done, but has the full
declaration emit pipeline hooked in and tests enabled - another thing
showing just how not complete it is! If the PR was already massive,
this won't help~

For real, this is all going to have to be un-integrated to be at all
mergable in the end, but all up it will be able to signal completion
with green CI in the end, so long as nobody blind baseline-accepts...
¯\_(ツ)_/¯
…ans type parameter smuggling and module specifier generation
@weswigham weswigham marked this pull request as ready for review April 30, 2025 22:06
@weswigham
Copy link
Member Author

I have weswigham#1 separately open with the baseline changes, because if I merge those into this PR, this change goes from "practically unreviewable" to "actually impossible to navigate in the github web UI", so it's probably best if reviews of this are done before any baseline updates get merged in.

@jakebailey
Copy link
Member

I don't want to spam the PR with this comment, but I think we're trying to put any exported function shims in exports.go rather than rename functions to be exported, just to make it super clear where "public" things are sitting.

weswigham added 4 commits May 19, 2025 11:25

Verified

This commit was signed with the committer’s verified signature.
weswigham Wesley Wigham

Verified

This commit was signed with the committer’s verified signature.
weswigham Wesley Wigham
return &SymbolTrackerImpl{context, inner, false, tchost}
}

func (this *SymbolTrackerImpl) GetModuleSpecifierGenerationHost() modulespecifiers.ModuleSpecifierGenerationHost {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this 🤨

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, it is analogous and it's not a keyword, so why not keep it similar where we actually did use objects/classes in Strada?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is the only class in the old codebase, but I still don't like it 😄

Comment on lines +9 to +11
type SymbolTrackerImpl struct {
context *NodeBuilderContext
inner nodebuilder.SymbolTracker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of things that are types that are Impl or Interface and members that are impl or inner makes me feel like something is in the wrong place and/or we’re trying to be too clever with polymorphism. How much of this is temporary until more things are moved out of checker.ts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SymbolTracker interface defined outside checker is required - otherwise the printer and checker end up circularly dependent on one another, so don't work. I did just push a removal of the NodeBuilderInterface since, for now at least, it's not strictly required.

Verified

This commit was signed with the committer’s verified signature.
weswigham Wesley Wigham
@jakebailey
Copy link
Member

hugeDeclarationOutputGetsTruncatedWithError appears to time out on my machine; maybe we need to skip its output? Or, there's a perf bug?

@weswigham
Copy link
Member Author

Definite perf issue - it completes locally for me in checks around 50 seconds.

@weswigham
Copy link
Member Author

I think the perf issue is in the test suite - pprof shows almost all 50 seconds spent in myers.Diff...

@jakebailey
Copy link
Member

Hm, unfortunate; the test suite doesn't have a way to skip a diff yet, but could a la fixupOld plumbed downward.

@weswigham
Copy link
Member Author

Can we just specify a higher timeout for an individual test in the interim? For now, the diff is technically still useful - just expensive to calculate for a large output such as that test.

@jakebailey
Copy link
Member

jakebailey commented May 20, 2025

No, unfortunately the test timeouts are global as golang/go#48157 isn't yet implemented.

Copy link
Member

@jakebailey jakebailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a pass of some of the files; have not ventured into the larger files quite yet.

return &SymbolTrackerImpl{context, inner, false, tchost}
}

func (this *SymbolTrackerImpl) GetModuleSpecifierGenerationHost() modulespecifiers.ModuleSpecifierGenerationHost {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is the only class in the old codebase, but I still don't like it 😄

@@ -519,6 +520,7 @@ type WideningContext struct {
}

type Program interface {
Host
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like us going back to the old days of "everything's a host you have to forward though"; is there a way to avoid this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...? No? The program is, by request, the only argument to making a checker, ergo it must provide all external, configurable/queryable state functionality the checker needs. All the metadata/state must flow through it, since it's the only point of contact between the checker and anything else. None of the methods added by this host are onerous for a normal program to implement, either (hence why the emit host, which wraps a program, can also easily provide them) - they're either methods on its own internal FS/host, or metadata it's already calculating during setup (like package json cache info). Like, you can expose the underlying things the program owns to implement some of the additional methods directly, like the FS or Resolver, but all that does is push complexity farther down and make implementing the interface for test/secondary use harder by accidentally exposing more scope...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this goes back to my question about "is a Program a host, or does it have a host"? And I prefer the latter but I can get why the former is more convenient

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically, it is a host (with an internal host of it's own for fs access) that we wrap/window/subset/augment as needed. In strada it's the primary implementation of TypeCheckerHost - we've just changed the term in use here - rather than naming the interface checker.TypeCheckerHost or checker.Host, we named it checker.Program. Just because, I guess.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It used to be that they were in the same package (compiler), so it was concrete before and was just left as an interface when the refactor happened.

@@ -38,7 +38,7 @@ func NewCommonJSModuleTransformer(emitContext *printer.EmitContext, compilerOpti
tx.assignmentPatternVisitor = emitContext.NewNodeVisitor(tx.visitAssignmentPattern)
tx.languageVersion = compilerOptions.GetEmitScriptTarget()
tx.moduleKind = compilerOptions.GetEmitModuleKind()
return tx.newTransformer(tx.visit, emitContext)
return tx.NewTransformer(tx.visit, emitContext)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just because the declaration emitter is not in this same package?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but defining transforms outside the core transform package could also be done in some quickfixes and would be done for some lint rule (fixes) - these methods on the base transform should be exported, imo, to support defining transforms where's they're needed, rather than only in one place.

Copy link
Member

@jakebailey jakebailey May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think declaration emit being "different" compared to the other transforms seems fine. It really does do something different than downleveling transforms. (I kinda want the other transforms in a different package, too.)

@@ -22,6 +23,42 @@ type fakeProgram struct {
getResolvedModule func(currentSourceFile *ast.SourceFile, moduleReference string) *ast.SourceFile
}

func (p *fakeProgram) FileExists(path string) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, having to have all of these is sort of a shame but maybe there's no way around it ☹️

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. I certainly don't wanna leak all the stuff on program's internal bits into the checker scope - minimally it'd make writing the interfaces incredibly painful.

weswigham added 7 commits May 23, 2025 13:03

Verified

This commit was signed with the committer’s verified signature.
weswigham Wesley Wigham

Verified

This commit was signed with the committer’s verified signature.
weswigham Wesley Wigham

Verified

This commit was signed with the committer’s verified signature.
weswigham Wesley Wigham

Verified

This commit was signed with the committer’s verified signature.
weswigham Wesley Wigham

Verified

This commit was signed with the committer’s verified signature.
weswigham Wesley Wigham

Verified

This commit was signed with the committer’s verified signature.
weswigham Wesley Wigham

Verified

This commit was signed with the committer’s verified signature.
weswigham Wesley Wigham
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants