Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C#: split table codegen logic from type codegen logic #1573

Merged
merged 5 commits into from
Sep 5, 2024

Conversation

RReverser
Copy link
Contributor

@RReverser RReverser commented Aug 7, 2024

Description of Changes

Over time, we've been adding more table features that affect base BSATN type generation as well.

Initially this was limited to BSATN.Codegen looking up any types marked with [SpacetimeDB.Table] - which was already an abstraction leak but not too bad on its own.

However, most recent example that made entangling even tighter is timer tables, where codegen needs to add extra fields to the type early in the pipeline, so that they'd be visible both to BSATN (de)serialization and to table column methods. Those extra fields had to be added separately to the type and table codegens and must be kept in sync.

This pushed the entangling over the line and, back in timer tables implementation, I left a TODO to instead subclass the table codegen from the type one, which is what I'm doing here.

After this change, the BSATN.Codegen's Type generator doesn't need to know anything about table and column attributes - it doesn't even need to look for [SpacetimeDB.Table] types anymore - and, instead, is only responsible for taking a list of fields and generating BSATN type methods, and only runs on types with the [SpacetimeDB.Type] marker.

On the other hand, Codegen's TableDeclaration now inherits from TypeDeclaration, adds those extra fields in its own constructor, parses column-specific attributes in ColumnDeclaration, and appends extra code in its own codegen. The combined type+table output for a table now ends up in a single file.

This change will make it easier to add table-specific features in the future without touching BSATN.Codegen altogether.

Note for reviewers: I recommend reviewing commit by commit and with whitespace hidden; I left detailed commit descriptions.

API and ABI breaking changes

If this is an API or ABI breaking change, please apply the
corresponding GitHub label.

Expected complexity level and risk

2 - the changes move some code around, so there's small risk of codegen breakage, but as long as snapshot tests don't show it, I'm fairly confident it works.

How complicated do you think these changes are? Grade on a scale from 1 to 5,
where 1 is a trivial change, and 5 is a deep-reaching and complex change.

This complexity rating applies not only to the complexity apparent in the diff,
but also to its interactions with existing and future code.

If you answered more than a 2, explain what is complex about the PR,
and what other components it interacts with in potentially concerning ways.

Testing

Describe any testing you've done, and any testing you'd like your reviewers to do,
so that you're confident that all the changes work as expected!

  • dotnet test
  • Write a test you want a reviewer to do here, so they can check it off when they're satisfied.

Copy link
Contributor

@kazimuth kazimuth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay on this, I'd meant to leave an approval a few days back but forgot to submit it. This all seems reasonable to me, although I'm still not intimately familiar with all the Roslyn plumbing.

@RReverser RReverser force-pushed the ingvar/csharp-table-inherit-type branch 2 times, most recently from 85ceffa to 07987a8 Compare August 21, 2024 16:54
Copy link
Contributor

@kazimuth kazimuth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wait, I wasn't supposed to approve this one.

@RReverser
Copy link
Contributor Author

Heh yeah I'm waiting for more reviews anyway (in particular @lcodes and @SteveBoytsun) before merging this.

@RReverser RReverser requested a review from SteveBoytsun August 22, 2024 18:17
Copy link
Contributor

@SteveBoytsun SteveBoytsun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welp, I have no idea what's going on here, but I don't see any obvious issues, so... LGTM (Let's Gamble, Try Merging)


if (type.Kind is TypeKind.Sum)
// Ensure all enums fit in `byte` as that's what SATS uses for tags.
if (enumType.Members.Count > 256)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some deeply-rooted reason for this? it's not difficult to go over 8 bits in enums (especially if you're using them as a bitmask). I know that's not the point of this PR, but I wanted to bring it up in case there isn't a ticket for this already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's limitation of SATS itself. I came when it was already like that. Maybe @cloutiertyler can answer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation AFAIK is that it minimizes the size of BSATN-ified sum values without having to add any more complex logic like var-len int encodings or computed sizeof(tag).

Also note that we don't (and have no plans to) allow users to choose enum discriminants, so using SpacetimeTyped enums as masks is not meaningful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it minimizes the size of BSATN-ified sum values without having to add any more complex logic like var-len int encodings or computed sizeof(tag).

I wonder if we should leave some wiggle room though. 256 variants is not that rare yo run out of even without bitflags.

We could either extend to uint16 or keep to uint8 but limit range to 127 variants max for now. With the 2nd option we don't need any changes to encoding for now, but it does leave us room to use LEB128 in the future if we decide we need it.

@RReverser RReverser force-pushed the ingvar/csharp-table-inherit-type branch from 818fd32 to 5e3ca05 Compare September 3, 2024 13:38
RReverser and others added 4 commits September 4, 2024 14:40
Aside from the trailing comma change, this is a non-functional refactor that moves 1) all the parsing code from anonymous `transform` lambdas and moves it into constructors of the corresponding classes, and 2) all the string generation code from anonymous `Select` lambdas and moves it into methods of the corresponding classes.

This makes each class self-contained and easier to reuse instead of having huge anonymous functions.

If you want to review the actual code diff, I'd recommend something like VSCode diff viewer which has a feature to show "moved code blocks", otherwise the diff is pretty ugly.
Several classes - fields, columns and reducer params - have very similar structure (name, type, BSATN type info) and codegen.

This commit unifies that shared functionality by making MemberDeclaration a base class for the others.
Over time, we've been adding more table features that affect base BSATN type generation as well.

Most recent example of this is timer tables, which need to add extra fields to the type early in the pipeline, so that they'd be visible both to BSATN (de)serialization and to table column methods.

The duplication has gotten ugly enough and, back in timer tables implementation, I left a TODO to instead subclass the table codegen from the type one, which is what I'm doing here. After this change, the BSATN.Codegen's Type generator doesn't need to know anything about table and column attributes - it doesn't even need to look for `[SpacetimeDB.Table]` types anymore - and, instead, is only responsible for taking a list of fields and generating BSATN type methods.

On the other hand, Codegen's TableDeclaration now inherits from TypeDeclaration, adds those extra fields in its own constructor as well as parses extra attributes in `ColumnDeclaration`, and appends extra code in its own codegen to that provided by the base TypeDeclaration. The combined type+table output for a table now ends up in a single file.

Overall, this change will make it easier to add table-specific features in the future without touching BSATN.Codegen altogether.
@RReverser RReverser force-pushed the ingvar/csharp-table-inherit-type branch from 5e3ca05 to 6f9e677 Compare September 4, 2024 13:54
@RReverser RReverser mentioned this pull request Sep 4, 2024
2 tasks
@cloutiertyler
Copy link
Contributor

@SteveBoytsun I've asked @jdetter to bring you up to speed on how to test this SDK change on top of BitCraft. I would like to hold off on merging this until we have updated BitCraft to 0.11 and can test this change on top.

@SteveBoytsun
Copy link
Contributor

Tested this with BitCraft and it works

@RReverser
Copy link
Contributor Author

Going to merge this since BitCraft has upgraded to 0.11 and we finally tested this PR against it.

We won't be doing any more major refactorings for now, but this one was open for >a month and splitting the large string codegen by types helps some other feature PRs as well.

@RReverser RReverser added this pull request to the merge queue Sep 5, 2024
Merged via the queue into master with commit 68e3565 Sep 5, 2024
8 checks passed
@RReverser RReverser deleted the ingvar/csharp-table-inherit-type branch September 5, 2024 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-any To be landed in any release window
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants