C#: split table codegen logic from type codegen logic #1573

RReverser · 2024-08-07T13:08:56Z

Description of Changes

Over time, we've been adding more table features that affect base BSATN type generation as well.

Initially this was limited to BSATN.Codegen looking up any types marked with [SpacetimeDB.Table] - which was already an abstraction leak but not too bad on its own.

However, most recent example that made entangling even tighter is timer tables, where codegen needs to add extra fields to the type early in the pipeline, so that they'd be visible both to BSATN (de)serialization and to table column methods. Those extra fields had to be added separately to the type and table codegens and must be kept in sync.

This pushed the entangling over the line and, back in timer tables implementation, I left a TODO to instead subclass the table codegen from the type one, which is what I'm doing here.

After this change, the BSATN.Codegen's Type generator doesn't need to know anything about table and column attributes - it doesn't even need to look for [SpacetimeDB.Table] types anymore - and, instead, is only responsible for taking a list of fields and generating BSATN type methods, and only runs on types with the [SpacetimeDB.Type] marker.

On the other hand, Codegen's TableDeclaration now inherits from TypeDeclaration, adds those extra fields in its own constructor, parses column-specific attributes in ColumnDeclaration, and appends extra code in its own codegen. The combined type+table output for a table now ends up in a single file.

This change will make it easier to add table-specific features in the future without touching BSATN.Codegen altogether.

Note for reviewers: I recommend reviewing commit by commit and with whitespace hidden; I left detailed commit descriptions.

API and ABI breaking changes

If this is an API or ABI breaking change, please apply the
corresponding GitHub label.

Expected complexity level and risk

2 - the changes move some code around, so there's small risk of codegen breakage, but as long as snapshot tests don't show it, I'm fairly confident it works.

How complicated do you think these changes are? Grade on a scale from 1 to 5,
where 1 is a trivial change, and 5 is a deep-reaching and complex change.

This complexity rating applies not only to the complexity apparent in the diff,
but also to its interactions with existing and future code.

If you answered more than a 2, explain what is complex about the PR,
and what other components it interacts with in potentially concerning ways.

Testing

Describe any testing you've done, and any testing you'd like your reviewers to do,
so that you're confident that all the changes work as expected!

dotnet test
Write a test you want a reviewer to do here, so they can check it off when they're satisfied.

crates/bindings-csharp/BSATN.Codegen/Type.cs

kazimuth

Sorry for the delay on this, I'd meant to leave an approval a few days back but forgot to submit it. This all seems reasonable to me, although I'm still not intimately familiar with all the Roslyn plumbing.

kazimuth

Ah wait, I wasn't supposed to approve this one.

RReverser · 2024-08-22T17:13:31Z

Heh yeah I'm waiting for more reviews anyway (in particular @lcodes and @SteveBoytsun) before merging this.

SteveBoytsun

Welp, I have no idea what's going on here, but I don't see any obvious issues, so... LGTM (Let's Gamble, Try Merging)

SteveBoytsun · 2024-08-23T17:50:14Z

crates/bindings-csharp/BSATN.Codegen/Type.cs

-
-                    if (type.Kind is TypeKind.Sum)
+                    // Ensure all enums fit in `byte` as that's what SATS uses for tags.
+                    if (enumType.Members.Count > 256)


Is there some deeply-rooted reason for this? it's not difficult to go over 8 bits in enums (especially if you're using them as a bitmask). I know that's not the point of this PR, but I wanted to bring it up in case there isn't a ticket for this already

It's limitation of SATS itself. I came when it was already like that. Maybe @cloutiertyler can answer?

The motivation AFAIK is that it minimizes the size of BSATN-ified sum values without having to add any more complex logic like var-len int encodings or computed sizeof(tag).

Also note that we don't (and have no plans to) allow users to choose enum discriminants, so using SpacetimeTyped enums as masks is not meaningful.

it minimizes the size of BSATN-ified sum values without having to add any more complex logic like var-len int encodings or computed sizeof(tag).

I wonder if we should leave some wiggle room though. 256 variants is not that rare yo run out of even without bitflags.

We could either extend to uint16 or keep to uint8 but limit range to 127 variants max for now. With the 2nd option we don't need any changes to encoding for now, but it does leave us room to use LEB128 in the future if we decide we need it.

Aside from the trailing comma change, this is a non-functional refactor that moves 1) all the parsing code from anonymous `transform` lambdas and moves it into constructors of the corresponding classes, and 2) all the string generation code from anonymous `Select` lambdas and moves it into methods of the corresponding classes. This makes each class self-contained and easier to reuse instead of having huge anonymous functions. If you want to review the actual code diff, I'd recommend something like VSCode diff viewer which has a feature to show "moved code blocks", otherwise the diff is pretty ugly.

Several classes - fields, columns and reducer params - have very similar structure (name, type, BSATN type info) and codegen. This commit unifies that shared functionality by making MemberDeclaration a base class for the others.

Over time, we've been adding more table features that affect base BSATN type generation as well. Most recent example of this is timer tables, which need to add extra fields to the type early in the pipeline, so that they'd be visible both to BSATN (de)serialization and to table column methods. The duplication has gotten ugly enough and, back in timer tables implementation, I left a TODO to instead subclass the table codegen from the type one, which is what I'm doing here. After this change, the BSATN.Codegen's Type generator doesn't need to know anything about table and column attributes - it doesn't even need to look for `[SpacetimeDB.Table]` types anymore - and, instead, is only responsible for taking a list of fields and generating BSATN type methods. On the other hand, Codegen's TableDeclaration now inherits from TypeDeclaration, adds those extra fields in its own constructor as well as parses extra attributes in `ColumnDeclaration`, and appends extra code in its own codegen to that provided by the base TypeDeclaration. The combined type+table output for a table now ends up in a single file. Overall, this change will make it easier to add table-specific features in the future without touching BSATN.Codegen altogether.

cloutiertyler · 2024-09-04T19:54:07Z

@SteveBoytsun I've asked @jdetter to bring you up to speed on how to test this SDK change on top of BitCraft. I would like to hold off on merging this until we have updated BitCraft to 0.11 and can test this change on top.

SteveBoytsun · 2024-09-05T16:38:30Z

Tested this with BitCraft and it works

RReverser · 2024-09-05T18:00:37Z

Going to merge this since BitCraft has upgraded to 0.11 and we finally tested this PR against it.

We won't be doing any more major refactorings for now, but this one was open for >a month and splitting the large string codegen by types helps some other feature PRs as well.

RReverser requested review from bfops, kazimuth, lcodes and jdetter August 7, 2024 13:08

RReverser force-pushed the ingvar/csharp-table-inherit-type branch from 0bbd0da to 018009d Compare August 7, 2024 13:09

kazimuth mentioned this pull request Aug 9, 2024

Require scheduled reducers to have table fields scheduled_at: ScheduledAt and scheduled_id: u64 declared #1577

Closed

bfops added the release-any To be landed in any release window label Aug 12, 2024

kazimuth reviewed Aug 15, 2024

View reviewed changes

crates/bindings-csharp/BSATN.Codegen/Type.cs Show resolved Hide resolved

kazimuth approved these changes Aug 15, 2024

View reviewed changes

RReverser force-pushed the ingvar/csharp-table-inherit-type branch 2 times, most recently from 85ceffa to 07987a8 Compare August 21, 2024 16:54

kazimuth reviewed Aug 22, 2024

View reviewed changes

RReverser requested a review from SteveBoytsun August 22, 2024 18:17

SteveBoytsun approved these changes Aug 23, 2024

View reviewed changes

RReverser force-pushed the ingvar/csharp-table-inherit-type branch from 818fd32 to 5e3ca05 Compare September 3, 2024 13:38

RReverser and others added 4 commits September 4, 2024 14:40

Reuse MemberDeclaration base for other members

bb8d5fb

Several classes - fields, columns and reducer params - have very similar structure (name, type, BSATN type info) and codegen. This commit unifies that shared functionality by making MemberDeclaration a base class for the others.

[ingvar/csharp-table-inherit-type]: Empty commit to bump Jenkins CI

6f9e677

RReverser force-pushed the ingvar/csharp-table-inherit-type branch from 5e3ca05 to 6f9e677 Compare September 4, 2024 13:54

RReverser mentioned this pull request Sep 4, 2024

Migrate C# ModuleDef to V9 #1670

Merged

2 tasks

Remove unnecessary usings

0db76c2

RReverser added this pull request to the merge queue Sep 5, 2024

Merged via the queue into master with commit 68e3565 Sep 5, 2024
8 checks passed

RReverser deleted the ingvar/csharp-table-inherit-type branch September 5, 2024 18:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C#: split table codegen logic from type codegen logic #1573

C#: split table codegen logic from type codegen logic #1573

RReverser commented Aug 7, 2024 •

edited

Loading

kazimuth left a comment

kazimuth left a comment

RReverser commented Aug 22, 2024

SteveBoytsun left a comment

SteveBoytsun Aug 23, 2024

RReverser Aug 23, 2024

gefjon Aug 23, 2024

RReverser Aug 24, 2024

cloutiertyler commented Sep 4, 2024

SteveBoytsun commented Sep 5, 2024

RReverser commented Sep 5, 2024

C#: split table codegen logic from type codegen logic #1573

C#: split table codegen logic from type codegen logic #1573

Conversation

RReverser commented Aug 7, 2024 • edited Loading

Description of Changes

API and ABI breaking changes

Expected complexity level and risk

Testing

kazimuth left a comment

Choose a reason for hiding this comment

kazimuth left a comment

Choose a reason for hiding this comment

RReverser commented Aug 22, 2024

SteveBoytsun left a comment

Choose a reason for hiding this comment

SteveBoytsun Aug 23, 2024

Choose a reason for hiding this comment

RReverser Aug 23, 2024

Choose a reason for hiding this comment

gefjon Aug 23, 2024

Choose a reason for hiding this comment

RReverser Aug 24, 2024

Choose a reason for hiding this comment

cloutiertyler commented Sep 4, 2024

SteveBoytsun commented Sep 5, 2024

RReverser commented Sep 5, 2024

RReverser commented Aug 7, 2024 •

edited

Loading