Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ABI v9 validation code #1572

Merged
merged 14 commits into from
Aug 27, 2024
Merged

ABI v9 validation code #1572

merged 14 commits into from
Aug 27, 2024

Conversation

kazimuth
Copy link
Contributor

@kazimuth kazimuth commented Aug 7, 2024

Description of Changes

WIP validation code for moduledefv9, plus the definition for ModuleDef. The new module-def should also be backwards-compatible with V8BackCompat and the existing TableSchema, Header, etc types. I have not implemented those bridges yet.

API and ABI breaking changes

A few API changes in components I have recently added, otherwise this is just adding code.

Expected complexity level and risk

1, there is a good amount of code but I am generally happy with the redesigned interfaces. It's fun playing with my new multi-error combinator.

Testing

Test porting is WIP, I have more proptest helpers now and I am going to use those to fuzz this once I've got the rest of the stuff ported.

@kazimuth kazimuth marked this pull request as draft August 7, 2024 05:56
@kazimuth kazimuth changed the base branch from jgilles/new_module_def to master August 9, 2024 05:27
@kazimuth kazimuth force-pushed the jgilles/validation branch 2 times, most recently from bb4d9c4 to c10127a Compare August 9, 2024 05:57
@kazimuth kazimuth marked this pull request as ready for review August 9, 2024 05:58
@kazimuth
Copy link
Contributor Author

kazimuth commented Aug 9, 2024

@Centril @cloutiertyler This is ready for review.

@kazimuth
Copy link
Contributor Author

kazimuth commented Aug 9, 2024

I didn't touch C#, so not sure where those SDK failures are coming from.

Copy link
Contributor

@Centril Centril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review; more to come.

@bfops bfops added the release-any To be landed in any release window label Aug 12, 2024
@kazimuth
Copy link
Contributor Author

kazimuth commented Aug 13, 2024

@Centril, please review #1584, #1585, #1586 before coming back to this one.

@kazimuth kazimuth force-pushed the jgilles/validation branch 2 times, most recently from 485f7da to 893e373 Compare August 16, 2024 17:13
Apply suggestions from code review

Co-authored-by: Mazdak Farrokhzad <[email protected]>
Signed-off-by: james gilles <[email protected]>

Address review comments

Address more review comments

Remove dead code

Fix tests

Add primary key support

Fix #1591 by handling scoped type names explicitly

Update validate for new ColList API

Whoops, that derive was wrong

Better ColList usage

Updates that Phoebe pointed out

Whoops, removed something we needed

Test that indexes are generated correctly in v9

Whoops, forgot to update raw_def
@kazimuth kazimuth force-pushed the jgilles/validation branch from e1afd15 to f695d0d Compare August 23, 2024 22:13
@kazimuth
Copy link
Contributor Author

@Centril @cloutiertyler I consider this ready for merge.

// Alias provided? Relate `name -> slot_ref`.
if let Some(name) = name {
// Right now, we just split the name on patterns likely to split up module paths.
// TODO(1.0): build namespacing directly into the bindings macros so that we don't need to do this.
Copy link
Contributor Author

@kazimuth kazimuth Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this will happen when we are redoing the Rust binding macros to return v9. It will not be an ABI break because the TypespaceBuilder traitTypespaceBuilder lives inside the module, so it is not part of the ABI. It is part of the API surface touched by the macros, so this will be an API break, but it's one of those weird actually-private-despite-being-formally-public APIs you get with macros.

@kazimuth kazimuth changed the title WIP v9 validation code ABI v9 validation code Aug 26, 2024
@kazimuth kazimuth requested a review from RReverser August 26, 2024 17:55
types: HashMap<ScopedTypeName, TypeDef>,

/// The typespace of the module definition.
typespace: Typespace,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we add AlgebraicTypeUse and AlgebraicTypeDefinition, we may want to change this to a datastructure that stores those for client codegen. On the other hand, I think the database engine still wants to think about algebraic types structurally, so we may be in the uncomfortable position of needing both.

Copy link
Contributor

@cloutiertyler cloutiertyler Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the future we will want to think about how we can implement nominal types within a structural type system on a holistic level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that would be very worthwhile.

pub mod validate;

/// A map from `Identifier`s to values of type `T`.
pub type IdentifierMap<T> = HashMap<Identifier, T>;
Copy link
Contributor

@RReverser RReverser Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, all the types you store here implement your nice ModuleDefLookup trait and already allow to get a map key from T.

One thing I've done in the past for similar scenarios is use HashSet instead + a newtype that implements Eq + Hash + Borrow<Key> by delegating to the key comparisons.

That way you should be able to avoid a lot of allocations in all these (foobar.name.clone(), foobar) KV construction lines, as well as save some memory by simply reusing foobar itself as both the key and the value of each entry.

Idk how critical it is, so feel free to leave this as a future optimisation, just thought I'd throw it out there as I noticed a lot of those clones in the PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's a great idea. That should get rid of almost all of these clones and get rid of some possible invalid states.

I'm tempted to leave it as a future optimization for now though just in favor of getting this merged.

Copy link
Contributor

@cloutiertyler cloutiertyler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any significant issues with this change, provided you address the things you've already responded to. Excited to merge!

@kazimuth kazimuth added this pull request to the merge queue Aug 27, 2024
Merged via the queue into master with commit 6e6bee2 Aug 27, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-any To be landed in any release window
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants