-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Frontend support for language extension with #lang-X #3363
Conversation
… node in the frontend AST, to tolerate syntax errors in the IDE
@@ -133,6 +133,8 @@ let go _ = | |||
| Success -> | |||
fstar_files := Some filenames; | |||
|
|||
load_native_tactics (); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
load_native_tactics early, since now even the early phases of the parser, dependence scanning etc. can depend on a native plugin
@@ -686,6 +696,22 @@ match %sedlex lexbuf with | |||
uninterpreted_blob snap name pos buffer lexbuf | |||
| _ -> assert false | |||
|
|||
and use_lang_blob snap name pos buffer lexbuf = | |||
match %sedlex lexbuf with | |||
| "#end-lang" -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about ending an extension block with #lang-fstar
(or #lang-pulse
for that matter)?
I wonder if it is feasible for the extension parser/lexer to call back into the main parser. That is, if we could just add a #lang-
rule to the pulse grammer?
One disadvantage of detecting the end of the extension block here is that it is unaware of the lexical syntax of the extension language (namely string literals and comments).
#lang-pulse
(*
We can exit a pulse block like this:
#end-lang
*)
fn foo () requires emp returns _: string ensures emp {
"This is how you end a pulse extension block:
#end-lang
Everything after that is F* again."
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or an optional #end-lang-X to match a #lang-X?
The extension parser for Pulse does call back into the F* parser and lexer. So, it can, in principle, handle a nested #lang-x ... #end-lang-x
block
The disadvantage you mentioned remains, but I guess it's not so bad ... you should be aware of the placement of the end-lang block.
That said, for Pulse itself, I don't see any reason to actually end the #lang-pulse block. Do you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm suggesting is that #lang-foo
means "parse the rest of the file using Foo." And then the parser for Foo could have the same command, calling back to the main parser.
#lang-pulse
fn foo () requires emp ensures emp { () }
// not really necessary, as you point out
#lang-fstar
let bar = 42
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this works. You can even do
#lang-pulse
fn stuff ...
#lang-pulse
fn more stuff
And it will recursively call back into the Pulse parser for the second block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also registered an #lang-fstar parser and removed #end-lang
…, for use with #lang-X
…r use with #lang-X and fstar-mode.el
… register a #lang-fstar parser; remove #end-lang markers
I also got an Everest green with this; it is still compatible with ``` pulse .. . ``` style extensions. |
For Pulse and other DSLs, we want tighter syntax integration. This PR provides support for writing
I.e., after #lang-something, the rest of the file is parsed by a custom parser registered for language "somelang". Such a parser can be an extension of the F* parser and what follows could be a mixture of F* and some custom syntax. The block can optionally end with an #end-lang, without it, the entire rest of the file is considered to be in extension syntax.
This is used in Pulse---where you can find a fully worked out example of how all this works.
Some of the main pieces:
This is the main parser hook for #lang extensions. The registered parser takes a string and a source range for the suffix of the buffer after the #lang pragma, and should produce a list of F* declarations (or an error)
Unparseable : decl'
which is a token to indicate a terminal parsing failure in a file, associated with a range. By including this token as a decl, we can produce a list of decls including up to the first error, suppressing the error. This is useful when launching the IDE on a file that may have syntax errors in it.DeclToBeDesugared
: A decl node which allows an extension to stash arbitrary data (e.g., its own parse tree) within the AST, for downstream processing. The node also includes callbacks to let an extension check blobs for equality (using for caching in the IDE), and to scan blobs for dependency analysis.DeclToBeDesugared
node with the desugaring environment appropriate for that point in a file---using this ensures that an extension doesn't see out-of-order names in scope.FStar.Options provides an ide_filename which is set programmatically by the IDE mode, allowing other modules (notably extension parsers) to query the name of the file on which the IDE was launched. This is not a command line option.
FStar.Interactive.Ide: This is all mostly transparent to FStar.Interactive.IDE when using the full-buffer mode, i.e., for VSCode. However, for backwards compat with fstar-mode.el, which only sends fragments of buffers, I rigged the repl-state to keep track of which language parser is currently active and use that to parse the next chunk sent by emacs.