Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] Support for low-level incremental tokenization #3702

Closed
fabiospampinato opened this issue Jan 29, 2023 · 4 comments
Closed

[Request] Support for low-level incremental tokenization #3702

fabiospampinato opened this issue Jan 29, 2023 · 4 comments
Labels
cantfix / wontfix Impossible to fix enhancement An enhancement or new feature parser

Comments

@fabiospampinato
Copy link

Is your request related to a specific problem you're having?

I'm making a new code editor, and I'd like to explore adding a Highlight.js backend to it for syntax highlighting, since I suspect it might be much faster than the current TextMate-based one.

The thing is Highlight.js doesn't seem to have tokenization APIs. Ideally I would need some API that tells me what colors things should be at what ranges, with the ability to pause and resume tokenization at any point, otherwise it would block the main thread for an indefinite amount of itme,

The solution you'd prefer / feature you'd like to see added...

I'd like to see a low-level tokenization API added.

Any alternative solutions you considered...

Not adding a Highlight.js backend to the editor.

Additional context...

N/A

@fabiospampinato fabiospampinato added enhancement An enhancement or new feature parser labels Jan 29, 2023
@joshgoebel joshgoebel added the cantfix / wontfix Impossible to fix label Jan 29, 2023
@joshgoebel
Copy link
Member

Sorry, this usage is out-of-scope and an entirely different usage than what HLJS is intended for. We're not that interested in the live code editor space - lots of solutions there already. That said, you'd be more than welcome to piggy back our grammars and build your own incremental tokenizer on top, but I wouldn't say that's a trivial undertaking. Of course the underlying parser code is all there for forking as well, but I wouldn't say it's super clean - it's never been FULLY refactored since I came onboard.

Worth noting we do not highlight nearly as well as big editors with large (and complete) grammars... we try to exist in the "smart pattern matching" space rather than the "fully parse and understand this language" space... that results in much, much smaller grammars, but often less fidelity. This becomes obvious in more complex grammars with a lot of nuance/context like TSX or Markdown, etc...

@joshgoebel
Copy link
Member

We've actually been curious about the opposite direction - allowing big editor tokenization to be plugged into us - allowing our themes/rendering engine, but super high fidelity highlighting... of course the price is much, much larger JS payloads... and so far no one seems super interested in that work. #3621

@fabiospampinato
Copy link
Author

fabiospampinato commented Jan 29, 2023

It's improbably impossible to plug existing themes for highlight.js and use them with textmate grammars, with great results, like presumably the whole scoping will be different, maybe highlight.js themes don't even use scoping.

Closing since it's out of scope.

@joshgoebel
Copy link
Member

joshgoebel commented Jan 29, 2023

We use scopes.

the whole scoping will be different

Indeed, but a 20-30 line mapping table would get one pretty far... so yes... to use another tokenizer one would have to remap it's named scopes into HLJS named scopes, but that's the easy part AFAIC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cantfix / wontfix Impossible to fix enhancement An enhancement or new feature parser
Projects
None yet
Development

No branches or pull requests

2 participants