Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More “Stream” Options #316

Open
Phoseele opened this issue Feb 20, 2025 · 1 comment
Open

More “Stream” Options #316

Phoseele opened this issue Feb 20, 2025 · 1 comment
Labels
enhancement New feature or request

Comments

@Phoseele
Copy link

Phoseele commented Feb 20, 2025

Describe the feature

When i ask for a response with stream option is on, seems like it will call string concat for every character generation, and if it's a long response, it will casue a lot of GC, means a serious performance problem.
So I‘m thinking is it possible to add a option that can just return the newest character generated instead of entire response made by string concat? Then, developer can decide how to use the character generated.
Or, add a overload function "LLMCharacter.Chat" change parameter type from "Callback string" to "Callback StringBuilder", and add a parameter to receive a outer string builder, can avoid string concat GC.
Image
Image

@Phoseele Phoseele added the enhancement New feature or request label Feb 20, 2025
@amakropoulos
Copy link
Collaborator

Yes this can be done, it needs a bit of engineering on the LlamaLib side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants