Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConditionalRouter output type is not working anymore #8161

Closed
1 task done
gustavo-has-stone opened this issue Aug 2, 2024 · 13 comments · Fixed by #8176
Closed
1 task done

ConditionalRouter output type is not working anymore #8161

gustavo-has-stone opened this issue Aug 2, 2024 · 13 comments · Fixed by #8176
Assignees
Labels
P1 High priority, add to the next sprint
Milestone

Comments

@gustavo-has-stone
Copy link

Describe the bug
We tried to update the Haystack version to 2.3.1 due to the security update and our code stopped working. We identified that the ConditionalRouter is returning a string instead of the type assigned in output_type

Error message
We are getting errors related to the fact that a string is now being returned. Here is an example of the error when running the provided code:

AttributeError: 'str' object has no attribute 'content'

Expected behavior
We expected that the output_type provided in the route would be returned

Additional context
The problem is related to version 2.3.1. Our provided example works on 2.3.0

To Reproduce
The following code works on haystack 2.3.0 but it does not work on haystack 2.3.1:

from haystack.components.routers import ConditionalRouter
from haystack.dataclasses import ChatMessage, ChatRole

routes = [
    {
        "condition": "{{streams|length > 2}}",
        "output": "{{message}}",
        "output_name": "long_streams_message",
        "output_type": ChatMessage,
    },
    {
        "condition": "{{streams|length <= 2}}",
        "output": "{{message}}",
        "output_name": "short_streams_message",
        "output_type": ChatMessage,
    },
]

message = ChatMessage(content='Hi', role=ChatRole('user'), name='', meta={})
print(message.content)

router = ConditionalRouter(routes)

kwargs = {"streams": [1, 2, 3], "message": message}
result = router.run(**kwargs)
print(result['long_streams_message'].content)

FAQ Check

System:

  • OS: Ubuntu 22.04 LTS
  • GPU/CPU: CPU
  • Haystack version (commit or version number): 2.3.1
@silvanocerza
Copy link
Contributor

silvanocerza commented Aug 2, 2024

That's expected.

Because of the changes to fix the security issue we had to make this breaking change. The ConditionalRouter can only return the following Python literal types now: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, None and Ellipsis.
Any other will be returned as a string.

I would suggest creating a custom Component to cover those cases where the ConditionalRouter can't be used as before.

Something like this should do the trick:

from typing import List

from haystack import component
from haystack.dataclasses import ChatMessage

@component
class Decision:
    @component.output_types(short_streams_message=ChatMessage, long_streams_message=ChatMessage)
    def run(self, streams: List[str], message: ChatMessage):
        if len(streams) > 2:
            return {"long_streams_message": message}
        return {"short_streams_message": message}

@gustavo-has-stone
Copy link
Author

I see. So the parameter output_type will not work anymore?

We did not expect to see a breaking change with a patch version change. Should we worry about those updates too?

@silvanocerza
Copy link
Contributor

output_type has never been necessary to convert the type, it's only used to set the correct output type for the Component. Even before this change you could put any type in there but it wouldn't guarantee that the Component will return that type when calling run. That's only necessary when calling Pipeline.connect() to validate the connection.

We decided to release this security change as a patch release even if it's a breaking change as it's a pretty critical, it could lead to remote code execution.

@mathislucka
Copy link
Member

mathislucka commented Aug 3, 2024

Perhaps we could add an option to the ConditionalRouter so that it can route inputs to the selected route's output without rendering a jinja2 template?

I also had this case, where I wanted to route a list of Documents depending on a reply generated from the LLM. I don't have to go through jinja to route these documents because I don't want to change anything about them.

So something like:

routes = [
{"condition": "{{...}}", output: "documents", output_name: "a_documents", "output_type": List[Document], "use_literal_output": True}
]
router = ConditionalRouter(routes=routes)

router.run(documents=[Document(content="hey")])

# returns the document passed in unchanged

# We now evaluate the `output` expression to determine the route output

This part would need to be changed to something like:

if route.get("use_literal_output"):
    output = kwargs[route["output"]]
else:
    ...

Would that solve your use case too @gustavo-has-stone ?

@julian-risch julian-risch added the P1 High priority, add to the next sprint label Aug 5, 2024
@gustavo-has-stone
Copy link
Author

In our scenario, we were routing a variable of a custom type (similar to ChatMessage) based on the value of an attribute of this variable. I did not fully understand your example. Could you provide more details?

@alexsniffin
Copy link

Hi, I ran into this problem too when upgrading to 2.3.1. I use a custom dataclass type for my pipeline which isn't working as expected anymore. Is there a suggested fix for this without creating a custom router?

@silvanocerza
Copy link
Contributor

@gustavo-has-stone @alexsniffin

We discussed a bit internally and decided to give the possibility to enable the previous behaviour by creating ConditionalRouter and OutputAdapter with unsafe=True.

I just created a PR to bring that change in and we'll release it on the upcoming 2.4.0 some time next week. For the time being, and if you consider it safe, you can still rely on the previous behaviour by using 2.3.0 instead of the latest 2.3.1.

@gustavo-has-stone
Copy link
Author

gustavo-has-stone commented Aug 8, 2024

I don't think that using an unsafe version is a good solution for us. We will try to find an alternative solution for our use case

@dmvieira
Copy link

dmvieira commented Aug 8, 2024

Hey @silvanocerza Idk if it`s a good solution because you are publishing a version that the only way to use previews behavior is forcing unsafe. It forces people to open security breaches when need a feature.

An approach that could help us is using Haystack ChatMessage as Router output. Do you know if is it possible?

@silvanocerza
Copy link
Contributor

I don't think that using an unsafe version is a good solution for us. We will try to find an alternative solution for our use case

It's unsafe only if the Jinja templates used are users inputs too. If you know the templates it's completely safe to use.

Hey @silvanocerza Idk if it`s a good solution because you are publishing a version that the only way to use previews behavior is forcing unsafe. It forces people to open security breaches when need a feature.

An approach that could help us is using Haystack ChatMessage as Router output. Do you know if is it possible?

Setting unsafe to True doesn't automatically open you to security breaches. It enables a behaviour that is considered unsafe in certain circumstances, that is when ConditionalRouter or OutputAdapter use templates that can be considered users input.

If you know what you're doing and know you're not using users inputs as Jinja template that is completely fine to use.

@gustavo-has-stone
Copy link
Author

Let me see if I understand. If we try to route based on some controlled condition (e.g., message length), we will be fine. However, if we try to use the content of a message for conditional routing, we could be in trouble. Is that right?

@shadeMe shadeMe added this to the 2.4.1 milestone Aug 12, 2024
@silvanocerza
Copy link
Contributor

Let me see if I understand. If we try to route based on some controlled condition (e.g., message length), we will be fine. However, if we try to use the content of a message for conditional routing, we could be in trouble. Is that right?

No. You are vulnerable if you let your users write their own templates from scratch.

A simple example:

from typing import List
from haystack.components.routers import ConditionalRouter

user_input = input("Write a Jinja template:")

routes = [
    {
        "condition": user_input,
        "output": "{{streams}}",
        "output_name": "enough_streams",
        "output_type": List[int],
    },
    {
        "condition": "{{streams|length <= 2}}",
        "output": "{{streams}}",
        "output_name": "insufficient_streams",
        "output_type": List[int],
    },
]
router = ConditionalRouter(routes)
...
router.run(whatever_input)

@dmvieira
Copy link

Thank you @silvanocerza . It sounds like a plan! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 High priority, add to the next sprint
Projects
None yet
7 participants