Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orchestration input IInputConverter does not handle colletion iterfaces #3035

Open
Ilia-Kosenkov opened this issue Feb 6, 2025 · 2 comments · May be fixed by #3061
Open

Orchestration input IInputConverter does not handle colletion iterfaces #3035

Ilia-Kosenkov opened this issue Feb 6, 2025 · 2 comments · May be fixed by #3061
Labels

Comments

@Ilia-Kosenkov
Copy link

Ilia-Kosenkov commented Feb 6, 2025

Description

Orchestrations receive null instead of an input if input binding type is anything other than List<T> or T[] (perhaps some other select cases).
If input type is e.g. IEnuemrable<T> or IReadOnlyCollection<T>, input silently gets converted into null.

Expected behavior

Input is set to the value that was passed to the orchestration.

Actual behavior

We get null

Relevant source code snippets

Test setup looks like this:

public static class Endpoints
{
    [Function(nameof(PassIntCollectionOrchestration))]
    public static async Task PassIntCollectionOrchestration([OrchestrationTrigger] TaskOrchestrationContext context)
    {
        var input = Enumerable.Range(0, 10).ToList();

        await context.CallSubOrchestratorAsync(nameof(CheckForNullOrchestration), input);
    }

    [Function(nameof(CheckForNullOrchestration))]
    public static async Task CheckForNullOrchestration([OrchestrationTrigger] TaskOrchestrationContext context, IEnumerable<int> input)
    {
        if (input is null)
        {
            throw new InvalidOperationException("Got null where collection was expected");
        }

        await Task.CompletedTask;
    }

    [Function(nameof(HttpEntryPoint))]
    public static async Task HttpEntryPoint([HttpTrigger(AuthorizationLevel.Anonymous, "GET", Route = "debug")] HttpRequestData req, [DurableClient] DurableTaskClient client)
    {
        await client.ScheduleNewOrchestrationInstanceAsync(nameof(PassIntCollectionOrchestration));
    }
}

Now I dug deep into the framework and found the following:

  • The input is correctly serialized, at least when executing locally backed by an instance of Azurite. Verified using Durable Functions Monitor extension for VSCode.
  • While traversing the execution stack with debugger, I found OrchestrationInputConverter from Microsoft.Azure.Functions.Worker.Extensions.DurableTask, which I believe to me the main source of problems, since it returns ConversionResult.Unhandled when in reality it should return Success.
  • I also discovered DefaultFunctionInputBindingFeature from Microsoft.Azure.Functions.Worker.Context.Features, which fall-backs to setting default (i.e. null) values to bindings that failed conversion with all available input converters (i.e. if all returned ConversionResult.Unhandled).

public ValueTask<ConversionResult> ConvertAsync(ConverterContext context)
{
if (context is null)
{
throw new ArgumentNullException(nameof(context));
}
// We only convert if:
// 1. The "Source" is null - IE: there are no declared binding parameters.
// 2. We have a cached input.
// 3. The TargetType matches our cached type.
// If these are met, then we assume this parameter is the orchestration input.
if (context.Source is null
&& context.FunctionContext.Items.TryGetValue(OrchestrationInputKey, out object? value)
&& context.TargetType == value?.GetType())
{
// Remove this from the items so we bind this only once.
context.FunctionContext.Items.Remove(OrchestrationInputKey);
return new(ConversionResult.Success(value));
}
return new(ConversionResult.Unhandled());
}

Why is this wrong (in my opinion)? JSON (de-)serializer deserializes a collection (e.g., [1, 2, 3, 4]) into a List<T>, so value.GetType() is actually List<T>. However, the target type comes from the function signature! And it can be, e.g., an IEnumerable<T>! Naturally, typeof(List<T>) != typeof(IEnumerable<T>) and conversion fails. I believe the last check in condition should be replaced with context.TargetType.IsInstanceOfType(value), which will allow passing List<T> as IEnumerable<T> (or practically any other meaningful collection as long as interfaces match). I did not check if target type is used for deseriazliation, but I believe it is the expected behavior when deserializing into collection interfaces -- you always get some sort of List<T> or Dictionary<TK, TV>.

Alternatively, a special branch could be introduced to only handle IEnumerable<T>-based types, if context.TargetType.IsInstanceOfType(value) is deemed too broad for general use cases.

The second issue is here:

https://github.com/Azure/azure-functions-dotnet-worker/blob/5366211c497e10e7c1534010069f79a9d06106ca/src/DotNetWorker.Core/Context/Features/DefaultFunctionInputBindingFeature.cs#L71-L92

When all conversions fail, the input binding is set to null because IEnumerable<T> is a reference type. Personally I would expect this branch to hard fail -- we have an input that is unbound and has no default parameter values, hence there is no way to populate it. That way we will get a much more concise and informative exception rather than dealing with wild NullReferenceExceptions because our collection input is suddenly null.

Known workarounds

Yup, steal OrchestrationInputConverter code and tweak the condition mentioned above:

ModifiedOrchestrationInputConverter .cs
using Microsoft.Azure.Functions.Worker;
using Microsoft.Azure.Functions.Worker.Converters;

public sealed class ModifiedOrchestrationInputConverter : IInputConverter
{
    private const string OrchestrationInputKey = "__orchestrationInput__";

    public static InputContext GetInputContext(FunctionContext context)
    {
        if (context is null)
        {
            throw new ArgumentNullException(nameof(context));
        }

        FunctionDefinition definition = context.FunctionDefinition;
        foreach (FunctionParameter parameter in definition.Parameters)
        {
            if (!IgnoredInputType(parameter.Type)
                && !definition.InputBindings.ContainsKey(parameter.Name)
                && !definition.OutputBindings.ContainsKey(parameter.Name))
            {
                // We take the first parameter we encounter which is not an explicitly ignored type
                // and has exactly 0 bindings (in or out) as our input candidate.
                return new(parameter.Type, context);
            }
        }

        return new(null, context);
    }

    public ValueTask<ConversionResult> ConvertAsync(ConverterContext context)
    {
        if (context is null)
        {
            throw new ArgumentNullException(nameof(context));
        }

        // We only convert if:
        // 1. The "Source" is null - IE: there are no declared binding parameters.
        // 2. We have a cached input.
        // 3. The TargetType matches our cached type.
        // If these are met, then we assume this parameter is the orchestration input.
        if (context.Source is null
            && context.FunctionContext.Items.TryGetValue(OrchestrationInputKey, out object? value)
// >>>>>>>HERE IS THE MODIFIED LINE<<<<<<<<<<<<<<<
            && context.TargetType.IsInstanceOfType(value))
        {
            // Remove this from the items so we bind this only once.
            context.FunctionContext.Items.Remove(OrchestrationInputKey);
            return new(ConversionResult.Success(value));
        }

        return new(ConversionResult.Unhandled());
    }

    private static bool IgnoredInputType(Type type)
    {
        // These are input types we know other converters handle.
        // TODO: is there a more concrete way we can determine if a type is already handled?
        return type == typeof(FunctionContext) || type == typeof(CancellationToken);
    }

    public class InputContext
    {
        private readonly FunctionContext context;
        private readonly Type? type;

        public InputContext(Type? type, FunctionContext context)
        {
            this.type = type;
            this.context = context;
        }

        public Type Type => this.type ?? typeof(object);

        public void PrepareInput(object? input)
        {
            // Short circuit if there is no input for our converter to handle, or if we have already prepared it.
            if (this.type is null || input is null || this.context.Items.ContainsKey(OrchestrationInputKey))
            {
                return;
            }

            this.context.Items[OrchestrationInputKey] = input;
        }
    }
}

Which can be then globally injected as

var host = new HostBuilder()
    .ConfigureFunctionsWorkerDefaults(
        builder => {  }, 
        opts =>
        {
            opts.InputConverters.RegisterAt<ModifiedOrchestrationInputConverter>(0);
        }
    )
    .Build();

App Details

  • <PackageReference Include="Microsoft.Azure.Functions.Worker" Version="2.0.0" />
  • <PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.DurableTask" Version="1.2.2" />
  • <PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.Http" Version="3.2.0" />
  • <PackageReference Include="Microsoft.Azure.Functions.Worker.Sdk" Version="2.0.0" />

Screenshots

Image

@Ilia-Kosenkov
Copy link
Author

Since this is the third time we report serialization-related issue here, and #3013 is still open, I am more than happy to help if you accept public contributions (with guidance of course).

@lilyjma
Copy link
Contributor

lilyjma commented Feb 28, 2025

@Ilia-Kosenkov - apologies for the delay. Yes! we accept public contributions. If you submit a PR, our team will review it.

@Ilia-Kosenkov Ilia-Kosenkov linked a pull request Mar 2, 2025 that will close this issue
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants