Skip to content

Chain-of-thought and structured output #800

Closed
@akira108

Description

@akira108

Please read this first

  • Have you read the docs?Agents SDK docs: Yes
  • Have you searched for related issues? Others may have had similar requests: Yes

Question

I'd like to enforce LLM to chain-of-thought and step by step tool calls and ultimately want to return a structured output built with a Pydantic model. To achieve that, I'm using StopAtTools and stop when structured output is built.

tool_use_behavior = StopAtTools(stop_at_tool_names=[build_output.name])

where

@function_tool
def build_output(foo: Foo):
    return foo

Today I can make this work by calling run_streamed, inspecting each interim tool-call result, and returning when the build_output call of the wanted type appears.

Is there a way to achieve the same thing without using run_streamed?

Activity

rm-openai

rm-openai commented on Jun 2, 2025

@rm-openai
Collaborator

@akira108 I might be missing something, but why can't you do this:

agent = Agent(
  model="o3", # reasoning model,
  tools=[...], 
  output_type=MyOutputType,
)

result = await Runner.run(agent, input)
final_output = result.final_output_as(MyOutputType)

The reasoning model should automatically call tools in it's CoT, and keep going until it produces an output of that type.

akira108

akira108 commented on Jun 2, 2025

@akira108
Author

@rm-openai

Thanks for the reply!

I’d love to use the reasoning model, but due to cost considerations, I’m trying to make it work with gpt-4.1.

Ref: https://cookbook.openai.com/examples/gpt4-1_prompting_guide#prompting-induced-planning–chain-of-thought

Do you have any suggestions for achieving similar behavior with gpt-4.1?

rm-openai

rm-openai commented on Jun 2, 2025

@rm-openai
Collaborator

Oh gotcha, your approach should work for that. Note that because you're prompting the agent, it might be finicky:

  1. It might call the tool directly, without doing the step-by-step thinking
  2. It might do the thinking, but not call the tool.

You could also try o4-mini, it might fit your budget.

import asyncio

from pydantic import BaseModel

from agents import (
    Agent,
    ItemHelpers,
    MessageOutputItem,
    ModelSettings,
    Runner,
    ToolCallItem,
    function_tool,
)


class MathResult(BaseModel):
    result: int


@function_tool
def commit_final_result(result: int) -> MathResult:
    """When you have the final result, call this function."""
    print(f"[debug] Final result: {result}")
    return MathResult(result=result)


async def main():
    agent = Agent(
        name="Assistant",
        instructions="Think step by step. Once you have the final answer, call the commit_final_result tool exactly once. Do NOT call the commit_final_result until AFTER your thought process is done.",
        tool_use_behavior={
            "stop_at_tool_names": [commit_final_result.name],
        },
        tools=[commit_final_result],
    )

    result = await Runner.run(agent, "What is 373 * 41 + the year Roger Federer was born?")
    new_items = result.new_items

    assert isinstance(new_items[0], MessageOutputItem), (
        f"First item should be a message output item, got {type(new_items[0])}"
    )

    print(f"===Message===:\n {ItemHelpers.text_message_output(new_items[0])}")

    print(f"===Final output===\n {result.final_output}")


if __name__ == "__main__":
    asyncio.run(main())

output:

[debug] Final result: 17274
===Message===:
 To solve this problem, I'll break it down into steps:

1. **Calculate \(373 \times 41\):**

   \(373 \times 41 = 373 \times (40 + 1) = 373 \times 40 + 373 \times 1\)

   \[
   373 \times 40 = 373 \times 4 \times 10 = 1492 \times 10 = 14920
   \]

   \[
   373 \times 1 = 373
   \]

   \[
   373 \times 41 = 14920 + 373 = 15293
   \]

2. **Find the year Roger Federer was born:**

   Roger Federer was born in 1981.

3. **Add the results from Step 1 and Step 2:**

   \[
   15293 + 1981 = 17274
   \]

Now, I'll provide the final result.
===Final output===
 result=17274
akira108

akira108 commented on Jun 2, 2025

@akira108
Author

@rm-openai

Thanks so much — that really helps!

I didn’t realize that even without specifying an output_type, final_output would take the return type of the tool.

Appreciate the heads-up on the two pitfalls when prompting gpt-4.1, and I’ll definitely give o4-mini a try too!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionQuestion about using the SDK

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @akira108@rm-openai

        Issue actions

          Chain-of-thought and structured output · Issue #800 · openai/openai-agents-python