Description
Describe your environment
OS: macOS and Debian 12 (confirmed on both)
Python version: Python 3.13
Package version: opentelemetry-python-contrib v1.31.0/0.52b0 or later
What happened?
When using Amazon Bedrock's ConverseStream API with tool_use functionality, JSON parsing errors occur during response processing. The issue is caused by the _decodetool_use function in bedrock_utils.py incorrectly handling double-quoted strings in chunked responses, removing necessary quotes and breaking JSON structure.
Steps to Reproduce
import json
import boto3
from opentelemetry.instrumentation.botocore import BotocoreInstrumentor
# Initialize instrumentation
BotocoreInstrumentor().instrument()
# Use Bedrock ConverseStream API with tool_use
client = boto3.client("bedrock-runtime", region_name="us-east-1")
response = client.converse_stream(
modelId="anthropic.claude-3-sonnet-20240229-v1:0", # or any model supporting tools
messages=[
{
"role": "user",
"content": [
{
"text": "Use the get_cities_list tool to provide exactly 10 popular tourist cities in Japan. Call the tool with a cities array containing: Tokyo, Osaka, Kyoto, Hiroshima, Nara, Yokohama, Sapporo, Fukuoka, Sendai, and Nagoya."
}
],
}
],
toolConfig={
"tools": [
{
"toolSpec": {
"name": "get_cities_list",
"description": "Get a list of cities",
"inputSchema": {"json": {"type": "object", "properties": {"cities": {"type": "array", "items": {"type": "string"}}}}},
}
}
]
},
)
res = ""
# Process the streaming response - error occurs here
for chunk in response["stream"]:
if "contentBlockDelta" in chunk:
delta = chunk["contentBlockDelta"]["delta"]
if "toolUse" in delta:
res += delta["toolUse"].get("input")
# Parse the output and print it
print(json.loads(res))
Expected Result
The ConverseStream API with tool_use should work seamlessly with OpenTelemetry instrumentation. Chunked responses should be properly processed without JSON parsing errors, maintaining the integrity of JSON strings across chunk boundaries.
For example, when a complete response is {"cities":["Tokyo","Osaka","Kyoto","Hiroshima","Nara","Yokohama","Sapporo","Fukuoka","Sendai","Nagoya"]}
and it's split across chunks, the final assembled JSON should remain valid regardless of how the chunks are divided.
Actual Result
JSON parsing error occurs when chunked responses contain complete quoted strings.
Specific problem:
- Expected complete JSON:
{"cities":["Tokyo","Osaka","Kyoto","Hiroshima","Nara","Yokohama","Sapporo","Fukuoka","Sendai","Nagoya"]}
- When delta chunks contain complete quoted strings like:
"Tokyo"
,"Osaka"
,"Kyoto"
, etc. - The
_decodetool_use
function strips the quotes:Tokyo
,Osaka
,Kyoto
, etc. - Final assembled JSON becomes:
{"cities":[Tokyo,Osaka,Kyoto,"Hiroshima",Nara,"Yokohama",Sapporo,"Fukuoka",Sendai,"Nagoya"]}
← Invalid JSON - Result: JSON parsing fails due to multiple unquoted string values in the array
Root cause location: instrumentation/opentelemetry-instrumentation-botocore/src/opentelemetry/instrumentation/botocore/extensions/bedrock_utils.py
The issue occurs inconsistently depending on chunk boundaries:
- ✅ Works: Chunks like
{"cities":["Tok
,yo","Osa
,ka","Kyoto"]}
(quotes not stripped due to incomplete strings) - ❌ Fails: Chunks like
"Tokyo"
,"Osaka"
,"Kyoto"
(complete quoted strings get quotes stripped)
Additional context
- This is a regression introduced in v1.31.0/0.52b0
- The same code works correctly with versions prior to v1.31.0/0.52b0
- Confirmed on multiple platforms: macOS and Debian 12
- The issue only affects ConverseStream API with tool_use functionality
- Regular Bedrock API calls (non-streaming) are not affected
- The problem impacts applications that rely on tool calling with streaming responses
The bug appears to be in the string parsing logic of the _decodetool_use
function, which doesn't properly handle the case where a complete quoted string appears as a single chunk in the streaming response.
Would you like to implement a fix?
None
Activity
xrmx commentedon May 23, 2025
Thanks for providing a reproducer
xrmx commentedon May 23, 2025
@sightseeker I'm not able to reproduce this and it prints
{'cities': ['Tokyo', 'Osaka', 'Kyoto', 'Hiroshima', 'Nara', 'Yokohama', 'Sapporo', 'Fukuoka', 'Sendai', 'Nagoya']}
Do you have a stacktrace? Also we don't have
_decodetool_use
function here.sightseeker commentedon May 23, 2025
@xrmx I don’t have my PC with me right now, so I’ll send the stack trace later.
Since it depends on the stream response chunks, the issue only occurs once every 10 to 20 times.
sightseeker commentedon May 24, 2025
@xrmx
Note about the reproduction code: The final two print statements are designed to demonstrate the bug clearly:
print(f"Final output: {res}")
- This will show the corrupted JSON string where quotes have been incorrectly stripped by the_decodetool_use
function. You'll see something like{"cities":[Tokyo,Osaka,Kyoto,...]}
instead of the expected{"cities":["Tokyo","Osaka","Kyoto",...]}
print(f"Parsed output: {json.loads(res)}")
- This line will fail with aJSONDecodeError
because the assembled string is no longer valid JSON due to the missing quotes around string values.To reproduce the issue: Replace the last two lines with the code above and run it. Important: Since this issue depends on how the streaming response is chunked, it may only occur intermittently - approximately once every 10 to 20 attempts. You may need to run the code multiple times to observe the bug. When the issue occurs with the affected versions (v1.31.0/0.52b0+), you should see the malformed JSON output first, followed by a JSON parsing exception. This clearly demonstrates how the instrumentation is breaking the streaming response processing.
When working correctly (quotes preserved):
When the bug occurs (quotes stripped by _decode_tool_use):
Notice in the failure case how
Nara
appears without quotes, making the JSON invalid and causing the parsing error.xrmx commentedon May 26, 2025
Tested ~30 times still no joy in reproducing.
I've pushed a branch here https://github.com/xrmx/opentelemetry-python-contrib/tree/fix-converse-stream-tool-3537 with a test you can run on your machine and record the aws response so I can reproduce it too.
Per
instrumentation/opentelemetry-instrumentation-botocore/tests/README.md
you need to let tox access AWS env vars for recording calls:You can run the test from the root of the -contrib repo with:
Until the recording is not the one we expect you need to remove it before running tox again:
When you have a recording reproducing the issue please attach it to this issue so I can reproduce it too.
sightseeker commentedon May 26, 2025
@xrmx Thank you for creating the test! I found the issue with reproduction - the test code wasn't enabling instrumentation, which is required to trigger the bug.
I've created a commit that enables instrumentation in the test: sightseeker@7b0afd9
To reproduce the issue with instrumentation enabled:
BotocoreInstrumentor().instrument()
before the API call in the test)export TOX_OVERRIDE=testenv.pass_env=AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_SESSION_TOKEN,AWS_DEFAULT_REGION rm instrumentation/opentelemetry-instrumentation-botocore/tests/cassettes/test_converse_stream_tool_call_parsing_errors.yaml tox -e py313-test-instrumentation-botocore-1 -- -k test_converse_stream_tool_call_parsing_errors
Once I successfully reproduce the issue with instrumentation enabled, I'll attach the recorded
test_converse_stream_tool_call_parsing_errors.yaml
file that demonstrates the bug. The key difference is that without instrumentation, the _decodetool_use function in bedrock_utils.py is never called, so the quote-stripping bug doesn't occur.sightseeker commentedon May 26, 2025
stdout