Skip to content

When computing the anchors on the traceback, results may be wrong if unicode chars are used #99103

Closed
@fabioz

Description

@fabioz

Bug report

Consider the code below:

d = {
    "ó": {
        "á": {
            "í": {
                "theta": 1
            }
        }
    }
}

try:
    result = d["ó"]["á"]["í"]["beta"]
except:
    import traceback;traceback.print_exc()

The output provided is:

Traceback (most recent call last):
  File "W:\pydev.debugger\check\snippet2.py", line 12, in <module>
    result = d["ó"]["á"]["í"]["beta"]
             ~~~~~~~~~~~~~~~~~~~^^^^^^^^
KeyError: 'beta'

Notice that for each additional unicode char, an additional `~' is added.

This seems to happen because when computing the anchors in traceback._extract_caret_anchors_from_line_segment the columns from the ast nodes generated in ast.parse seem to be related to bytes and not actual chars.

Your environment

  • CPython versions tested on: 3.11.0
  • Operating system and architecture: Windows 10

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions