diff --git a/Doc/glossary.rst b/Doc/glossary.rst index c5c7994f1262a9..fad892998480d3 100644 --- a/Doc/glossary.rst +++ b/Doc/glossary.rst @@ -1304,6 +1304,11 @@ Glossary See also :term:`borrowed reference`. + t-string + String literals prefixed with ``'t'`` or ``'T'`` are commonly called + "t-strings" which is short for + :ref:`template string literals `. See also :pep:`750`. + text encoding A string in Python is a sequence of Unicode code points (in range ``U+0000``--``U+10FFFF``). To store or transfer a string, it needs to be diff --git a/Doc/library/ast.rst b/Doc/library/ast.rst index ef6c62dca1e124..efab63afbf58e8 100644 --- a/Doc/library/ast.rst +++ b/Doc/library/ast.rst @@ -325,6 +325,54 @@ Literals Constant(value='.3')]))])) +.. class:: TemplateStr(values) + + A t-string, comprising a series of :class:`Interpolation` and :class:`Constant` + nodes. + + .. doctest:: + + >>> print(ast.dump(ast.parse('t"{name} finished {place:ordinal}"', mode='eval'), indent=4)) + Expression( + body=TemplateStr( + values=[ + Interpolation( + value=Name(id='name'), + str='name', + conversion=-1), + Constant(value=' finished '), + Interpolation( + value=Name(id='place'), + str='place', + conversion=-1, + format_spec=JoinedStr( + values=[ + Constant(value='ordinal')]))])) + + .. versionadded:: 3.14 + + +.. class:: Interpolation(value, str, conversion, format_spec) + + Node representing a single interpolation field in a t-string. + + * ``value`` is any expression node (such as a literal, a variable, or a + function call). + * ``str`` is a constant containing the text of the interpolation expression. + * ``conversion`` is an integer: + + * -1: no conversion + * 115: ``!s`` string conversion + * 114: ``!r`` repr conversion + * 97: ``!a`` ascii conversion + + * ``format_spec`` is a :class:`JoinedStr` node representing the formatting + of the value, or ``None`` if no format was specified. Both + ``conversion`` and ``format_spec`` can be set at the same time. + + .. versionadded:: 3.14 + + .. class:: List(elts, ctx) Tuple(elts, ctx) diff --git a/Doc/library/dis.rst b/Doc/library/dis.rst index 11685a32f48e4f..6e5f65e122dcdb 100644 --- a/Doc/library/dis.rst +++ b/Doc/library/dis.rst @@ -1120,6 +1120,48 @@ iterations of the loop. .. versionadded:: 3.12 +.. opcode:: BUILD_TEMPLATE + + Constructs a new :class:`~string.templatelib.Template` from a tuple + of strings and a tuple of interpolations and pushes the resulting instance + onto the stack:: + + interpolations = STACK.pop() + strings = STACK.pop() + STACK.append(_build_template(strings, interpolations)) + + .. versionadded:: 3.14 + + +.. opcode:: BUILD_INTERPOLATION (format) + + Constructs a new :class:`~string.templatelib.Interpolation` from a + value and its source expression and pushes the resulting instance onto the + stack. + + If no conversion or format specification is present, ``format`` is set to + ``2``. + + If the low bit of ``format`` is set, it indicates that the interpolation + contains a format specification. + + If ``format >> 2`` is non-zero, it indicates that the interpolation + contains a conversion. The value of ``format >> 2`` is the conversion type + (e.g. ``0`` for no conversion, ``1`` for ``!s``, ``2`` for ``!r``, and + ``3`` for ``!a``):: + + conversion = format >> 2 + if format & 1: + format_spec = STACK.pop() + else: + format_spec = None + expression = STACK.pop() + value = STACK.pop() + STACK.append(_build_interpolation(value, expression, conversion, format_spec)) + + .. versionadded:: 3.14 + + .. opcode:: BUILD_TUPLE (count) Creates a tuple consuming *count* items from the stack, and pushes the diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 394c302fd354b9..cce38d3a48ca6e 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -2673,9 +2673,9 @@ For example: lead to a number of common errors (such as failing to display tuples and dictionaries correctly). Using the newer :ref:`formatted string literals `, the :meth:`str.format` interface, or :ref:`template strings - ` may help avoid these errors. Each of these - alternatives provides their own trade-offs and benefits of simplicity, - flexibility, and/or extensibility. + ($-strings) ` may help avoid these errors. + Each of these alternatives provides their own trade-offs and benefits of + simplicity, flexibility, and/or extensibility. String objects have one unique built-in operation: the ``%`` operator (modulo). This is also known as the string *formatting* or *interpolation* operator. diff --git a/Doc/library/string.rst b/Doc/library/string.rst index 23e15780075435..7b624d18d7f8e0 100644 --- a/Doc/library/string.rst +++ b/Doc/library/string.rst @@ -198,8 +198,9 @@ Format String Syntax The :meth:`str.format` method and the :class:`Formatter` class share the same syntax for format strings (although in the case of :class:`Formatter`, subclasses can define their own format string syntax). The syntax is -related to that of :ref:`formatted string literals `, but it is -less sophisticated and, in particular, does not support arbitrary expressions. +related to that of :ref:`formatted string literals ` and +:ref:`template string literals `, but it is less sophisticated +and, in particular, does not support arbitrary expressions. .. index:: single: {} (curly brackets); in string formatting @@ -306,7 +307,7 @@ Format Specification Mini-Language "Format specifications" are used within replacement fields contained within a format string to define how individual values are presented (see -:ref:`formatstrings` and :ref:`f-strings`). +:ref:`formatstrings`, :ref:`f-strings`, and :ref:`t-strings`). They can also be passed directly to the built-in :func:`format` function. Each formattable type may define how the format specification is to be interpreted. @@ -789,10 +790,20 @@ Nesting arguments and more complex examples:: -.. _template-strings: +.. _template-strings-pep292: -Template strings ----------------- +Template strings ($-strings) +---------------------------- + +.. note:: + + The feature described here was introduced in Python 2.4. It is unrelated + to, and should not be confused with, the newer + :ref:`template strings ` feature and + :ref:`t-string literal syntax ` introduced in Python 3.14. + T-string literals evaluate to instances of a different + :class:`~string.templatelib.Template` class, found in the + :mod:`string.templatelib` module. Template strings provide simpler string substitutions as described in :pep:`292`. A primary use case for template strings is for @@ -972,3 +983,9 @@ Helper functions or ``None``, runs of whitespace characters are replaced by a single space and leading and trailing whitespace are removed, otherwise *sep* is used to split and join the words. + + + +.. toctree:: + + string.templatelib.rst diff --git a/Doc/library/string.templatelib.rst b/Doc/library/string.templatelib.rst new file mode 100644 index 00000000000000..ed2e9e3f7f84b7 --- /dev/null +++ b/Doc/library/string.templatelib.rst @@ -0,0 +1,276 @@ +:mod:`!string.templatelib` --- Templates and Interpolations for t-strings +========================================================================= + +.. module:: string.templatelib + :synopsis: Support for t-string literals. + +**Source code:** :source:`Lib/string/templatelib.py` + +-------------- + + +.. seealso:: + + :ref:`T-strings tutorial ` + :ref:`Format strings ` + :ref:`T-string literal syntax ` + + +.. _template-strings: + +Template strings +---------------- + +.. versionadded:: 3.14 + +Template strings are an extension of :ref:`f-strings ` +that allow for greater control of formatting behavior. The :class:`Template` +class gives you access to the static and interpolated (in curly braces) +parts of a string *before* they are combined into a final string. + +See the :ref:`t-strings tutorial ` for an introduction. + + +.. _templatelib-template: + +Template +-------- + +The :class:`!Template` class describes the contents of a template string. + +:class:`!Template` instances are shallow immutable: their attributes cannot be +reassigned. + +.. class:: Template(*args) + + Create a new :class:`!Template` object. + + :param args: A mix of strings and :class:`Interpolation` instances in any order. + :type args: str | Interpolation + + The most common way to create a :class:`!Template` instance is to use the + :ref:`t-string literal syntax `. This syntax is identical to that of + :ref:`f-strings` except that it uses a ``t`` instead of an ``f``: + + >>> name = "World" + >>> template = t"Hello {name}!" + >>> type(template) + + >>> template.strings + ('Hello ', '!') + >>> template.values + ('World',) + + While literal syntax is the most common way to create :class:`!Template` + instances, it is also possible to create them directly using the constructor: + + >>> from string.templatelib import Interpolation, Template + >>> name = "World" + >>> template = Template("Hello, ", Interpolation(name, "name"), "!") + >>> list(template) + ['Hello, ', Interpolation('World', 'name', None, ''), '!'] + + If two or more consecutive strings are passed, they will be concatenated + into a single value in the :attr:`~Template.strings` attribute. For example, + the following code creates a :class:`Template` with a single final string: + + >>> from string.templatelib import Template + >>> template = Template("Hello ", "World", "!") + >>> template.strings + ('Hello World!',) + + If two or more consecutive interpolations are passed, they will be treated + as separate interpolations and an empty string will be inserted between them. + For example, the following code creates a template with a single value in + the :attr:`~Template.strings` attribute: + + >>> from string.templatelib import Interpolation, Template + >>> template = Template(Interpolation("World", "name"), Interpolation("!", "punctuation")) + >>> template.strings + ('', '', '') + + .. attribute:: strings + :type: tuple[str, ...] + + A :ref:`tuple ` of the static strings in the template. + + >>> name = "World" + >>> t"Hello {name}!".strings + ('Hello ', '!') + + Empty strings *are* included in the tuple: + + >>> name = "World" + >>> t"Hello {name}{name}!".strings + ('Hello ', '', '!') + + The ``strings`` tuple is never empty, and always contains one more + string than the ``interpolations`` and ``values`` tuples: + + >>> t"".strings + ('',) + >>> t"{'cheese'}".strings + ('', '') + >>> t"{'cheese'}".values + ('cheese',) + + .. attribute:: interpolations + :type: tuple[Interpolation, ...] + + A tuple of the interpolations in the template. + + >>> name = "World" + >>> t"Hello {name}!".interpolations + (Interpolation('World', 'name', None, ''),) + + The ``interpolations`` tuple may be empty and always contains one fewer + values than the ``strings`` tuple: + + >>> t"Hello!".interpolations + () + + .. attribute:: values + :type: tuple[Any, ...] + + A tuple of all interpolated values in the template. + + >>> name = "World" + >>> t"Hello {name}!".values + ('World',) + + The ``values`` tuple is always the same length as the + ``interpolations`` tuple. It is equivalent to + ``tuple(i.value for i in template.interpolations)``. + + .. method:: __iter__() + + Iterate over the template, yielding each string and + :class:`Interpolation` in order. + + >>> name = "World" + >>> list(t"Hello {name}!") + ['Hello ', Interpolation('World', 'name', None, ''), '!'] + + Empty strings are *not* included in the iteration: + + >>> name = "World" + >>> list(t"Hello {name}{name}") + ['Hello ', Interpolation('World', 'name', None, ''), Interpolation('World', 'name', None, '')] + + .. method:: __add__(other) + + Concatenate this template with another, returning a new + :class:`!Template` instance: + + >>> name = "World" + >>> list(t"Hello " + t"there {name}!") + ['Hello there ', Interpolation('World', 'name', None, ''), '!'] + + Concatenation between a :class:`!Template` and a ``str`` is *not* supported. + This is because it is ambiguous whether the string should be treated as + a static string or an interpolation. If you want to concatenate a + :class:`!Template` with a string, you should either wrap the string + directly in a :class:`!Template` (to treat it as a static string) or use + an :class:`!Interpolation` (to treat it as dynamic): + + >>> from string.templatelib import Template, Interpolation + >>> template = t"Hello " + >>> # Treat "there " as a static string + >>> template += Template("there ") + >>> # Treat name as an interpolation + >>> name = "World" + >>> template += Template(Interpolation(name, "name")) + >>> list(template) + ['Hello there ', Interpolation('World', 'name', None, '')] + + +.. class:: Interpolation(value, expression="", conversion=None, format_spec="") + + Create a new :class:`!Interpolation` object. + + :param value: The evaluated, in-scope result of the interpolation. + :type value: object + + :param expression: The text of a valid Python expression, or an empty string + :type expression: str + + :param conversion: The optional :ref:`conversion ` to be used, one of r, s, and a,. + :type conversion: Literal["a", "r", "s"] | None + + :param format_spec: An optional, arbitrary string used as the :ref:`format specification ` to present the value. + :type format_spec: str + + The :class:`!Interpolation` type represents an expression inside a template string. + + :class:`!Interpolation` instances are shallow immutable: their attributes cannot be + reassigned. + + .. property:: value + + :returns: The evaluated value of the interpolation. + :rtype: object + + >>> t"{1 + 2}".interpolations[0].value + 3 + + .. property:: expression + + :returns: The text of a valid Python expression, or an empty string. + :rtype: str + + The :attr:`~Interpolation.expression` is the original text of the + interpolation's Python expression, if the interpolation was created + from a t-string literal. Developers creating interpolations manually + should either set this to an empty string or choose a suitable valid + Python expression. + + >>> t"{1 + 2}".interpolations[0].expression + '1 + 2' + + .. property:: conversion + + :returns: The conversion to apply to the value, or ``None`` + :rtype: Literal["a", "r", "s"] | None + + The :attr:`!Interpolation.conversion` is the optional conversion to apply + to the value: + + >>> t"{1 + 2!a}".interpolations[0].conversion + 'a' + + .. note:: + + Unlike f-strings, where conversions are applied automatically, + the expected behavior with t-strings is that code that *processes* the + :class:`!Template` will decide how to interpret and whether to apply + the :attr:`!Interpolation.conversion`. + + .. property:: format_spec + + :returns: The format specification to apply to the value. + :rtype: str + + The :attr:`!Interpolation.format_spec` is an optional, arbitrary string + used as the format specification to present the value: + + >>> t"{1 + 2:.2f}".interpolations[0].format_spec + '.2f' + + .. note:: + + Unlike f-strings, where format specifications are applied automatically + via the :func:`format` protocol, the expected behavior with + t-strings is that code that *processes* the :class:`!Template` will + decide how to interpret and whether to apply the format specification. + As a result, :attr:`!Interpolation.format_spec` values in + :class:`!Template` instances can be arbitrary strings, even those that + do not necessarily conform to the rules of Python's :func:`format` + protocol. + + .. property:: __match_args__ + + :returns: A tuple of the attributes to use for structural pattern matching. + + The tuple returned is ``('value', 'expression', 'conversion', 'format_spec')``. + This allows for :ref:`pattern matching ` on :class:`!Interpolation` instances. + diff --git a/Doc/reference/compound_stmts.rst b/Doc/reference/compound_stmts.rst index e95fa3a6424e23..a416cbb4cc8eab 100644 --- a/Doc/reference/compound_stmts.rst +++ b/Doc/reference/compound_stmts.rst @@ -852,8 +852,8 @@ A literal pattern corresponds to most The rule ``strings`` and the token ``NUMBER`` are defined in the :doc:`standard Python grammar <./grammar>`. Triple-quoted strings are -supported. Raw strings and byte strings are supported. :ref:`f-strings` are -not supported. +supported. Raw strings and byte strings are supported. :ref:`f-strings` +and :ref:`t-strings` are not supported. The forms ``signed_number '+' NUMBER`` and ``signed_number '-' NUMBER`` are for expressing :ref:`complex numbers `; they require a real number diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 567c70111c20ec..fc385a61cd36b9 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -913,6 +913,44 @@ See also :pep:`498` for the proposal that added formatted string literals, and :meth:`str.format`, which uses a related format string mechanism. +.. _t-strings: +.. _template-string-literals: + +t-strings +--------- + +.. versionadded:: 3.14 + +A :dfn:`template string literal` or :dfn:`t-string` is a string literal +that is prefixed with ``'t'`` or ``'T'``. These strings follow the same +syntax and evaluation rules as :ref:`formatted string literals `, with +the following differences: + +- Rather than evaluating to a ``str`` object, t-strings evaluate to a + :class:`~string.templatelib.Template` object from the + :mod:`string.templatelib` module. + +- The :func:`format` protocol is not used. Instead, the format specifier and + conversions (if any) are passed to a new :class:`~string.templatelib.Interpolation` + object that is created for each evaluated expression. It is up to code that + processes the resulting :class:`~string.templatelib.Template` object to + decide how to handle format specifiers and conversions. + +- Format specifiers containing nested replacement fields are evaluated eagerly, + prior to being passed to the :class:`~string.templatelib.Interpolation` object. + +- When the equal sign ``'='`` is provided in an interpolation expression, the + resulting :class:`~string.templatelib.Template` object will have the expression + text along with a ``'='`` character placed in its + :attr:`~string.templatelib.Template.strings` attribute. The + :attr:`~string.templatelib.Template.interpolations` attribute will also + contain an ``Interpolation`` instance for the expression. By default, the + :attr:`~string.templatelib.Interpolation.conversion` attribute will be set to + ``'r'`` (i.e. :func:`repr`), unless there is a conversion explicitly specified + (in which case it overrides the default) or a format specifier is provided (in + which case, the ``conversion`` defaults to ``None``). + + .. _numbers: Numeric literals @@ -1044,7 +1082,7 @@ readability:: Either of these parts, but not both, can be empty. For example:: - 10. # (equivalent to 10.0) + 1. # (equivalent to 10.0) .001 # (equivalent to 0.001) Optionally, the integer and fraction may be followed by an *exponent*: diff --git a/Doc/sphinx-warnings.txt b/Doc/sphinx-warnings.txt new file mode 100644 index 00000000000000..e69de29bb2d1d6 diff --git a/Doc/tutorial/inputoutput.rst b/Doc/tutorial/inputoutput.rst index 35b8c7cd8eb049..dbf14fe142c921 100644 --- a/Doc/tutorial/inputoutput.rst +++ b/Doc/tutorial/inputoutput.rst @@ -34,13 +34,28 @@ printing space-separated values. There are several ways to format output. >>> f'Results of the {year} {event}' 'Results of the 2016 Referendum' +* When greater control is needed, :ref:`template string literals ` + can be useful. T-strings -- which begin with ``t`` or ``T`` -- share the + same syntax as f-strings but, unlike f-strings, produce a + :class:`~string.templatelib.Template` instance rather than a simple ``str``. + Templates give you access to the static and interpolated (in curly braces) + parts of the string *before* they are combined into a final string. + + :: + + >>> name = "World" + >>> template = t"Hello {name}!" + >>> template.strings + ('Hello ', '!') + >>> template.values + ('World',) + * The :meth:`str.format` method of strings requires more manual effort. You'll still use ``{`` and ``}`` to mark where a variable will be substituted and can provide detailed formatting directives, but you'll also need to provide the information to be formatted. In the following code block there are two examples of how to format variables: - :: >>> yes_votes = 42_572_654 @@ -95,10 +110,11 @@ Some examples:: >>> repr((x, y, ('spam', 'eggs'))) "(32.5, 40000, ('spam', 'eggs'))" -The :mod:`string` module contains a :class:`~string.Template` class that offers -yet another way to substitute values into strings, using placeholders like -``$x`` and replacing them with values from a dictionary, but offers much less -control of the formatting. +The :mod:`string` module also contains support for so-called +:ref:`$-strings ` that offer yet another way to +substitute values into strings, using placeholders like ``$x`` and replacing +them with values from a dictionary. This syntax is easy to use, although +it offers much less control of the formatting. .. index:: single: formatted string literal @@ -160,6 +176,185 @@ See :ref:`self-documenting expressions ` for more informatio on the ``=`` specifier. For a reference on these format specifications, see the reference guide for the :ref:`formatspec`. + +.. _tut-t-strings: + +Template String Literals +------------------------- + +:ref:`Template string literals ` (also called t-strings for short) +are an extension of :ref:`f-strings ` that let you access the +static and interpolated parts of a string *before* they are combined into a +final string. This provides for greater control over how the string is +formatted. + +The most common way to create a :class:`~string.templatelib.Template` instance +is to use the :ref:`t-string literal syntax `. This syntax is +identical to that of :ref:`f-strings` except that it uses a ``t`` instead of +an ``f``: + + >>> name = "World" + >>> template = t"Hello {name}!" + >>> template.strings + ('Hello ', '!') + >>> template.values + ('World',) + +:class:`!Template` instances are iterable, yielding each +string and :class:`~string.templatelib.Interpolation` in order: + +.. testsetup:: + + name = "World" + template = t"Hello {name}!" + +.. doctest:: + + >>> list(template) + ['Hello ', Interpolation('World', 'name', None, ''), '!'] + +Interpolations represent expressions inside a t-string. They contain the +evaluated value of the expression (``'World'`` in this example), the text of +the original expression (``'name'``), and optional conversion and format +specification attributes. + +Templates can be processed in a variety of ways. For instance, here's code that +converts static strings to lowercase and interpolated values to uppercase: + + >>> from string.templatelib import Template + >>> + >>> def lower_upper(template: Template) -> str: + ... return ''.join( + ... part.lower() if isinstance(part, str) else part.value.upper() + ... for part in template + ... ) + ... + >>> name = "World" + >>> template = t"Hello {name}!" + >>> lower_upper(template) + 'hello WORLD!' + +Template strings are particularly useful for sanitizing user input. Imagine +we're building a web application that has user profile pages. Perhaps the +``User`` class is defined like this: + + >>> from dataclasses import dataclass + >>> + >>> @dataclass + ... class User: + ... name: str + ... + +Imagine using f-strings in to generate HTML for the ``User``: + +.. testsetup:: + + class User: + name: str + def __init__(self, name: str): + self.name = name + + +.. doctest:: + + >>> # Warning: this is dangerous code. Don't do this! + >>> def user_html(user: User) -> str: + ... return f"

{user.name}

" + ... + +This code is dangerous because our website lets users type in their own names. +If a user types in a name like ``""``, the +browser will execute that script when someone else visits their profile page. +This is called a *cross-site scripting (XSS) vulnerability*, and it is a form +of *injection vulnerability*. Injection vulnerabilities occur when user input +is included in a program without proper sanitization, allowing malicious code +to be executed. The same sorts of vulnerabilities can occur when user input is +included in SQL queries, command lines, or other contexts where the input is +interpreted as code. + +To prevent this, instead of using f-strings, we can use t-strings. Let's +update our ``user_html()`` function to return a :class:`~string.templatelib.Template`: + + >>> from string.templatelib import Template + >>> + >>> def user_html(user: User) -> Template: + ... return t"

{user.name}

" + +Now let's implement a function that sanitizes *any* HTML :class:`!Template`: + + >>> from html import escape + >>> from string.templatelib import Template + >>> + >>> def sanitize_html_template(template: Template) -> str: + ... return ''.join( + ... part if isinstance(part, str) else escape(part.value) + ... for part in template + ... ) + ... + +This function iterates over the parts of the :class:`!Template`, escaping any +interpolated values using the :func:`html.escape` function, which converts +special characters like ``<``, ``>``, and ``&`` into their HTML-safe +equivalents. + +Now we can tie it all together: + +.. testsetup:: + + from dataclasses import dataclass + from string.templatelib import Template + from html import escape + @dataclass + class User: + name: str + def sanitize_html_template(template: Template) -> str: + return ''.join( + part if isinstance(part, str) else escape(part.value) + for part in template + ) + def user_html(user: User) -> Template: + return t"

{user.name}

" + +.. doctest:: + + >>> evil_user = User(name="") + >>> template = user_html(evil_user) + >>> safe = sanitize_html_template(template) + >>> print(safe) +

<script>alert('evil');</script>

+ +We are no longer vulnerable to XSS attacks because we are escaping the +interpolated values before they are included in the rendered HTML. + +Of course, there's no need for code that processes :class:`!Template` instances +to be limited to returning a simple string. For instance, we could imagine +defining a more complex ``html()`` function that returns a structured +representation of the HTML: + + >>> from dataclasses import dataclass + >>> from string.templatelib import Template + >>> from html.parser import HTMLParser + >>> + >>> @dataclass + ... class Element: + ... tag: str + ... attributes: dict[str, str] + ... children: list[str | Element] + ... + >>> def parse_html(template: Template) -> Element: + ... """ + ... Uses python's built-in HTMLParser to parse the template, + ... handle any interpolated values, and return a tree of + ... Element instances. + ... """ + ... pass + ... + +A full implementation of this function would be quite complex and is not +provided here. That said, the fact that it is possible to implement a method +like ``parse_html()`` showcases the flexibility and power of t-strings. + + .. _tut-string-format: The String format() Method