From 4d2d58930f7c4748d213ed5e38a4b7517d95a3f1 Mon Sep 17 00:00:00 2001 From: Serhiy Storchaka Date: Fri, 7 Jul 2023 15:38:30 +0300 Subject: [PATCH 1/2] gh-106482: Clarify documentation of character set in RE --- Doc/library/re.rst | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index b7510b93d75427..f3904a8f80cd27 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -250,15 +250,22 @@ The special characters are: ``[a\-z]``) or if it's placed as the first or last character (e.g. ``[-a]`` or ``[a-]``), it will match a literal ``'-'``. - * Special characters lose their special meaning inside sets. For example, + * Special characters except backslash lose their special meaning inside sets. + For example, ``[(+*)]`` will match any of the literal characters ``'('``, ``'+'``, ``'*'``, or ``')'``. .. index:: single: \ (backslash); in regular expressions - * Character classes such as ``\w`` or ``\S`` (defined below) are also accepted - inside a set, although the characters they match depends on whether - :const:`ASCII` or :const:`LOCALE` mode is in force. + * Backslash either escapes characters which have special meaning in a set + such as ``'-'``, ``']'``, ``'^'`` and ``'\\'`` itself or signals + a special sequence which represents a single character such as + ``\xa0`` or ``\n`` or a character class such as ``\w`` or ``\S`` + (defined below). + Note that ``\b`` is used to represent a single "backspace" character, + not word boundaries as outside a set. + Special sequences which do not match a single character such as ``\A`` + and ``\Z`` are not allowed. .. index:: single: ^ (caret); in regular expressions From 2907f262835adc315a566148d6ff16d17e6dbee2 Mon Sep 17 00:00:00 2001 From: Serhiy Storchaka Date: Sun, 9 Jul 2023 10:47:30 +0300 Subject: [PATCH 2/2] Update Doc/library/re.rst Co-authored-by: Martin Panter --- Doc/library/re.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index f3904a8f80cd27..87461af78a871d 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -262,8 +262,9 @@ The special characters are: a special sequence which represents a single character such as ``\xa0`` or ``\n`` or a character class such as ``\w`` or ``\S`` (defined below). - Note that ``\b`` is used to represent a single "backspace" character, - not word boundaries as outside a set. + Note that ``\b`` represents a single "backspace" character, + not a word boundary as outside a set, and numeric escapes + such as ``\1`` are always octal escapes, not group references. Special sequences which do not match a single character such as ``\A`` and ``\Z`` are not allowed.