Files
python315/CVE-2025-15282-urllib-ctrl-chars.patch
Matěj Cepl 9205d3700f Update to 3.15.0a5:
- Tools/Demos
    - gh-142095: Make gdb ‘py-bt’ command use frame from thread
      local state when available. Patch by Sam Gross and Victor
      Stinner.
  - Tests
    - gh-143460: Skip tests relying on infinite recusion if stack
      size is unlimited.
    - gh-143553: Add support for parametrized resources, such as
      -u xpickle=2.7.
    - bpo-31391: Forward-port test_xpickle from Python 2 to
      Python 3 and add the resource back to test’s command line.
  - Library
    - gh-143706: Fix multiprocessing forkserver so that sys.argv
      is correctly set before __main__ is preloaded. Previously,
      sys.argv was empty during main module import in forkserver
      child processes. This fixes a regression introduced in
      3.13.8 and 3.14.1. Root caused by Aaron Wieczorek, test
      provided by Thomas Watson, thanks!
    - gh-143638: Forbid reentrant calls of the pickle.Pickler and
      pickle.Unpickler methods for the C implementation.
      Previously, this could cause crash or data corruption, now
      concurrent calls of methods of the same object raise
      RuntimeError.
    - gh-143658: importlib.metadata: Use str.translate() to
      improve performance of
      importlib.metadata.Prepared.normalize(). Patch by Hugo van
      Kemenade and Henry Schreiner.
    - gh-78724: Raise RuntimeError’s when user attempts to call
      methods on half-initialized Struct objects, For example,
      created by Struct.__new__(Struct). Patch by Sergey
      B Kirpichev.
    - gh-143196: Fix crash when the internal encoder object
      returned by undocumented function
      json.encoder.c_make_encoder() was called with non-zero
      second (_current_indent_level) argument.
    - gh-143191: _thread.stack_size() now raises ValueError if
      the stack size is too small. Patch by Victor Stinner.
    - gh-143547: Fix sys.unraisablehook() when the hook raises an
      exception and changes sys.unraisablehook(): hold a strong
      reference to the old hook. Patch by Victor Stinner.
    - gh-139686: Revert 0a97941245f1dda6d838f9aaf0512104e5253929
      and 57db12514ac686f0a752ec8fe1c08b6daa0c6219 which made
      importlib.reload a no-op for lazy modules; caused Buildbot
      failures.
    - gh-143517: annotationlib.get_annotations() no longer raises
      a SyntaxError when evaluating a stringified starred
      annotation that starts with one or more whitespace
      characters followed by a *. Patch by Bartosz Sławecki.
    - gh-143474: Add os.RWF_ATOMIC constant for Linux 6.11+.
    - gh-143445: Speed up copy.deepcopy() by 1.04x.
    - gh-143378: Fix use-after-free crashes when a BytesIO object
      is concurrently mutated during write() or writelines().
    - gh-143368: Fix endless retry loop in profiling.sampling
      blocking mode when threads cannot be seized due to EPERM.
      Such threads are now skipped instead of causing repeated
      error messages. Patch by Pablo Galindo.
    - gh-143346: Fix incorrect wrapping of the Base64 data in
      plistlib._PlistWriter when the indent contains a mix of
      tabs and spaces.
    - gh-140025: queue: Fix SimpleQueue.__sizeof__() computation.
    - gh-143310: tkinter: fix a crash when a Python list is
      mutated during the conversion to a Tcl object (e.g., when
      setting a Tcl variable). Patch by Bénédikt Tran.
    - gh-143309: Fix a crash in os.execve() on non-Windows
      platforms when given a custom environment mapping which is
      then mutated during parsing. Patch by Bénédikt Tran.
    - gh-143308: pickle: fix use-after-free crashes when
      a PickleBuffer is concurrently mutated by a custom buffer
      callback during pickling. Patch by Bénédikt Tran and Aaron
      Wieczorek.
    - gh-142939: Performance optimisations for
      difflib.get_close_matches()
    - gh-124951: The base64 implementation behind the binascii,
      base64, and related codec has been optimized for modern
      pipelined CPU architectures and now performs 2-3x faster
      across all platforms.
    - gh-143237: Fix support of named pipes in the rotating
      logging handlers.
    - gh-143249: Fix possible buffer leaks in Windows overlapped
      I/O on error handling.
    - gh-143241: zoneinfo: fix infinite loop in
      ZoneInfo.from_file when parsing a malformed TZif file.
      Patch by Fatih Celik.
    - gh-142830: sqlite3: fix use-after-free crashes when the
      connection’s callbacks are mutated during a callback
      execution. Patch by Bénédikt Tran.
    - gh-143200: xml.etree.ElementTree: fix use-after-free
      crashes in __getitem__() and __setitem__() methods of
      Element when the element is concurrently mutated. Patch by
      Bénédikt Tran.
    - gh-143214: Add the wrapcol parameter in
      binascii.b2a_base64() and base64.b64encode().
    - gh-142195: Updated timeout evaluation logic in subprocess
      to be compatible with deterministic environments like
      Shadow where time moves exactly as requested.
    - gh-140739: Fix several crashes due to reading invalid
      memory in the new Tachyon sampling profiler. Patch by Pablo
      Galindo.
    - gh-142164: Fix the ctypes bitfield overflow error message
      to report the correct offset and size calculation.
    - gh-143145: Fixed a possible reference leak in ctypes when
      constructing results with multiple output parameters on
      error.
    - gh-143103: Add padding support to base64.z85encode() via
      the pad parameter.
    - gh-130796: Undeprecate the locale.getdefaultlocale()
      function. Patch by Victor Stinner.
    - gh-74902: Add the iter_graphemes() function in the
      unicodedata module to iterate over grapheme clusters
      according to rules defined in Unicode Standard Annex #29,
      “Unicode Text Segmentation”. Add grapheme_cluster_break(),
      indic_conjunct_break() and extended_pictographic()
      functions to get the properties of the character which are
      related to the above algorithm.
    - gh-143004: Fix a potential use-after-free in
      collections.Counter.update() when user code mutates the
      Counter during an update.
    - gh-140648: The asyncio REPL now respects the -I flag
      (isolated mode). Previously, it would load and execute
      PYTHONSTARTUP even if the flag was set. Contributed by
      Bartosz Sławecki.
    - gh-142991: Fixed socket operations such as recvfrom() and
      sendto() for FreeBSD divert(4) socket.
    - gh-116738: Make the attributes in lzma thread-safe on the
      free threaded build.
    - gh-142950: Fix regression in argparse where format
      specifiers in help strings raised ValueError.
    - gh-142881: Fix concurrent and reentrant call of
      atexit.unregister().
    - gh-142615: Fix possible crashes when initializing
      asyncio.Task or asyncio.Future multiple times. These
      classes can now be initialized only once and any subsequent
      initialization attempt will raise a RuntimeError. Patch by
      Kumar Aditya.
    - gh-142517: The non-compat32 email policies now correctly
      handle refolding encoded words that contain bytes that can
      not be decoded in their specified character set. Previously
      this resulted in an encoding exception during folding.
    - gh-138122: The Tachyon profiler’s live TUI now integrates
      with the experimental _colorize theming system. Users can
      customize colors via _colorize.set_theme() (experimental
      API, subject to change). A LiveProfilerLight theme is
      provided for light terminal backgrounds. Patch by Pablo
      Galindo.
    - gh-142306: Improve errors for Element.remove.
    - gh-63016: Add a flags parameter to mmap.mmap.flush() to
      control synchronization behavior.
    - gh-139262: Some keystrokes can be swallowed in the new
      PyREPL on Windows, especially when used together with the
      ALT key. Fix by Chris Eibl.
    - gh-138897: Improved license/copyright/credits display in
      the REPL: now uses a pager.
    - gh-135852: Add _winapi.RegisterEventSource(),
      _winapi.DeregisterEventSource() and _winapi.ReportEvent().
      Using these functions in NTEventLogHandler to replace
      pywin32.
    - gh-109263: Starting a process from spawn context in
      multiprocessing no longer sets the start method globally.
    - gh-132715: Skip writing objects during marshalling once
      a failure has occurred.
  - Documentation
    - gh-140806: Add documentation for enum.bin().
  - Core and Builtins
    - gh-134584: Eliminate redundant refcounting from
      _CONTAINS_OP, _CONTAINS_OP_SET and _CONTAINS_OP_DICT.
    - gh-143604: Fix a reference counting issue in the JIT tracer
      where the current executor could be prematurely freed
      during tracing.
    - gh-143469: Enable LOAD_ATTR_MODULE specialization even if
      __getattr__() is defined in module.
    - gh-134584: Eliminate redundant refcounting from
      TO_BOOL_STR.
    - gh-143377: Fix a crash in _interpreters.capture_exception()
      when the exception is incorrectly formatted. Patch by
      Bénédikt Tran.
    - gh-139757: Add BINARY_OP_SUBSCR_USTR_INT to specialize
      reading an ASCII character from any string. Patch by Chris
      Eibl.
    - gh-141504: Factor out tracing and optimization heuristics
      into a single object. Patch by Donghee Na.
    - gh-142982: Specialize CALL_FUNCTION_EX for Python and
      non-Python callables.
    - gh-136924: The interactive help mode in the REPL no longer
      incorrectly syntax highlights text input as Python code.
      Contributed by Olga Matoula.
    - gh-139757: Fix unintended bytecode specialization for
      non-ascii string. Patch by Donghee Na, Ken Jin and Chris
      Eibl.
    - gh-143361: Add PY_VECTORCALL_ARGUMENTS_OFFSET to
      _Py_CallBuiltinClass_StackRefSteal to avoid redundant
      allocations
    - gh-131798: The JIT optimizer now understands more generator
      instructions.
    - gh-134584: Eliminate redundant refcounting from
      _LOAD_ATTR_SLOT.
    - gh-143189: Fix crash when inserting a non-str key into
      a split table dictionary when the key matches an existing
      key in the split table but has no corresponding value in
      the dict.
    - gh-143228: Fix use-after-free in perf trampoline when
      toggling profiling while threads are running or during
      interpreter finalization with daemon threads active. The
      fix uses reference counting to ensure trampolines are not
      freed while any code object could still reference them.
      Pach by Pablo Galindo
    - gh-142664: Fix a use-after-free crash in
      memoryview.__hash__ when the __hash__ method of the
      referenced object mutates that object or the view. Patch by
      Bénédikt Tran.
    - gh-142557: Fix a use-after-free crash in bytearray.__mod__
      when the bytearray is mutated while formatting the %-style
      arguments. Patch by Bénédikt Tran.
    - gh-143195: Fix use-after-free crashes in bytearray.hex()
      and memoryview.hex() when the separator’s __len__() mutates
      the original object. Patch by Bénédikt Tran.
    - gh-143183: Fix a bug in the JIT when dealing with
      unsupported control-flow or operations.
    - gh-142975: Fix crash after unfreezing all objects tracked
      by the garbage collector on the free threaded build.
    - gh-143135: Set sys.flags.inspect to 1 when PYTHONINSPECT is
      0. Previously, it was set to 0 in this case.
    - gh-143123: Protect the JIT against recursive tracing.
    - gh-143092: Fix a crash in the JIT when dealing with
      list.append(x) style code.
    - gh-143003: Fix an overflow of the shared empty buffer in
      bytearray.extend() when __length_hint__() returns 0 for
      non-empty iterator.
    - gh-143006: Fix a possible assertion error when comparing
      negative non-integer float and int with the same number of
      bits in the integer part.
    - gh-116738: Fix thread safety of contextvars.Context.run().
    - gh-142829: Fix a use-after-free crash in
      contextvars.Context comparison when a custom __eq__ method
      modifies the context via set().
    - gh-142863: Generate optimized bytecode when calling list or
      set with generator expression.
    - gh-41779: Allowed defining any __slots__ for a class
      derived from tuple (including classes created by
      collections.namedtuple()).
    - gh-69605: Fix edge-cases around already imported modules in
      the REPL auto-completion of imports.
    - gh-138568: Adjusted the built-in help() function so that
      empty inputs are ignored in interactive mode.
    - gh-131798: Remove bounds check when indexing into tuples
      with a constant index.
    - gh-134584: Eliminate redundant refcounting from
      _CALL_TYPE_1. Patch by Tomas Roun
    - gh-132108: Speed up int.from_bytes() when passed object
      supports buffer protocol, like bytearray by ~1.2x.
    - gh-128334: Make the slice class subscriptable at runtime to
      be consistent with typing implementation.
  - C API
    - gh-141671: PyMODINIT_FUNC (and the new PyMODEXPORT_FUNC)
      now adds a linkage declaration (__declspec(dllexport)) on
      Windows.
Update to 3.15.0a4:
  - Tests
    - gh-142836: Accommodated Solaris in
      test_pdb.test_script_target_anonymous_pipe.
  - Library
    - gh-122431: Corrected the error message in
      readline.append_history_file() to state that nelements must
      be non-negative instead of positive.
    - gh-143046: The asyncio REPL no longer prints copyright and
      version messages in the quiet mode (-q). Patch by Bartosz
      Sławecki.
    - gh-80744: Fix issue where pdb would read a .pdbrc twice if
      launched from the home directory
    - gh-138122: Add blocking mode to Tachyon for accurate stack
      traces in applications with many generators or
      fast-changing call stacks. Patch by Pablo Galindo.
    - gh-143010: Fixed a bug in mailbox where the precise timing
      of an external event could result in the library opening an
      existing file instead of a file it expected to create.
    - gh-112127: Fix possible use-after-free in
      atexit.unregister() when the callback is unregistered
      during comparison.
    - gh-138122: Fix incomplete stack traces in the Tachyon
      profiler’s frame cache when profiling code with deeply
      nested generators. The frame cache now validates that stack
      traces reach the base frame before caching, preventing
      broken flamegraphs. Patch by Pablo Galindo.
    - gh-142834: Change the pdb commands command to use the last
      available breakpoint instead of failing when the most
      recently created breakpoint was deleted.
    - gh-142783: Fix zoneinfo use-after-free with descriptor
      _weak_cache. a descriptor as _weak_cache could cause
      crashes during object creation. The fix ensures proper
      reference counting for descriptor-provided objects.
    - gh-76007: Deprecate VERSION from xml.etree.ElementTree and
      version from xml.sax.expatreader and xml.sax.handler. Patch
      by Hugo van Kemenade.
    - gh-142784: The asyncio REPL now properly closes the loop
      upon the end of interactive session. Previously, it could
      cause surprising warnings. Contributed by Bartosz Sławecki.
    - gh-138122: Add binary output format to profiling.sampling
      for compact storage of profiling data. The new --binary
      option captures samples to a file that can be converted to
      other formats using the replay command. Patch by Pablo
      Galindo
    - gh-142495: collections.defaultdict now prioritizes
      __setitem__() when inserting default values from
      default_factory. This prevents race conditions where
      a default value would overwrite a value set before
      default_factory returns.
    - gh-142654: Show the clearer error message when using
      profiling.sampling on an unknown PID.
    - gh-142560: Fix use-after-free in bytearray search-like
      methods (find(), count(), index(), rindex(), and rfind())
      by marking the storage as exported which causes
      reallocation attempts to raise BufferError. For contains(),
      split(), and rsplit() the buffer protocol is used for this.
    - gh-142419: mmap.mmap.set_name() method added to annotate an
      anonymous memory map if Linux kernel supports
      PR_SET_VMA_ANON_NAME (Linux 5.17 or newer). Patch by
      Donghee Na.
    - gh-139971: pydoc: Ensure that the link to the online
      documentation of a stdlib module is correct.
    - gh-124098: Fix issue where methods in handlers that lacked
      the protocol name but matched a valid base handler method
      (e.g., _open() or error()) were incorrectly added to
      urllib.request.OpenerDirector’s handlers. Contributed by
      Andrea Mattei.
    - gh-136282: Add support for UNNAMED_SECTION when creating
      a section via the mapping protocol access
  - Core and Builtins
    - gh-143057: Avoid locking in PyTraceMalloc_Track() and
      PyTraceMalloc_Untrack() when tracemalloc is not enabled.
    - gh-139109: Add missing terminator in certain cases when
      tracing in the new JIT compiler.
    - gh-142961: Fix a segfault in the JIT when constant folding
      len(tuple).
    - gh-142776: Fix a file descriptor leak in import.c
    - gh-139757: Fix building JIT stencils on free-threaded
      builds.
    - gh-129068: Make concurrent iteration over the same range
      iterator thread-safe in the free threading build.
    - gh-142543: Fix a stack overflow on Clang JIT build
      configurations with full LTO.
    - gh-142448: Fix a bug when using monitoring with the JIT.
    - gh-142766: Clear the frame of a generator when
      generator.close() is called.
    - gh-134584: Eliminate redundant refcounting from
      _LOAD_ATTR_INSTANCE_VALUE.
    - gh-134584: Eliminate redundant refcounting from
      _STORE_ATTR_WITH_HINT.
    - gh-142476: Fix a memory leak in the experimental Tier
      2 optimizer when creating executors. Patched by Shamil
      Abdulaev.
    - gh-100964: Fix reference cycle in exhausted generator
      frames. Patch by Savannah Ostrowski.
    - gh-139922: Allow building CPython with the tail calling
      interpreter on Visual Studio 2026 MSVC. This provides
      a performance gain over the prior interpreter for MSVC.
      Patch by Ken Jin, Brandt Bucher, and Chris Eibl. With help
      from the MSVC team including Hulon Jenkins.
Remove upstreamed patch:
  - longer-time-test_thread_time.patch
2026-02-08 14:38:36 +01:00

60 lines
2.8 KiB
Diff

From d8850aac54c234201966c66e83225564302cd15c Mon Sep 17 00:00:00 2001
From: Seth Michael Larson <seth@python.org>
Date: Fri, 16 Jan 2026 10:54:09 -0600
Subject: [PATCH 1/2] Add 'test.support' fixture for C0 control characters
---
Lib/test/test_urllib.py | 8 ++++++++
Lib/urllib/request.py | 5 +++++
Misc/NEWS.d/next/Security/2026-01-16-11-51-19.gh-issue-143925.mrtcHW.rst | 1 +
3 files changed, 14 insertions(+)
Index: Python-3.15.0a5/Lib/test/test_urllib.py
===================================================================
--- Python-3.15.0a5.orig/Lib/test/test_urllib.py 2026-02-08 14:31:49.004578010 +0100
+++ Python-3.15.0a5/Lib/test/test_urllib.py 2026-02-08 14:34:10.667653549 +0100
@@ -10,6 +10,7 @@
from test import support
from test.support import os_helper
from test.support import socket_helper
+from test.support import control_characters_c0
import os
import socket
try:
@@ -590,6 +591,13 @@
# missing padding character
self.assertRaises(ValueError,urllib.request.urlopen,'data:;base64,Cg=')
+ def test_invalid_mediatype(self):
+ for c0 in control_characters_c0():
+ self.assertRaises(ValueError,urllib.request.urlopen,
+ f'data:text/html;{c0},data')
+ for c0 in control_characters_c0():
+ self.assertRaises(ValueError,urllib.request.urlopen,
+ f'data:text/html{c0};base64,ZGF0YQ==')
class urlretrieve_FileTests(unittest.TestCase):
"""Test urllib.urlretrieve() on local files"""
Index: Python-3.15.0a5/Lib/urllib/request.py
===================================================================
--- Python-3.15.0a5.orig/Lib/urllib/request.py 2026-02-08 14:31:49.344934070 +0100
+++ Python-3.15.0a5/Lib/urllib/request.py 2026-02-08 14:34:10.668244681 +0100
@@ -1636,6 +1636,11 @@
scheme, data = url.split(":",1)
mediatype, data = data.split(",",1)
+ # Disallow control characters within mediatype.
+ if re.search(r"[\x00-\x1F\x7F]", mediatype):
+ raise ValueError(
+ "Control characters not allowed in data: mediatype")
+
# even base64 encoded data URLs might be quoted so unquote in any case:
data = unquote_to_bytes(data)
if mediatype.endswith(";base64"):
Index: Python-3.15.0a5/Misc/NEWS.d/next/Security/2026-01-16-11-51-19.gh-issue-143925.mrtcHW.rst
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ Python-3.15.0a5/Misc/NEWS.d/next/Security/2026-01-16-11-51-19.gh-issue-143925.mrtcHW.rst 2026-02-08 14:34:10.668611672 +0100
@@ -0,0 +1 @@
+Reject control characters in ``data:`` URL media types.