Skip to content

Commit c74d0dc

Browse files
gh-121999: Change default tarfile filter to 'data'
1 parent 7b36b67 commit c74d0dc

File tree

4 files changed

+21
-63
lines changed

4 files changed

+21
-63
lines changed

Doc/library/tarfile.rst

Lines changed: 19 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -40,9 +40,11 @@ Some facts and figures:
4040
Archives are extracted using a :ref:`filter <tarfile-extraction-filter>`,
4141
which makes it possible to either limit surprising/dangerous features,
4242
or to acknowledge that they are expected and the archive is fully trusted.
43-
By default, archives are fully trusted, but this default is deprecated
44-
and slated to change in Python 3.14.
4543

44+
.. versionchanged:: 3.14
45+
The default extraction filter was 'fully trusted' but is now 'data' which
46+
which disallows dangerous features like links to absolute paths or paths
47+
outside the destination.
4648

4749
.. function:: open(name=None, mode='r', fileobj=None, bufsize=10240, **kwargs)
4850

@@ -495,19 +497,23 @@ be finalized; only the internally used file object will be closed. See the
495497
The *filter* argument specifies how ``members`` are modified or rejected
496498
before extraction.
497499
See :ref:`tarfile-extraction-filter` for details.
498-
It is recommended to set this explicitly depending on which *tar* features
499-
you need to support.
500+
It is recommended to set this explicitly only if unusual *tar* features
501+
are required.
500502

501503
.. warning::
502504

503-
Never extract archives from untrusted sources without prior inspection.
505+
The default filter is set to ``filter='data'`` to prevent the most
506+
dangerous security issues, read the :ref:`tarfile-extraction-filter`
507+
section for details.
508+
509+
Never extract archives from untrusted sources without prior inspection,
510+
even when using the ``'data'`` filter, but especially if using the
511+
``'tar'`` or ``'fully_trusted'`` filters.
512+
504513
It is possible that files are created outside of *path*, e.g. members
505514
that have absolute filenames starting with ``"/"`` or filenames with two
506515
dots ``".."``.
507516

508-
Set ``filter='data'`` to prevent the most dangerous security issues,
509-
and read the :ref:`tarfile-extraction-filter` section for details.
510-
511517
.. versionchanged:: 3.5
512518
Added the *numeric_owner* parameter.
513519

@@ -538,8 +544,9 @@ be finalized; only the internally used file object will be closed. See the
538544

539545
See the warning for :meth:`extractall`.
540546

541-
Set ``filter='data'`` to prevent the most dangerous security issues,
542-
and read the :ref:`tarfile-extraction-filter` section for details.
547+
The default filter is set to ``filter='data'`` to prevent the most
548+
dangerous security issues, read the :ref:`tarfile-extraction-filter`
549+
section for details.
543550

544551
.. versionchanged:: 3.2
545552
Added the *set_attrs* parameter.
@@ -603,12 +610,7 @@ be finalized; only the internally used file object will be closed. See the
603610
argument to :meth:`~TarFile.extract`.
604611

605612
If ``extraction_filter`` is ``None`` (the default),
606-
calling an extraction method without a *filter* argument will raise a
607-
``DeprecationWarning``,
608-
and fall back to the :func:`fully_trusted <fully_trusted_filter>` filter,
609-
whose dangerous behavior matches previous versions of Python.
610-
611-
In Python 3.14+, leaving ``extraction_filter=None`` will cause
613+
calling an extraction method without a *filter* argument will cause
612614
extraction methods to use the :func:`data <data_filter>` filter by default.
613615

614616
The attribute may be set on instances or overridden in subclasses.
@@ -992,12 +994,7 @@ can be:
992994

993995
* ``None`` (default): Use :attr:`TarFile.extraction_filter`.
994996

995-
If that is also ``None`` (the default), raise a ``DeprecationWarning``,
996-
and fall back to the ``'fully_trusted'`` filter, whose dangerous behavior
997-
matches previous versions of Python.
998-
999-
In Python 3.14, the ``'data'`` filter will become the default instead.
1000-
It's possible to switch earlier; see :attr:`TarFile.extraction_filter`.
997+
If that is also ``None`` (the default), the ``'data'`` filter will be used.
1001998

1002999
* A callable which will be called for each extracted member with a
10031000
:ref:`TarInfo <tarinfo-objects>` describing the member and the destination

Lib/tarfile.py

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2248,13 +2248,7 @@ def _get_filter_function(self, filter):
22482248
if filter is None:
22492249
filter = self.extraction_filter
22502250
if filter is None:
2251-
import warnings
2252-
warnings.warn(
2253-
'Python 3.14 will, by default, filter extracted tar '
2254-
+ 'archives and reject files or modify their metadata. '
2255-
+ 'Use the filter argument to control this behavior.',
2256-
DeprecationWarning, stacklevel=3)
2257-
return fully_trusted_filter
2251+
return data_filter
22582252
if isinstance(filter, str):
22592253
raise TypeError(
22602254
'String names are not supported for '

Lib/test/test_tarfile.py

Lines changed: 0 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -738,31 +738,6 @@ def test_extract_directory(self):
738738
finally:
739739
os_helper.rmtree(DIR)
740740

741-
def test_deprecation_if_no_filter_passed_to_extractall(self):
742-
DIR = pathlib.Path(TEMPDIR) / "extractall"
743-
with (
744-
os_helper.temp_dir(DIR),
745-
tarfile.open(tarname, encoding="iso8859-1") as tar
746-
):
747-
directories = [t for t in tar if t.isdir()]
748-
with self.assertWarnsRegex(DeprecationWarning, "Use the filter argument") as cm:
749-
tar.extractall(DIR, directories)
750-
# check that the stacklevel of the deprecation warning is correct:
751-
self.assertEqual(cm.filename, __file__)
752-
753-
def test_deprecation_if_no_filter_passed_to_extract(self):
754-
dirtype = "ustar/dirtype"
755-
DIR = pathlib.Path(TEMPDIR) / "extractall"
756-
with (
757-
os_helper.temp_dir(DIR),
758-
tarfile.open(tarname, encoding="iso8859-1") as tar
759-
):
760-
tarinfo = tar.getmember(dirtype)
761-
with self.assertWarnsRegex(DeprecationWarning, "Use the filter argument") as cm:
762-
tar.extract(tarinfo, path=DIR)
763-
# check that the stacklevel of the deprecation warning is correct:
764-
self.assertEqual(cm.filename, __file__)
765-
766741
def test_extractall_pathlike_dir(self):
767742
DIR = os.path.join(TEMPDIR, "extractall")
768743
with os_helper.temp_dir(DIR), \
@@ -4011,15 +3986,6 @@ def test_data_filter(self):
40113986
self.assertIs(filtered.name, tarinfo.name)
40123987
self.assertIs(filtered.type, tarinfo.type)
40133988

4014-
def test_default_filter_warns(self):
4015-
"""Ensure the default filter warns"""
4016-
with ArchiveMaker() as arc:
4017-
arc.add('foo')
4018-
with warnings_helper.check_warnings(
4019-
('Python 3.14', DeprecationWarning)):
4020-
with self.check_context(arc.open(), None):
4021-
self.expect_file('foo')
4022-
40233989
def test_change_default_filter_on_instance(self):
40243990
tar = tarfile.TarFile(tarname, 'r')
40253991
def strict_filter(tarinfo, path):
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Update tarfile library to use 'data' filter by default when extracting

0 commit comments

Comments
 (0)