Skip to content

zipfile regression: When writing a zip64 entry to an unseekable file, the local file header compressed/uncompressed fields are not set to 0 #106218

Closed
@chenxiaolong

Description

@chenxiaolong

Bug report

When creating a zip file with a zip64 entry in a streaming way (unseekable file), my understanding from the spec is that the headers should be set up so that:

  1. The local file header general flags field has bit 3 set (§4.4.4)
  2. The local file header crc32, compressed size, and uncompressed size fields are set to 0 (§4.4.4)
  3. The local file header has a zip64 (0x0001) extra record so that data descriptor sizes are interpreted as 64-bit integers (§4.3.9.2)
  4. A data descriptor header exists (§4.3.9.1)

zipfile normally does all this, but (2) seems to have broken after #103861/#103863. With that change, the local file header's compressed and uncompressed sizes are set to 0xffffffff instead of 0x00000000. (0xffffffff is correct for the usual case where files are seekable and data descriptors are not used.)

From initial testing, it seems like this is all that's needed to fix the issue:

--- zipfile.py	2023-06-08 22:29:05.000000000 -0400
+++ zipfile2.py	2023-06-28 17:50:45.841709476 -0400
@@ -463,8 +463,9 @@
             fmt = '<HHQQ'
             extra = extra + struct.pack(fmt,
                                         1, struct.calcsize(fmt)-4, file_size, compress_size)
-            file_size = 0xffffffff
-            compress_size = 0xffffffff
+            if not (self.flag_bits & _MASK_USE_DATA_DESCRIPTOR):
+                file_size = 0xffffffff
+                compress_size = 0xffffffff
             min_version = ZIP64_VERSION
 
         if self.compress_type == ZIP_BZIP2:

To reproduce

import zipfile


class UnseekableFile:
    def __init__(self, fp):
        self.fp = fp

    def write(self, data):
        return self.fp.write(data)

    def flush(self):
        self.fp.flush()


with open('test.zip', 'wb') as f_raw:
    with zipfile.ZipFile(UnseekableFile(f_raw), 'w') as z:
        with z.open('foobar', 'w', force_zip64=True) as f:
            f.write(b'Hello, world!')

The resulting file looks like this, as reported by zipdetails (green=good, red=problematic).

 0000 LOCAL HEADER #1       04034B50
 0004 Extract Zip Spec      2D '4.5'
 0005 Extract OS            00 'MS-DOS'
+0006 General Purpose Flag  0008
+     [Bit  3]              1 'Streamed'
 0008 Compression Method    0000 'Stored'
 000A Last Mod Time         00210000 'Mon Dec 31 19:00:00 1979'
+000E CRC                   00000000
-0012 Compressed Length     FFFFFFFF
-0016 Uncompressed Length   FFFFFFFF
 001A Filename Length       0006
 001C Extra Length          0014
 001E Filename              'foobar'
+0024 Extra ID #0001        0001 'ZIP64'
+0026   Length              0010
+0028   Uncompressed Size   0000000000000000
+0030   Compressed Size     0000000000000000
 0038 PAYLOAD               Hello, world!
 
+0045 STREAMING DATA HEADER 08074B50
+0049 CRC                   EBE6C6E6
+004D Compressed Length     000000000000000D
+0055 Uncompressed Length   000000000000000D
 
 005D CENTRAL HEADER #1     02014B50
 0061 Created Zip Spec      2D '4.5'
 0062 Created OS            03 'Unix'
 0063 Extract Zip Spec      2D '4.5'
 0064 Extract OS            00 'MS-DOS'
 0065 General Purpose Flag  0008
      [Bit  3]              1 'Streamed'
 0067 Compression Method    0000 'Stored'
 0069 Last Mod Time         00210000 'Mon Dec 31 19:00:00 1979'
 006D CRC                   EBE6C6E6
 0071 Compressed Length     0000000D
 0075 Uncompressed Length   0000000D
 0079 Filename Length       0006
 007B Extra Length          0000
 007D Comment Length        0000
 007F Disk Start            0000
 0081 Int File Attributes   0000
      [Bit 0]               0 'Binary Data'
 0083 Ext File Attributes   01800000
 0087 Local Header Offset   00000000
 008B Filename              'foobar'
 
 0091 END CENTRAL HEADER    06054B50
 0095 Number of this disk   0000
 0097 Central Dir Disk no   0000
 0099 Entries in this disk  0001
 009B Total Entries         0001
 009D Size of Central Dir   00000034
 00A1 Offset to Central Dir 0000005D
 00A5 Comment Length        0000
 Done

Your environment

  • CPython versions tested on: 3.11.4
  • Operating system and architecture: Alpine Linux x86_64

Metadata

Metadata

Assignees

No one assigned

    Labels

    type-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions